[GRASS-dev] [GRASS-user] importing big vector data

Markus Metz markus.metz.giswork at gmail.com
Tue Mar 4 13:08:36 PST 2014


On Tue, Mar 4, 2014 at 8:53 PM, Markus Metz
<markus.metz.giswork at gmail.com> wrote:
> On Tue, Mar 4, 2014 at 5:13 PM, Martin Landa <landa.martin at gmail.com> wrote:
>> Hi,
>>
>> [I moved this discussion to grass-dev ML which seems to be more appropriate]
>>
>> 2014-03-03 23:27 GMT+01:00 Markus Metz <markus.metz.giswork at gmail.com>:
>>>> -> after 110min failed with "ERROR: G_calloc: unable to allocate 101 *
>>>> 4 bytes of memory at allocation.c:81" (when Finding centroids for OGR
>>>> layer <buildings>...), memory peak 3.6GB
>>>
>>> This works for me on a desktop machine with quadcore AMD Phenom @ 4GHz
>>> and 16 GB RAM with Scientific Linux. Memory peak was 6.3 GB and it
>>> took 1 hour, but that included snapping which costs time and memory
>>> (my first attempt was also successful but I didn't measure processing
>>> time), and I processed at the same time a time series of climate data
>>> in a different GRASS session. There are a number of topological errors
>>> in the shapefile, amongst others there are 6619 overlapping areas
>>> which are too large to be removed by snapping and might need manual
>>> cleaning.
>>>
>>> The memory allocation errors you experienced are strange, they should
>>> not happen on a system with 48 GB RAM. Can you hack v.in.ogr and make
>>> it try to allocate e.g. 8 GB of memory right at the beginning? If that
>>> fails, you could check how much RAM is free on your system. Disk space
>>> can also be a limiting factor but seems unlikely in your case.
>>
>> right, this strange, I used code bellow
>>
>>     int i;
>>     int base = 1e6;
>>     char *buf = NULL;
>>     for(i = 1 ; i <= 8000; i++) {
>>       G_debug(0, "%d -> %ld", i, i * base);
>>       buf = G_realloc(buf, i * base);
>>       if (i % 100 == 0)
>>         sleep(1);
>>     }
>>
>> it fails when trying to allocate about 2GB,
>
> Since i and base are both int, integer overflow will occur for i *
> base > 2^31 - 1 and i * base will be converted to 1 by G_realloc().
> The second argument to G_realloc() must be of type size_t, see also
> man realloc.
>
> Thus, no wonder that the test fails when trying to allocate more than
> 2GB, but it does not explain why v.in.ogr fails. Did you perform a
> memory test with memtest?
>
>>
>> D0/0: 2047 -> 2047000000
>> D0/0: 2048 -> 2048000000
>> Current region rows: 3, cols: 7
>> ERROR: G_realloc: unable to allocate 2048000000 bytes of memory at main.c:101

I was wrong, 2048000000 is smaller than 2^31 - 1, thus no integer overflow.
>>
>> There is memory available and no limits per process:

Are you sure that there are no limits per process?

Markus M

>>
>> $ free -m
>>              total       used       free     shared    buffers     cached
>> Mem:         48395       4587      43808          0       3272        361
>> -/+ buffers/cache:        953      47441
>> Swap:        47682          0      47682
>>
>> Trying to investigate more. In any case thanks for testing... Martin
>
> I get without GRASS_VECTOR_LOWMEM a memory peak of 8.3GB and 22 min
> processing time. The conversion from non-topological vector to
> topological vector needs some time.
>
> With GRASS_VECTOR_LOWMEM I get a memory peak of 6.3GB. I would have
> expected a larger difference with regard to GRASS_VECTOR_LOWMEM.
>
> Markus M


More information about the grass-dev mailing list