[GRASS-dev] [GRASS-user] importing big vector data

Markus Metz markus.metz.giswork at gmail.com
Tue Mar 4 11:53:04 PST 2014


On Tue, Mar 4, 2014 at 5:13 PM, Martin Landa <landa.martin at gmail.com> wrote:
> Hi,
>
> [I moved this discussion to grass-dev ML which seems to be more appropriate]
>
> 2014-03-03 23:27 GMT+01:00 Markus Metz <markus.metz.giswork at gmail.com>:
>>> -> after 110min failed with "ERROR: G_calloc: unable to allocate 101 *
>>> 4 bytes of memory at allocation.c:81" (when Finding centroids for OGR
>>> layer <buildings>...), memory peak 3.6GB
>>
>> This works for me on a desktop machine with quadcore AMD Phenom @ 4GHz
>> and 16 GB RAM with Scientific Linux. Memory peak was 6.3 GB and it
>> took 1 hour, but that included snapping which costs time and memory
>> (my first attempt was also successful but I didn't measure processing
>> time), and I processed at the same time a time series of climate data
>> in a different GRASS session. There are a number of topological errors
>> in the shapefile, amongst others there are 6619 overlapping areas
>> which are too large to be removed by snapping and might need manual
>> cleaning.
>>
>> The memory allocation errors you experienced are strange, they should
>> not happen on a system with 48 GB RAM. Can you hack v.in.ogr and make
>> it try to allocate e.g. 8 GB of memory right at the beginning? If that
>> fails, you could check how much RAM is free on your system. Disk space
>> can also be a limiting factor but seems unlikely in your case.
>
> right, this strange, I used code bellow
>
>     int i;
>     int base = 1e6;
>     char *buf = NULL;
>     for(i = 1 ; i <= 8000; i++) {
>       G_debug(0, "%d -> %ld", i, i * base);
>       buf = G_realloc(buf, i * base);
>       if (i % 100 == 0)
>         sleep(1);
>     }
>
> it fails when trying to allocate about 2GB,

Since i and base are both int, integer overflow will occur for i *
base > 2^31 - 1 and i * base will be converted to 1 by G_realloc().
The second argument to G_realloc() must be of type size_t, see also
man realloc.

Thus, no wonder that the test fails when trying to allocate more than
2GB, but it does not explain why v.in.ogr fails. Did you perform a
memory test with memtest?

>
> D0/0: 2047 -> 2047000000
> D0/0: 2048 -> 2048000000
> Current region rows: 3, cols: 7
> ERROR: G_realloc: unable to allocate 2048000000 bytes of memory at main.c:101
>
> There is memory available and no limits per process:
>
> $ free -m
>              total       used       free     shared    buffers     cached
> Mem:         48395       4587      43808          0       3272        361
> -/+ buffers/cache:        953      47441
> Swap:        47682          0      47682
>
> Trying to investigate more. In any case thanks for testing... Martin

I get without GRASS_VECTOR_LOWMEM a memory peak of 8.3GB and 22 min
processing time. The conversion from non-topological vector to
topological vector needs some time.

With GRASS_VECTOR_LOWMEM I get a memory peak of 6.3GB. I would have
expected a larger difference with regard to GRASS_VECTOR_LOWMEM.

Markus M


More information about the grass-dev mailing list