[GRASS-user] importing big vector data

Markus Metz markus.metz.giswork at gmail.com
Mon Mar 3 14:27:17 PST 2014


On Mon, Mar 3, 2014 at 5:39 PM, Martin Landa <landa.martin at gmail.com> wrote:
> Hi all,
>
> I was playing with big vector data and trying import it into GRASS
> native format. I used 'buildings' layer from OSM provided in Esri
> Shapefile for region of the Czech Republic [1].
>
> This layer has almost 3.5 milions of polygons.
>
> $ ogrinfo buildings.shp -al | grep 'Count:'
> Feature Count: 3464029
>
> I used PC with 16 CPUs (Intel Xeon @ 2.53GHz) a 48GB RAM with 64bit
> Debian stable. GRASS compiled with large-file support.
>
> 1) $ v.in.ogr dsn=buildings.shp
>
> -> after 30min failed with "Unexpected error. Aborted", memory usage peak 4.1GB
>
> 2) $ v.in.ogr -c dsn=buildings.shp
>
> -> 20min with 3.0GB memory peak
>
> The result was not topologically clean, but import procedure finished.
>
> 3) $ GRASS_VECTOR_LOWMEM=1 v.in.ogr dsn=buildings.shp
>
> -> after 110min failed with "ERROR: G_calloc: unable to allocate 101 *
> 4 bytes of memory at allocation.c:81" (when Finding centroids for OGR
> layer <buildings>...), memory peak 3.6GB

This works for me on a desktop machine with quadcore AMD Phenom @ 4GHz
and 16 GB RAM with Scientific Linux. Memory peak was 6.3 GB and it
took 1 hour, but that included snapping which costs time and memory
(my first attempt was also successful but I didn't measure processing
time), and I processed at the same time a time series of climate data
in a different GRASS session. There are a number of topological errors
in the shapefile, amongst others there are 6619 overlapping areas
which are too large to be removed by snapping and might need manual
cleaning.

The memory allocation errors you experienced are strange, they should
not happen on a system with 48 GB RAM. Can you hack v.in.ogr and make
it try to allocate e.g. 8 GB of memory right at the beginning? If that
fails, you could check how much RAM is free on your system. Disk space
can also be a limiting factor but seems unlikely in your case.

Markus M

>
> 4) shp->postgis + v.external (simple features)
>
> $ shp2pgsql -I -D -S buildings.shp | psql pgis_db
>
> in 2min!
>
> $ v.external dsn=PG:dbname=pgis_db layer=buildings
>
> -> after 9min failed with "G_realloc: unable to allocate 40800000
> bytes of memory at cindex.c:113"
>
> GRASS_VECTOR_LOWMEM=1 v.external dsn=PG:dbname=pgis_db layer=buildings
>
> -> it took 17min with memory peak 2.2GB, link was created!
>
> I tried to render such data
>
> $ d.mon cairo output=/tmp/buildings.png
>
> $ g.region vect=buildings
>
> $ d.vect buildings
>
> but failed in 30sec with
>
> ERROR: G_realloc: unable to allocate 48000 bytes of memory at list.c:46
>
> Any comments, suggesting are highly welcomed...
>
> Thanks, Martin
>
> [1] http://download.geofabrik.de/europe/czech-republic-latest.shp.zip
>
> --
> Martin Landa * http://geo.fsv.cvut.cz/gwiki/Landa
> _______________________________________________
> grass-user mailing list
> grass-user at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/grass-user


More information about the grass-user mailing list