[GRASS-dev] large vector problems

Wed Feb 25 15:40:09 EST 2009

Dear Markus, Markus and Jens,

My MacBook survived the burn-in test :-) After 29 hours under full load,
alternating processor and disk access limited, I got my mega-vector file
cleaned. Thank you for the support and suggestions to solve my problems
with the large vector cleaning operation!

I did recompile the 6.5dev version on Linux, no large file support, no
64bit, used the standard pre-compiled binary and devel packages from the
suse repositories, so nothing special at all. A last test showed me that
the 6.4RC3 version worked too.

Probably 2 effects were mixed that causes my problems:
-lack of memory in the first place, and
-the database was on a Samba share (which works perfectly well, unless with
these mega vector datasets)

Why it doesn't work on the Mac _natively_ is a still unanswered question.
Maybe some built-in memory limit? At least my VMWare Suse did the job.

Although the dataset was cleaned, files of this size are virtually
impossible to handle, especially as standard querying, extract and overlay
operations with raster datasets simply take too much time. The fact that
ArcINFO workstation (also a topological GIS) is an order of magnitude
faster and not so memory hungry makes me believe that there should be a way
to improve speed on this kind of operations, but unfortunately I'm not a
algorythm guru...

I'll post a few related issues with mega files that makes working with them
very difficult. I'll post them as (separate) enhancements on trac, as they
are in my opinion of major importance:
-selecting a large vector map from a dropdown box in the wxPython GUI takes
a long time
-renaming this vector took 25 minutes (PostgreSQL access!)
-v.extract is also incredibly slow
-removing a vector file with an unreachable PostgreSQL database link does
not work, not even in force mode
-v.what consuming several GB of RAM only for querying a large vector map??
-v.rast.stats suffers from setting masks, extracting polygons and querying,
not usefull anymore for vector files this size, this is a particular slow
operations

I'll put my shell script to create a new mapset and automatically generate
a postgresql schema somewhere on the wiki.

With kind regards,

Wouter Boasson (MSc)
Geo-IT Research and Coordination

RIVM - National Institute for Public Health and the Environment
Expertise Centre for Methodology and Information Services

Contact information
-----------------------
RIVM
VenZ/EMI, Pb 86
t.a.v. dhr. Drs. Wouter Boasson
Postbus 1
3720 BA Bilthoven

T +31(0)302748518
M +31(0)611131150
F +31(0)302744456
E wouter.boasson at rivm.nl
mo - th