[GRASS-dev] Re: improve v.rast.stats speed? (Markus Metz), Vol 34, Issue 43

Thu Feb 19 16:09:17 EST 2009

Dear readers,
Is this the right place to post some questions and suggestions for
improvements?
I respond to the remark about the v.rast.stats option, I totally agree with
the lack of speed problem. I had to perform the operation on 3,000,000+
polygons on a 33,000 * 28,000 raster... impossible.
Also too much for ArcGIS, the only one that came through was gvSIG, and
stunningly fast, in a little over an hour.

I'm experimenting with the new 6.4RC3 version, to check whether Grass could
replace ArcINFO in certain research groups within our institute. To avoid
problems with hardware resources I work with both an Intel Mac Core 2 Duo
2.4GHz with 4GB of RAM, as well as with a Pentium D Dual Core 3.0GHz with
4GB of RAM, so I'm at least prepared for heavy tasks.

Generally spoken I run into problems when dealing with extremely large
datasets. Besides the previous problem I ran into the following:
-topology cleaning on large vector files (see bug report 494 on trac)
-tried to save a subselect of 1000 polygons/areas from a 1,500,000+ dataset
using v.extract, and stopped the operation after an hour or so, the new
vector data file gets created quite quickly, but it looks to me as if the
problem is in the database or data extraction
-deleting a vector dataset must be done manually when for some reason the
database link cannot be found, which is sometime very annoying, the force
option should delete it anyway
-using the python gui, selecting a very large vector file from a dropdown
box (e.g. to show it in a map) takes a lot of time (it starts v.category,
shouldn't do so)

Other weird problems I noted:
-using the old tcltk gui the map display doesn't update when loading
multiple datasets, is this specific to my mega-datasets again?
-the classic d.what.rast on an x monitor (d.mon ) does not work on my mega
raster

Apart from this, I created a simple enhancement in the form of a shell
script, which automatically generates a new database schema when creating a
new mapset, and you're already logged in into a PostgreSQL database. This
makes it easy to create new mapsets connected to PostgreSQL, after having
initially set up a PostgreSQL database + permanent schema. Maybe this
should be integrated into the g.mapset command.

This is about it so far. I actually like Grass, it's not the most 'open'
software to work with from a users perspective, but it's powerful, and the
new gui is certainly promising, also for less experienced users! Only
ultra-large dataset support and some very slow vector operations should
really be tuned up. And I love the command-line, don't drop it in favour of
gui's.

I hope you can do something useful with this information.

Kind regards,
Wouter