[GRASS5] Re: vector point dataset

Radim Blazek blazek at itc.it
Wed Nov 17 08:00:17 EST 2004


I have submitted the changes: spatial index is not saved to
file, it is built only if necessary. It is built automaticaly
if Vect_select_* is called, but it is probably better
to call Vect_build_spatial_index from module (like in d.what.vect).

Another change: it took lang time to free memory occupied by support
structures and usually it is not necessary. So now, by default
support structures are not released when vector is closed.
This reduced time for v.build from previous example from 54
to 39s.

It is possible to use Vect_set_release_support if necessary
(for example v.clean). Let me know about other cases where
Vect_set_release_support should be added.


Radim


Brent Wood wrote:
> 
> On Tue, 16 Nov 2004, Radim Blazek wrote:
> 
> 
>>I tried to change the library so that spatial index is not
>>stored to file and it is built only if needed.
>>
>>Test on vector : 149971 boundaries, 99972 areas, 99972 centroids
>>
>>module                        | Old version | New version |
>>v.build                       |    55s      |     54s     |
>>v.distance (from   1 point )  |     6s      |     29s     |
>>v.distance (from 100 points)  |  3m50s      |   4m30s     |
>>
>>It means, that if a module needs spatial index, it takes
>>5 times more time to get it ready then before, but the difference
>>is less important in real applications, because usually it is
>>not used just once.
>>
>>There are circa 10 modules from 60 which needs spatial index.
>>
>>We can have either faster 10 modules or we can spare the space occupied
>>by spatial index file.
>>
>>What is your opinion?
> 
> 
> Thank you for following this up, 'tis appreciated.
> 
> Hmm.... I'm not using GRASS much, mostly GMT/QGIS/PostGIS, but I want to
> do more with GRASS. I appreciate your comment on the value of indices for
> points. I guess with modern hard drives the space is not a huge issue, and
> the time taken to import the file & create the index is only a once off,
> whereas queries are likely to be ongoing.
> 
> A typical use for me, would be taking 120,000,000 point elevations and
> building a DEM. I have mainly used GMT for this, so hope to simply import
> the GMT netCDF grids (so far unsuccesfully), but I was also interested in
> building the model with GRASS, as it supposedly has some excellent tools
> for this. I've tried a few times, but so far I have been unable to get a
> DEM built by GRASS.
> 
> GMT takes about 3 hrs on a fast PC. The same box with GRASS has been
> running for 15hrs with no result. I do need to look into this more, & it
> is not directly in answer to your question, but is background to what I'm
> trying to achieve with GRASS.
> 
> As long as GRASS was doing what it should, & the index files are useful I
> have no problem with them being built.
> 
> Something I'm not aware of is the approach used by GRASS57 for accessing
> data from a PostGIS table. If the points were stored in PostGIS and
> accessed by GRASS, I presume GRASS would not build an index. I have not
> yet tried to build a DEM using GRASS to work with points in PostGIS.
> 
> 
> 
> Thanks,
> 
>     Brent
> 
> 
> 
>>Radim
>>
>>
>>Radim Blazek wrote:
>>
>>>It is normal.
>>>Spatial index is important also for points, I think. Otherwise
>>>v.distance for example
>>>must always go through all points. Say that you have another vector with
>>>4,000,000
>>>and you want to find the nearst in the first one. Withou spatial index it
>>>must do 4,000,000 x 4,000,000 checks.
>>>Spatial index is stored as tree of boxes, 6x8 bytes each, so 430 M is
>>>possible.
>>>
>>>Any advice appreciated....
>>>
>>>
>>>Brent Wood wrote:
>>>
>>>
>>>>I have a vector point dataset (ascii XYZ, the basis of a DEM).
>>>>
>>>>I'm importing into GRASS5.7 with
>>>>
>>>>cat <file> | v.in.ascii -Z output=nzxyz xcol=1 ycol=2 zcol=2 catcol=0
>>>>
>>>>There are about 4,000,000 XYZ points.
>>>>
>>>>The vector coor file is 150Mb. GRASS is now building the topology &
>>>>index.
>>>>The topo file is almost 200Mb and the sidx is at 430Mb and growing.
>>>>
>>>>Is this normal? It seems very excessive for a point dataset. A spatial
>>>>index & topology make more sense for line/polygon data, with over 4x the
>>>>actual data volume to store this extra info doesn't look right somehow...
>>>>
>>>>The system is a SuSE Linux 9.1 23 bit OS on A64 3500 with 1Gb memory.
>>>>Swapped out 1.4Gb & still using 800Mb main memory.
>>>>
>>>>
>>>>Any advice appreciated....
>>>>
>>>>
>>>> Brent Wood
>>>>
>>>>
>>>
>>




More information about the grass-dev mailing list