[GRASS-dev] Re: [GRASS GIS] #516: v.extract slow on large datasets
GRASS GIS
trac at osgeo.org
Wed Mar 4 02:26:59 EST 2009
#516: v.extract slow on large datasets
--------------------------+-------------------------------------------------
Reporter: gisboa | Owner: grass-dev at lists.osgeo.org
Type: enhancement | Status: new
Priority: minor | Milestone: 6.4.0
Component: Vector | Version: unspecified
Resolution: | Keywords:
Platform: All | Cpu: All
--------------------------+-------------------------------------------------
Comment (by mmetz):
Replying to [ticket:516 gisboa]:
> Using v.extract on large datasets is incredibly slow. From a 3,000,000
areas dataset I extracted the first 99 (id<100). It took 12 minutes to
extract the geometries,
There are probably several reasons for this. The spatial index is built
from topology, that can take a bit. The category index used to select
features is rather inefficient for large numbers of categories. These two
aspects are handled by the vector libs. v.extract itself has potential for
speed improvement. Regarding the vector libs, changes of the spatial index
and the category index will only be done in grass7. Improving v.extract is
possible for grass6, I have some ideas, but I won't get to it soon, and I
don't know if anybody else will rewrite v.extract soon.
> after that it says 'writing attributes' for another 6 minutes. The pg
process is a runner-up in top, consuming about 50% cpu time,
I think Glynn answered that in his comment to #513.
> Would this be another reason to implement the file based geometry index?
Probably yes. But that's not easy. There are "off-the-shelf" solutions for
that, but 1) someone needs to evaluate these solutions for their
suitability for grass, and 2) someone has to implement it.
> Maybe a few modules should be rewritten to perform a dedicated task on
their own, instead of relying on others, if that makes it slow.
AFAICT, v.extract does not rely on other modules, it uses library
functions only. IMHO, modules should not bypass core libraries. If a
particular task is done inefficiently by the core libraries, these
libraries need to be improved. A workaround for a specific module would
only create a mess.
--
Ticket URL: <http://trac.osgeo.org/grass/ticket/516#comment:1>
GRASS GIS <http://grass.osgeo.org>
More information about the grass-dev
mailing list