[GRASS-dev] large vector and indices, speed up db

Thu Feb 26 12:37:31 EST 2009

In response to:

>Hello,
>reading this thread, and being sometimes concerned with large vector
>files (associated with big related tables), I wonder if it's worth
>manually creating indexes (on cat field) : can it be a effective way to
>speed up queries, or is the kink elsewhere, at the geometric level (data
>handled by grass, not the linked DBMS) ?
>Thank you,
>Vincent

The problems with querying and topological operations must be at the
geometric level (e.g. click an area to see the attributes using v.what, or
cleaning operations). When working with PostgreSQL you don't have to care
about the speed of simple select operations on a single column (like the
'cat' column), it will spit out results within milliseconds, even from
tables with tens of millions of records. So, when an operation which
requires a simple select on 'cat' takes a lot of time, it must be the part
'determining the cat value'.

However, in the current implementation of some Grass tools it seems to me
(correct me if I'm wrong) that there are a few operations where the
database is certainly the limiting factor. The most notable example is
renaming a large vector table, which is extremely slow. I noticed that
PostgreSQL is working very hard. Simply renaming the vector dir and perform
a rename on the table in the database should be sufficient, and adjust the
contents of vector/<dataset>/dbln file. This should be done in a split
second, regardless of the size of the table. Doing this manually is much
much faster than renaming it using g.rename...

I'm not sure on v.extract, which is also amazingly slow. Maybe it
regenerates cat id's and must somehow keep the original information linked?

Kind regards,
Wouter