[GRASS5] Re: Feature-Request: Prompt for PostGIS-connection

Martin Pokorny martin at truffulatree.org
Wed Nov 5 18:32:43 EST 2003


On Fri, 3 Oct 2003, Radim Blazek wrote:
> On Thursday 02 October 2003 11:55, Stephan Holl wrote:
>> 	d.vect roads was really slow, but.... * PostGIS-Server is not on
>> 	the same maschine G57 is running, but inside a 10Mbit-LAN *
>> 	PostGIS-Server is 500Mhz-server Maybe this is a bottleneck?
> 
> No the problem is in GRASS, 1) GRASS uses random access (read line
> by line number), this is fast for native format but cannot be for
> RDBMS, as it must send query for each line and that is slow
> especially over network (send query and wait for respons for every
> line). I don't know how to solve this. 2) Currently, to read one
> line, GRASS performs more queries: - select type, NumPoints(),
> Dimension() - BEGIN (because binary cursors must be in transaction,
> IIRC) - DECLARE gcursor BINARY CURSOR FOR SELECT ASBINARY( "geom
> FROM table WHERE id = ...") - FETCH ALL in gcursor - CLOSE gcursor -
> COMMIT - SELECT field, cat FROM table WHERE id = ... this could be
> reduced, probably return to WKT and parse results in GRASS, so from
> 7 queries to 2, and PostGIS in GRASS will be faster in this ratio
> (7/2) (?). It was my stupid idea that binary cursor is be faster ;)
> 
> I am considering to either remove PostGIS support at all or do not
> officially support in 5.8 (or how they call it).

I've been working on this on and off for a few weekends now, and, in a
prototype implementation, I've been able to speed up PostGIS use, at
least in some cases. My benchmark has been to use v.in.ogr to import
layers from a TIGER file directly to a postgres database (both
attributes and geometry). Before my mods, importing the "pip" layer
(points only) took about 24 minutes for 26303 points. I've been able
to get the time down to around 2 1/2 minutes with various
modifications to read_post.c, write_post.c, and open_post.c, together
with the addition of some PL/pgSQL functions, and small mods to
v.in.ogr/main.c.

Other results are below. For reference, importing the CompleteChain
layer alone previously took about 6 hours, now it's 11 minutes. The
other example, I never had the patience to complete a test, but it
would have been at least 6 hours, and now it's 25 minutes.

I plan on doing more testing, but I need some advice, too (I haven't
done much with GRASS programming before). One tactic I use to speed up
reading is to buffer the results of a single query that encompasses
many features at a time. However, I need to vary the size of the
buffer depending on whether access is largely sequential (such as when
the features are first imported) or random (such as when building
topology). My approach appears to require two, somewhat global
modifications: 1) an abstract buffer size (or "access pattern")
setting function for all db types, and 2) use of this function in
several modules, such as v.in.ogr and v.clean. One possibility is to
set the default buffer size assuming a random access pattern (i.e,
small), and explicitly set the larger buffer only in the v.in.*
modules, but there are others. I don't have a good picture of the
access patterns yet, so it's hard for me to decide. Ideas, clues or
hints, anyone?

Cheers,
  Martin

-------------- next part --------------
time v.in.ogr dsn=../../../../data/pima/tgr04019 output=ppp layer=PIP
Layer: PIP
-----------------------------------------------------
26303 primitives registered      
0 areas built      8%
0 isles built
Number of nodes     :   26303         
Number of primitives:   26303
Number of points    :   26303
Number of lines     :   0
Number of boundaries:   0
Number of centroids :   0
Number of areas     :   0
Number of isles     :   0

real    2m21.927s
user    0m11.650s
sys     0m2.640s

time v.in.ogr dsn=../../../../data/pima/tgr04019 output=ccc layer=CompleteChain
Layer: CompleteChain
-----------------------------------------------------
93236 primitives registered      
0 areas built      8%
0 isles built
Number of nodes     :   67043         
Number of primitives:   93236
Number of points    :   0
Number of lines     :   93236
Number of boundaries:   0
Number of centroids :   0
Number of areas     :   0
Number of isles     :   0

real    10m58.132s
user    1m2.470s
sys     0m12.600s

time v.in.ogr -c dsn=../../../../data/pima/tgr04019
layer=CompleteChain,PIP output=tgr type=boundary,centroid snap=-1
Layer: CompleteChain
Layer: PIP
-----------------------------------------------------
119539 primitives registered
21768 areas built
95 isles built
Number of nodes     :   93345         
Number of primitives:   119539
Number of points    :   0
Number of lines     :   0
Number of boundaries:   93236
Number of centroids :   26303
Number of areas     :   21768
Number of isles     :   95
Number of incorrect boundaries   :   58178
Number of centroids outside area :   4535
Number of duplicate centroids    :   1
Number of areas without centroid :   1

real    24m50.441s
user    1m42.810s
sys     0m24.180s
-------------- next part --------------

-- 
Martin Pokorny
Tucson, AZ, USA


More information about the grass-dev mailing list