[postgis-users] Performance question (how many years willthistake?)

Gregory S. Williamson gsw at globexplorer.com
Mon May 10 16:04:50 PDT 2004


What is your sort_mem space ? If it is the default of 1024k that might be hindering performance some. If there are pgsql_tmp directories showing up that would be indicative of a sort space issue, I think.  If there's only a minimal number of processes connecting I'd bump this number way up, depending on the answers to:

What are the other settings for your postgres installation (shared_buffers, max_fsm_pages, etc.) ?

Have you updated statistics (both regular postgres if you have any columns being used to retrieved data and GIS stats) recently ?

Still, it sounds like you will absolutely have to read every row of line data at a minimum of once and probably every row of polygon data at least once, which is a lot of disk io.

Have you tried running the query with an EXPLAIN ANALYZE ?

Greg Williamson
DBA
GlobeXplorer LLC

(and you'd hoped for help, all I can offer is more questions ! But the answers may inform people who know more about postGIS than I, e.g. refractions folks and heavier users of the thing than I).
 

-----Original Message-----
From: Wood Brent [mailto:pcreso at pcreso.com]
Sent: Monday, May 10, 2004 3:56 PM
To: PostGIS Users Discussion
Subject: [postgis-users] Performance question (how many years
willthistake?)



Hi,

I have two tables:

one with about 200,000 records, geometry is a straight line (LINESTRING with 2 
   coord pairs)

t'other has about 50,000 cells (POLYGON with 4 vertices (5 if you count the
repeat of the first also as last to close the polygon)

Both have the geometry column indexed.

My hardware is a P4 2400 with 1Gb memory.

I'm running a query to carry out a spatial overlay of the two tables. Each line
feature crosses (typically) 10 cells.

select into...
...
...
where intersects(uniq_event.the_geom, chat_cells3.the_geom);

This has been running for 4 days so far. Very cpu bound, minimal swap use.

I'm not surprised it has taken a while, with 10,000,000,000 comparisons to
make.

top shows:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
13085 postgres  15   0 13368  13m  11m R 41.4  1.5   2616:44 postmaster
 2431 root      14   0   632  504  504 R 40.7  0.1   8043:02 ncpmount


Is there some tweaking I can do to reduce the ncpmount workload? As far as I
know this is related to the pgsql running, as I'm not doing a lot else on this
sysyem right now. This seems to be the process taking the most time, about 3x
the postmaster in accumulated time. 

Is there any way to optimise this further? I would expect that making any
manual "boxing" of the data into regional subsets would not be much help due to
the GiST indices already pretty much doing this automatically.

A rough calculation (assuming 10 checks/sec) gives 
(10,000,000,000 / 10 / 60 / 60 / 24 / 365) about 30 years. The GiST index will
possibly cut this down enormously, but I'm not sure I'm not asking a bit much
of things.

Advice welcome :-)


Brent Wood


_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users



More information about the postgis-users mailing list