[postgis-users] Performance question (how many years will this take?)

Mon May 10 15:56:18 PDT 2004

Hi,

I have two tables:

one with about 200,000 records, geometry is a straight line (LINESTRING with 2 
   coord pairs)

t'other has about 50,000 cells (POLYGON with 4 vertices (5 if you count the
repeat of the first also as last to close the polygon)

Both have the geometry column indexed.

My hardware is a P4 2400 with 1Gb memory.

I'm running a query to carry out a spatial overlay of the two tables. Each line
feature crosses (typically) 10 cells.

select into...
...
...
where intersects(uniq_event.the_geom, chat_cells3.the_geom);

This has been running for 4 days so far. Very cpu bound, minimal swap use.

I'm not surprised it has taken a while, with 10,000,000,000 comparisons to
make.

top shows:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
13085 postgres  15   0 13368  13m  11m R 41.4  1.5   2616:44 postmaster
 2431 root      14   0   632  504  504 R 40.7  0.1   8043:02 ncpmount

Is there some tweaking I can do to reduce the ncpmount workload? As far as I
know this is related to the pgsql running, as I'm not doing a lot else on this
sysyem right now. This seems to be the process taking the most time, about 3x
the postmaster in accumulated time. 

Is there any way to optimise this further? I would expect that making any
manual "boxing" of the data into regional subsets would not be much help due to
the GiST indices already pretty much doing this automatically.

A rough calculation (assuming 10 checks/sec) gives 
(10,000,000,000 / 10 / 60 / 60 / 24 / 365) about 30 years. The GiST index will
possibly cut this down enormously, but I'm not sure I'm not asking a bit much
of things.

Advice welcome :-)

Brent Wood