[postgis-users] Performance question (how many years will this take?)
Wood Brent
pcreso at pcreso.com
Mon May 10 15:56:18 PDT 2004
Hi,
I have two tables:
one with about 200,000 records, geometry is a straight line (LINESTRING with 2
coord pairs)
t'other has about 50,000 cells (POLYGON with 4 vertices (5 if you count the
repeat of the first also as last to close the polygon)
Both have the geometry column indexed.
My hardware is a P4 2400 with 1Gb memory.
I'm running a query to carry out a spatial overlay of the two tables. Each line
feature crosses (typically) 10 cells.
select into...
...
...
where intersects(uniq_event.the_geom, chat_cells3.the_geom);
This has been running for 4 days so far. Very cpu bound, minimal swap use.
I'm not surprised it has taken a while, with 10,000,000,000 comparisons to
make.
top shows:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13085 postgres 15 0 13368 13m 11m R 41.4 1.5 2616:44 postmaster
2431 root 14 0 632 504 504 R 40.7 0.1 8043:02 ncpmount
Is there some tweaking I can do to reduce the ncpmount workload? As far as I
know this is related to the pgsql running, as I'm not doing a lot else on this
sysyem right now. This seems to be the process taking the most time, about 3x
the postmaster in accumulated time.
Is there any way to optimise this further? I would expect that making any
manual "boxing" of the data into regional subsets would not be much help due to
the GiST indices already pretty much doing this automatically.
A rough calculation (assuming 10 checks/sec) gives
(10,000,000,000 / 10 / 60 / 60 / 24 / 365) about 30 years. The GiST index will
possibly cut this down enormously, but I'm not sure I'm not asking a bit much
of things.
Advice welcome :-)
Brent Wood
More information about the postgis-users
mailing list