[postgis-users] Performance question (how many years will this take?)
Paul Ramsey
pramsey at refractions.net
Mon May 10 18:46:44 PDT 2004
Wood Brent wrote:
> --- Paul Ramsey <pramsey at refractions.net> wrote:
>
>>Wood Brent wrote:
>>
>>
>>>select into...
>>>...
>>>...
>>>where intersects(uniq_event.the_geom, chat_cells3.the_geom);
>>
>>Is that really your where clause? You want this:
>>
>> where intersects(uniq_event.the_geom, chat_cells3.the_geom)
>> and uniq_event.the_geom && chat_cells3.the_geom
>>
>>So that the spatial index cuts down on the number of comparisons that
>>have to be made with the expensive exact intersects() function.
>
> I assume from this that _intersects_ does NOT do a pre-check with the GiST
> bounding box before the actual polygon check?
No, no functions do. If you want to use the index, you *must* use the &&
operator. (Or one of the other operators, please don't.)
> What I have now done is add a
> "where cell_id >= $START_ID and $CELL_ID < $END_ID"
> into the sql, and wrap it up in a script which starts at cell_id of 0 and
> increments in batches of 100 cells per iteration until it gets to the maximum
> cell_id. (note: there is a unique index on cell_id which is an integer attr)
Iterating outside the database seems to be incredibly efficient. I did 3
million fairly complex iteractions (pulling 10-200 records each and
analyzing them for time series properties) against a 50 million record
table, with run times of under an hour for the whole analysis. And the
program was in perl (like all great programs).
> This seems faster & is running without the postmaster process swapping out.
> So I think this will work OK, it's about 30% done already, while I wrote this.
Good stuff.
P.
More information about the postgis-users
mailing list