[postgis-users] RE: Question: How can I improve the performance of the function DISTANCE_SPHERE?

Obe, Regina robe.dnd at cityofboston.gov
Mon Sep 24 09:37:40 PDT 2007


What area does your data cover?  Does it cover the whole world or just a
limited region like a country?  If you can transform to a meter based
projection, that would be the most efficient.  Taht won't work
unfortunately if you need to cover the whole world.  In which case you
may want to dissect your data into different quadrants and project each
separately.
 
Also would be helpful if you could show us the Explain analyze of your
query.
 
something like
 
EXPLAIN SELECT ...   if explain analyze takes too long.
 
or 
 
EXPLAIN ANALYZE SELECT .....
 
 
So we can see what indexes it is using etc.  EXPLAIN Analyze would be
more informative, but sounds like it would take too long.
 
Hope that helps,
Regina


________________________________

From: postgis-users-bounces at postgis.refractions.net
[mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of
Reichle, Florian
Sent: Monday, September 24, 2007 12:19 PM
To: postgis-users at postgis.refractions.net
Subject: RE: Question: How can I improve the performance of the function
DISTANCE_SPHERE?



Hi,

i have tried this at my DB, but the query is even slow:

SELECT distinct point.field1, point.field2
FROM tab_point point LEFT JOIN
            (SELECT point.field1 as pfield1, point.field2 as pfield2
FROM tab_point point, tab_line line WHERE
 Expand(line.the_geom, 0.01)  && point.the_geom  AND
(DISTANCE_SPHEROID(point.the_geom, StartPoint(line.the_geom),
'SPHEROID["WGS 84",6378137,298.257223563]') < 50.00 OR
DISTANCE_SPHEROID(point.the_geom, Centroid(line.the_geom),
'SPHEROID["WGS 84",6378137,298.257223563]') < 50.00 OR
DISTANCE_SPHEROID(point.the_geom, EndPoint(line.the_geom),
'SPHEROID["WGS 84",6378137,298.257223563]') < 50.00) ) As ce 
           ON (ce.pfield1 = point.field1 and ce.pfield2 = point.field2)
WHERE ce.pfield1 IS NULL;

I'm thinking you are looking for
"All points whose closest line is > 50 meters away" which doesn't
require a slow cartesian product. --> Yes, that's right. I' am searching
for all points whose closest line is  > 50 meters. But how is the trick
do not use the
cartesian product?

I have many datas in the DB and it takes moren than 10 hours. 


Date: Mon, 17 Sep 2007 15:38:08 -0400
From: "Obe, Regina" <robe.dnd at cityofboston.gov>
Subject: RE: [postgis-users] Question: How can I improve the
        performance of  thefunction DISTANCE_SPHERE?
To: "PostGIS Users Discussion" <postgis-users at postgis.refractions.net>
Message-ID:
        <53F9CF533E1AA14EA1F8C5C08ABC08D201979F8A at ZDND.DND.boston.cob>
Content-Type: text/plain; charset="iso-8859-1"

Florian,

Two points
1) You have a cartesian product here which is very slow with the
tab_point point, tab_line line but it sounds like from your description
of your desired result that may be intentional but its hard to tell.
Just thought I would point it out.

I'm thinking you are looking for
"All points whose closest line is > 50 meters away" which doesn't
require a slow cartesian product.

But I could read your question a couple of ways.

2)
In general NOT IN is intuitive but much slower than doing a left join.
Also I think you want to  do a compound check.  Try doing the following
instead


The below query should give you "all points who are > 50 meters away
from the closest starting point the psuedo closest line".  Note I added
an Expand of a percent of a degree because it is not guaranteed your
point will be in the bounding box of the line and can still be within 50
meters but take that out if you want.  Also just going by the startpoint
is faulty - it might be better to use Centroid() instead of startpoint
or you could do (centroid or startpoint or endpoint) combination check -
see the second example below. 

If you could transform to a non-degree projection, that would be a lot
more efficient too and then you can use distance which would give you
distance from closest point on the line. 

SELECT distinct point.field1, point.field2
FROM tab_point point LEFT JOIN
            (SELECT point.field1 as pfield1, point.field2 as pfield2
FROM tab_point point, tab_line line WHERE
 Expand(line.the_geom, 0.01)  && point.the_geom  AND
DISTANCE_SPHEROID(point.the_geom, StartPoint(line.the_geom),
'SPHEROID["WGS 84",6378137,298.257223563]') < 50.00)  As ce 
           ON (ce.pfield1 = point.field1 and ce.pfield2 = point.field2)
WHERE ce.pfield1 IS NULL;

---What I mean by combination check

SELECT distinct point.field1, point.field2
FROM tab_point point LEFT JOIN
            (SELECT point.field1 as pfield1, point.field2 as pfield2
FROM tab_point point, tab_line line WHERE
 Expand(line.the_geom, 0.01)  && point.the_geom  AND
(DISTANCE_SPHEROID(point.the_geom, StartPoint(line.the_geom),
'SPHEROID["WGS 84",6378137,298.257223563]') < 50.00 OR
DISTANCE_SPHEROID(point.the_geom, Centroid(line.the_geom),
'SPHEROID["WGS 84",6378137,298.257223563]') < 50.00 OR
DISTANCE_SPHEROID(point.the_geom, EndPoint(line.the_geom),
'SPHEROID["WGS 84",6378137,298.257223563]') < 50.00) ) As ce 
           ON (ce.pfield1 = point.field1 and ce.pfield2 = point.field2)
WHERE ce.pfield1 IS NULL;



Caveat - the above examples wors reliably only if field1 and field2 are
never null of your point table are never null. It would be better to use
the primary key of the table.




________________________________

From: postgis-users-bounces at postgis.refractions.net on behalf of
Reichle, Florian
Sent: Mon 9/17/2007 11:04 AM
To: postgis-users at postgis.refractions.net
Subject: [postgis-users] Question: How can I improve the performance of
thefunction DISTANCE_SPHERE?



Hi,

I use the function DISTANCE_SPHERE to get a metric unit how far is a
point away from a polygon. I have two tables, one for the points and one
for the polygons. The polygons are MULTILINES.
The result of the query must be a list of all points, which are to far a
away from any polygon. So I reduce in the first step the list with &&
which drop all points which are intersects with the geom of any polygon.
After that i must proof if the distance of the rest to far away.
Therefor I use the DISTANCE_SPHERE function. But at here I have the
problem with the speed of the query, because the function take to much
time to compare alle elements in the tables. Have someone any better
idea?

Here my example:

SELECT distinct point.field1, point.field2, line.field1, line.field2
FROM tab_point point, tab_line line WHERE
point.field1 NOT IN (SELECT distinct point.field1, point.field2,
line.field1, line.field2 FROM tab_point point, tab_line line WHERE
point.the_geom && line.the_geom AND
DISTANCE_SPHEROID(point.the_geom, StartPoint(line.the_geom),
'SPHEROID["WGS 84",6378137,298.257223563]') < 50.00 GROUP BY
point.field1, point.field2, line.field1, line.field2 from );

Greetz




-----------------------------------------
The substance of this message, including any attachments, may be
confidential, legally privileged and/or exempt from disclosure
pursuant to Massachusetts law. It is intended
solely for the addressee. If you received this in error, please
contact the sender and delete the material from any computer.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20070924/b581c2b8/attachment.html>


More information about the postgis-users mailing list