[postgis-users] point-in-polygon SQL sentence performance

Obe, Regina robe.dnd at cityofboston.gov
Mon Jun 30 05:56:56 PDT 2008


Pere,

Somehow I think your whole code is needlessly complicated, but I still
don't have a clear idea what you are trying to do and looking at your
code is actually more confusing than anything.  Maybe try to restate
exactly what you are trying to do minus the code.  

I'm guessing you are trying to do this (by the way wrapping a (SELECT
...) like that is very non-standard)

INSERT INTO user_polygon (userid, polygon_code,the_geom) 
SELECT 
distinct(polygon_code),the_geom FROM polygon where polygon."."the_geom
&&
GeometryFromText('".$row['astext']."',4326);


1) Are you using inherited tables and having a different table for each
user?  I'm assuming that is why you are doing

UPDATE ONLY 

rather than UPDATE  (UPDATE ONLY really only makes sense if you are
using inherited tables and don't want the update to cascade to child
tables, but then if its inherited - its still silly since you are just
updating child tables anyway)


2) Where is that $user coming from.  Make sure you know where that $user
is coming from and you properly sanitize it otherwise its a prime target
for SQL Injection attack.

3) How big is your user_polygon table?  Speed would depend on that

 - this particular statement you have is inefficient - because it is
either doing a correlated subselect for every record or its updating
every record to the same value. Since you are not prefixing things - its
hard for me to tell what exactly this is doing.  This I suppose would
make sense if you are creating a different table for each user

UPDATE ONLY user_polygon SET numreg = (SELECT COUNT (code) FROM
user_points
WHERE name = polygon_code);

but its generally more efficient to write a sub select

UPDATE user_polygon SET numreg = u.cnt
FROM (SELECT COUNT(up.code) as cnt, up.name
		FROM user_point up
		GROUP BY up.name) u
WHERE u.name = user_polygon.polygon_code;


3) I have no idea what the point of the first bounding box insert you
have is for 
since ST_intersects, ST_Contains already have bounding box checks in
them.  So I suspect this step you have is is not necessary

For this part

SELECT
distinct(polygon_code),the_geom FROM polygon where polygon."."the_geom
&&
GeometryFromText('".$row['astext']."',4326)


4) Again this statement
UPDATE ONLY user_polygon SET userid ='$user';

Is updating every record in user_polygon to the current user.  So if you
are looping thru a user set which I can only guess you are, this
statement is overwritting the previous updates.

If again you have one table per user, then this is fine - but its
cheaper to do everything in your insert rather than doing an update
after the fact.  Updates tend to be slower than inserts.

Hope that helps a bit,
Regina




 

-----Original Message-----
From: postgis-users-bounces at postgis.refractions.net
[mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of pere
roca
Sent: Monday, June 30, 2008 4:25 AM
To: postgis-users at postgis.refractions.net
Subject: [postgis-users] point-in-polygon SQL sentence performance


hi!

I wanna make a point-in-polygon operation but takes looong time (about 4
minutes!) and the CPU consuming of the server
 becomes huge when performing this operation for 15000 points /
polygons:

I need to count how many user points fall in each polygon and perform
some
operations based on that. 
We initially have a point_table that stores multi-user point data. We
extract from there only this-user-data and create new tables:
user_points
and user_polygon. To the first one we will insert the code of the
polygon in
which it falls; counting how many many times the same polygon_code is
repeated, we will calculate number-points_in_polygon...
The important code is commented below (calling SQL from PHP):

//we use the points bbox to narrow the future spatial queries (will only
select polygons falling inside this bbox; from 1991 initial polygons,
only
85 are from now on used); it is useful when points are close each other
(if
there is an outlayer then bbox is not useful...)

$extent="select astext(extent(points_table.the_geom)) from points_table
where userid='$user'";
$extent_result=pg_query($extent); 

while ($row = pg_fetch_array($extent_result, NULL, PGSQL_ASSOC)) 
{
//creates a NEW polygon table that will store only polygon code and
polygon
geometry that fall INSIDE the point's bbox
   $poligon_sql2="INSERT INTO user_polygon (polygon_code,the_geom)
(SELECT
distinct(polygon_code),the_geom FROM polygon where polygon."."the_geom
&&
GeometryFromText('".$row['astext']."',4326));
UPDATE ONLY user_polygon SET userid ='$user';

//we previously created a new user point table, where we insert some
info.
The table is created because we are taking data from a multi-user table
and
now we want a single user-->a single point table

INSERT INTO user_points (genus, species, name, code, usuario, the_geom)
(SELECT genus, specie, codigo, oid, userid, the_geom FROM points WHERE
userid = '$user');

//in this user POINT table we set the code from the polygon's table
where
the point falls inside; 
//I expected this one to be the "big" operation , but it's not (takes
"only"
1 min 18 sec)

UPDATE ONLY user_points SET code = (SELECT polygon_code FROM
user_polygon
WHERE user_polygon.the_geom && user_points.the_geom AND CONTAINS
(user_polygon.the_geom,user_points.the_geom));

//updating user polygon table, counting how many times do we have a
polygon
code in our POINT table --> number of points in each polygon
// this updating operation takes 1 min 20 seconds! 

UPDATE ONLY user_polygon SET numreg = (SELECT COUNT (code) FROM
user_points
WHERE name = polygon_code);

// this updating operation also takes about 1 min 20 seconds! 

UPDATE ONLY user_polygon SET numtax = (SELECT COUNT (genus) FROM (SELECT
DISTINCT (genus) FROM user_points WHERE code = polygon_code) AS foo); 

   All important data is indexed; Any ideas to get better performance?

   thanks!
 
   Pere Roca
   EDIT project (http://edit.csic.es/edit_geo/prototype/edit.html)


-- 
View this message in context:
http://www.nabble.com/point-in-polygon-SQL-sentence-performance-tp181908
54p18190854.html
Sent from the PostGIS - User mailing list archive at Nabble.com.

_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users
-----------------------------------------
The substance of this message, including any attachments, may be
confidential, legally privileged and/or exempt from disclosure
pursuant to Massachusetts law. It is intended
solely for the addressee. If you received this in error, please
contact the sender and delete the material from any computer.




More information about the postgis-users mailing list