[postgis-users] Performance external loop, WHERE IN (values), stored procedure or else to pass data to SQL query

Michal Seidl michal.seidl at gmail.com
Tue Jun 14 03:53:08 PDT 2016


Hello,
You understand my problem correctly. But despite I have sptatial index 
on geometry colum (points are 2D) one SQL query takes about 20ms only 
with operator && ST_GeomFromText('POINT..) && geom_column. This operator 
as I know works only with MBR (it should be same as equal for points)

With other operators ST_Distance,ST_Equal, = the spatial index is not 
used (?) and query take about 500ms. 500 points x 500ms => 250s which is 
quite a lot and I do not take into account time spent on python side. 
That's the problem.

Thanks Michal

On 06/14/2016 10:36 AM, Rémi Cura wrote:
> Hey,
> I fiind it difficult to understand exactly your problem.
> If you have a set of point A in python , of about 100 points,
> and a set of point B, in postgres, of a few million points,
> and that you want to find which point of A are already in B,
> you wouldn't need to import your python points into a temp postgres
> table and index it.
>
> If your points are simple (for instance, only 2D, each dim an int),
> you may skip postgis altogether.
> Checking 100 points against few million should be <10ms with indexes.
>
> You can check ta
>
>
> 2016-06-13 20:46 GMT+02:00 Leknín Řepánek <godzilalalala at gmail.com
> <mailto:godzilalalala at gmail.com>>:
>
>     Python Array -> geometry(POINT, [SRID])[] -> unnest, you can use unnest
>     in CTE
>     something like
>
>     WITH cte AS (
>              SELECT unnest(array['POINT(1 1)'::geometry ......]) pnt
>     )
>     SELECT * FROM t JOIN cte ON ST_....
>
>
>     Milions of points aren`t big data, if it is slow, maybe you should check
>     your indexes.
>
>     On Mon, Jun 13, 2016 at 08:08:34AM -0700, Michal Seidl wrote:
>      > Hello,
>      > I have probably simple question but I am stuck. I have table with
>     about 1E6
>      > points in table. I have about hundreds of points in Python array.
>     How to
>      > pass these points into Postgis SQL query to check against table
>     (if the same
>      > point exist) with millions points?
>      >
>      > I started with Python loop to check every point individually but
>     it is slow.
>      > I have found lots of receipts how to fight with large data but
>     they have to
>      > be already in database.
>      >
>      > Which way is good for this situation?
>      > - create temporary table and firstly insert points into DB
>      > - try to manage Python not to close connection and send SQL query
>     with each
>      > point fast
>      > - try to push all points in one query? How?
>      > - write stored procedure that will accept array of data?
>      > - anything else?
>      >
>      > May be it is more general SQL questions, but i am working with
>     points.
>      >
>      > Thanks for advice Michal
>      >
>      >
>      >
>      > --
>      > View this message in context:
>     http://postgis.17.x6.nabble.com/Performance-external-loop-WHERE-IN-values-stored-procedure-or-else-to-pass-data-to-SQL-query-tp5010138.html
>      > Sent from the PostGIS - User mailing list archive at Nabble.com.
>      > _______________________________________________
>      > postgis-users mailing list
>      > postgis-users at lists.osgeo.org <mailto:postgis-users at lists.osgeo.org>
>      > http://lists.osgeo.org/mailman/listinfo/postgis-users
>     _______________________________________________
>     postgis-users mailing list
>     postgis-users at lists.osgeo.org <mailto:postgis-users at lists.osgeo.org>
>     http://lists.osgeo.org/mailman/listinfo/postgis-users
>
>
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/postgis-users
>


More information about the postgis-users mailing list