[postgis-users] Performance external loop, WHERE IN (values), stored procedure or else to pass data to SQL query
Michal Seidl
michal.seidl at gmail.com
Tue Jun 14 03:53:08 PDT 2016
Hello,
You understand my problem correctly. But despite I have sptatial index
on geometry colum (points are 2D) one SQL query takes about 20ms only
with operator && ST_GeomFromText('POINT..) && geom_column. This operator
as I know works only with MBR (it should be same as equal for points)
With other operators ST_Distance,ST_Equal, = the spatial index is not
used (?) and query take about 500ms. 500 points x 500ms => 250s which is
quite a lot and I do not take into account time spent on python side.
That's the problem.
Thanks Michal
On 06/14/2016 10:36 AM, Rémi Cura wrote:
> Hey,
> I fiind it difficult to understand exactly your problem.
> If you have a set of point A in python , of about 100 points,
> and a set of point B, in postgres, of a few million points,
> and that you want to find which point of A are already in B,
> you wouldn't need to import your python points into a temp postgres
> table and index it.
>
> If your points are simple (for instance, only 2D, each dim an int),
> you may skip postgis altogether.
> Checking 100 points against few million should be <10ms with indexes.
>
> You can check ta
>
>
> 2016-06-13 20:46 GMT+02:00 Leknín Řepánek <godzilalalala at gmail.com
> <mailto:godzilalalala at gmail.com>>:
>
> Python Array -> geometry(POINT, [SRID])[] -> unnest, you can use unnest
> in CTE
> something like
>
> WITH cte AS (
> SELECT unnest(array['POINT(1 1)'::geometry ......]) pnt
> )
> SELECT * FROM t JOIN cte ON ST_....
>
>
> Milions of points aren`t big data, if it is slow, maybe you should check
> your indexes.
>
> On Mon, Jun 13, 2016 at 08:08:34AM -0700, Michal Seidl wrote:
> > Hello,
> > I have probably simple question but I am stuck. I have table with
> about 1E6
> > points in table. I have about hundreds of points in Python array.
> How to
> > pass these points into Postgis SQL query to check against table
> (if the same
> > point exist) with millions points?
> >
> > I started with Python loop to check every point individually but
> it is slow.
> > I have found lots of receipts how to fight with large data but
> they have to
> > be already in database.
> >
> > Which way is good for this situation?
> > - create temporary table and firstly insert points into DB
> > - try to manage Python not to close connection and send SQL query
> with each
> > point fast
> > - try to push all points in one query? How?
> > - write stored procedure that will accept array of data?
> > - anything else?
> >
> > May be it is more general SQL questions, but i am working with
> points.
> >
> > Thanks for advice Michal
> >
> >
> >
> > --
> > View this message in context:
> http://postgis.17.x6.nabble.com/Performance-external-loop-WHERE-IN-values-stored-procedure-or-else-to-pass-data-to-SQL-query-tp5010138.html
> > Sent from the PostGIS - User mailing list archive at Nabble.com.
> > _______________________________________________
> > postgis-users mailing list
> > postgis-users at lists.osgeo.org <mailto:postgis-users at lists.osgeo.org>
> > http://lists.osgeo.org/mailman/listinfo/postgis-users
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org <mailto:postgis-users at lists.osgeo.org>
> http://lists.osgeo.org/mailman/listinfo/postgis-users
>
>
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/postgis-users
>
More information about the postgis-users
mailing list