[postgis-users] Spatially ordering data

Lars Aksel Opsahl Lars.Opsahl at nibio.no
Thu Mar 2 13:02:30 PST 2017


Hi


Thanks for testing it. We have used it quite bit and have not seen any scaling problems yet.


For a layer with 8 million polygons (about 1 billion points) it takes about one minute to create a content based grid. (Se example further down)


We use the && operator to check intersection to avoid loading all data, but what sometimes take time is to compute the extent of the layer. Before I used ST_estimatedExtent (https://github.com/larsop/content_balanced_grid/commit/66c8267590543951c1b5088ff021cc5b70e0acff) but that function gives to small extent in some cases. To use the ST_estimatedExtent, we have to use add some more tests to ensure that the extent is big enough, but since we have not seen any need yet its not done either.


We usually use this kind of grids to be able to run jobs i parallel with about equal load in each job. To be sure that each polygon is only handled by one job we can use ST_Centroid(or faster first point in exterior ring) and ST_CoveredBy cell.bTo be 100% sure you also have to check for ST_Centroid(or first point) on cell borders.



SELECT count(q_grid.cell::geometry(geometry,4258)) as geo

FROM (

SELECT(ST_Dump(

cbg_content_based_balanced_grid(ARRAY['org_ar5.ar5_flate geo'],4000))

).geom AS cell) AS q_grid

geo

------

5245

(1 row)

Time: 62370.532 ms


SELECT count(*) from org_ar5.ar5_flate;

count

---------

8263246

(1 row)


SELECT PostGIS_Full_Version();

postgis_full_version

--------------------------------------------------------------------------------------------------------------------------------------------------------------

-------------

POSTGIS="2.2.2 r14797" GEOS="3.5.0-CAPI-1.9.0 r4084" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.9.2, released 2012/10/08" LIBXML="2.7.6" LIBJSON="0.11" TOP

OLOGY RASTER



Lars


________________________________
Fra: postgis-users <postgis-users-bounces at lists.osgeo.org> på vegne av Jonathan Moules <jonathan-lists at lightpear.com>
Sendt: 2. mars 2017 18:41
Til: PostGIS Users Discussion
Emne: Re: [postgis-users] Spatially ordering data

Thanks for the suggestions. Some thoughts after testing:


Darafei: (order by ST_Geohash) Interesting notion, but when implemented the grouping is tenuous at best. If I return 5,000 features they can be spread over 200km (bad) or 6km (good). It'd definitely better clustered than without this, but I was surprised at how dispersed some of the groups were, and that may be too much for my purposes.
Although I see Paul's answer to this SO question (https://gis.stackexchange.com/questions/188879/clustering-markers-using-quadtree/189296) may help. I've not tested it yet.

Lars: (content_balanced_grid) The issue here is what happens to features that overlap the grid edges, as I only want to process any given polygon once, and I can't clip them? I'd have to use a ST_Within to only get polygons inside the CBG (which is fine), but the remaining features would need an ST_Overlaps against the entire grid. I've been running that 30 mins so far on a small sample of my full database, and it's still running, so it definitely won't scale.

Remi: I was looking for something higher level. Implementing a Quad-Tree in SQL isn't something I'd be comfortable doing. Although if it is simple to do, it may be worth implementing as a feature within PostGIS; I'm sure others would also benefit from being able to easily spatially sort. I tried googling PostGIS quadtree but didn't find anything so I'm guessing PG doesn't implement this type of index by default.

Are there any other options? I may consider trying to glue together the two approaches, but that's going to take some thought.

Thanks,
Jonathan

---- On Wed, 01 Mar 2017 12:56:58 +0000 Travis Kirstine <traviskirstine at gmail.com> wrote ----
if you go with Remi suggestion you then can cluster the quad tree index to reorder records

http://postgis.net/docs/performance_tips.html



On 1 March 2017 at 07:11, Rémi Cura <remi.cura at gmail.com<mailto:remi.cura at gmail.com>> wrote:
hey,
the most classical approach would be to construct a quad tree<https://en.wikipedia.org/wiki/Quadtree>.
You can think of it as a grid that adapts its resolution locally according to the data.
This is also easy to code.

Cheers,
Rémi-C

2017-03-01 12:46 GMT+01:00 Darafei "Kom?pa" Praliaskouski <me at komzpa.net<mailto:me at komzpa.net>>:
Have a look here:

http://www.paulnorman.ca/blog/2016/05/improve-your-st-geohash-sorting-with-these-three-simple-tricks/

for simplest case, you can just order by your geom field.

??, 1 ???. 2017 ?. ? 14:44, Jonathan Moules <jonathan-lists at lightpear.com<mailto:jonathan-lists at lightpear.com>>:
Hi List,
I've got a database of about 60 million spatial features that I need to run through a process.

The process can't do all of them at once, so I want to get the data from postgis in spatial chunks of up to say 500,000 features per query.

The data itself isn't evenly spatially distributed, so gridding won't work either, and I don't see any prospectively useful functions in the reference.

Non-spatially this would be simple:

    SELECT * FROM my_table ORDER BY some_key ASC LIMIT 500000 OFFSET 0

Does anyone have any suggestions for how to spatially order data with PostGIS?

Thanks,
Jonathan
_______________________________________________
postgis-users mailing list
postgis-users at lists.osgeo.org<mailto:postgis-users at lists.osgeo.org>
https://lists.osgeo.org/mailman/listinfo/postgis-users

_______________________________________________
postgis-users mailing list
postgis-users at lists.osgeo.org<mailto:postgis-users at lists.osgeo.org>
https://lists.osgeo.org/mailman/listinfo/postgis-users


_______________________________________________
postgis-users mailing list
postgis-users at lists.osgeo.org<mailto:postgis-users at lists.osgeo.org>
https://lists.osgeo.org/mailman/listinfo/postgis-users

_______________________________________________
postgis-users mailing list
postgis-users at lists.osgeo.org<mailto:postgis-users at lists.osgeo.org>
https://lists.osgeo.org/mailman/listinfo/postgis-users


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20170302/34701219/attachment.html>


More information about the postgis-users mailing list