[postgis-users] Fwd: ST_Intersection very slow

Rémi Cura remi.cura at gmail.com
Thu Mar 5 02:03:28 PST 2015


Finally I was able to test with your data.

I only worked on the big fat polygon with 40k rings
First I dumped the ring
[image: Images intégrées 1]

Then I create a grid. I aimed to 300k cells, good for my memory
[image: Images intégrées 2]

Then it is another dump : every segment (pair of successive points) of
every rings is put into a table
This is the special trick that make computing the intersection geometry so
fast. This table is annoyingly big (2.4 millions)
[image: Images intégrées 3]


The next step is crucial, we find the mapping between the segments and the
grid (which segment is in which cell).
We naturally take the segment that cross the cell, so we don't have any
precision issue
[image: Images intégrées 1]

Then comes the geometry crunching :
for each cell, we group the segment and use ST_polygonize to compute
possible surfaces (part of squares).
This is actually what would be very long without the trick of dumping ring
then dumping segments.
[image: Images intégrées 2]



We create "squarized" version of rings (conceptually, the ring is replaced
with a generalized version)
The idea is simple, on simple cells, we can use squarized version of rings
which will be a lot faster than using actual geometry
[image: Images intégrées 3]

then comes what should be a major speed up, but does't speed things in your
case.
This is because there are very "simple case", because the inner rings are
almost filling up the polygon
So the squarized rings allow to rule out only a few number of cells
[image: Images intégrées 4]

Now we have simplified the problem as much at is was possible.
Up to this it is very fast;

The last part (and I didn't found how to accelerate it) is to decide wether
part of square are within the outer and outside rings or not.




Cheers,
Rémi-C

2015-02-28 10:05 GMT+01:00 Mark Wynter <mark at dimensionaledge.com>:

> I will post tomorrow
>
> Sent from my iPhone
>
> On 28 Feb 2015, at 5:54 pm, John Abraham <jea at hbaspecto.com> wrote:
>
> That is so awesome.  That makes total sense to dump the rings and do them
> each separately.
>
> I have a client visiting the office Monday and Tuesday, so can't work on
> this much until Wednesday.
>
> Are you going to post to the list?
>
> --
> John
>
> On Feb 27, 2015, at 10:23 PM, Mark Wynter <mark at dimensionaledge.com>
> wrote:
>
> Hi John
>
> I have a tutorial version2 solution that tiles lc_class34 (not 32, my
> mistake earlier).
> This involves using ST_DumpRings, and using the path value to discern
> between exterior and interior rings.
> We tile and then use ST_Difference to remove the negative space of the
> interior rings.
>
> I’ve experimented with different tile sizes on a 2 vCPU machine.
> 32000 x 32000 tiles - Total run time =  170 seconds
> 8000 x 8000 tiles - Total run time =  845 seconds
> 4000 x 4000 tiles - total run time = 2408 seconds
>
> Could probably go smaller again.  The smaller the tile size, the quicker
> your downstream ST_Intersection queries will be with your Poly overlay.
>
> One way to offset the increase in tiling time (to produce smaller tiles)
> is to parallelise over more CPUs.
> The alternative to more CPUs  is to use a quadgrid tiling algorithm which
> recursively subdivides tiles into smaller tiles if say ST_NPoints exceeds a
> certain threshold.
>
> I use quadgrids quite a bit - an example  is on the front slider at
>  http:://www.dimensionaledge.com.  The quadgrid SQL algorithm sits in my
> private code repo and I’m keeping it close to my chest (for now).
>
>
> <Screen Shot 2015-02-28 at 3.50.24 pm.png><Screen Shot 2015-02-28 at
> 3.50.58 pm.png>
>
> On 26 Feb 2015, at 4:19 am, John Abraham <jea at hbaspecto.com> wrote:
>
> Wow guys.  131 seconds is a lot faster than 10 days.
>
> Remi, if you can work out a fast intersection method that uses points and
> lines as a preprocessor to dealing with the polygons, I think it would be a
> great addition to PostGIS.  ST_Intersection in PostGIS is often quite a bit
> slower than the implementation in Geomedia, Mapinfo, and (I hear) ArcGIS,
> so any functionality that results in speed improvements would be great.
>
> Mark, I can't wait to figure out why your system was fast!  I was
> following your (preliminary) tutorial and gridding the data was progressing
> very slowly.
>
> I have a provincial boundary file but there seems to be much ambiguity in
> GIS representations of the provincial boundary, so I won't send you the one
> I have.  I can try to assemble one from other sources.
>
> --
> John Abraham
> jea at hbaspecto.com
> 403-232-1060
>
> On Feb 25, 2015, at 6:28 AM, Mark Wynter <mark at dimensionaledge.com> wrote:
>
> Hi John
>
> I’ve just crunched your whole dataset.  The process takes 131 seconds for
> the vector tiling (using a 16 CPU machine).  Plus another 170 seconds for
> data prep at the start including making the poly's valid.
> For a 2 CPU machine, it will take circa 15 minutes, or double that using a
> single CPU.
>
> Only one small issue outstanding - and that relates to clipping the
> regular grid prior to tiling.  For clipping I used the union of the
> abmiw2wlcv_48tiles as supplied with the data - the problem is the
> abmiw2wlcv_48tiles are rough and ready, which produces voids. The voids
> using my method unfortunately get the same feature class as lc_class = 32.
> You’ll see this clearly on second screenshot.
> The way around this is to clip the regular grid using a high-res shapefile
> of the Alberta state boundary prior to the tile crunching the
> lancover_polygons_2010 table.  This is easily done - I just didn’t have the
> state boundary data.
>
> I need to get some sleep. John, Remi,  I will share with you the code
> tomorrow.  For others, I’ll be posting a tutorial that steps through the
> methods touched upon in this thread..
>
> John, the only difference between my tutorial and its application to your
> land cover data was a few tweaks to the data prep stage. Otherwise the same
> code pattern (no modification at all needed to the worker_function).  It
> was great to test the code with your data.
>
> Speak tomorrow.
> Mark
>
>
>
>
> <Screen Shot 2015-02-25 at 11.29.15 pm.png>
>
>
> <Screen Shot 2015-02-25 at 11.39.04 pm.png>
>
>
> <Screen Shot 2015-02-25 at 11.41.41 pm.png>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20150305/5933ce77/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 51372 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20150305/5933ce77/attachment-0007.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 142272 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20150305/5933ce77/attachment-0008.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 402449 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20150305/5933ce77/attachment-0009.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 94277 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20150305/5933ce77/attachment-0010.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 61245 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20150305/5933ce77/attachment-0011.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 125529 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20150305/5933ce77/attachment-0012.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 377409 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20150305/5933ce77/attachment-0013.png>


More information about the postgis-users mailing list