[postgis-users] ST_Intersection very slow
Rémi Cura
remi.cura at gmail.com
Thu Mar 5 02:06:23 PST 2015
Code is here :
https://github.com/Remi-C/PPPP_utilities/blob/master/postgis/rc_cut_big_geom_into_pieces_efficiently.sql
Cheers,
Rémi-C
2015-03-05 11:03 GMT+01:00 Rémi Cura <remi.cura at gmail.com>:
>
>
> Finally I was able to test with your data.
>
> I only worked on the big fat polygon with 40k rings
> First I dumped the ring
> [image: Images intégrées 1]
>
> Then I create a grid. I aimed to 300k cells, good for my memory
> [image: Images intégrées 2]
>
> Then it is another dump : every segment (pair of successive points) of
> every rings is put into a table
> This is the special trick that make computing the intersection geometry so
> fast. This table is annoyingly big (2.4 millions)
> [image: Images intégrées 3]
>
>
> The next step is crucial, we find the mapping between the segments and the
> grid (which segment is in which cell).
> We naturally take the segment that cross the cell, so we don't have any
> precision issue
> [image: Images intégrées 1]
>
> Then comes the geometry crunching :
> for each cell, we group the segment and use ST_polygonize to compute
> possible surfaces (part of squares).
> This is actually what would be very long without the trick of dumping ring
> then dumping segments.
> [image: Images intégrées 2]
>
>
>
> We create "squarized" version of rings (conceptually, the ring is replaced
> with a generalized version)
> The idea is simple, on simple cells, we can use squarized version of rings
> which will be a lot faster than using actual geometry
> [image: Images intégrées 3]
>
> then comes what should be a major speed up, but does't speed things in
> your case.
> This is because there are very "simple case", because the inner rings are
> almost filling up the polygon
> So the squarized rings allow to rule out only a few number of cells
> [image: Images intégrées 4]
>
> Now we have simplified the problem as much at is was possible.
> Up to this it is very fast;
>
> The last part (and I didn't found how to accelerate it) is to decide
> wether part of square are within the outer and outside rings or not.
>
>
>
>
> Cheers,
> Rémi-C
>
> 2015-02-28 10:05 GMT+01:00 Mark Wynter <mark at dimensionaledge.com>:
>
>> I will post tomorrow
>>
>> Sent from my iPhone
>>
>> On 28 Feb 2015, at 5:54 pm, John Abraham <jea at hbaspecto.com> wrote:
>>
>> That is so awesome. That makes total sense to dump the rings and do them
>> each separately.
>>
>> I have a client visiting the office Monday and Tuesday, so can't work on
>> this much until Wednesday.
>>
>> Are you going to post to the list?
>>
>> --
>> John
>>
>> On Feb 27, 2015, at 10:23 PM, Mark Wynter <mark at dimensionaledge.com>
>> wrote:
>>
>> Hi John
>>
>> I have a tutorial version2 solution that tiles lc_class34 (not 32, my
>> mistake earlier).
>> This involves using ST_DumpRings, and using the path value to discern
>> between exterior and interior rings.
>> We tile and then use ST_Difference to remove the negative space of the
>> interior rings.
>>
>> I’ve experimented with different tile sizes on a 2 vCPU machine.
>> 32000 x 32000 tiles - Total run time = 170 seconds
>> 8000 x 8000 tiles - Total run time = 845 seconds
>> 4000 x 4000 tiles - total run time = 2408 seconds
>>
>> Could probably go smaller again. The smaller the tile size, the quicker
>> your downstream ST_Intersection queries will be with your Poly overlay.
>>
>> One way to offset the increase in tiling time (to produce smaller tiles)
>> is to parallelise over more CPUs.
>> The alternative to more CPUs is to use a quadgrid tiling algorithm which
>> recursively subdivides tiles into smaller tiles if say ST_NPoints exceeds a
>> certain threshold.
>>
>> I use quadgrids quite a bit - an example is on the front slider at
>> http:://www.dimensionaledge.com. The quadgrid SQL algorithm sits in my
>> private code repo and I’m keeping it close to my chest (for now).
>>
>>
>> <Screen Shot 2015-02-28 at 3.50.24 pm.png><Screen Shot 2015-02-28 at
>> 3.50.58 pm.png>
>>
>> On 26 Feb 2015, at 4:19 am, John Abraham <jea at hbaspecto.com> wrote:
>>
>> Wow guys. 131 seconds is a lot faster than 10 days.
>>
>> Remi, if you can work out a fast intersection method that uses points and
>> lines as a preprocessor to dealing with the polygons, I think it would be a
>> great addition to PostGIS. ST_Intersection in PostGIS is often quite a bit
>> slower than the implementation in Geomedia, Mapinfo, and (I hear) ArcGIS,
>> so any functionality that results in speed improvements would be great.
>>
>> Mark, I can't wait to figure out why your system was fast! I was
>> following your (preliminary) tutorial and gridding the data was progressing
>> very slowly.
>>
>> I have a provincial boundary file but there seems to be much ambiguity in
>> GIS representations of the provincial boundary, so I won't send you the one
>> I have. I can try to assemble one from other sources.
>>
>> --
>> John Abraham
>> jea at hbaspecto.com
>> 403-232-1060
>>
>> On Feb 25, 2015, at 6:28 AM, Mark Wynter <mark at dimensionaledge.com>
>> wrote:
>>
>> Hi John
>>
>> I’ve just crunched your whole dataset. The process takes 131 seconds for
>> the vector tiling (using a 16 CPU machine). Plus another 170 seconds for
>> data prep at the start including making the poly's valid.
>> For a 2 CPU machine, it will take circa 15 minutes, or double that using
>> a single CPU.
>>
>> Only one small issue outstanding - and that relates to clipping the
>> regular grid prior to tiling. For clipping I used the union of the
>> abmiw2wlcv_48tiles as supplied with the data - the problem is the
>> abmiw2wlcv_48tiles are rough and ready, which produces voids. The voids
>> using my method unfortunately get the same feature class as lc_class = 32.
>> You’ll see this clearly on second screenshot.
>> The way around this is to clip the regular grid using a high-res
>> shapefile of the Alberta state boundary prior to the tile crunching the
>> lancover_polygons_2010 table. This is easily done - I just didn’t have the
>> state boundary data.
>>
>> I need to get some sleep. John, Remi, I will share with you the code
>> tomorrow. For others, I’ll be posting a tutorial that steps through the
>> methods touched upon in this thread..
>>
>> John, the only difference between my tutorial and its application to your
>> land cover data was a few tweaks to the data prep stage. Otherwise the same
>> code pattern (no modification at all needed to the worker_function). It
>> was great to test the code with your data.
>>
>> Speak tomorrow.
>> Mark
>>
>>
>>
>>
>> <Screen Shot 2015-02-25 at 11.29.15 pm.png>
>>
>>
>> <Screen Shot 2015-02-25 at 11.39.04 pm.png>
>>
>>
>> <Screen Shot 2015-02-25 at 11.41.41 pm.png>
>>
>>
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20150305/fcf7e76c/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 61245 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20150305/fcf7e76c/attachment-0007.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 125529 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20150305/fcf7e76c/attachment-0008.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 51372 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20150305/fcf7e76c/attachment-0009.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 402449 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20150305/fcf7e76c/attachment-0010.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 377409 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20150305/fcf7e76c/attachment-0011.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 142272 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20150305/fcf7e76c/attachment-0012.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 94277 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20150305/fcf7e76c/attachment-0013.png>
More information about the postgis-users
mailing list