[postgis-devel] Introducing wagyu to validate MVT polygons
Björn Harrtell
bjorn.harrtell at gmail.com
Mon Dec 31 07:43:32 PST 2018
Interesting!
When I initially implemented ST_AsMVTGeom I looked at wagyu as it was
already pretty clear the main encoding performance bottleneck was with
MakeValid and that wagyu was likely faster. But I was/is too inexperienced
with C and C++ to do what you just have done. Nice work!
/Björn
Den fre 21 dec. 2018 kl 17:00 skrev Raúl Marín Rodríguez <
rmrodriguez at carto.com>:
> Hi all,
>
> When CARTO decided to switch to ST_AsMVT as the default library to generate
> MVTs from the database we made a lot of benchmarks and performance
> improvements (most of them are available in 2.5 but some won't be available
> until 3.0) [1]. The sort summary when you compare it to Mapnik is:
>
> # Good parts
> - Postgis can do work in parallel. When this is triggered, performance
> is faster than Mapnik.
> - Postgis is faster encoding lines (0.5x to 1.5x faster).
> - It is also faster encoding properties (2x faster in tiles with high
> amount
> of properties (40+)).
> - It is faster discarding small polygons (up to 20x faster in extreme
> cases).
>
> # Average
> - It has a similar performance encoding points.
>
> # Not so good
> - It is slower encoding polygons (2x for small [~10 point] polygons, 20x
> for
> big ones [1M points]).
>
> I also comented this in that blogpost [1]: "It would be interesting to
> analyze
> why Postgis validation (based on GEOS) is way slower than Mapnik’s (based
> on
> Boost), and addressing this would benefit multiple SQL functions that use
> it,
> like St_IsValid."
>
> After working around this to try to speed up this process (validation takes
> ~95-98% of the time in ST_AsMVTGeom) I've learned several things:
> - St_MakeValid is buggy. There are several tickets in GEOS around this and
> even
> Postgis has some commented out tests showing this buggy behaviour.
> - St_MakeValid goes beyond what's necessary for MVTs, even being
> counterproductive sometimes (like collapsing polygons into lines, or lines
> into points).
> - St_MakeValid can create new points that don't respect the MVT integer
> coordinates. Thus it is producing invalid MVT geometries.
> - I was wrong when I said that Mapnik's validation was based on Boost, it
> is
> actually based on Wagyu [2].
>
> I started with the idea of improving ST_MakeValid but it soon proved
> extremely
> hard and some of the shortcuts valid for MVT weren't ok for the general
> case.
> With this in mind, a week ago I decided to have a look at Wagyu and see how
> hard it would be to integrate it in Postgis. Both the code [3] and the
> performance comparison [4] can be seen at Github; it is 1x faster for small
> polygons, 20x faster for large ones.
>
> You might be wondering why I decided to do this in Postgis instead of GEOS,
> which is a C++ library that we already depend on. The main reasons is
> because
> I have little to no experience with it, and doing it in the right way would
> require to move the validation code from Postgis to GEOS, add the ability
> to
> work with integers (it only allows doubles right now) and then expose this
> all
> through its C-API. So, what has taken me a week (and a good chunk was
> autotools)
> would require multiple months of work.
>
> I see 2 ways to include the library and I'm not sure which one is best for
> this
> case:
> - Use system libraries. Packages for `wagyu` and `geometry.hpp` are only
> available in Debian and Fedora, but not in other Linux distributions, OSX
> or
> Windows. If it was widely available I think this would be the best option,
> as
> it has been done with other libraries like geos, protobuf, etc.
> - Bring the library code into the project. This is what I've done in my PR.
>
> A C++11 compiler is required and it should be trivial to switch between
> the 2
> (a couple of configure flags). I think that the first option would be best
> if
> we could have packagers making wagyu more widely available, but even though
> both I and my company only use Linux, but I don't want this improvement to
> be Linux only.
>
> Some comments about the PR:
> - I've created a minimal C api to do the operations we need (clipping with
> validation) for the MVT use case (integers and opposite winding order).
> Wagyu
> has other functionalities like Union or XOR but those aren't exposed, and
> it
> can also work with doubles but for this use case it was 10% slower.
> - Using wagyu it's optional and only used if you pass `--use-wagyu` to
> configure. It will use CXX and CXXFLAGS, not CC and CFLAGS.
> - The library only supports polygons, so any other geometries are still
> passed
> to the make_valid based on GEOS.
> - The MVT process now transforms into MVT coordinates before clipping.
> This is
> to make the code more similar for the 2 methods (GEOS and Wagyu) and to
> make
> the clipping process consistent (we had hacks to account for half units
> and so
> on which are now gone).
> - Some outputs change when using this library, most notably dropping some
> geometries that have extreme self intersections (on input).
>
> Things that aren't done:
> - Add tests directly to libwagyu instead of relying on MVT tests.
> - Adapt MVT tests to pass with both methods.
> - Adapt CI to test both methods.
> - Update documentation for ST_AsMVTGeom.
> - Move uthash to the new `deps` folder.
>
>
> Although it's almost Christmas I'm expecting some conflict, specially
> around
> the fact that it's bringing a new library and code inside the project and
> that
> it requires new configure flags and a C++11 compiler if you decided to use
> it.
> What are your thoughts?
>
> [1] - https://carto.com/blog/inside/An-update-on-MVT-encoders/
> [2] - https://github.com/mapbox/wagyu
> [3] - https://github.com/postgis/postgis/pull/356
> [4] -
> https://github.com/postgis/postgis/files/2703289/20181221_mvt_postgis_trunk_vs_20181221_mvt_postgis_wagyu.pdf
>
> Regards
>
> --
> Raúl Marín Rodríguez
> carto.com
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20181231/11e6aa5c/attachment.html>
More information about the postgis-devel
mailing list