Intersection tests with curved polygons

Andrea Aime andrea.aime at geosolutionsgroup.com
Tue Jan 7 02:55:57 PST 2025


Hi all,
to compare the alternatives, I've run some tests using pgbench, using the
full original dataset (36k curved polygons),
with 32 clients and 4000 transactions each, testing 3 scenarios:

   - The current query using ST_Intersection as is (returns 2 polygons,
   wrong result but I'm treating it as the baseline)
   - Using && + ST_Distance
   - Using && + linearized intersection, with a precision good enough to
   get the correct result (0.01 meters)

I find the results surprsing:

> pgbench -U cite helsinki -f baseline.sql  -c 32 -t 2000
tps = 10227.810087 ((without initial connection time)

> pgbench -U cite helsinki -f distance.sql  -c 32 -t 2000
tps = 19989.627257 (without initial connection time)

> pgbench -U cite helsinki -f linearized.sql  -c 32 -t 2000
tps = 8984.594159 (without initial connection time)

So... testing with the distance is twice as fast as the other options? Wow,
have we been doing intersection tests wrong all this time? ROFL
Other possible ideas:

   - There is something specific to having curves in the mix, and having to
   pay the cost of linearization makes distance competitive?
   - The specific dataset is playing an important role in the result?


Ah, since there seems to be a real issue, I've also opened a ticket in
trac: https://trac.osgeo.org/postgis/ticket/5832#ticket

Regards,

Andrea Aime


==


GeoServer Professional Services from the experts!

Visit http://bit.ly/gs-services-us for more information.

==

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions Group
phone: +39 0584 962313

fax:     +39 0584 1660272

mob:   +39  339 8844549

https://www.geosolutionsgroup.com/

http://twitter.com/geosolutions_it

-------------------------------------------------------

Con riferimento alla normativa sul trattamento dei dati personali (Reg. UE
2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si
precisa che ogni circostanza inerente alla presente email (il suo
contenuto, gli eventuali allegati, etc.) è un dato la cui conoscenza è
riservata al/i solo/i destinatario/i indicati dallo scrivente. Se il
messaggio Le è giunto per errore, è tenuta/o a cancellarlo, ogni altra
operazione è illecita. Le sarei comunque grato se potesse darmene notizia.

This email is intended only for the person or entity to which it is
addressed and may contain information that is privileged, confidential or
otherwise protected from disclosure. We remind that - as provided by
European Regulation 2016/679 “GDPR” - copying, dissemination or use of this
e-mail or the information herein by anyone other than the intended
recipient is prohibited. If you have received this email by mistake, please
notify us immediately by telephone or e-mail


On Sun, Dec 22, 2024 at 4:44 PM Andrea Aime <
andrea.aime at geosolutionsgroup.com> wrote:

> Hi Paul,
> thanks a lot for following up. Comments inline below.
>
> These are literally CurvePolygon type?
>>
>
> The column type is just "geometry(Geometry,3879)", while ST_GeometryType
> returns "multisurface" for both.
> When doing a ST_AsText instead, you'll get something like:
>
> MULTISURFACE(CURVEPOLYGON(COMPOUNDCURVE((...
>
> for both.
>
>
>> It’s probably getting caught in our lack of full curve support.
>> I would be interested in the ST_Distance between the point and those two
>> CurvePolygons. (Because, for distance, we have a postgis-native
>> implementation that supports curves).
>>
>
> =# SELECT ogc_fid, ST_Distance(ST_GeomFromText('POINT (25492818
> 6677399.98)', 3879), geom) FROM testdata;
>
>  ogc_fid |     st_distance
> ---------+---------------------
>     1258 | 0.01234572446598792
>    12875 |                   0
> (2 rows)
>
> Indeed, the correct answer, 12875 contains the point, while the other
> polygon is close to it.
>
>
>> Whereas for intersection, the calculation is delegated to GEOS *after
>> linearizing the inputs*. In that linearization, could sit the logically
>> problem you’re seeing.
>>
>
> Let's check with different tolerances... yes, changing the tolerance
> changes the result:
>
> =#  SELECT ogc_fid FROM testdata WHERE ST_Intersects(ST_CurveToLine(geom,
> 0.01, 1, 1), ST_GeomFromText('POINT (25492818 6677399.98)', 3879));
>  ogc_fid
> ---------
>    12875
> (1 row)
>
> #  SELECT ogc_fid FROM testdata WHERE ST_Intersects(ST_CurveToLine(geom,
> 0.02, 1, 1), ST_GeomFromText('POINT (25492818 6677399.98)', 3879));
>  ogc_fid
> ---------
>     1258
> (1 row)
>
> In the immediate future, I guess I could have the GeoTools PostGIS store
> use either approach, when knowing curves are involved...
> First using && to perform a first rough filter, and then either use either
> * ST_Distance equals to 0
> * An explicit linearization with a target tolerance (this is an urban
> application, so I'm guessing they will need centimeter, if not millimeter,
> precision)
> .
> Is there a clear winner here in terms of performance, or performance of
> distance vs linearized intersection is more contextual to the geometries
> involved?
>
> Cheers
> Andrea
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20250107/e2964503/attachment.htm>


More information about the postgis-users mailing list