[postgis-users] Semantics of difference() function

Sun Jun 8 16:59:05 PDT 2008

On Jun 5, 2008, at 11:43 AM, Martin Davis wrote:
> I think the operation also is meaningful for the case (line,  
> polygon) => line (this is equivalent to a combination of  
> line.difference(polygon) and line.intersection(polygon) )

Ah, yes, split(line, polygon) should definitely be allowed. However, I  
was originally imagining that split(line, polygon), as well as  
split(polygon1, polygon2), should be the same as difference(*,  
polygon), which is geometrically consistent with the simpler forms  
split(*, line) and split(line, point):

     ####
----####--->   becomes   --->    --->
     ####                  1       2

But I see where it might be more useful for the polygon in the second  
argument to be reduced to its boundary first:

     ####                     +--+
----####--->   becomes   ----|--|--->   becomes   ---> ---> --->
     ####                     +--+                  1    2    3

But the latter is easily produced with the boundary function if  
someone is so inclined:

split(polygon1, polygon2) == split(polygon1, boundary(polygon2))
split(line, polygon) == split(line, boundary(polygon))

This is more involved, but less confusing to those like me who learn  
from consistency. What do other people think?

> Having thought about this with an eye to implementation, I've take  
> the definition a bit further:
>
> * The result of split(g1, g2) is a two-component GeometryCollection,  
> GC(c1, c2).

If you define split(line, polygon) as the combination of  
difference(line, polygon) and intersection(line, polygon), then you  
could have three components, right? This would also happen with  
split(line1, line2) and line2 is a linear ring that crosses line1 twice.

> * The first component c1 is the portion of g1 which lies to inside,  
> to the left of, or before, g2 (depending on whether g2 is a polygon,  
> line or point).  The second component c2 is the portion of g1 which  
> lies outside, to the right of, or after g2. * The types of c1 and c2  
> are the same as g1.  c1 and c2 may be empty, in degenerate cases  
> (e.g,. where one or more components of g1 are not intersected, or  
> intersected only at their boundary).

It seems to me that "left" and "right" don't always work for lines, as  
you could also have "below" or "above". Or are you referencing the  
directionality of g2 here, as you traverse it? But if it's a linear  
ring, the inner portion will be both to the left and right. I wonder  
if it wouldn't make more sense to simply use the directionality of g1  
instead, describing its pieces as "first", "second", etc., as I  
suggest in my drawings above?

As for polygons, there's no obvious spatial ordering for them, unless  
you take their vertex order as significant. That seems reasonable to  
me. For example,

split(POLYGON((0 0, 2 0, 2 2, 0 2, 0 0)), LINESTRING(1 -1, 1 3))  
becomes GEOMETRYCOLLECTION(POLYGON((0 0, 1 0, 1 2, 0 2, 0 0)),  
POLYGON((1 0, 2 0, 2 2, 1 2, 1 0)))

because the first polygon has the first vertex of the original and the  
second polygon is built from later vertices. I'm not certain if this  
could be consistently defined, though; it would require comparisons of  
multiple vertices when one or more are shared by the split polygons.

> * Any components of g2 which have a position which is not well- 
> defined relative to g2 (e.g.which are not intersected, or which are  
> intersected only partially), are returned as components of c1 (---  
> this rule is fairly arbitrary - a different strategy might make  
> equal sense)

So c1 is itself a Geometry Collection? This seems unnecessarily  
complicated; I would prefer a flat object.

-- Andy

> Andy Anderson wrote:
>>
>> On Jun 4, 2008, at 11:49 AM, Martin Davis wrote:
>>> To answer a question you posed on your blog, the reason that when  
>>> you subtract a line from a polygon you get basically the same  
>>> polygon, rather than say the polygon split into two halves, is  
>>> that if the latter was provided as a MultiPolygon it would be  
>>> invalid, because the halves would share line segments down the  
>>> middle.  Also, if the line did not fully overlap the polygon  
>>> there's no choice - you have to return the original poly.  The  
>>> behaviour is thus consistent between the two cases.
>>>
>>
>>> Unfortunately the SFS (and no other standard I'm aware of) doesn't  
>>> define the precise semantics of the overlay operations.  So I made  
>>> 'em up!  Hopefully they are consistent and reasonable.  There's no  
>>> doubt alternative definitions which might be useful in some cases  
>>> - but you have to choose one definition for any given function.   
>>> (For the situation above, many people would like a "cut polygon by  
>>> line" operation - hopefully that will get provided as a new  
>>> function sometime soon).
>>
>> I've seen "split" as a term that is commonly used for such  
>> operations, e.g.
>>
>> geometry = ST_Split(geometry1, geometry2)
>>
>> This should be defined for geometry1 of dimensionality D ≥ 1 and  
>> geometry2 of dimensionality D or D-1, i.e. (polygon, polygon),  
>> (polygon, line), (line, line), and (line, point). The result should  
>> be GeometryType(geometry) == 'GeometryCollection' where for any n  
>> GeometryType(ST_GeometryN(geometry, n)) == GeometryType(geometry1).
>>
>> -- Andy
>>
>>
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
>
> -- 
> Martin Davis
> Senior Technical Architect
> Refractions Research, Inc.
> (250) 383-3022
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20080608/af4e1873/attachment.html>