[postgis-users] Semantics of difference() function

Mon Jun 9 12:10:02 PDT 2008

Difference does seem to provide both sides of the subtractor:

	select st_astext(st_difference(GEOMFROMTEXT('LINESTRING(-1 1, 3 1)'),
		GEOMFROMTEXT('POLYGON((0 0, 2 0, 2 2, 0 2, 0 0))')))

returns

	'MULTILINESTRING((-1 1,0 1),(2 1,3 1))'

-- Andy

On Jun 9, 2008, at 2:44 PM, Martin Davis wrote:

> I see the difference (pun intended) between difference() and split()  
> being that split returns the geometries on both sides of the  
> splitting geometry, ordered so that the user can decide which one(s)  
> he wants.  This is more powerful than difference, which only returns  
> a single geometry from one side of the split.
>
> And I agree - I don't think polygons should be reduced to their  
> boundary first - the user can do this explicitly if they require it.
>
> Andy Anderson wrote:
>> On Jun 5, 2008, at 11:43 AM, Martin Davis wrote:
>>> I think the operation also is meaningful for the case (line,  
>>> polygon) => line (this is equivalent to a combination of  
>>> line.difference(polygon) and line.intersection(polygon) )
>>
>> Ah, yes, split(line, polygon) should definitely be allowed.  
>> However, I was originally imagining that split(line, polygon), as  
>> well as split(polygon1, polygon2), should be the same as  
>> difference(*, polygon), which is geometrically consistent with the  
>> simpler forms split(*, line) and split(line, point):
>>
>>    ####
>> ----####--->   becomes   --->    --->
>>    ####                  1       2
>>
>> But I see where it might be more useful for the polygon in the  
>> second argument to be reduced to its boundary first:
>>
>>    ####                     +--+   ----####--->   becomes    
>> ----|--|--->   becomes   ---> ---> --->
>>    ####                     +--+                  1    2    3
>>
>> But the latter is easily produced with the boundary function if  
>> someone is so inclined:
>>
>> split(polygon1, polygon2) == split(polygon1, boundary(polygon2))  
>> split(line, polygon) == split(line, boundary(polygon))
>> This is more involved, but less confusing to those like me who  
>> learn from consistency. What do other people think?
>>
>>> Having thought about this with an eye to implementation, I've take  
>>> the definition a bit further:
>>>
>>> * The result of split(g1, g2) is a two-component  
>>> GeometryCollection, GC(c1, c2).
>>
>> If you define split(line, polygon) as the combination of  
>> difference(line, polygon) and intersection(line, polygon), then you  
>> could have three components, right? This would also happen with  
>> split(line1, line2) and line2 is a linear ring that crosses line1  
>> twice.
>>
>>> * The first component c1 is the portion of g1 which lies to  
>>> inside, to the left of, or before, g2 (depending on whether g2 is  
>>> a polygon, line or point).  The second component c2 is the portion  
>>> of g1 which lies outside, to the right of, or after g2. * The  
>>> types of c1 and c2 are the same as g1.  c1 and c2 may be empty, in  
>>> degenerate cases (e.g,. where one or more components of g1 are not  
>>> intersected, or intersected only at their boundary).
>>
>> It seems to me that "left" and "right" don't always work for lines,  
>> as you could also have "below" or "above". Or are you referencing  
>> the directionality of g2 here, as you traverse it? But if it's a  
>> linear ring, the inner portion will be both to the left and right.  
>> I wonder if it wouldn't make more sense to simply use the  
>> directionality of g1 instead, describing its pieces as "first",  
>> "second", etc., as I suggest in my drawings above?
>>
>> As for polygons, there's no obvious spatial ordering for them,  
>> unless you take their vertex order as significant. That seems  
>> reasonable to me. For example,
>>
>> split(POLYGON((0 0, 2 0, 2 2, 0 2, 0 0)), LINESTRING(1 -1, 1 3))  
>> becomes GEOMETRYCOLLECTION(POLYGON((0 0, 1 0, 1 2, 0 2, 0 0)),  
>> POLYGON((1 0, 2 0, 2 2, 1 2, 1 0)))
>>
>> because the first polygon has the first vertex of the original and  
>> the second polygon is built from later vertices. I'm not certain if  
>> this could be consistently defined, though; it would require  
>> comparisons of multiple vertices when one or more are shared by the  
>> split polygons.
>>
>>> * Any components of g2 which have a position which is not well- 
>>> defined relative to g2 (e.g.which are not intersected, or which  
>>> are intersected only partially), are returned as components of c1  
>>> (--- this rule is fairly arbitrary - a different strategy might  
>>> make equal sense)
>>
>> So c1 is itself a Geometry Collection? This seems unnecessarily  
>> complicated; I would prefer a flat object.
>>
>> -- Andy
>>
>>> Andy Anderson wrote:
>>>>
>>>> On Jun 4, 2008, at 11:49 AM, Martin Davis wrote:
>>>>> To answer a question you posed on your blog, the reason that  
>>>>> when you subtract a line from a polygon you get basically the  
>>>>> same polygon, rather than say the polygon split into two halves,  
>>>>> is that if the latter was provided as a MultiPolygon it would be  
>>>>> invalid, because the halves would share line segments down the  
>>>>> middle.  Also, if the line did not fully overlap the polygon  
>>>>> there's no choice - you have to return the original poly.  The  
>>>>> behaviour is thus consistent between the two cases.
>>>>>
>>>>
>>>>> Unfortunately the SFS (and no other standard I'm aware of)  
>>>>> doesn't define the precise semantics of the overlay operations.   
>>>>> So I made 'em up!  Hopefully they are consistent and  
>>>>> reasonable.  There's no doubt alternative definitions which  
>>>>> might be useful in some cases - but you have to choose one  
>>>>> definition for any given function.  (For the situation above,  
>>>>> many people would like a "cut polygon by line" operation -  
>>>>> hopefully that will get provided as a new function sometime soon).
>>>>
>>>> I've seen "split" as a term that is commonly used for such  
>>>> operations, e.g.
>>>>
>>>> geometry = ST_Split(geometry1, geometry2)
>>>>
>>>> This should be defined for geometry1 of dimensionality D ≥ 1 and  
>>>> geometry2 of dimensionality D or D-1, i.e. (polygon, polygon),  
>>>> (polygon, line), (line, line), and (line, point). The result  
>>>> should be GeometryType(geometry) == 'GeometryCollection' where  
>>>> for any n GeometryType(ST_GeometryN(geometry, n)) ==  
>>>> GeometryType(geometry1).
>>>>
>>>> -- Andy
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> postgis-users mailing list
>>>> postgis-users at postgis.refractions.net <mailto:postgis-users at postgis.refractions.net 
>>>> >
>>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>>>
>>>
>>> -- 
>>> Martin Davis
>>> Senior Technical Architect
>>> Refractions Research, Inc.
>>> (250) 383-3022
>>>
>>> _______________________________________________
>>> postgis-users mailing list
>>> postgis-users at postgis.refractions.net <mailto:postgis-users at postgis.refractions.net 
>>> >
>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
>
> -- 
> Martin Davis
> Senior Technical Architect
> Refractions Research, Inc.
> (250) 383-3022
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users