[postgis-users] Semantics of difference() function

Mon Jun 9 14:58:10 PDT 2008

Sorry, I tend to think of "sides" in a linear sense rather than a  
radial sense, so I was not interpreting your last comment properly.

In any case, you mean three operations -- difference, intersection,  
and collect:

	select st_astext(st_collect(
		st_difference(GEOMFROMTEXT('LINESTRING(-1 1, 3 1)'),  
GEOMFROMTEXT('POLYGON((0 0, 2 0, 2 2, 0 2, 0 0))')),
		st_intersection(GEOMFROMTEXT('LINESTRING(-1 1, 3 1)'),  
GEOMFROMTEXT('POLYGON((0 0, 2 0, 2 2, 0 2, 0 0))'))
	))

returns

	'GEOMETRYCOLLECTION(MULTILINESTRING((-1 1,0 1),(2 1,3  
1)),LINESTRING(0 1,2 1))'

"Total runtime: 1.270 ms"

But you really only need two, boundary and difference:

	select st_astext(
		st_difference(GEOMFROMTEXT('LINESTRING(-1 1, 3 1)'),  
ST_Boundary(GEOMFROMTEXT('POLYGON((0 0, 2 0, 2 2, 0 2, 0 0))'))))

returns

	'MULTILINESTRING((-1 1,0 1),(0 1,2 1),(2 1,3 1))'

This doesn't significantly increase performance, though:

"Total runtime: 1.207 ms"

Round-off issues should be much less, I think.

My previous case with a solid Polygon split ("the sledgehammer" rather  
than "the ring axe") has "Total runtime: 0.469 ms".

I'll be interested to see what you come up with!

-- Andy

On Jun 9, 2008, at 3:18 PM, Martin Davis wrote:

> Yes - but it doesn't provide the section LINESTRING(0 1, 2 1) inside  
> the polygon.  I envision this also being provided by split().    
> (Otherwise you have to run two operations, difference and  
> intersection - with the associated performance penalty, and also a  
> risk that round-off will cause results to be computed which don't  
> precisely match)
>
> Andy Anderson wrote:
>> Difference does seem to provide both sides of the subtractor:
>>
>>    select st_astext(st_difference(GEOMFROMTEXT('LINESTRING(-1 1, 3  
>> 1)'),
>>        GEOMFROMTEXT('POLYGON((0 0, 2 0, 2 2, 0 2, 0 0))')))
>>
>> returns
>>
>>    'MULTILINESTRING((-1 1,0 1),(2 1,3 1))'
>>
>> -- Andy
>>
>> On Jun 9, 2008, at 2:44 PM, Martin Davis wrote:
>>
>>> I see the difference (pun intended) between difference() and  
>>> split() being that split returns the geometries on both sides of  
>>> the splitting geometry, ordered so that the user can decide which  
>>> one(s) he wants.  This is more powerful than difference, which  
>>> only returns a single geometry from one side of the split.
>>>
>>> And I agree - I don't think polygons should be reduced to their  
>>> boundary first - the user can do this explicitly if they require it.
>>>
>>> Andy Anderson wrote:
>>>> On Jun 5, 2008, at 11:43 AM, Martin Davis wrote:
>>>>> I think the operation also is meaningful for the case (line,  
>>>>> polygon) => line (this is equivalent to a combination of  
>>>>> line.difference(polygon) and line.intersection(polygon) )
>>>>
>>>> Ah, yes, split(line, polygon) should definitely be allowed.  
>>>> However, I was originally imagining that split(line, polygon), as  
>>>> well as split(polygon1, polygon2), should be the same as  
>>>> difference(*, polygon), which is geometrically consistent with  
>>>> the simpler forms split(*, line) and split(line, point):
>>>>
>>>>   ####
>>>> ----####--->   becomes   --->    --->
>>>>   ####                  1       2
>>>>
>>>> But I see where it might be more useful for the polygon in the  
>>>> second argument to be reduced to its boundary first:
>>>>
>>>>   ####                     +--+   ----####--->   becomes    
>>>> ----|--|--->   becomes   ---> ---> --->
>>>>   ####                     +--+                  1    2    3
>>>>
>>>> But the latter is easily produced with the boundary function if  
>>>> someone is so inclined:
>>>>
>>>> split(polygon1, polygon2) == split(polygon1, boundary(polygon2))  
>>>> split(line, polygon) == split(line, boundary(polygon))
>>>> This is more involved, but less confusing to those like me who  
>>>> learn from consistency. What do other people think?
>>>>
>>>>> Having thought about this with an eye to implementation, I've  
>>>>> take the definition a bit further:
>>>>>
>>>>> * The result of split(g1, g2) is a two-component  
>>>>> GeometryCollection, GC(c1, c2).
>>>>
>>>> If you define split(line, polygon) as the combination of  
>>>> difference(line, polygon) and intersection(line, polygon), then  
>>>> you could have three components, right? This would also happen  
>>>> with split(line1, line2) and line2 is a linear ring that crosses  
>>>> line1 twice.
>>>>
>>>>> * The first component c1 is the portion of g1 which lies to  
>>>>> inside, to the left of, or before, g2 (depending on whether g2  
>>>>> is a polygon, line or point).  The second component c2 is the  
>>>>> portion of g1 which lies outside, to the right of, or after g2.  
>>>>> * The types of c1 and c2 are the same as g1.  c1 and c2 may be  
>>>>> empty, in degenerate cases (e.g,. where one or more components  
>>>>> of g1 are not intersected, or intersected only at their boundary).
>>>>
>>>> It seems to me that "left" and "right" don't always work for  
>>>> lines, as you could also have "below" or "above". Or are you  
>>>> referencing the directionality of g2 here, as you traverse it?  
>>>> But if it's a linear ring, the inner portion will be both to the  
>>>> left and right. I wonder if it wouldn't make more sense to simply  
>>>> use the directionality of g1 instead, describing its pieces as  
>>>> "first", "second", etc., as I suggest in my drawings above?
>>>>
>>>> As for polygons, there's no obvious spatial ordering for them,  
>>>> unless you take their vertex order as significant. That seems  
>>>> reasonable to me. For example,
>>>>
>>>> split(POLYGON((0 0, 2 0, 2 2, 0 2, 0 0)), LINESTRING(1 -1, 1 3))  
>>>> becomes GEOMETRYCOLLECTION(POLYGON((0 0, 1 0, 1 2, 0 2, 0 0)),  
>>>> POLYGON((1 0, 2 0, 2 2, 1 2, 1 0)))
>>>>
>>>> because the first polygon has the first vertex of the original  
>>>> and the second polygon is built from later vertices. I'm not  
>>>> certain if this could be consistently defined, though; it would  
>>>> require comparisons of multiple vertices when one or more are  
>>>> shared by the split polygons.
>>>>
>>>>> * Any components of g2 which have a position which is not well- 
>>>>> defined relative to g2 (e.g.which are not intersected, or which  
>>>>> are intersected only partially), are returned as components of  
>>>>> c1 (--- this rule is fairly arbitrary - a different strategy  
>>>>> might make equal sense)
>>>>
>>>> So c1 is itself a Geometry Collection? This seems unnecessarily  
>>>> complicated; I would prefer a flat object.
>>>>
>>>> -- Andy
>>>>
>>>>> Andy Anderson wrote:
>>>>>>
>>>>>> On Jun 4, 2008, at 11:49 AM, Martin Davis wrote:
>>>>>>> To answer a question you posed on your blog, the reason that  
>>>>>>> when you subtract a line from a polygon you get basically the  
>>>>>>> same polygon, rather than say the polygon split into two  
>>>>>>> halves, is that if the latter was provided as a MultiPolygon  
>>>>>>> it would be invalid, because the halves would share line  
>>>>>>> segments down the middle.  Also, if the line did not fully  
>>>>>>> overlap the polygon there's no choice - you have to return the  
>>>>>>> original poly.  The behaviour is thus consistent between the  
>>>>>>> two cases.
>>>>>>>
>>>>>>
>>>>>>> Unfortunately the SFS (and no other standard I'm aware of)  
>>>>>>> doesn't define the precise semantics of the overlay  
>>>>>>> operations.  So I made 'em up!  Hopefully they are consistent  
>>>>>>> and reasonable.  There's no doubt alternative definitions  
>>>>>>> which might be useful in some cases - but you have to choose  
>>>>>>> one definition for any given function.  (For the situation  
>>>>>>> above, many people would like a "cut polygon by line"  
>>>>>>> operation - hopefully that will get provided as a new function  
>>>>>>> sometime soon).
>>>>>>
>>>>>> I've seen "split" as a term that is commonly used for such  
>>>>>> operations, e.g.
>>>>>>
>>>>>> geometry = ST_Split(geometry1, geometry2)
>>>>>>
>>>>>> This should be defined for geometry1 of dimensionality D ≥ 1  
>>>>>> and geometry2 of dimensionality D or D-1, i.e. (polygon,  
>>>>>> polygon), (polygon, line), (line, line), and (line, point). The  
>>>>>> result should be GeometryType(geometry) == 'GeometryCollection'  
>>>>>> where for any n GeometryType(ST_GeometryN(geometry, n)) ==  
>>>>>> GeometryType(geometry1).
>>>>>>
>>>>>> -- Andy
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> postgis-users mailing list
>>>>>> postgis-users at postgis.refractions.net <mailto:postgis-users at postgis.refractions.net 
>>>>>> >
>>>>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>>>>>
>>>>>
>>>>> -- 
>>>>> Martin Davis
>>>>> Senior Technical Architect
>>>>> Refractions Research, Inc.
>>>>> (250) 383-3022
>>>>>
>>>>> _______________________________________________
>>>>> postgis-users mailing list
>>>>> postgis-users at postgis.refractions.net <mailto:postgis-users at postgis.refractions.net 
>>>>> >
>>>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>> _______________________________________________
>>>> postgis-users mailing list
>>>> postgis-users at postgis.refractions.net
>>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>>>
>>>
>>> -- 
>>> Martin Davis
>>> Senior Technical Architect
>>> Refractions Research, Inc.
>>> (250) 383-3022
>>>
>>> _______________________________________________
>>> postgis-users mailing list
>>> postgis-users at postgis.refractions.net
>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
>
> -- 
> Martin Davis
> Senior Technical Architect
> Refractions Research, Inc.
> (250) 383-3022
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users