[postgis-users] Center of Points Collection

Chris Hermansen chris.hermansen at timberline.ca
Wed Sep 16 09:08:47 PDT 2009


A median in X will either yield a point in the set or the average of the 
two points bracketing the (theoretical) median.  Same in Y.  In 2D the 
medians of both will often produce a point that is not in either set, 
even in cases where there is not an interpolation going on.  Also what 
does one do about duplicate coordinates?

In the median there is clearly the concept of "half of the points being 
on one side of the median, half on the other side".  It's this 
"sidedness" in 2D that makes me fret.

Here is an example for which I can't conceive of a great solution:

POINT(1 6),
POINT(0 5),
POINT(0 4),
POINT(2 3),
POINT(2 2),
POINT(2 1),
POINT(1 0)

What is the median of that?  Simon, what your propose (medians of x and 
y) should I think give POINT(1.5 3) if duplicates are only counted once 
and POINT(1 3) if not.

Simon Greener wrote:
> What would happen if you calculated the median of the x and y 
> ordinates instead of the average?
>
> WITH points As
> (
> SELECT st_collect(column1) as the_geom
>   FROM (values ('POINT(0 0)'),
>                ('POINT(0 1)'),
>                ('POINT(0 2)'),
>                ('POINT(1 0)')) as foo
> )
> SELECT astext(st_makepoint(avg(x),avg(y)))
>   FROM ( SELECT st_x(geom) as x, st_y(geom) as y
>            FROM (select (ST_Dump(the_geom)).geom as geom from points 
> as foo) as p
>           ORDER by st_x(geom) ASC, st_y(geom) ASC
>           LIMIT (select case when mod(ST_NumGeometries(the_geom),2) = 
> 0 then 1 else 2 end from points)
>           OFFSET (select case when mod(ST_NumGeometries(the_geom),2) = 
> 0 then ST_NumGeometries(the_geom)/2-1 else 
> (ST_NumGeometries(the_geom)-1)/2 end from points)
>         ) T;
> -- where p=1 and n=N/2-1 when the number of non null rows N is even, 
> or p=2 and n=(N-1)/2 when N is odd.
>
> "POINT(0 1)"
>
> See 
> http://scorreiait.wordpress.com/2008/10/28/how-to-compute-a-median-in-sql
>
> S
> On Wed, 16 Sep 2009 05:25:29 +1000, Chris Hermansen 
> <chris.hermansen at timberline.ca> wrote:
>
>> Hmm on further thought I'm not completely clear on the definition of
>> "median" in two dimensions, either.  I found this interesting article on
>> Google, including a way cool Java app for calculating and graphing:
>>
>> http://www.tiac.net/~sw/2007/11/2D_median/index.html
>>
>> pcreso at pcreso.com wrote:
>>> Hi guys,
>>>
>>> A bit more difficult, & way out in left field, but if you use PL/R 
>>> to create a median function for Postgres, you could build your point 
>>> from the median(X) & (Y) values instead of the average.
>>>
>>> Where this would actually lie obviously depends on the distribution 
>>> of the points. The centroid is most affected by (actually defined 
>>> by) outlying points, the avg somewhat less & the median less still.
>>>
>>> Of course once you have PL/R to play with, you have much more 
>>> flexibility to look at returning statistics from datasets than just 
>>> the median.
>>>
>>> Cheers,
>>>
>>>   Brent Wood
>>>
>>>
>>>
>>>
>>> --- On Tue, 9/15/09, Kevin Neufeld <kneufeld at refractions.net> wrote:
>>>
>>>
>>>> From: Kevin Neufeld <kneufeld at refractions.net>
>>>> Subject: Re: [postgis-users] Center of Points Collection
>>>> To: "PostGIS Users Discussion" <postgis-users at postgis.refractions.net>
>>>> Date: Tuesday, September 15, 2009, 8:51 AM
>>>> Paul Ramsey wrote:
>>>>
>>>>> Faster than creating a multipoint is to recognize that
>>>>>
>>>> ST_Centroid()
>>>>
>>>>> is just going to return the center of the bbox of the
>>>>>
>>>> collection
>>>>
>>>>> anyways...
>>>>>
>>>> Unfortunately, Paul, ST_Centroid returns the center of
>>>> mass, not the center of the bbox.
>>>>
>>>> SELECT astext(st_centroid(st_collect(column1))),
>>>> FROM (values ('POINT(0 0)'),
>>>>
>>>>    ('POINT(0 1)'),
>>>>
>>>>    ('POINT(0 2)'),
>>>>
>>>>    ('POINT(1 0)')) as foo;
>>>>       astext
>>>> ------------------
>>>>  POINT(0.25 0.75)
>>>> (1 row)
>>>>
>>>> Your second post, taking the avg of the x,y does seem to be
>>>> the nice approach, and produces the same results as
>>>> ST_Centroid - the center of mass.
>>>>
>>>> SELECT astext(st_makepoint(avg(st_x(column1)),
>>>> avg(st_y(column1))))
>>>> FROM (values ('POINT(0 0)'),
>>>>
>>>>    ('POINT(0 1)'),
>>>>
>>>>    ('POINT(0 2)'),
>>>>
>>>>    ('POINT(1 0)')) as foo;
>>>>       astext
>>>> ------------------
>>>>  POINT(0.25 0.75)
>>>> (1 row)
>>>>
>>>> If Dustin is after the center of the collection, then
>>>> something along your first suggestion might be more
>>>> appropriate.
>>>> (taking the center of the extents)
>>>>
>>>> Cheers,
>>>> Kevin
>>>> _______________________________________________
>>>> postgis-users mailing list
>>>> postgis-users at postgis.refractions.net
>>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>>>
>>>>
>>> _______________________________________________
>>> postgis-users mailing list
>>> postgis-users at postgis.refractions.net
>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>>
>>
>>
>
>


-- 
Regards,  (please note new mobile number below)

Chris Hermansen         mailto:chris.hermansen at timberline.ca
tel+1.604.714.2878 · fax+1.604.733.0631 · mob+1.778.840.4625
Timberline Natural Resource Group · http://www.timberline.ca
401 · 958 West 8th Avenue  · Vancouver BC · Canada · V5Z 1E5




More information about the postgis-users mailing list