[postgis-users] quantiles, quartiles, or jenks natural

David William Bitner david.bitner at gmail.com
Thu Jan 25 08:58:45 PST 2007


Robert,

If you don't pass the set.seed, R will use a random seed already I believe.
The reason I use it with the same seed is that I had a need to be able to
run the query multiple times on the same data and be sure I would get the
same results.  I've had great success using this approach with a number of
different R functions.

I'm glad to see that you are getting some use out of it too.

David

On 1/25/07, Robert Burgholzer <rburghol at chesapeakebay.net> wrote:
>
> David/others,
> I have finally revisited this thread, having forgotten all about it, and
> missing the finale. I wish I had realized you solved my problem already
> (with your array_accum), as I came up with a less elegant solution --
> after hours of struggle :).
>
> Anyhow, I have modified your "kmeans" function slightly to make it a bit
> more robust (I think), allowing it to use decimals instead of integers,
> and allowing you to pass in the seed value yourself (assuming that some
> utility exists in being able to supply a random, rather than static,
> seed). That said, I don't really know if kmeans is supposed to act on
> non-integer values, but it seems to behave OK.
>
> CREATE OR REPLACE FUNCTION kmeans(double precision[], int4, int4)
>   RETURNS double precision[] AS'
> set.seed(arg3)
> km=kmeans(sort(arg1),arg2)
> sort(unlist(tapply(sort(arg1),factor(match(km$cluster,order(km$centers))
> ),range)))
> ' LANGUAGE 'plr' VOLATILE STRICT;
>
> I have posted this code, as well as an implementation of the "quantile"
> function (now using your more robust array_accum implementation) at:
>
> http://soulswimmer.dynalias.net/db/R/r_functions.01.sql
>
> Comments, suggestions, and other R function implementations are most
> welcome.
>
> r.b.
>
>
> -----Original Message-----
> From: postgis-users-bounces at postgis.refractions.net
> [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of
> David Bitner
> Sent: Thursday, March 02, 2006 12:45 PM
> To: PostGIS Users Discussion
> Subject: Re: [postgis-users] quantiles, quartiles, or jenks natural
>
> By the way, the set.seed call is so I get the same results with
> subsequent calls on the dataset as I make one call with PHP and
> PostgreSQL to the database to create my legend with the class
> intervals and another to divvy up my dataset into predefined class
> styles in my mapfile with MapServer and I need subsequent calls to
> come up with the same results.
>
> On 3/2/06, David Bitner <osgis.lists at gmail.com> wrote:
> > The main function that I made is kmeans which takes an array of the
> > values that you want to classify and the number of classes that you
> > want and spits out an array of the break points for the data.
> >
> > For example:
> > select
> kmeans(array[1,1,1,1,4,3,5,6,45,3,5,7,8,6,4,3,2,1,32,6,7,5,6,7,8],4)
> > returns
> > {1,2,3,4,5,8,32,45}
> > which can be interpreted as use these classes:
> > 1-2,3-4,5-8,32-45
> > which extending to have no gaps would be the same as either
> > 1-2,3-4,5-31,32-45 or
> > 1-2,3-4,5-8,9-45
> >
> > I generally just call this using an array_accum aggregate like this:
> > select kmeans(array_accum(myintegercol),4)  from mytable
> >
> > As I said before, I kept getting some parse errors that I haven't had
> > time to look into when I tried writing the function to multiple lines,
> > so the function is all one line.
> >
> > CREATE OR REPLACE FUNCTION kmeans(_int8, int4)
> >   RETURNS _int8 AS
> >
> 'set.seed(2007);km=kmeans(sort(arg1),arg2);sort(unlist(tapply(sort(arg1)
> ,factor(match(km$cluster
> > ,order(km$centers))),range)))'
> >   LANGUAGE 'plr' VOLATILE STRICT;
> >
> > CREATE AGGREGATE array_accum(
> >   BASETYPE=anyelement,
> >   SFUNC=array_append,
> >   STYPE=anyarray,
> >   INITCOND='{}'
> > );
> >
> >
> >
> > On 3/2/06, Stephen Woodbridge <woodbri at swoodbridge.com> wrote:
> > > David,
> > >
> > > Please post it to the listserv, I would be interested also. I have
> yet
> > > to jump into PL/R but it is on my list to do.
> > >
> > > Thanks,
> > >    -Steve
> > >
> > > David Bitner wrote:
> > > > I ended up jumping into the PL/R world and just created an
> aggregate
> > > > wrapper around kmeans to get my class values. They ended up being
> > > > very, very close (identical in some cases) to classifications that
> had
> > > > been done with Jenks Natural Breaks.  If you want the same results
> > > > every time you run a classification on the same data, you need to
> set
> > > > the same seed value for the random number generator before each
> run.
> > > >
> > > > It's pretty basic and my code is ugly due to some R parser errors
> that
> > > > I could only get passed by throwing all the code on one line with
> no
> > > > spaces (hey it worked and I didn't have time to look into the
> parser
> > > > error), but I can throw the code up if anyone would like.
> > > >
> > > > On 3/2/06, Robert Burgholzer <rburghol at chesapeakebay.net> wrote:
> > > >
> > > >>OK,
> > > >>I'm coming into this late, but I am a user of PL/R and PostGIS,
> and
> > > >>would appreciate any progress on developing some classification
> routines
> > > >>to be posted to this lists, or I would be interested in being
> notified
> > > >>offline.
> > > >>
> > > >>Thanks!
> > > >>
> > > >>r.b.
> > > >>
> > > >>-----Original Message-----
> > > >>From: postgis-users-bounces at postgis.refractions.net
> > > >>[mailto:postgis-users-bounces at postgis.refractions.net] On Behalf
> Of Amit
> > > >>Kulkarni
> > > >>Sent: Wednesday, March 01, 2006 1:20 PM
> > > >>To: postgis-users at postgis.refractions.net
> > > >>Subject: Re: [postgis-users] quantiles, quartiles, or jenks
> natural
> > > >>
> > > >>Sorry, I have been catching up on the past few months emails. I
> just
> > > >>want to add that I read that quantiles and minimum boundary error
> are
> > > >>better than jenks. Also minimum boundary error takes into account
> the
> > > >>underlying topology.
> > > >>
> > > >>The two being better are mentioned in
> > > >>
> > > >>Brewer, Cynthia A. & Pickle, Linda (2002) Evaluation of Methods
> for
> > > >>Classifying Epidemiological Data on Choropleth Maps in Series.
> > > >>Annals of the Association of American Geographers 92 (4), 662-681
> > > >>
> > > >>And the minimum boundary algorithm is supposedly mentioned in
> > > >>
> > > >>Cromley, E. K. , and R. G. Cromley. 1996. An analysis of
> alternative
> > > >>classification  schemes  for  medical  atlas mapping. European
> Journal
> > > >>of Cancer 32A (9): 1551 -- 59.
> > > >>
> > > >>Cromley, R. G. , and R. D. Mrozinski. 1999. The classification of
> > > >>ordinal data for choropleth mapping. The Cartographic Journal 36
> (2):
> > > >>101 -- 9.
> > > >>
> > > >>HTH,
> > > >>amit
> > > >>
> > > >>
> > > >>Date: Tue, 14 Feb 2006 12:38:39 -0800
> > > >>From: Paul Ramsey <pramsey at refractions.net>
> > > >>
> > > >>I did some in PHP, but the algorithms are relatively braindead,
> the
> > > >>quantile stuff in particular.  Jenks I did some research on but
> never
> > > >>really found a definitive description of the process.  Some of the
> > > >>descriptions ended up sounding like a k-means clustering idea,
> which
> > > >>is not cheap!
> > > >>
> > > >>P.
> > > >>
> > > >>__________________________________________________
> > > >>Do You Yahoo!?
> > > >>Tired of spam?  Yahoo! Mail has the best spam protection around
> > > >>http://mail.yahoo.com
> > > >>_______________________________________________
> > > >>postgis-users mailing list
> > > >>postgis-users at postgis.refractions.net
> > > >>http://postgis.refractions.net/mailman/listinfo/postgis-users
> > > >>_______________________________________________
> > > >>postgis-users mailing list
> > > >>postgis-users at postgis.refractions.net
> > > >>http://postgis.refractions.net/mailman/listinfo/postgis-users
> > > >>
> > > >
> > > > _______________________________________________
> > > > postgis-users mailing list
> > > > postgis-users at postgis.refractions.net
> > > > http://postgis.refractions.net/mailman/listinfo/postgis-users
> > > >
> > >
> > > _______________________________________________
> > > postgis-users mailing list
> > > postgis-users at postgis.refractions.net
> > > http://postgis.refractions.net/mailman/listinfo/postgis-users
> > >
> >
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>



-- 
************************************
David William Bitner
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20070125/59861c66/attachment.html>


More information about the postgis-users mailing list