[postgis-users] spatial distribution maps (heat maps?)

Mr. Puneet Kishor punk.kish at gmail.com
Tue Dec 6 19:09:59 PST 2011


Hi all,

Thanks to your ideas, I was able to successfully implement spatial clustering. I queried the data via Perl and DBD::Pg (Perl is my tool of choice for most things), and used Statistics::R to bridge it with R. KMeans is built in the standard R package. Ten lines of code, and it was all done.

 796         my $R = Statistics::R->new();
 797         $clusters = 100;
 798         $R->set( 'n.clusters', $clusters);
 799         $R->set( 'lon', \@lon);
 800         $R->set( 'lat', \@lat);
 801         $R->run(q`sample.data <- cbind(lon, lat)`);
 802         $R->run(q`cl <- kmeans(sample.data, n.clusters)`);
 803 
 804         my $lon = $R->get( 'as.numeric(cl$centers[,1])' );
 805         my $lat = $R->get( 'as.numeric(cl$centers[,2])' );
 806         my $size = $R->get( 'cl$size' );


Takes about 8 secs for 120K records, but I am sure I can make it a tad faster. That said, it doesn't really matter because once I calculate the clustering, I store it on disk using Storable. Subsequent calls are a couple of hundred milliseconds.


On Dec 6, 2011, at 1:58 AM, Phil James wrote:

> I would use something that is designed to do this analysis - R would be an obvious choice but there are others - using Python you can connect to Postgis to grab the data and then rpy to run R commands from python - we have used this configuration successfully using Django and it works fast enough for the web generating an image - this could of course be optimised to cache outputs if performance is an issue.
> 
>>> ..
>> 
>> 
>> Reading the above made me realize that I should have rephrased my question -- I don't want to create images on the server side. I realize now that what I really want to do is to do spatial clustering on the server side and then send the data to the browser. I wrote my own very naive clustering routine in Perl, and also tried Algorithm::KMeans [1]. This kind of analysis allows me to create a summary of my data that I can then plot on the client (see image at [2]).
>> 
>> Of course, my algorithm is way too naive, and waaaaay too slow, although I *can* compute the summary and cache it using Storable.
>> 
>> So, here is my rephrased question -- I am looking to do spatial clustering on my Pg data. The added complication is that I do not have access to WKTRaster.
>> 
>> 
>> [1] http://search.cpan.org/~avikak/Algorithm-KMeans-1.30/lib/Algorithm/KMeans.pm
>> [2] http://dl.dropbox.com/u/3526821/occurrences.png
>> 







More information about the postgis-users mailing list