[OpenLayers-Users] Cluster strategy : cluster coord update proposition

Tim Schaub tschaub at opengeo.org
Thu Dec 11 13:29:23 EST 2008


Hey-

Didrik Pinte wrote:
> The cluster (x,y) position is created using the first feature.geometry
> coordinates. To have a cluster that is "at the center" of the data it
> clusters, I would propose to update its position using all the feature
> added to the cluster. Something like this :
> 
>      addToCluster: function(cluster, feature) {
> +        //update pos of the cluster to the mean pos of the features
> +        var bounds = cluster.geometry.getBounds();
> +        bounds.extend(feature.geometry.getBounds(););
>          cluster.cluster.push(feature);
>          cluster.attributes.count += 1;
> +        cluster.geometry = bounds.getCenterLonLat();
>      },
> 

Currently, if you have a cluster centered at 0, 0 and your cluster 
distance is 20, the cluster geometry will represent all features within 
20 pixels of the cluster.

This definition can be achieved with a simple and efficient algorithm: 
pick a location and gather all features within some distance.

If you think instead that the cluster position should be the centroid of 
all features within some distance of the first unclustered feature, then 
the resulting cluster (feature) does not have this same definition.  I'm 
not saying this is bad, but you could no longer say "this cluster 
represents all features within X distance of this point."  (Instead, the 
cluster definition would be something like "the centroid of all features 
within X distance of the first feature included in the cluster.)

And, if you want to determine the geometric center of your cluster, you 
should not do so in the way you suggest above.  That algorithm weighs 
more heavily each feature added (the new location is the mean of the 
newest feature and the mean of all previous features).

So, if you want to determine the centroid of a cluster of points, keep 
them all and after the cluster is done, calculate the centroid.

And, if you are really wanting the cluster to represent the centroid, 
then you should be accounting for the full geometry of each feature in 
the cluster - not just the center point of the bounds.

I just thought the existing definition & algorithm was a nice 
combination of simple and efficient.

Tim


PS - In case the above is not clear, consider points in one dimension.

1 point at 0.  10 points at 10.  1 point at 11.  (In that order.)

Given the existing strategy and a distance of 10, the result looks like 
this:

1 cluster at 0 with 11 points.  1 cluster at 11 with 1 point.

The distance between clusters is always greater than strategy.distance.

The "geometric center" strategy with a distance of 10 (assuming the same 
algorithm) would look like this:

1 cluster at 9.0909 with 11 points.  1 cluster at 11 with 1 point.

The second results in two clusters that are 2 units apart.  That feels a 
bit weird having specified a distance of 10.

Again, the clustering algorithm is simple to make it efficient.  You can 
do some reading on other cluster algorithms - I just didn't think they 
would be performant enough.


> What do you think about that ? Interesting or not ? Is it worth
> submitting a patch for it ?
> 
> Didrik
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Users mailing list
> Users at openlayers.org
> http://openlayers.org/mailman/listinfo/users


-- 
Tim Schaub
OpenGeo - http://opengeo.org
Expert service straight from the developers.



More information about the Users mailing list