[mapserver-commits] r10999 - trunk/docs/en/development/rfc

Sat Feb 19 13:36:00 EST 2011

Author: tamas
Date: 2011-02-19 10:36:00 -0800 (Sat, 19 Feb 2011)
New Revision: 10999

Added:
   trunk/docs/en/development/rfc/ms-rfc-69.txt
Modified:
   trunk/docs/en/development/rfc/index.txt
Log:
Adding MS-RFC-69 initial revision


Modified: trunk/docs/en/development/rfc/index.txt
===================================================================

--- trunk/docs/en/development/rfc/index.txt	2011-02-18 15:51:47 UTC (rev 10998)
+++ trunk/docs/en/development/rfc/index.txt	2011-02-19 18:36:00 UTC (rev 10999)
@@ -78,3 +78,4 @@
    ms-rfc-66
    ms-rfc-67
    ms-rfc-68
+   ms-rfc-69

Added: trunk/docs/en/development/rfc/ms-rfc-69.txt
===================================================================
--- trunk/docs/en/development/rfc/ms-rfc-69.txt	                        (rev 0)
+++ trunk/docs/en/development/rfc/ms-rfc-69.txt	2011-02-19 18:36:00 UTC (rev 10999)
@@ -0,0 +1,207 @@
+.. _rfc69:
+
+=========================================================================
+MS RFC 69: Support for clustering of features in point layers
+=========================================================================
+
+:Date:  2011/02/19
+:Author: Tamas Szekeres
+:Contact: szekerest at gmail.com
+:Last Edited: $Date$
+:Status: Discussion Draft
+:Version: MapServer 6.0
+:Id: $Id$
+
+Description: This RFC proposes an implementation clustering multiple features 
+from a layer to single (aggregated) features based on their relative positions.
+
+1. Overview
+-----------
+
+In order to make the maps perspicuous at a given view, we may require to limit
+the number of the features rendered at neighbouring locations which would normally
+overlap each other. Currently there's no such mechanism in MapServer which would
+prevent from the symbols to overlap based on their relative locations. In a feasible
+solution we should provide rendering the isolated symbols as is, but create new 
+(clustered) features for those symbols that would overlap in a particular scale.
+
+3. The proposed soution
+-----------------------
+
+This functionality will be implemented as a separate layer data provider (implemented in mapcombine.c)
+This provider will be used internally (being invoked in msDrawVectorLayer) and it could
+as well be compiled as a separate layer data provider (ie a plugin).
+In the first case the clustering parameters will be specified in the same layer from
+which the data is provided actually, like:
+
+::
+
+  LAYER
+    TYPE POINT
+    CONNECTIONTYPE OGR
+    NAME cluster
+    PROCESSING "CLUSTERMAXDISTANCE=10"
+    PROCESSING "CLUSTERREGION=ellipse"
+    ...
+  END
+
+In the second case the data provider can be used as a separate layer (compiled as a plugin)
+according to the following example:
+
+::
+
+  LAYER
+    TYPE POINT
+    CONNECTIONTYPE PLUGIN
+    PLUGIN "msplugin_cluster.dll"
+    CONNECTION "sourcelayer" # reference to the source layer
+    NAME combine
+    PROCESSING "CLUSTERMAXDISTANCE=10"
+    PROCESSING "CLUSTERREGION=ellipse"
+    ...
+  END
+  LAYER
+    TYPE POINT
+    CONNECTIONTYPE OGR
+    NAME sourcelayer
+    ...
+  END
+
+We will actually create a single implementation which is suitable for both cases.
+The differences will be separated by the USE_CLUSTER_EXTERNAL define setting.
+
+3.1 Implementing the single layer approach
+------------------------------------------
+
+In the single layer approach msDrawVectorLayer we will open this new layer data provider
+depending on the existence of the clustering options, something like:
+  
+::
+
+  if (msLayerGetProcessingKey( layer, "CLUSTERMAXDISTANCE" ))
+    status = msClusterLayerOpen(layer);
+  else
+    status = msLayerOpen(layer);
+
+
+In msClusterLayerOpen the data provider will override the vtable functions so that
+the subsequent LayerWhichShapes/LayerNextShape/LayerClose (and some further) functions
+will be handled by this provider and not by the original data source.
+The clustering process itself will be handled in the LayerWhichShapes call. This is the only
+place where the features are retrieved from the original data source and then cached
+in the local clustering database (stored in layerinfo).
+
+The clustering process itself will be implemented in the following way:
+
+1) For each feature we create a tentative cluster and create the aggregate attributes 
+   (like the feature count and the average position, and the variance) the features are added into
+   a customized quedtree data structure which provides quick access when searhing for the
+   neighboring shapes
+2) For each feature we will retrieve all the neighbouring shapes (that has already been retrieved earlier) 
+   within the specified distance (CLUSTERMAXDISTANCE) and searc shape (CLUSTERREGION) by using a quadtree seach. 
+   In the related clusters we update the feature count (n) average positions (avg) and the variance (var)
+   for each interseting clusters by using the following recursive formula:
+   
+::
+
+  n = n + 1
+  avg(n) = avg(n-1) * (n-1) / n + x(n) / n
+  var(n) = var(n-1) * (n-1) / n + pow2(x(n) - avg(n)) / (n-1)
+
+3) In a second turn we evaluate the tentative clusters based on their feature count and the offset of the 
+   average position related to the initial position and the variance. The best ranking clusters will
+   be identified by minimizing the position offset and the variance. The individual features (having rank=0)
+   will be retrieved first in this approach.
+4) The best ranking clusters will be added to the finalization list (in layerinfo) and the finalized 
+   clusters (and the related features) will be removed from the quadtree as well.
+5) Based on the finalized features we update the average position and the variance of the affected
+   clusters which are still exist in the quadtree.
+6) Repeat from #4 until we have features in the quadtree.
+   
+The finalized features are served from the finalization list which is preserved until the layer is open.
+In LayerClose the vtable of the layer will be restored to the original methods 
+(by calling msInitializeVirtualTable)
+
+3.1 Implementing the multiple layer approach
+--------------------------------------------
+
+In this case mapcluster.c will be compiled as a plugin which will set the vtable methods in the
+plugin initialization. The clustering process itself will be handled in the LayerWhichShapes call.
+In this case the features are accessed by opening the source layer (as referred in the CONNECTION parameter)
+but the bulk of the clustering process is the same as described in the previous section.
+
+3.2 Handling the feature attributes (items)
+-------------------------------------------
+
+The cluster layer itself will provide only the "Cluster:FeatureCount" aggregated attribute 
+(which can be used to configure the labels to contain the feature count of the clustered shape), 
+however cluster layer will support to get further attributes from the original data source as 
+referenced in the LAYER configuration.
+The ITEMS processing option can be used to specify the set of the attribures according to the user preference.
+
+3.2 Projections
+---------------
+
+In the multiple layer (plugin) configuration the cluster layer data provider
+will support transforming the feature positions between the layers. The clustering process itself is
+happening in the projection of the cluster layer. 
+
+3.3 Handling classes and styles
+-------------------------------
+
+We can define the symbology and labelling of the combine layers in the same way as any other layer by specifying 
+the classes and styles. STYLEITEM AUTO is not considered to be supported at this phase.
+
+3.4 Query processing
+--------------------
+
+In the single layer approach the clustering will only be happen when rendering the layer (background) the query
+itself will operate on the original data source and will retrieve all the features within the specified region.
+In the multiple layer approach the query will be happen in the cluster layer and the clustered fetures can
+be retrieved (as displayed on the screen). The clustered fetures are preserved until the layer is open, so the
+single pass so the single pass query approach is provided by the driver.
+
+4. Implementation Details
+-------------------------
+
+In order to implement this enhancement the following changes should be made in the MapServer codebase:
+   
+1) Modify mapdraw.c to invoke msClusterlayerOpen based on the existence of the clustering parameters.
+2) Implement mapcluster.c containing the code of the cluster layer data source.
+
+4.1 Files affected
+------------------
+
+The following files will be modified/created by this RFC:
+
+::
+
+  Makefile.vc
+  Makefile.in
+  mapcluster.c (new)
+
+4.2 MapScript Issues
+--------------------
+
+There's no need to modify the MapScript interface within the scope of this RFC.
+
+4.3 Backwards Compatibilty Issues
+---------------------------------
+
+This change provides a new functionality with no backwards compatibility issues being considered.
+
+5. Bug ID
+---------
+
+The ticket for RFC-69 (containing implementation code) can be found here.
+
+Bug 3700_
+ 
+.. _3700: http://trac.osgeo.org/mapserver/ticket/3700 
+
+6. Voting history
+-----------------
+
+None
+
+


Property changes on: trunk/docs/en/development/rfc/ms-rfc-69.txt
___________________________________________________________________
Added: svn:keywords
   + Date Id