[mapserver-dev] RE: [mapserver-users] colorramp and datarangeon the fly?

Stephen Woodbridge woodbri at swoodbridge.com
Thu Feb 4 13:38:33 EST 2010


To build on what Steve is saying it might help to think of features in 
mapserver as requiring two separate components:

1) the algorithm
2) the presentation

Obviously you can not present something that you do not have. the 
algorithm is how you generate stuff. Once we have stuff we need to 
present it via the multitude of rendering modules and interfaces, like 
the CGI interface or template interface, etc and rendered as text, json, 
xml, etc.

I think adding a STATS object might be a good way to add pluggable 
algorithms and filters into the layer.

A year plus ago, I did some research into thematic cartography and the 
math and statistics involved in that.

Here are some of the data classification methods we might want to support:

1) equal intervals
2) quantiles (quartiles are n=4 quantiles)
3) mean-standard deviation
4) maximum breaks
5) natural breaks
6) optimal

You can probably find more on these on wikipedia if you want the math 
behind them.

In addition to the above for choropleths we might want to consider 
bivariant or multivariant analysis.

And there is also a need to apply filters to the data like:

1) eliminate outliers
2) numerous smoothing algorithms that might be appropriate
   - triangularization
   - inverse distance
   - kriging
   - various interpolation routines that solve various issues with the data
3) probably others

You might look on Amazon used books for some reference material on 
thematic mapping. I got the following:

Introduction to Thematic Cartography, by Judith Tyner
Thematic Cartography and Visualization, by Terry Slocum
Thematic Cartography and Geographic Visualization, by Slocum, McMaster, 
Kessler and Howard

These are all excellent and there are a lot of others.

I think the point here is that there is a lot of research that has gone 
into this field and if we want to build tools that will allow mapserver 
to work effectively in this area, and I DO think we should do this, then 
we need to understand the scope of what needs to be done even if it is 
not all done at once and we need to build a pluggable interface that 
allows us to easily add stats analysis and apply statistical filters as 
the need arises. I think for the most part what Steve L has described 
will meet (I guessing) 80-90% of the requirements that might reasonably 
thrown at us and the others we can most likely say are too specialized 
or out of scope.

I will also note that for the most part it can be argued that mapserver 
should NOT do the analysis, that that is the realm of other tools, but 
given that the analysis has been preformed that mapserver needs to 
support rendering it. Some of the analysis mentioned above it 
non-trivial to do on the fly. It is not clear to me where we draw a line 
in the sand and justify yes this in mapserver, no, that is not in mapserver.

Ok, sorry this probably way more than any of you want to know :)

-Steve W

Lime, Steve D (DNR) wrote:
> You’re always mixing things up aren’t you…
> 
> I don’t think you need a  special mode. This strikes me as just another 
> special form of template output based on a query.  Presently you can get 
> layer level  query data (e.g. number of results) and of course spatial 
> and non-spatial attributes for each feature. What’s needed here is the 
> mechanism to gather layer stats (and a definition of what those are) in 
> either draw or query context and then a means of using or presenting 
> them.  Imaging in a new “stats” tag within a result set like so (within 
> a resultset tag):
> 
>     [stats item=”foo” format=”$stddev,$min,$max”]
> 
> You could use that to populate XML, JSON or just plain HTML output for 
> use on the client. This tag would draw from a populated layerStatsObj 
> which might consist of an array or hash of itemStatsObj’s…  You could 
> populate those objects on-the-fly or perhaps even from a file. Would 
> probably need a new STATS … END block within a layer to drive this. Just 
> thinking out loud.
> 
> Steve
> 
> *From:* Bob Basques [mailto:Bob.Basques at ci.stpaul.mn.us]
> *Sent:* Thursday, February 04, 2010 8:19 AM
> *To:* Lime, Steve D (DNR); Stephen Woodbridge; Jan Hartmann
> *Cc:* MapServer Dev Mailing List; mapserver-users at lists.osgeo.org
> *Subject:* RE: [mapserver-dev] RE: [mapserver-users] colorramp and 
> datarangeon the fly?
> 
>  
> 
> All,
> 
>  
> 
> Just to mix it up a little, what about doing a half and half approach. 
>  MapServer could generate something simpler (the data basics from a data 
> read on the fly, and return a simple ramping config file, which could be 
> used to pass back as a  file for the custom aspects.
> 
>  
> 
> Just having these two options would open up some doors, MapServer 
> returning a simple ramp structure as a file, could be used in a User 
> interface to do more complicated theming for example.  Once the basic 
> data limits are known, the ramping divisions are much easier to decide on.
> 
>  
> 
> Could this work as a compiled in module for those that need/want it 
> instead of always in? or would it make more sense to build in, ?? 
> Something similar to a mode=legend sort of thing, like mode=ramp_txt ??
> 
>  
> 
> bobb
> 
>  
> 
> 
> 
>  >>> "Lime, Steve D (DNR)" <Steve.Lime at state.mn.us> wrote:
> 
> At one point I toyed with the idea of supporting a .stats file for a 
> layer and supplying a routine (command-line) that would populate it. The 
> file would contain data like you're talking about. Doesn't help with the 
> on-the-fly needs Bart was articulating.
> 
> Steve
> 
> -----Original Message-----
> From: mapserver-users-bounces at lists.osgeo.org 
> [mailto:mapserver-users-bounces at lists.osgeo.org] On Behalf Of Stephen 
> Woodbridge
> Sent: Wednesday, February 03, 2010 10:39 AM
> To: Jan Hartmann
> Cc: Lime, Steve D (DNR); MapServer Dev Mailing List; 
> mapserver-users at lists.osgeo.org
> Subject: Re: [mapserver-dev] RE: [mapserver-users] colorramp and 
> datarange on the fly?
> 
> Right, I think there are two use cases:
> 
> 1) data exploration   - can be slower but needs flexibility
> 2) production serving - needs to be fast, and probably limits
> flexibility to some predefined models
> 
> I think that there is also another angle to this, which is how the
> summary data is computed for example:
> 
> 1) min/max/average/std
> 2) statistical analysis
> 3) binning into some number of classes
> 4) removing outliers so the results are not skewed by them
> 5) etc
> 
> There are a lot of ways the people might need to summarize they data.
> 
> If the data is in a database, then you can add all the analysis, slicing
> and dicing to the database and the rendering to mapserver.
> 
> So, I think that it would be nice to be able to read some "metadata"
> about a layer and then use that for building the display using something
> like colorramp and datarange. We might want to look at ways that we
> could establish in mapserver for fetching the "metadata" about a layer.
> For example:
> 
> 1) define the "metadata" in the METADATA object
> 2) define a .met file for a shapefile or tileindex that contained the
> "metadata" for that layer
> 3) define a separate SQL query that could be used to fetch the
> "metadata" for the layer
> 4) something similar for other layer providers.
> 5) scan the data in two passes to compute some simple "metadata"
> 
> Thoughts?
> 
> -Steve W
> 
> 
> Jan Hartmann wrote:
>  > If you allow two passes, you can have all sorts of summarized values in
>  > the template, to be used in the second pass, like Bart's actual extent
>  > used for coloring. Doesn't look to difficult to implement to me, as long
>  > as the two passes only get called when really necessary. I'm not sure if
>  > performance is an issue for MapServer itself: if you really want high
>  > performance, you should use the underlying format or database directly.
>  >
>  > Jan
>  >
>  > On 3-2-2010 17:02, Lime, Steve D (DNR) wrote:
>  >> How big a change would depend on the implementation. The brute force
>  >> approach where you simply loop through features once to compute ranges
>  >> and then again to draw would be probably pretty straight forward and
>  >> driver independent. Wouldn't be fast (but would be simple). Complexity
>  >> would be added as you try and boost performance by:
>  >>
>  >>    - allowing drivers to compute stats in their own way (e.g. add to
>  >> the layer API something like msLayerGetStats(...))
>  >>    - caching geometries from a first pass through the shapes for the
>  >> second
>  >>
>  >> Steve
>  >>
>  >> BTW The color ramp support needs to be cleaned up first. I think we
>  >> scared the originator of that code away when an RFC was originally put
>  >> together.
>  >>
>  >> -----Original Message-----
>  >> From: mapserver-users-bounces at lists.osgeo.org
>  >> [mailto:mapserver-users-bounces at lists.osgeo.org] On Behalf Of Bart van
>  >> den Eijnden
>  >> Sent: Wednesday, February 03, 2010 5:12 AM
>  >> To: mapserver-users at lists.osgeo.org
>  >> Subject: [mapserver-users] colorramp and datarange on the fly?
>  >>
>  >> Hi list,
>  >>
>  >> is it possible to have a colorramp in Mapserver based on the min and
>  >> max value in the current extent?
>  >>
>  >> So instead of predefining the min and max in DATARANGE, have Mapserver
>  >> use the min and max value of the dataset in the current extent?
>  >>
>  >> If not, would it be an easy change or a very complex one?
>  >>
>  >> Best regards,
>  >> Bart_______________________________________________
>  >> mapserver-users mailing list
>  >> mapserver-users at lists.osgeo.org
>  >> http://lists.osgeo.org/mailman/listinfo/mapserver-users
>  >> _______________________________________________
>  >> mapserver-dev mailing list
>  >> mapserver-dev at lists.osgeo.org
>  >> http://lists.osgeo.org/mailman/listinfo/mapserver-dev
>  >>   
>  > _______________________________________________
>  > mapserver-dev mailing list
>  > mapserver-dev at lists.osgeo.org
>  > http://lists.osgeo.org/mailman/listinfo/mapserver-dev
> 
> _______________________________________________
> mapserver-users mailing list
> mapserver-users at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/mapserver-users
> _______________________________________________
> mapserver-dev mailing list
> mapserver-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/mapserver-dev
> 



More information about the mapserver-dev mailing list