[mapserver-users] Dual role of LAYER is confusing

Tue Feb 11 04:30:51 PST 2014

Hi,

I will change my course to direction "Defining input data for Mapserver is confusing".  Dual role of LAYER is just a detail and using a layer as input for heatmap generator feels actually rather good. But there is something more general that smells rotten to me and I cannot be the most negative people to say so. After all, I do know something about mapfiles, I maintain a few hundred WMS layers and I can myself do many tricky things very effectively with "copy-paste-edit" and "find-replace" and with heavy use of INCLUDEs. I can say that the system is powerful, flexible and maintainable - for me. And each summer I have experienced that it is impossible to make the one who should substitute me during my holidays to believe me.

I try to explain why defining the input data for Mapserver feels sometimes awkward.  As a user, when I create a new vector layer all the input data feel like the same. After all, they are just vectors and I do not care if they come from a shapefile, database, or tileindex.  When there are hundreds of layers to maintain, it would be very nice if:

- Defining the input data would follow always the same logic, even better if also the same syntax
- Input data definition could be put on one line so it would be simple to edit the whole mapfile  with a text editor and search-replace
- Changing from shapefile to using a bunch of files through ogrtileindex or to a database could be done with one simple edit
- Input data definition would be reusable over many layers and mapfiles.

Let's have a look and see how we define the input vector data now. Myself I need to use only 6 of these methods, the others I collected from Mapserver documentation. Naturally I use raster data too and that makes 4 input types more (file, shapefile tileindex, spatialite tileindex, WMS) but those feel rather convenient for me. So these are the vector data input methods with usage examples. 

1) Shapefile:
DATA "shapefile"

2) Native shapefile tileindex:
TILEINDEX "shapefile"
TILEITEM "LOCATION"

3) Tileindex through OGR
CONNECTIONTYPE OGR
TILEINDEX "/shapefile.shp,0"
TILEITEM "LOCATION"

4) Tileindex from a layer, which must be defined separately
TILEINDEX "layer"

5) Connection through OGR, in this case to Spatialite database
CONNECTIONTYPE OGR
CONNECTION "/spatialite.sqlite"
DATA "select * from layer"

6) Native SDE-connection
CONNECTION "sdemachine.iastate.edu,port:5151,sde,username,password"
CONNECTIONTYPE SDE
DATA "HOBU.STATES_LAYER,SHAPE,SDE.DEFAULT"
FILTER "where MYCOLUMN is not NULL"

7) Connection through a plugin
CONNECTIONTYPE PLUGIN
PLUGIN "msplugin_mssql2008.dll"
CONNECTION "Server=.\MSSQLSERVER2008;Database=Maps;Integrated Security=true"
DATA "ogr_geometry from rivers USING UNIQUE ogr_fid USING SRID=4326"

8) Native Oracle connection
CONNECTIONTYPE oraclespatial
DATA "MYGEOMETRY FROM MYTABLE USING UNIQUE MYTABLE_ID"

9) Native PostGIS connection
CONNECTIONTYPE POSTGIS
CONNECTION "host=yourhostname dbname=yourdatabasename user=yourdbusername
            password=yourdbpassword port=yourpgport"
DATA "geometrycolumn from yourtablename"

10) Connection to remote WFS service
CONNECTION "http://demo.mapserver.org/cgi-bin/wfs?"
  CONNECTIONTYPE WFS
  METADATA
    "wfs_typename"          "continents"
    "wfs_version"           "1.0.0"
    "wfs_connectiontimeout" "60"
    "wfs_maxfeatures"       "10"
  END

I do not believe that there was a bizarre developer who made a plan about implementing all these data input systems for Mapserver v. 0.1. Rather they have just appeared one after another during the years.

As a user I can't say what might work and what wouldn’t so I can only challenge the developers to think how to make data input more user friendly. However, I do have some thoughts.
- Individual layers are not so hard to configure but it is irritating that there are so many different configurations.
- Fortunately so many formats are supported only through OGR which gives the same syntax for all: CONNECTIONTYPE + CONNECTION + DATA.
- I wonder why DATA was not originally defined as block like many analogous things like PROJECTION, SYMBOL, CLASS, STYLE were. Perhaps nobody thought about anything else than shapefiles.
- We may introduce a new object "DATADEFINITION" that holds everything that is needed to read the input vectors: path to shapefile or CONNECTIONTYPE+CONNECTION+DATA or ogrtileindex path.
- Like SYMBOL, DATADEFINITION (or shorter, DATADEF) could have a name.
- DATADEF would be defined like a symbol
          DATADEF
          NAME "streetdata"
          CONNECTIONTYPE OGR
          CONNECTION "/data/osm_sqlite"
          DATA "select * from lines where highway is not null"
          END
- DATADEF could be defined it the mapfile and reused by its name.
- In LAYER, one line would by enough for defining the input data source but support for old syntax remains.
         LAYER
         DATADEF "streetdata"
         ......
- We have SYMBOLSET and FONTSET and for the same reasons we could have DATADEFSET file. DETADEFs stored into the DATADEFSET file could be reused across multiple mapfiles. Hey, only one or handful of places to update after password change! In practice that would make it possible to change the passwords.
Hmm, it should be possible also to encrypt the DATADEFSET file but I leave that to the developers.
- I am not sure if it would work, but perhaps it could be possible to separate, if needed, DATADEF and DATADEFFILTER. There is an example in the native SDE connector which seems to have a FILTER. For example for OGR layers it would mean storing only CONNECTIONTYPE and CONNECTION into DATADEF. SQL select could come from DATADEFFILTER.  It would give a possibility to re-use a common database connection in a mapfile but create layers by adding one line for unique selects. Unfortunately the DATADEFFILTER would probably not work if DATADEF is changed to read data from a shapefile instead of PostGIS db. Another problem for the developers.

-Jukka Rahkonen-

thomas bonfort wrote: 
> On 10 February 2014 15:13, Rahkonen Jukka  (Tike)
> <jukka.rahkonen at mmmtike.fi> wrote:
> > Hi,
> >
> > In most cases in Mapfile LAYER is defining the output of the service. However,
> it looks like the original way to use DATA has not been flexible enough to suit
> new sources of data and as a workaround one layer is used as an input for
> another layer. For example if a shapefile is used as tileindex it is enough to write
> TILEINDEX "tiger/index.shp"
> > However, if one wants to take tileindex from Spatialite it must be
> > done by defining a layer first
> >
> > LAYER  # The tileindex layer
> >         NAME "spatialite_tileindex"
> >         STATUS OFF
> >         TYPE POLYGON
> >         CONNECTIONTYPE OGR
> >         CONNECTION "/orthophotoindex.sqlite"
> >         DATA "select * from orthophotos where year=2013"
> >        PROJECTION
> >           "init=epsg:3067"
> >         END
> > END # End of tileindex layer
> > LAYER # The orthophoto layer
> >         NAME "orthophotos_2013 "
> >         STATUS ON
> >         TILEINDEX "spatialite_tileindex"  # Name of the tileindex
> > layer ....
> >
> > I guarantee that many if not all new Mapserver users consider this as tricky.
> However, situation can be even more tricky. The new RFC about
> heatmaps/density maps gives an example
> http://mapserver.org/development/rfc/ms-rfc-108.html.
> > The source data for the heatmap layer is configured as
> >
> > LAYER
> >     NAME "heatmap"
> >     TYPE raster
> >     CONNECTIONTYPE kerneldensity
> >     CONNECTION "points"
> >
> > and here "points" is referring to a layer.  It the "points" layer is using
> ogrtileindex where tileindex is not a shapefile but comes from a database we
> will need three layers for publishing the heatmap:
> > - ogrtileindex layer
> > - "points" layer
> > - heatmap layer
> >
> > This is doable but confusing. What is also confusing is that I have not yet found
> a way to hide those technical layers from WMS.
> 
> Supposing you had that use-case given your data files, how would you specify it
> in a more readable way if you didn't have this syntax?

 > Until now after making a web search during a coffee break, it is
> > naturally done with a layer level metadata item "ows_enable_request"
> > "!*" as described in
> > http://mapserver.org/de/development/rfc/ms-rfc-67.html.  Great. But
> > still I feel the dual role of "layer" a bit confusing.  At least in
> > the tileindex case I wish I could simply write just one line even if
> > with somehow complicated syntax. Perhaps it could be something like
> >
> > TILEINDEX "SELECT * from orthophotos where year=2013 USING OGR
> CONNECTION /orthophotoindex.sqlite"
> 
> I'd argue that referencing an existing layer is less confusing. In any case it is
> more powerfull, more flexible and more maintainable in our codebase.

> Interesting points nevertheless....

-Jukka-

> best regards,
> thomas
> 
> >
> > -Jukka Rahkonen-
> >
> >
> >
> >
> > _______________________________________________
> > mapserver-users mailing list
> > mapserver-users at lists.osgeo.org
> > http://lists.osgeo.org/mailman/listinfo/mapserver-users