[postgis-users] Postigs layer in Mapserver

Fri Nov 15 08:47:29 PST 2002

(Just an addition to Chris's and Horst's replies, with a few ramblings 
of my own)

Oscar Testa wrote:
 > DATA "the_geom from (select locarg.the_geom, locarg.oid, count(*) as
 > cantidad from locarg, lugar where locarg.codpostal = ugar.codigopostal
 > group by 1,2) as prueba"

You can't combine a COUNT() statement with a geometry. Count summarizes 
the fields described by GROUP BY. For example, if you have a table with 
has population data containing rows for many people with the same postal 
code , you can do:

select count (*) from data
group by postal_code

and you wil get back something like

count
------
   123
   432
77786
   445
  ...
  ...

This gives the number of people for each postal code.
You cannot just add the postal_code field to this query, because there 
are many persons for every postal code. If you want to see the postal 
code for each total, you have to use an aggregate function for that too, 
like min:

select min(postal_code), count(*) from data
group by postal_code

which will give back:

min   | count
---------------
1011      123
1012      432
1014    77786
1015      445
...       ...
...       ...

For a better layout, rename the aggregate column names:

select min(postal_code) as postal_code, count(*) as total_pop from data
group by postal_code:

postal_code | total_pop
--------------------
1011         123
  ...         ...

And now the big leap: how to plot this aggregated table with 
MapServer/PostGIS. Just as with the postal code, you cannot just put in 
a geometry field in the population table, because there are many 
different rows with data for the same location. Instead you use a second 
table (let's call it zipcodes), in which there are two field: the postal 
code and the geometry (its central coordinate in most cases).  You can 
link this table to the computed aggregate table like this:

select the_geom from
zipcodes,
(select min(postal_code) as aggr_postal_code, count(*) as total_pop
      from data
      group by postal_code
) as aggrdata
where zipcodes.postal_code = aggrdata.postal_code

The thing to remember here is that ALL the computed variables of the 
inner select AND the inner select itself have to be named explicitly (as 
aggr_postal_code, as total_pop, as aggrdata). That way MapServer can see 
them.

A MapServer CLASS statement using this DATA could be

LABELITEM aggr_postal_code
CLASS
    EXPRESSION "([total_pop] < 1000)"
    COLOR 30 50 60
    SYMBOL ...
    SIZE ...
END
CLASS
    EXPRESSION "([total_pop] < 2000)"
    COLOR ...
    SYMBOL ...
    SIZE ...
END
...
etc

This wil return a map with symbols at the postal code locations, sized 
accorded to the number of people on that location.

This is pretty hefty SQL, but it is worth some hard thinking to figure 
out how it works. Marketing analysis with GIS is done almost exclusively 
this way. The beauty about PostGIS is the close integration between 
database queries and mapping. It's very easy (by scripting the DATA 
statement) to get all kinds of related spatial distributions this way 
(partial populations, male/female distributions, customer types, just 
take your choice). Of course, this can be done by desktop applications 
like ArcView or MapInfo, but with considerably more effort and not as 
elegant or generalized, and not for WebMapping. You need at least 
ArcIMS/Oracle Spatial for that, and I don't think they will do it as 
elegantly as MapServer/PostGIS (or as fast, if you manage your indexes 
well).

Jan Hartmann
Department of Geography
University of Amsterdam
jhart at frw.uva.nl