[Aust-NZ] NZ Geo Placenames

Peter webwiz at pl.net
Fri Apr 8 01:15:55 EDT 2011


Francis Markham wrote:
> Is "linz geonames" the same as the NZ Gazetteer?  They have data
> available at:
> http://www.linz.govt.nz/placenames/find-names/nz-gazetteer-official-names/index.aspx
>
> Cheers,


I havent compared them line by line, but having used both for various applications i understand them to be related, and derived from the same dataset. Note that they are v.poor. Especially if you try to do things like compute which tla polygon thames or coromandel are in. Answer none, they are in the sea.

FTR, no one told me till now that the NZ fire service maintains a supposedly accurate placenames layer as part  of their extensive emergency services suburbs/localitys dataset. From looking around i found out that it is supplied by terralink for free but under a restrictive license. Having dealt with TLA data on this sort of basis i imagine that involves saying what you want it for, and signing a declaration.  

I understand this effectively precludes reuse/derived works of just about any kind from what ive heard.

Also FTR i have made a start on this task and the metadata is below. If anyone wants to preview the dataset to date let me know, im always open to feedback. Otherwise i will be releasing it when it is 1.0.

Peter

ps. does anyone know if there is any satelitle data that im allowed to load as a wms layer in Qgis?
I discovered the google layer plugin but its tedious to use (and i imagine illegal).

--------------------

ps-placenames.xls  metadata

This dataset comprises a single points layer. When the dataset reaches 1.0 status it will be released as .shp and .csv but for now it is in .xls format. It is very much a work in progress.

Currrently it comprises a subset of the linz geographic placenames shape layer, modified as follows:

The point features are only those with DESC_CODE attribute of either METR, TOWN, or POPL.
They were downloaded from Koordinates in NZTM, and the dataset remains in NZTM at this time.
The data was loaded into QGIS, overlaid over the following layers by way of a base reference:
  
   a. original linz placenames
   b. OSM nz-locations points layer downloaded in wgs84 shape format, and converted to NZTM by QGIS
   c. google geocoded placename equivilants for reference only (wgs conversion as above). No google coordinates appears in this dataset.
   d. zenbu pois, latest set as of 5/4/11, downloaded in csv format, and imported using QGIS delimited text plugin, after removing all chathams island points and those with 0,0 values.
   d. zenbu AllSuburbsRegions dataset (a heavily hand modifed) LINZ BDE extract derived dataset courtesy Zenbu.
   e. LINZ road-centerlines, sealed and highway, ex Koordinates
   f. LINZ residential areas,
   g. LINZ building-locations and building footprints
   g. Olivier and Co nz-urban-north and south polygons, downloaded from Koordinates in same manner as above.
  
Therefore in practice sources d. and e. and g, form the effective basis of this dataset. You will find some example methodology pngs enclosed.
Be aware that e and g are referenced to the LINZ roading data, while d is likely referenced to whatever roading dataset google possesses and possibly affected by googles spherical mercator projection. As such some minor discrepencys may occur when moving from one to the other. One would expect the LINZ road lines to be the source of googles road lines, but i doubt it. The LINZ road lines are known to be a reasonably old and rough dataset.

Regardless of the above, this dataset was created using the following criteria, in order of priority:

- attempts to represent the present (2011) subjective 'center' of each place as defined by its commercial/retail center
- ie. mainstreets where they exist, any kind of central retail cluster, even a couple of shops in very small places.
- the coordinate is almost always at the junction or two or more roads.
- most of the time the coordinate is at or near the centroid of the poi cluster
- failing any significant retail presence, the coordinate tends to be placed near the main road junction to the community.
- also taken into account, when various the above criteria fail to yield a definitive answer, is the centroids of:
   . the urban polygons
   . the clusters of building footprints/locations.

To be clear the coordinates dataset is manually produced by eye without any kind of computation. As such the points are placed approximately perhaps plus or minus 10m, but given that the roads layers is not that flash, no attempt was made to snap the coordinates to the road junctions themselves. However on balance the dataset is of significantly better quality than its LINZ forebear.

I have removed many of the outdated attribute fields, and what remains is as follows:

- ID	
- NAME	
- DESC_CODE
- LANDDISTRI
- X
- Y
- WKT_GEOM

Please note that the first four are exactly as specified in the original dataset. However i consider them to be of minimal value, other than the id to match the balance of the placenames dataset.

NAME has had ALL CAPs removed.
DESC_CODE has a peculier idea about the relative size of each place and i will be fixing this shortly.
LANDDISTRI is an old administrative region that bears only a passing resemblance to current TLA regions. This also will be fixed prior to 1.0 release.


License: clarification about the derived nature of LINZ and zenbu via google data needs to be sought. But pending these copyright complications, the actual points data is essentially an original work, released as public domain.

Todo:
1. Create polygon bounding layers for all places that arent present in Olivier and Cos urban areas.
2. Compute approx area in hectares for each 'place' polygon
3. Compute TLA district and region for each place
4. Add Zenbus "nearest town" field for each place.
5. Attempt to derive population for each place from statnz data or local government sources.
6. Compute a more meaningful place size score, enabling mapmakers to render places accordingly.
7. Move onto the SBRB, USAT and LOC subsets attempting to  creating a single quality liberally licensed placenames dataset. I expect this to be extremely difficult, as the places have nothing to see, or are places without commonly understood centroids. Hence it may not happen, and reliance on the best available source may have to do.


Peter Scott 8/4/2011




More information about the Aust-NZ mailing list