[gdal-dev] ogr2ogr problem converting UK Ordnance Survey MasterMap

Peter J Halls P.Halls at york.ac.uk
Tue Jul 13 03:16:45 EDT 2010


Even, Jez,

    sadly, I am not going to be able to try this out myself for at least a 
couple of weeks, due to other commitments.

    I think Even's solution to split the multiple entries into a large number of 
simple entries is workable, however I do have a couple of caveats.  I think this 
is pragmatic and the only viable approach for the majority of output structures 
which, like Shapefiles / dBase IV, cannot support list columns.  I think that 
this should be adequate for those who require to draw maps from these data - the 
cartographic instructions should be easily handled by this solution; my doubts 
concern the use of these data as inputs to analytic processes.  My caveats are:

1) I would like to know how many such columns are present for each 'list' and 
how many were found.  For example, the limit might be set at 80 but there be 85 
in the source, so I would like to know that for a particular record only a 
subset / the first 'n' have been stored.  This gives the possibility of making 
adjustments to the parameters.  Of course, it also takes up a couple of precious 
columns per list entry, reducing further the proportion of the list that can be 
recorded ...

2) This 'list' structure is used by OS for both 'informational' entries, eg 
changeHistory, and 'data' entries, eg referenceToTopograhicArea (which lists all 
the TOIDS that comprise a complex feature such as, for example, 'Station Road' 
in ITN).    In this case, OS are adopting a similar approach to that used by 
ESRI for the old 'coverage' format, by not repeating the geometry but creating 
the geometry once and then making pointers to those parts of the geometry 
required for a specific purpose.  Here, truncation of the list means that the 
output data may not be usable for the intended purpose.  Unfortunately, other 
than using lists, I cannot see a viable alternative as there can be many 
thousand of these - which can easily exceed the maximum number of columns 
permitted in several of the output formats.  I had originally been considering 
comma separated lists in a single string, but these can quickly exceed the 
maximum string length, which brings us back to the reasons for Even's solution.

This is a form of topology embedded within the OS data and it might be that it 
is desirable to continue with the 'no topology here' principle.  This is another 
of the 'problems', in that many (most today?) spatial data structures are not 
designed to store topology, however topology does have its uses.

3) What to do when a limit is reached.  As I have not had the chance to try 
Even's development yet I do not know what approach has been chosen.  From the 
perspective of using the output, I guess that I want a list of the FIDs (TOIDS) 
which contain truncated data structures: this would permit some measure of 
choice when handling these data ... a sort of 'exceptions list'.  Of course, 
this does not permit the recovery of the lost data ... nor does it allow me to 
differentiate between those columns that do not matter to me and those that do 
but it may be the most practical approach.

Enough: I must try Even's work out for myself ...

Thanks and best wishes,

Peter


Even Rouault wrote:
> Jez,
> 
> if you checkout latest GDAL trunk, you'll find a new -splitlistfields option 
> for ogr2ogr that will split fields of type IntegerList, RealList or 
> StringList into as many subfields of single type as necessary. You can also 
> specify -maxsubfields an_integer_value to limit the number of subfields (can 
> be usefull if you just want to keep the first element of the list, or to keep 
> the number of subfields to a reasonable number, as some features from your 
> GML file have a big number of elements in the list)
> 
> Even
> 
> Le Monday 12 July 2010 20:04:00 Even Rouault, vous avez écrit :
>> Jez,
>>
>> Yes this is a limitation of the shapefile format (and most drivers,
>> PostgreSQL databases being one of the exceptions).
>>
>> Try adding -fieldTypeToString IntegerList,RealList,StringList to your
>> ogr2ogr command line. This will transform any field of those types into a
>> String field by concatenating the values into a single string (what you can
>> see with ogrinfo). Beware that if the list if longer more than a few items,
>> there will be a truncation at 80 characters.
>>
>> I'm considering to see if it's practical or not to add an option to ogr2ogr
>> to split fields of type *List into several fields of simple type.
>>
>> Best regards,
>>
>> Even
>>
>> PS: For the record, in http://download.osgeo.org/gdal/daily/, you can find
>> daily snapshots of the source code of the trunk (1.8.0dev) and the 1.7
>> stable branch.
>>
>> Le Monday 12 July 2010 18:09:16 Jez Walters, vous avez écrit :
>>> Even,
>>>
>>>
>>> I've just rebuilt GDAL/OGR using the latest code from the GDAL 'trunk',
>>> but now I get the following error using ogr2ogr to convert an OS
>>> MasterMap chunk (e.g.
>>> http://www.ordnancesurvey.co.uk/oswebsite/products/innovations/sampledata
>>> /O SMasterMap_Topo/58116-SX9192-2c1.gz) into ESRI shapefiles:
>>>
>>> "ERROR 6: Can't create fields of type StringList on shapefile layers."
>>>
>>> The various fields for which this error is reported do not appear to be
>>> in the resultant shapefiles. Unfortunately this makes the new GDAL code
>>> unusable for me.  :-(
>>>
>>> Any thoughts?
>>>
>>>
>>> Jez
>>>
>>>
>>> -----Original Message-----
>>> From: Even Rouault [mailto:even.rouault at mines-paris.org]
>>> Sent: Sunday 11 July 2010 11:12
>>> To: gdal-dev at lists.osgeo.org
>>> Cc: Martin Daly; Peter J Halls; Jez Walters
>>> Subject: Re: [gdal-dev] ogr2ogr problem converting UK Ordnance Survey
>>> MasterMap
>>>
>>> Just to inform you that now that the NAS driver is in GDAL trunk, I've
>>> been able to port its enhancements to the main GML driver. On the few
>>> samples I've tested, OS Mastermap GML files seem to be read correctly
>>> now.
>>>
>>> See http://trac.osgeo.org/gdal/ticket/3680
>>>
>>> Le Friday 02 July 2010 09:04:38 Martin Daly, vous avez écrit :
>>>>>     Here it is not only GDAL/OGR that has a problem!  Currently, I
>>>>> know of no importer that can handle this construct, other than the
>>>>> tool (from Snowflake) used by OSGB to generate it - and there is also
>>>>> the question of onwards storage.
>>>> Not even close, I'm afraid.
>>>>
>>>> There are plenty of tools to read (all parts of) OS MM:
>>>>
>>>> http://www.ordnancesurvey.co.uk/oswebsite/products/osmastermap/informat
>>>> io n/ technical/software.html
>>>>
>>>> e.g. (an excellent one, at a very reasonable price...)
>>>>
>>>> http://www.ordnancesurvey.co.uk/oswebsite/products/osmastermap/informat
>>>> io n/ technical/software/cadcorp.html
>>>>
>>>> Also, as far as I am aware, OS GB use in-house software to generate the
>>>> data.
>>>>
>>>> Martin
>>>> ***********************************************************************
>>>> ** ** * This email is confidential and may be privileged and should not
>>>> be used, read or copied by anyone who is not the  original intended
>>>> recipient. If you have received this email in error  please inform the
>>>> sender and delete it from your mailbox or any other storage mechanism.
>>>> Unless specifically stated, nothing in this email constitutes an offer
>>>> by Cadcorp and Cadcorp does not warrant that any information contained
>>>> in this email is accurate. Cadcorp cannot accept liability for any
>>>> statements made which are clearly the sender's own and not expressly
>>>> made on behalf of Cadcorp or one of its agents. Please rely on your own
>>>> virus check. No responsibility is taken by Cadcorp for any damage
>>>> arising out of any bug or virus infection.
>>>> ***********************************************************************
>>>> ** ** *
>>>>
>>>> _______________________________________________
>>>> gdal-dev mailing list
>>>> gdal-dev at lists.osgeo.org
>>>> http://lists.osgeo.org/mailman/listinfo/gdal-dev
>>> The information transmitted is intended only for the person
>>> or entity to which it is addressed and may contain
>>> confidential and/or privileged material. If you are not the
>>> addressee, any disclosure, reproduction, copying,
>>> distribution, or other dissemination or use of this
>>> communication is strictly prohibited. If you have received
>>> this transmission in error please notify the sender
>>> immediately and then delete this email.
>>>
>>> Any representations or commitments expressed in this email
>>> are subject to contract.
>>>
>>> This message has been scanned for viruses and dangerous
>>> content. However, it is essential that the recipient also
>>> checks this message using commercially available mail
>>> scanning and anti-virus software. IPL Information Processing
>>> Limited accepts no liability for any loss or damage resulting
>>> from any virus or other dangerous content in this message.
>>>
>>> IPL Information Processing Limited is registered in England
>>> and Wales under company registration number 1418818.
>>> Registration took place at Cardiff on 10 May 1979. IPL
>>> Information Processing Limited's registered office and
>>> normal place of business is Eveleigh House, Grove Street,
>>> Bath, BA1 5LR, United Kingdom. IPL is also registered for
>>> Value Added Tax (VAT) under registration number GB 601 2931 83.
>> _______________________________________________
>> gdal-dev mailing list
>> gdal-dev at lists.osgeo.org
>> http://lists.osgeo.org/mailman/listinfo/gdal-dev
> 
> 
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev

-- 
--------------------------------------------------------------------------------
Peter J Halls, GIS Advisor, University of York
Telephone: 01904 433806     Fax: 01904 433740
Snail mail: Computing Service, University of York, Heslington, York YO10 5DD
This message has the status of a private and personal communication
--------------------------------------------------------------------------------


More information about the gdal-dev mailing list