[gdal-dev] GDAL GML driver and .xsd schema mapping

Even Rouault even.rouault at mines-paris.org
Mon Nov 14 14:25:09 EST 2011


Jukka,

> 
> It is not totally clear for me to what extent GDAL GML driver is utilising
> the .xsd schema if it is present.  I have an example where the schema
> obviously in not utilised or respected. In the schema the nationalCode
> element is defined as </element>
>           <element name="nationalCode" type="string">
>             <annotation>
>               <documentation>-- Definition --&#13;
> Thematic identifier corresponding to the national administrative codes
> defined in each country.&#13;
> 
> In the GML file the nationalCode attribute has values which are strings but
> they contain always only numeric characters. The .gfs file created by GDAL
> is interpreting the attribute data type as Integer. <PropertyDefn>
>       <Name>nationalCode</Name>
>       <ElementPath>nationalCode</ElementPath>
>       <Type>Integer</Type>
>     </PropertyDefn>

In fact, the issuse is not with the nationalCode because OGR knows the 
"string" type. The issue is with the following elements, like inspireId whose 
type "base:IdentifierPropertyType" is not understood. When the XSD parser 
doesn't understand an element, it is preferable for it to completely give up, 
instead of only keeping the fields it understands and loosing the others one, 
which would lead to information loss. When the XSD parser fails, or the XSD 
doesn't exist, a full scan of the GML is triggered to make an auto-discovery 
of its structure that will be recorded in a .gfs file. Of course, as a side-
effect, this auto-discovery cannot guess that a field should be considered as 
string when all values found in the file are in fact integers.

> 
> Ogr2ogr conversion to other formats gives a correct data type for this
> attribute if I edit first manually the .gfs file to use String type for
> nationalCode. I would like to know if this is how it is planned to be and
> users just need to remember to correct all the
> strings-with-only-numeric-characters attribute types manually.
> 
> Perhaps this behaviour has something to do with the data which is using an
> Inspire schema that cannot be totally converted into ogr model?  Comments
> of ticket 4328 http://trac.osgeo.org/gdal/ticket/4328 especially 
> http://trac.osgeo.org/gdal/changeset/23315  seem to suggest so.

Yes, this schema is too complex for the OGR XSD parser (and beyond the 
capabilities of the parser itself, it doesn't feed into the Simple Feature 
model). The parser basically only understoods a subset of the possible 
schemas. It should match pretty much what is described as the "Compliance 
level SF-0" of "GML 3.1.1 simple features profile - OGC(R) 06-049r1" : 
http://portal.opengeospatial.org/files/?artifact_id=15201
Which, in a single sentence, basically means only simple fields with a single 
occurrence each.

> 
> If that is a case, could it be possible to make ogr2ogr to print the
> message /* Too complex schema for us. Aborts parsing */  also on screen as
> a warning?  Otherwise users can believe that the .xsd schema file is used
> even it is not.

Hopefully the following should be enough :

r23378 /trunk/gdal/ogr/ogrsf_frmts/gml/ogrgmldatasource.cpp: GML: add debug 
information to know if we use/generate .gfs file while there's a .xsd we ignore

I'm not sure it is a good idea to make it more verbose than a debug trace.

Best regards,

Even
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.osgeo.org/pipermail/gdal-dev/attachments/20111114/fb3860b4/attachment-0001.html


More information about the gdal-dev mailing list