[pycsw-devel] gmd schema 2005 broken? Schemas parser error

Tom Kralidis tomkralidis at gmail.com
Sat Oct 3 05:54:26 PDT 2015


Hi Jachym: looks like the metadata record itself is not valid.  CSW
itself leaves GetRecords and GetRecordById responses relatively open
(either the default Dublin Core + ows:BoundingBox, or anything else),
given the various metadata formats.

pycsw does not perform XML validation when ingesting metadata records
(either via CSW-T operations or via the pycsw-admin.py load_records
command).  Rather, pycsw parses the metadata record (assuming it's a
supported format, else throws an error) and inserts it into the
repository.

Note you can use pycsw-admin.py to validate XML records before insertion with:

pycsw-admin.py -c validate_xml -x file.xml -s file.xsd

Or you can use the pycsw API (assuming 1.10.2) like:

from pycsw.admin import validate_xml
result = validate_xml(xml_filepath, xsd_filepath)

Having said this, we never turned on validating parsers when ingesting
metadata records given the reality that many / most are not XML valid.
As enhancement, we could provide a configuration setting which turns
on strict mode validation when ingesting metadata records into pycsw,
like:

metadata_validation=off|warn|strict

What do folks think?

..Tom


On Fri, Oct 2, 2015 at 5:12 PM, Jachym Cepicky <jachym.cepicky at gmail.com> wrote:
> Hi all,
>
> I would like to validate some records from our PyCSW instance, but it seems,
> that the schemas are broken ? I test on Linux command line and python lxml
> with the same result:
>
> # download the metadata record
> $ wget -O record.xml
> "http://geosense.cz/cgi-bin/csw.py?service=CSW&version=2.0.2&Request=GetRecordById&id=CZ-29002567-srv-jedinecny_nazev_sluzby&outputSchema=http://www.isotc211.org/2005/gmd&elementSetName=full"
>
> # create local xsd schema file
> $ cat << EOF > schema.xsd
> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
> targetNamespace="http://www.isotc211.org/2005/srv"
> elementFormDefault="qualified" version="0.1">
> <xs:include
> schemaLocation="http://schemas.opengis.net/iso/19139/20060504/srv/serviceMetadata.xsd"/>
> </xs:schema>
> EOF
>
> # finally test the schema
> $ xmllint --schema schema.xsd record.xml  --noout
>
> result:
>
> http://schemas.opengis.net/iso/19139/20060504/gml/coordinateOperations.xsd:48:
> element element: Schemas parser error : Element
> '{http://www.w3.org/2001/XMLSchema}element', attribute 'ref': The QName
> value '{http://www.isotc211.org/2005/gmd}AbstractDQ_PositionalAccuracy' does
> not resolve to a(n) element declaration.
> WXS schema schema.xsd failed to compile
>
> Any hint? Do I use proper version of XSD?
>
> Thanks for hint
>
> Jachym
>


More information about the pycsw-devel mailing list