[pycsw-devel] gmd schema 2005 broken? Schemas parser error

Tom Kralidis tomkralidis at gmail.com
Mon Oct 5 12:47:55 PDT 2015


Hi Jachym: comments interleaved:

On Sat, Oct 3, 2015 at 11:35 AM, Jachym Cepicky
<jachym.cepicky at gmail.com> wrote:
> hi tom,
>
> in PyWPS-4 we will have exactly this option for Inputs - they may pass the
> validation based on configuration. I think, it is good feature to have in
> order to have your (meta)data records in shape.
>

If you can file a GitHub issue
(https://github.com/geopython/pycsw/issues/new), that would be great.

> However, I would say, this problem happens *before* the record get's
> validated.
>
> What profile (namespace) are you guys using for your geospatial metadata (of
> data and services)?
>

pycsw generally leaves that up to the metadata provider.  For ISO, generally:

19139 (data): http://www.isotc211.org/2005/gmd/gmd.xsd
19119 (services): http://schemas.opengis.net/iso/19139/20060504/srv/srv.xsd

Note that the CSW ISO Application Profile provides
http://schemas.opengis.net/csw/2.0.2/profiles/apiso/1.0.0/apiso.xsd as
a convenience to support both data and services.

Having said this, when I make some changes (invalid ISO dates,
ordering of some elements, invalid xs:ID values), your metadata
document in your example URL validates with no problem against
http://schemas.opengis.net/csw/2.0.2/profiles/apiso/1.0.0/apiso.xsd

Hope this helps.

..Tom


> Thanks
>
> jachym
>
> P.S. Sorry, if my questions seem odd - I'm quite new to practical usage of
> this field
>
> so 3. 10. 2015 v 14:54 odesílatel Tom Kralidis <tomkralidis at gmail.com>
> napsal:
>>
>> Hi Jachym: looks like the metadata record itself is not valid.  CSW
>> itself leaves GetRecords and GetRecordById responses relatively open
>> (either the default Dublin Core + ows:BoundingBox, or anything else),
>> given the various metadata formats.
>>
>> pycsw does not perform XML validation when ingesting metadata records
>> (either via CSW-T operations or via the pycsw-admin.py load_records
>> command).  Rather, pycsw parses the metadata record (assuming it's a
>> supported format, else throws an error) and inserts it into the
>> repository.
>>
>> Note you can use pycsw-admin.py to validate XML records before insertion
>> with:
>>
>> pycsw-admin.py -c validate_xml -x file.xml -s file.xsd
>>
>> Or you can use the pycsw API (assuming 1.10.2) like:
>>
>> from pycsw.admin import validate_xml
>> result = validate_xml(xml_filepath, xsd_filepath)
>>
>> Having said this, we never turned on validating parsers when ingesting
>> metadata records given the reality that many / most are not XML valid.
>> As enhancement, we could provide a configuration setting which turns
>> on strict mode validation when ingesting metadata records into pycsw,
>> like:
>>
>> metadata_validation=off|warn|strict
>>
>> What do folks think?
>>
>> ..Tom
>>
>>
>> On Fri, Oct 2, 2015 at 5:12 PM, Jachym Cepicky <jachym.cepicky at gmail.com>
>> wrote:
>> > Hi all,
>> >
>> > I would like to validate some records from our PyCSW instance, but it
>> > seems,
>> > that the schemas are broken ? I test on Linux command line and python
>> > lxml
>> > with the same result:
>> >
>> > # download the metadata record
>> > $ wget -O record.xml
>> >
>> > "http://geosense.cz/cgi-bin/csw.py?service=CSW&version=2.0.2&Request=GetRecordById&id=CZ-29002567-srv-jedinecny_nazev_sluzby&outputSchema=http://www.isotc211.org/2005/gmd&elementSetName=full"
>> >
>> > # create local xsd schema file
>> > $ cat << EOF > schema.xsd
>> > <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
>> > targetNamespace="http://www.isotc211.org/2005/srv"
>> > elementFormDefault="qualified" version="0.1">
>> > <xs:include
>> >
>> > schemaLocation="http://schemas.opengis.net/iso/19139/20060504/srv/serviceMetadata.xsd"/>
>> > </xs:schema>
>> > EOF
>> >
>> > # finally test the schema
>> > $ xmllint --schema schema.xsd record.xml  --noout
>> >
>> > result:
>> >
>> >
>> > http://schemas.opengis.net/iso/19139/20060504/gml/coordinateOperations.xsd:48:
>> > element element: Schemas parser error : Element
>> > '{http://www.w3.org/2001/XMLSchema}element', attribute 'ref': The QName
>> > value '{http://www.isotc211.org/2005/gmd}AbstractDQ_PositionalAccuracy'
>> > does
>> > not resolve to a(n) element declaration.
>> > WXS schema schema.xsd failed to compile
>> >
>> > Any hint? Do I use proper version of XSD?
>> >
>> > Thanks for hint
>> >
>> > Jachym
>> >


More information about the pycsw-devel mailing list