[pycsw-devel] Problem with harvesting from another csw server

Tom Kralidis tomkralidis at hotmail.com
Wed May 30 18:17:02 PDT 2012



FYI I have updated the codebase in svn trunk [1] to accept http://www.isotc211.org/schemas/2005/gmd/ and http://www.isotc211.org/2005/gmd/ as acceptable csw:ResourceType values for ISO metadata.  Thanks for catching this.

[1] http://sourceforge.net/apps/trac/pycsw/changeset/550

> From: tomkralidis at hotmail.com
> To: kathrin at waterinsight.nl; pycsw-devel at lists.osgeo.org
> Date: Wed, 30 May 2012 13:32:05 -0400
> Subject: Re: [pycsw-devel] Problem with harvesting from another csw server
> 
> 
> 
> Hi Kathrin: thanks for the info.  At this point, pycsw supports harvesting at the metadata record level, and WMS  (which will harvest all layers from a WMS).
> 
> Currently, pycsw takes the http://www.opengis.net/cat/csw/2.0.2 as a Dublin Core metadata record.  We should extend this by checking if the resource is a metadata record, or a CSW capabilities document.  If the latter, then harvest the entire CSW.
> 
> For a quick workaround, you could write a script to loop through an existing CSW, pick up all identifiers and feed these to pycsw to Harvest (where the Source element value would be a GetRecordById request to the record.
> 
> Having said this, I think this would be a valuable addition.  If you can file an enhancement ticket at https://sourceforge.net/apps/trac/pycsw, this would be much appreciated; I will implement this for 1.4.0 (summer 2012).
> 
> Thanks
> 
> ..Tom
> 
> 
> 
> Date: Wed, 30 May 2012 18:36:22 +0200
> From: kathrin at waterinsight.nl
> To: pycsw-devel at lists.osgeo.org
> Subject: [pycsw-devel] Problem with harvesting from another csw server
> 
> Hello list,
> 
> I have just started testing pycsw for our meta data catalogue and am
> quite impressed so far. Great work!
> 
> However, I have now stumbled upon a problem when using the harvesting
> operation. I have enabled transactions in the default.cfg file and the
> 
> 
> 
> harvesting test from the test suite works (after changing ResourceType
> from http://www.isotc211.org/schemas/2005/gmd/ to
> http://www.isotc211.org/2005/gmd). However, this only harvests
> 
> 
> 
> individual metadata files. What I want to do is to harvest all records
> from another CSW server. So, from what I understand, I would have to
> change ResourceType to http://www.opengis.net/cat/csw/2.0.2 and give
> 
> 
> 
> the link to the capabilities document as Source.
> 
> So, my query is:
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <Harvest xmlns="http://www.opengis.net/cat/csw/2.0.2"
> 
> 
> 
> xmlns:ogc="http://www.opengis.net/ogc"
> xmlns:gmd="http://www.isotc211.org/2005/gmd"
> 
> 
> xmlns:ows="http://www.opengis.net/ows"
> 
> xmlns:xsd="http://www.w3.org/2001/XMLSchema"
> xmlns:dc="http://purl.org/dc/elements/1.1/"
> 
> 
> xmlns:dct="http://purl.org/dc/terms/"
> 
> xmlns:gml="http://www.opengis.net/gml"
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> 
> 
> xsi:schemaLocation="http://www.opengis.net/cat/csw/2.0.2
> 
> http://schemas.opengis.net/csw/2.0.2/CSW-publication.xsd"
> service="CSW" version="2.0.2">
>  <Source>http://aiolos.survey.ntua.gr/pycsw/csw.py?service=CSW&version=2.0.2&request=GetCapabilities</Source>
> 
> 
> 
>  <ResourceType>http://www.opengis.net/cat/csw/2.0.2</ResourceType>
>  <ResourceFormat>application/xml</ResourceFormat>
> </Harvest>
> 
> 
> 
> 
> And the exception I get is:
> 
> <?xml version="1.0" encoding="UTF-8" standalone="no"?>
> <!-- pycsw 1.2.0 -->
> <ows:ExceptionReport xmlns:dc="http://purl.org/dc/elements/1.1/"
> 
> 
> 
> xmlns:inspire_common="http://inspire.ec.europa.eu/schemas/common/1.0"
> xmlns:xs="http://www.w3.org/2001/XMLSchema"
> 
> 
> 
> xmlns:dct="http://purl.org/dc/terms/"
> xmlns:ows="http://www.opengis.net/ows"
> xmlns:apiso="http://www.opengis.net/cat/csw/apiso/1.0"
> 
> 
> 
> xmlns:gml="http://www.opengis.net/gml"
> xmlns:dif="http://gcmd.gsfc.nasa.gov/Aboutus/xml/dif/"
> 
> 
> xmlns:xlink="http://www.w3.org/1999/xlink"
> 
> xmlns:gco="http://www.isotc211.org/2005/gco"
> xmlns:gmd="http://www.isotc211.org/2005/gmd"
> 
> 
> xmlns:srv="http://www.isotc211.org/2005/srv"
> 
> xmlns:ogc="http://www.opengis.net/ogc"
> xmlns:fgdc="http://www.opengis.net/cat/csw/csdgm"
> 
> 
> xmlns:inspire_ds="http://inspire.ec.europa.eu/schemas/inspire_ds/1.0"
> 
> xmlns:csw="http://www.opengis.net/cat/csw/2.0.2"
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> 
> 
> 
> xmlns:os="http://a9.com/-/spec/opensearch/1.1/"
> xmlns:soapenv="http://www.w3.org/2003/05/soap-envelope"
> 
> 
> 
> xmlns:sitemap="http://www.sitemaps.org/schemas/sitemap/0.9"
> version="1.2.0" language="en-US"
> xsi:schemaLocation="http://www.opengis.net/ows
> 
> 
> 
> http://schemas.opengis.net/ows/1.0.0/owsExceptionReport.xsd">
>  <ows:Exception locator="service" exceptionCode="InvalidRequest">
> 
> 
> 
>    <ows:ExceptionText>Exception: document not well-formed.
> Error: EntityRef: expecting ';', line 3, column 72.</ows:ExceptionText>
>  </ows:Exception>
> </ows:ExceptionReport>
> 
> 
> 
> Is there something here that I am missing? Pycsw version is 1.2.0,
> 
> python 2.6.6 running on Debian squeeze.
> 
> Any help will be appreciated!
> 
> Best regards,
> Kathrin
> 
> 
> _______________________________________________
> pycsw-devel mailing list
> pycsw-devel at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/pycsw-devel 		 	   		  
> _______________________________________________
> pycsw-devel mailing list
> pycsw-devel at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/pycsw-devel
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/pycsw-devel/attachments/20120530/7f9814da/attachment.html>


More information about the pycsw-devel mailing list