[pycsw-devel] Issues with the PostGIS backend + real live metadata

Adrià Mercader adria.mercader at okfn.org
Wed Aug 21 03:54:45 PDT 2013


Hi all,

I've been playing around trying to set up pycsw with Postgres +
PostGIS as backend, with mixed results. Here are a list of things I
found:


* If I understood the docs correctly [1], if you are using PostGIS
(which the script will autodetect), the native PostGIS functions will
be used and there is no need for the plpythonu functions. But if you
call setup_db with create_plpythonu_functions=False, the geometry
column and the trigger are not created, as they are in the same code
block as the plpythonu functions creation. It looks like this commit
moved the indentation one level down [2].

The rest of issues that followed assumed that this was fixed (ie I can
use PostGIS without plpythonu), but please correct me if wrong.

* After setting up the wkb_geometry in this way I always got an
IntegrityError exception when trying to load a document which has a
bbox defined. This was easy to spot, there was a typo in the srid on
AddGeometryColumn.

This PR fixes both issues:

https://github.com/geopython/pycsw/pull/177


* On several documents I tried I got exceptions because the fields
defined were too short:

DataError('(DataError) value too long for type character varying(256)\n',)

See eg the abstract field in [3][4] or conditionapplyingtoaccessanduse
in [5] (you find this with much more fields once you start importing
large numbers of records).
I'm not a DB expert, and I certainly can't talk for sqlite, but I
would imagine that changing the column definitions to character
varying(x) to text would have no effect on performance while removing
all kind of text length limit problems. This is a good resource [6]

* As a really minor general comment, the exceptions raised while
loading documents [8] are a bit too noisy, as they output the whole
document, making difficult to know what actually went wrong.

Hope this helps,

Adrià




[1] http://pycsw.org/docs/administration.html#postgis
[2] https://github.com/geopython/pycsw/commit/c01ce4179a11c31a7ec46c1a9c5441225604e535#L4L255
[3] http://catalog.data.gov/harvest/object/3ff17b36-4a12-42c6-b615-8104291dadbd
[4] http://catalog.data.gov/harvest/object/2b7a2015-a942-48ff-ab03-a2728be6326a
[5] http://www.ngdc.noaa.gov/metadata/published/NOAA/NESDIS/NGDC/STP/Solar/iso/xml/G10136.xml
[6] http://www.depesz.com/2010/03/02/charx-vs-varcharx-vs-varchar-vs-text/
[7] http://catalog.data.gov/harvest/object/36074bbb-65eb-4e7a-b5e1-3b74e8bf836b
[8] https://github.com/amercader/pycsw/blob/master/pycsw/repository.py#L214


More information about the pycsw-devel mailing list