[pycsw-devel] Initial plan to implement ISO and INSPIRE compatibility for PyCSW

Sun Mar 6 18:13:37 EST 2011

Hi Tom,

I have commented bellow...

> FYI clarification: Although SQLAlchemy saves us from being tied to a
> single database, Python's sqlite lib allows for calling Python functions
> (via create_function), which the code uses for bbox (passing to Shapely)
> at this point.  So we'll have to work on moving away from this and find.
>
Yes, SQLAlchemy helps a lot here along with the functions used in 
SQLite. In the future, we should also consider PostGIS as a solution for 
achieving better performance

> Having said this, you are right; the strategy is similar to GN in terms
> of storing an entire XML document in a db record.  We are additionally
> storing the core queryables (Dublin Core) so as to be able to query more
> efficiently.  When returning GetRecords results, if
> elementsetname='full', then we return the entire XML document as stored;
> else, we present (either brief or summary) as per
> http://schemas.opengis.net/csw/2.0.2/record.xsd.
>
Great, this seems to be a great solution.
>>   - Also, there is the example of MDWeb project, another open
>> source Java implementation of CSW that has implemented the
>> full ISO 19115 schema within postgres (more than 1300 lines
>> of SQL) and stores all the info in this schema. My opinion is
>> that this would lead to unmaintainable code or we would lose
>> the current advantage to use all RDBMS systems available
>> through SQLAlchemy.
> Ouch!  IMHO that would be prone to error along the metadata lifecycle.
>
I think we should not consider at all a solution like this. I just 
mentioned this as an implemented solution :)

>>   - Finally, I see another possible solution, a middle path.
>> Store the main queryables in db columns (with extra tables
>> for "one to many" connections) but without following the ISO
>> UMLs. At the same time the xml will be stored in separate
>> column in the main table. This will make the xml import and
>> export functions more complicated, but will lead to better
>> performance. I think this direction would lead to ~15 extra
>> tables (eg one for "Keywords",  one for "Resource Language",
>> one for "Topic Category" etc)
>>
>>
> Good idea.  For ISO/INSPIRE support, we could start by mapping the ISO
> core queryables to the Dublin Core core queryables.  Core queryables
> over and above would need to be exposed specifically for ISO/INSPIRE
> support.
>
> I would think this work would be best be done as a plugin to pycsw.
> Having said this, a plugin architecture would be valuable here, so pycsw
> can support n profiles over time.
>
Having read the code for mapping the queryables, and after your help 
explaining the 3 node mapping involving SQLAlchemy, XML schema and 
Database, it is clear now that we can create an ISO (and later an 
INSPIRE) mapping as a plugin. I agree.

> I think it would be a good idea to:
>
> - start mapping out the core queryables (we should start with the ISO
> profile first IMHO, since it looks like INSPIRE is an extension based on
> the ISO profile)
> - establish a framework on adding plugins to the code
>
> We could use the wiki at https://sourceforge.net/apps/trac/pycsw/wiki to
> start and flush out requirements.
>
> Thoughts?
>
> ..Tom
>
Yes, we should start on this and then build up.
We also have to decide about adding some extra tables to the db schema, 
but I guess this can wait for a bit later. Let's just test the mappings 
on a single table and then change this for performance if necessary.

I have a short list of tasks for myself:
1. create a test configuration file for ISO/INSPIRE
2. see for any possible model changes in config class.
3. install a working server
4. test a scenario for ISO mappings based on my previous mail.

Best regards,
Angelos

-- 

Angelos Tzotsos
Remote Sensing Laboratory
National Technical University of Athens
http://users.ntua.gr/tzotsos