[pycsw-devel] Initial plan to implement ISO and INSPIRE compatibility for PyCSW

Angelos Tzotsos gcpp.kalxas at gmail.com
Tue Mar 1 18:42:34 EST 2011


Hi all,

I was reading the code lately and was wondering how should we proceed in 
order to implement ISO and INSPIRE for PyCSW.

I think the first thing that has to be seen is the database schema:

  - For now, Tom has implemented Dublin Core + CSW 2.0.2 using 
SQLAlchemy, Shapely and SQLite (which can be any db actually). The basic 
db schema for this includes the Dublin Core queryable columns in one 
single table, as well as the Geographic Extend in an extra field. The 
rest of the xml metadata file is stored in a separate db column (which 
can be queryable through Xpath).
Almost the same strategy is implemented in GeoNetwork CSW.

  - Another solution would be to store only the xml files in the 
database and use only xpath queries. I suppose this would have an effect 
in performance for large datasets.

  - Also, there is the example of MDWeb project, another open source 
Java implementation of CSW that has implemented the full ISO 19115 
schema within postgres (more than 1300 lines of SQL) and stores all the 
info in this schema. My opinion is that this would lead to 
unmaintainable code or we would lose the current advantage to use all 
RDBMS systems available through SQLAlchemy.
  - Finally, I see another possible solution, a middle path. Store the 
main queryables in db columns (with extra tables for "one to many" 
connections) but without following the ISO UMLs. At the same time the 
xml will be stored in separate column in the main table. This will make 
the xml import and export functions more complicated, but will lead to 
better performance. I think this direction would lead to ~15 extra 
tables (eg one for "Keywords",  one for "Resource Language", one for 
"Topic Category" etc)


But what kind of data need to be stored in the db in order to have basic 
compliance with ISO 19115 and INSPIRE for datasets?

For ISO core needs are: (with * are the core queryables in CSW)
1.* Dataset Title (M - Mandatory)
2. Dataset reference date (M)
3. Dataset responsible party (O - Optional)
4.* Geographic location of the dataset (C - Conditional)
5. Dataset language (M)
6. Dataset character set (C)
7.* Dataset topic category (M) (includes keywords in CSW queryable)
8. Spatial resolution of the dataset (O)
9.* Abstract describing the dataset (M)
10.* Distribution format (O)
11. Additional extend information for the dataset (O)
12. Spatial representation type (O)
13.* Reference system (O)
14. Lineage (O)
15. On-line resource (O)
16.* Metadata file identifier (O)
17. Metadata standard name (O)
18. Metadata standard version (O)
19. Metadata language (C)
20. Metadata character set (C)
21. Metadata point of contact (M)
22.* Metadata date stamp (M)

plus CSW queryables
* "Any text"
* Type (default "dataset")


For INSPIRE the same list is: (numbers indicate mapping with the above 
and * is for queryables)
1.* Resource title (M) [1]
2.* Temporal reference (C) [0..n]
3.* Responsible organization (M) including both name of the organization 
and contact e-mail [1]
4.* Geographic Bounding Box (M) [1..n]
5. Resource language (C) [0..n]
7.* Topic category (M) [1..n]
8.* Spatial resolution (C) [0..n]
9.* Resource abstract (M) [1]
11. Temporal extend (C) [0..n]
14.* Lineage (M) [1]
15. Resource Locator (C) [0...n]
19. Metadata Language (M) [1]
21. Metadata point of contact (M) including both name of the 
organization and contact e-mail [1..n]
22. Metadata Date (M) [1]
23.* Resource Type (M) [1]
24.* Unique Resource Identifier (M) [1..n]
25.* Keyword (M) [1..n]
26.* Conformity (M) [1]
27.* Conditions for access and use (M) [1..n]
28.* Limitations on public access (M) [1..n]

  [1..n] indicates "1 to many"


The resources of the above are:
http://portal.opengeospatial.org/files/?artifact_id=21460
http://inspire.jrc.ec.europa.eu/documents/Metadata/INSPIRE_MD_IR_and_ISO_v1_2_20100616.pdf
http://inspire.jrc.ec.europa.eu/documents/Network_Services/Technical_Guidance_Discovery_Services_v2.12.pdf

Another interesting document to read is:
http://www.neogeo-online.net/blog/wp-content/uploads/2011/01/201012_geonetwork_inspire_v0.6.pdf

Any thoughts, ideas, proposals on how to proceed?

Regards,
Angelos

-- 
Angelos Tzotsos
Remote Sensing Laboratory
National Technical University of Athens
http://users.ntua.gr/tzotsos

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.osgeo.org/pipermail/pycsw-devel/attachments/20110302/8f46317a/attachment.html


More information about the Pycsw-devel mailing list