[OSGeodata] geodata schema for persistence, discovery and binding
Norman Barker
norman.barker at gmail.com
Tue Aug 1 18:13:59 EDT 2006
Stefan,
(apologies for the new thread, I have had to change email - the link
to our Boulder mail server is down from the UK!)
I am keen on the concept of a bot (crawler) rather than calling out to
a web service (such as Google API) mainly because it a crawler
requires a seed url, and as such it is the administrators
responsibility about whether he/she should use that data from that
institution.
I was initially planning to focus on searching for WCSs
implementations since the describeCoverage operation gives you a
fairly rich metadata set (though perhaps not standardised), and I like
your idea of using protocols to dig in further programatically. I am
looking to use j2ee to componentise this development (and to make it
fast to develop since I know it!!)
so that we have
client -> JMS Queue -> Message Driven Bean (receives url and search
parameters) -> calls interface to do the processing + database update.
With this type of coupling you can put any bean on the end to do your
work for you + it also scales if you choose to run a cluster to
harvest. The client will schedule jobs at timed intervals (part of
the J2ee spec - timer service)
I wrote the queue part this afternoon, and I am looking to put a
crawler on tomorrow - I am concerned with licenses (I really can't use
the GPL - but am happy to contribute to Geotools (LGPL) and hence it
can roll up to GeoServer), so am probably going to write a very simple
crawler from scratch - but it is plug + play with the interface
separation.
The attribute indicating protocol is interesting; if we base
'ingestion' through a URL search parameter identifier we can have
local + remote ingestion with the same code - so matching file:// or
http:// or whatever the identifier is.
I am very interested to see how the database schema develops and how
it performs with PostGIS. Hopefully we can get something up in the
next 10 days.
Unfortunately I know very little about postgis (though I have wanted
to use it for a while), so when the schema further develops I would be
interested to see how you index getcapabilities (free text search?
spatial index?)
many thanks,
Norman
More information about the Geodata
mailing list