[OSGeodata] geodata schema for persistence, discovery and binding

Norman Barker norman.barker at gmail.com
Tue Aug 1 18:13:59 EDT 2006


Stefan,

(apologies for the new thread, I have had to change email - the link
to our Boulder mail server is down from the UK!)

I am keen on the concept of a bot (crawler) rather than calling out to
a web service (such as Google API) mainly because it a crawler
requires a seed url, and as such it is the administrators
responsibility about whether he/she should use that data from that
institution.

I was initially planning to focus on searching for WCSs
implementations since the describeCoverage operation gives you a
fairly rich metadata set (though perhaps not standardised), and I like
your idea of using protocols to dig in further programatically.  I am
looking to use j2ee to componentise this development (and to make it
fast to develop since I know it!!)

so that we have

client -> JMS Queue -> Message Driven Bean (receives url and search
parameters) -> calls interface to do the processing + database update.
 With this type of coupling you can put any bean on the end to do your
work for you + it also scales if you choose to run a cluster to
harvest.  The client will schedule jobs at timed intervals (part of
the J2ee spec - timer service)

I wrote the queue part this afternoon, and I am looking to put a
crawler on tomorrow - I am concerned with licenses (I really can't use
the GPL - but am happy to contribute to Geotools (LGPL) and hence it
can roll up to GeoServer), so am probably going to write a very simple
crawler from scratch - but it is plug + play with the interface
separation.

The attribute indicating protocol is interesting; if we base
'ingestion' through a URL search parameter identifier we can have
local + remote ingestion with the same code - so matching file:// or
http:// or whatever the identifier is.

I am very interested to see how the database schema develops and how
it performs with PostGIS.  Hopefully we can get something up in the
next 10 days.

Unfortunately I know very little about postgis (though I have wanted
to use it for a while), so when the schema further develops I would be
interested to see how you index getcapabilities (free text search?
spatial index?)

many thanks,

Norman




More information about the Geodata mailing list