[pycsw-devel] configuration design options
tomkralidis at hotmail.com
Fri Apr 29 18:15:21 EDT 2011
> Date: Fri, 29 Apr 2011 18:58:27 +0300
> From: gcpp.kalxas at gmail.com
> To: pycsw-devel at lists.sourceforge.net
> Subject: Re: [pycsw-devel] configuration design options
> Hi Tom,
> Some thoughts about this:
> On 04/27/2011 10:32 PM, Tom Kralidis wrote:
> > Hi: I'd like to get some input and thoughts w.r.t. our current configuration mechanism.
> > Currently, we use the ConfigParser approach to set runtime options. This has proven to be a simple and lightweight approach in the spirit of pycsw.
> The ConfigParser has served us well :)
> > Reasoning: I would like to move APISO as core (not option) functionality by
> > default. The APISO code can stay where it is, however I think most uses will
> > call for APISO implementation.
> I agree that most users will only look for APISO for deployment.
> But what happens if we merge APISO with core? Will we merge inspire code
> in core too? (Since we thought it is not a separate profile, but an
> extension to APISO)
> > Having said this, we load APISO
> > (repository, core queryables, etc.) in a separate space (initially) to
> > core. So what happens if a user wants to load APISO and does not have any DC records to load? The code currently _always_ loads csw:Record.
> This is an issue, you are right
> > I'd like repositories to be set/accessed as a list (self.repositories). As well, I think harmonizing to one main configuration (instead of core config, then APISO config) will be cleaner in the long run.
> In my opinion, if we do this, the profile setup we use now will fade out
> over time. What happens if someone wants to implement a new profile?
> Will he have to add things to this main configuration xml file? On the
> other hand this would solve problems (like multilanguage service
> metadata in Inspire)
> > With the above in mind, there are issues with the ConfigParser approach in the code:
> > - numerous checks against missing/not set options
> > - ConfigParser does not allow repeatable section names or options. How can we set multiple [repository] objects?
> > Options:
> > 1) still use ConfigParser, and namespace [repository] objects like [repository-csw:Record], [repository-gmd:MD_Metadata] to make them unique. This would take some tweaking in the config parsing, but certainly doable. Not sure how user-friendly or error prone this would be
> > 2) use JSON as a config format. This can be integrated in the code easy enough, but I think this is prone to error given the complexity of the format
> > 3) XML. Using XML (with XML Schema) allows us to:
> > - perform validation (offline with sbin/validate_xml.py or at runtime) of the configuration to ensure validity
> > - have repeatable objects (like repository) and properties
> > - gives us the option to parse the XML and convert to a Python dict (which is what we do with ConfigParser), or work directly with the etree object(s) in the code (and bypass extra parsing). This would gives us some performance gains (although parsing an XML file is more overhead than ConfigParser). I've attached sample XML and XSD files as example.
> I prefer option 1 with option 3 being second, but I am not strong about
> 1. Having to configure through cfg files is very simple for the end
> user. I believe that a non developer user has trouble configuring
> through xml files.
> On the other hand, xml has many advantages for development...
> > In the end, the goal of one main configuration and repeatable repository objects would benefit pycsw, and even opens up options to enable OGC:WFS support (not that this is in scope for pycsw, but perhaps a Python OGC Web Services framework which can implement n service types). Just a thought.
> > I'm mildly leaning towards option 3 (would take some major code rework, which I am willing to implement this), but would like to see what others think in terms of functionality and user-friendliness. OGC CITE support wouldn't be affected, but this would be a change to current configuration design; we are still in early phases, so better now than later :)
> Yes, if there is such a change ahead, better to do it now.
> > Thoughts? I hope this explanation is clear enough. Are there other options we can/should consider?
> > ..Tom
> My only hesitation is the profile specific things, that will change the
> xsd in the future. One configuration file means that each profile will
> need to add to this file extra stuff. This can get messy on the long run.
After much discussion on irc, and some recommendations from others in the GIS / Python community, we are staying with ConfigParser as the config approach (option 1), with namespace'd section names.
Thanks much for the discussion and way forward.
More information about the Pycsw-devel