[Geodata] discoverability and the wiki

Aaron Straup Cope asc at spum.org
Fri Oct 5 09:29:36 EDT 2007


Hellos,

I recently attended FOSS4G, in Victoria, and stopped in during the open 
geodata BOF.

One of the issues people raised was how to organize and find all of the 
possible data that may be housed on osgeo servers.

Since there is already a working instance of Mediawiki I wondered aloud 
whether something like the Semantic MediaWiki (SMW) extensions would be 
useful.

	http://meta.wikimedia.org/wiki/Semantic_MediaWiki

Let me pause briefly to just say : 1) I don't really like wikis either 
and 2) I am not going to rain on everyone's parade with pedantic semweb 
hocus pocus. No, really.

But.

The SMW stuff does make it pretty easy to add just that little bit of 
extra data so that you aren't living and dieing by full-text search 
alone and MW templates, once you suffer the initial setup, make it 
possible to mostly hide all of the hard stuff.

Both are still fraught with their own ongoing issues but they save 
people from having to write something from scratch and it's a reasonable 
80/20 solution to the problem of making easy enough to bother entering 
data but detailed enough to make it worth getting it back out again.

Maybe.

Eventually someone said : It's sounds like you're volunteering. At which 
point it became bad form not to at least put together a proof of concept.

So here it is, with details (and bugs) below : http://proj.spum.org/

(Also : I am not wed to any of this and I offer it up only as a 
suggestion. This is all stuff that I am interested in beyond any needs 
to index and discover open geodata so I'm not going to take my toys and 
leave if people decide it doesn't fit their needs.)

---

At the moment, there are 3 basic page types : Project, ProjectRelease, 
Creator. There are more, being a wiki and all, but you get the idea. 
Here are three sample pages followed by their (complete) markup :

# http://www.proj.spum.org/index.php?title=Net-Flickr-API
# {{Project|Aaron Straup Cope}}

I picked one of my own Perl modules mostly just to see if the templates 
would work with (not test data).

Originally that page called a "PerlProject" template that in turn called 
"Project" but also added the following SMW statement : [[doap 
language::Perl]]

Just, you know, because you can.

# http://www.proj.spum.org/index.php?title=Net-Flickr-API_1.67
# {{ProjectRelease|2007-09-03|perl|cpan}}

All three parameters are optional.

The second option is the license under which the release is...released. 
Licenses are passed as "short names" and teased out in to full names and 
proper URLs in a separate template.

(At the moment, it only knows about two licenses : perl and cc-by-3.0)

The third option is a "permalink" short-name. Presumably this would be 
mostly moot in an osgeo context and the default is in fact to hang 
everything off of something like osgeo.org/geodata/Project/VersionNumber

# http://www.proj.spum.org/index.php?title=Aaron_Straup_Cope
# {{SummaryCreator}}

Profit?

---

Here's a "complicated" example :

# http://www.proj.spum.org/index.php?title=SomeProject

{{Project|Bob Exampolopolis|Mr. Nubby}}

== Description ==

This is a fuzzy project!

{{Tags|fuzzy|dice|muffins}}

== Meta ==

{{meta|dc|coverage|foo}}

# http://www.proj.spum.org/index.php?title=SomeProject_0.9
{{ProjectRelease|2007-09-01|cc-by-3.0}}

---

In the example above tags actually get added as "dc subject" properties 
(as well as categoties) with all the work being hidden in the Tags template.

The Meta template is just a more general way to add domain specific 
data. Prefixes, like dc, can be registered in SMW such that they are 
recognized and expanded to proper URLs.

---

Out of the box, SMW will let you search by properties. For example :

http://www.proj.spum.org/index.php?title=Special:SearchByProperty/Dc_license:%3Dhttp://www.perl.com/pub/a/language/misc/Artistic.html

One of the things things that's also nice about is the ability to do 
inline queries on a page. For example, on a Project page you can display 
all the releases like this :

<ask format="broadtable" order="desc" mainlabel="Release">
[[dc versionOf::{{PAGENAME}}]]
[[doap version::*|Version]]
[[doap created::*|Created]]
[[dc license::*|License]]
[[doap download-page::*|Download]]
</ask>
[[doap name::{{#explode:{{PAGENAME}}| |0}}]]
[[is a::Project]]
{{ #if: {{{1}}} | 
{{for|call=doap-creator|sep=|1={{{1|@}}}|2={{{2|@}}}|3={{{3|@}}}|4={{{4|@}}}|5={{{5|@}}}|6={{{6|@}}}|7={{{7|@}}}|8={{{8|@}}}|9={{{9|@}}}|10={{{10|@}}}}}||}}

And, yes, the {{for|call}} stuff (well, actually, all of it) is a little 
like stabbing yourself in the eyes. That's why you hide it all in templates.

The <ask> stuff works great where it works. And not so much where it 
doesn't. For example :

- Either because of MW caching or some other issue/feature, when you add 
a new release the project page it is not automagically updated. You need 
to re-save it for the changes to appear. Not great but I gather this is 
one the SMW to-do list.

- You can't "ask" for, say, [[doap name::{{#var:bucket}}]] because the 
query parser doesn't evaluate templates. This is irritating.

- You can define YA template as the output format for a query but if it 
actually works yet (it is known to be unstable) I haven't figured out 
how. Once it does though you could, for example, define a query on a 
release page that asks for all the creators defined on the main project 
page and then squirts them in to the metadata for the release page.

Just a lot of little conveniences so that data need only be entered once 
but gets sprayed across a variety of places where it could be useful.

If I ever get templates working, ask/templates could also be useful for 
creating an API-like interface which neither MW or SMW do very well at 
the moment.

---

I suppose I will leave it there for now. The only other thing that I 
know doesn't work for sure are the "RDF feed" links. There are a bunch 
of missing templates which I have no idea why the SMW developers don't 
include by default. It may just be that I am using a dev version of the 
code.

Oh, and there are probably still sloppy XSS holes so buyer beware.

Have a poke around. If you're feeling brave follow some of the templates 
but you may want to cry. If you're interested in playing with a related 
project there is also :

	http://grape.spum.org/

This one has a More Better (tm) search interface/API but only because I 
started to abuse the actual SMW source code. They have since updated 
things and I can't face whatever changes I'll need to make as a result...

	http://grape.spum.org/pages/HowToSearch

---

Discuss!


More information about the Geodata mailing list