[Geodata] Fwd: The once and future OpenAerialMap
Schuyler Erle
schuyler at nocat.net
Mon Nov 2 13:06:46 EST 2009
My take on latest attempt at jumpstarting the OpenAerialMap effort.
Please reply to the talk at openaerialmap.org list if interested!
SDE
-------- Forwarded Message --------
From: Schuyler Erle <schuyler at nocat.net>
To: talk at openaerialmap.org
Subject: [OAM-talk] The once and future OpenAerialMap
Date: Mon, 02 Nov 2009 12:08:45 -0500
Recently there's been a surge of interest in OpenAerialMap in the
context of humanitarian crisis response, but most of the discussion in
the last couple months has been over private emails and conversations.
I'd like to take the opportunity to try to restart the public discussion
in the hopes of moving forward.
This will be a long email, but I have a lot of thoughts I want to get
out. Please forgive the extent to which this missive covers ground that
has already been trod upon on this list.
----
OpenAerialMap is meant, to my understanding, to provide a Free and Open
archive of aerial and satellite imagery for general use.
The four main challenges behind OAM, as I see them, are as follows:
1. Community interest
2. Imagery cataloging
3. Reliable storage
4. Low-latency delivery
Community interest
------------------
Bona-fide community interest has been the main sticking point to date.
I've been agitating for an OAM-like service for nearly five years, but
OAM is going to continue to be merely an unfinished prototype and a nice
idea until we have a critical mass of interested parties to keep the
project going. Based on the private discussions around its utility for
crisis response, I think we now have a real use case that the community
can develop around.
The next question is one of organizing. If/when we have a sufficient
critical mass, perhaps we should constitute a project organizing
committee under a community governance model akin to Apache, OpenLayers,
etc., so that someone is empowered to make executive decisions on behalf
of the community. This would mean that, once this list begins to reach
consensus, we have someone to take responsibility for moving the ball
forward, rather than us all staring at each other and wondering what
happens next.
Cataloging
----------
After talking with Christopher Schmidt last week, I'm convinced that the
imagery catalog is the key point on the critical path to a working
OpenAerialMap. Basically, ground imagery can be delivered *to* OAM in
two forms:
(1) hosted elsewhere and exposed via WMS, etc. MassGIS and Landsat-7 are
good examples.
(2) delivered in bulk (on DVD, hard drive, via upload) from a third
party, whether it's from a government or NGO mapping bureau, or a
homebrew UAV enthusiast.
Both of these data sources need to be cataloged appropriately in OAM, so
that they can be mosaicked into a single ground imagery layer at each
zoom level. The WMS-delivered datasets are easier to deal with in the
sense that they're already online and OAM can simply reproject, proxy,
and cache requests. The bulk-delivered datasets are a little more
complex because they require hosting somewhere, but once they're online,
they too can be exposed via WMS and cached in OAM by the same means.
Christopher has already written a very basic imagery catalog for OAM
using Django. IIRC, this catalog app generates a MapServer configuration
that performs the reprojection and mosaic. I would submit that we should
start with the existing OAM catalog code and look critically at how we
can extend it to be more comprehensive and robust, versus adopting
something else (GeoNetwork?) or starting from scratch.
There are questions of crowdsourcing versus curating the effort of
building the catalog that are out of scope at the moment, but we should
think about some model of allowing anyone to submit WMS URLs or offer to
upload imagery, with some data curators empowered by the PSC to vet and
manage the "official" catalog itself.
The nice thing about starting by focusing on the catalog is that it
allows us to build a usable prototype by simply caching the mosaic in
RAM on some single machine for now, without having to commit to a
long-term storage plan first.
Reliable storage
----------------
OAM in its most recent conception didn't really get far enough to look
at solving this problem. One issue is that we're looking at scaling up
to dozens or 100s of TB of imagery, but let's assume that at least one
organization has a sufficient need and sufficient goodwill to step up
and offer to provide this to the community for free.
Such an offer will beg the other issue, which is that no one will want
to invest the effort in (re)building OAM on a single hosting provider,
if there's any risk that that host might lose interest (perhaps because
the project's liason gets a new job elsewhere), drop the project, go
offline without warning, or experience significant downtime. I think we
have seen this with well-intentioned efforts in the past to offer our
community hosting for free but with no SLA. We need to avoid this
situation moving forward. An OAM that doesn't stay online is worse than
none at all.
I see a range of options between fully centralized and fully distributed
hosting:
(1) A single hosting provider offers hosting to "the community" under
some kind of binding commitment with a lengthy term and a definite SLA.
Since no money will be changing hands, it's hard to see how we ("the
community") could enforce such an agreement (or indeed who exactly the
hosting institution would be making the agreement with).
(2) Multiple hosting providers mirroring each other in some arrangement,
with round-robin DNS and the project steering committee empowered to
change the DNS records as necessary. The downside of this is that it
will take up a lot of bandwidth, and still puts the eggs into relatively
few baskets, but at least there's some redundancy.
(3) A quasi-peer-to-peer solution with a limited number of trusted,
high-capacity seed nodes that obtain, cryptographically sign, and cache
tiles derived from the imagery sources in the catalog. A larger number
of leaf nodes could serve actual client requests and maintain
distributed local caches. We could design a fixed level of redundancy
into this topology, provided there's enough space available. Naturally
this is the technical solution that I recommend considering.
Pure P2P solutions are great for exchanging large files, but typically
have too much latency to be practical for these purposes (see below).
Low-latency delivery
--------------------
The key consideration for consumption of OAM tiles is that Internet map
users expect imagery to load on the client side within milliseconds,
which is partly why OAM will need to cache tiles delivered from 3rd
party sources. Any kind of cache storage that depends on delivering
tiles from pure P2P networks or higher-latency institutional SANs is not
going to work for our purposes.
----
I think that's it. I hope, if you've gotten this far, that you'll treat
this email as a call to action. I look forward to hearing discussion of
the technical and community organizing details, and I hope to see more
participants commit to making this concept a reality. Thanks!
SDE
_______________________________________________
talk mailing list
talk at openaerialmap.org
http://openaerialmap.org/mailman/listinfo/talk_openaerialmap.org
More information about the Geodata
mailing list