[OSGeo-Discuss] idea for an OSGeo project -- a new, open data format

Frank Warmerdam warmerdam at pobox.com
Tue Nov 13 07:42:10 PST 2007


P Kishor wrote:
> So, I am thinking, Shapefile is the de facto data standard for GIS
> data. That it is open (albeit not Free) along with the deep and wide
> presence of ESRI's products from the beginning of the epoch, it has
> been widely adopted. Existence of shapelib, various language bindings,
> and ready use by products such as MapServer has continued to cement
> Shapefile as the format to use. All this is in spite of Shapefile's
> inherent drawbacks, particularly in the area of attribute data
> management.
> 
> What if we came up with a new and improved data format -- call it
> "Open Shapefile" (extension .osh) -- that would be completely Free,
> single-file based (instead of the multiple .shp, .dbf, .shx, etc.),
> and based on SQLite, giving the .osh format complete relational data
> handling capabilities. We would require a new version of Shapelib,
> improved language bindings, make it the default and preferred format
> for MapServer, and provide seamless and painless import of regular
> .shp data into .osh for native rendering. Its adoption would be quick
> in the open source community. The non-opensource community would
> either not give a rat's behind for it, but it wouldn't affect them...
> they would still work with their preferred .shp until they learned
> better. By having a completely open and Free single-file based, built
> on SQLite, fully relational dbms capable spatial data format, it would
> be positioned for continued improvement and development.

Puneet,

I've had a similar idea kicking around in my head for a while, but I think
of it as "open geodatabase".  I see the goals as providing a similar role
to the "personal geodatabase", including:

  o RDBMS style operations like SQL filtering, joins, etc.
  o Get past all the shapefile limitations related to the .dbf format (very
    restricted data types, short attribute names, lots of other limits)
  o Allow storing many layers in one file.
  o Built in spatial indexing and attribute indexing.
  o OGC style coordinate system and geometry support.

I have had some hope that the existing SDF format supported by FDO would
be this new format; however, SDF is quite a complicated format, and the
only available open source implementation is quite heavily tied to FDO.
Once you carry along FDO the whole thing becomes fairly heavy in terms of
the amount of code required, and the interface complexity.  But (I think)
it satisfies most of my goals and already exists.

I do feel that we need to be cautious before launching "yet another format".
I'm also a bit dubious about some aspects of sqlite as a native data store.
In particular, it's typeless "everything is a string" approach strikes me
as potentially being a problem.  It also remains to be seen whether we could
build fast spatial indexing directly in, though I suppose with a fat enough
middleware layer it could be done.

PS. I'm still doubtful it would be faster than shapefiles+qix for most web
mapping needs.

Best regards,
-- 
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up   | Frank Warmerdam, warmerdam at pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush    | President OSGeo, http://osgeo.org




More information about the Discuss mailing list