[OSGeo-Discuss] Re: idea for an OSGeo project -- a new, open data format

P Kishor punk.kish at gmail.com
Tue Nov 13 11:51:44 EST 2007


one more thing -- the new format should be the default format for a
very popular project like MapServer, yet, MapServer should be able to
import the normal Shapefile format seamlessly. This feature would be
crucial for rapid dissemination and adoption of such a format.

On 11/13/07, P Kishor <punk.kish at gmail.com> wrote:
> Thanks everyone, for responding. Here is my "groundwork."
>
> The new format --
>
> - Should be fast. SQLite is plenty fast, and anything that simply
> "extends" the Shapefile format to inject relational capabilities
> should be pretty fast. It should definitely be faster than a
> geodatabase format (such as PostGIS/ArcSDE) and perhaps even faster
> than Shapefiles especially while accessing attribute data. DBF is
> sequential, and searching for textual information is particularly
> expensive. SQLite has been tuned to excellence. I have been working
> with it for a few years now, and it really is an amazing product,
> development community, support, and capabilities. That it is in public
> domain makes for a transfat-free icing on the cake.
>
> - Should be unencumbered by licenses and copyrights. Ideally, the new
> format could also be put back into public domain. We want to remove
> all encumbrances to encourage rapid and wide adoption.
>
> - Should be a single file. Well, some like multiple files and some
> like single files. We can achieve both objectives by using a
> tar-gzipped packaging such as Apple tends to use for much of its stuff
> (for example, its Pages wordprocessor uses a tgzipped xml file along
> with other resources for icons and pictures and stuff). Or, if speed
> is going to be affected because of gzipping and gunzipping, just a
> package format (I have no idea if this is a Unix thing or a Mac OS
> thing -- we, in the Mac world, call them packages... they appear like
> files in the Finder, and like directories in the shell).
>
> - Should be easy to transition to. By building the new format on the
> structure of the Shapefile format, and *in fact*, calling it "open
> shapefiles" or some such thing, we indicate from its name that the
> transition is not that revolutionary but is evolutionary. This,
> hopefully, will bring some name-familiarity, and make the transition
> less scary.
>
> - Frank mentions SQLite's lack of datatypes as an issue -- I guess
> that is a matter of preference. I personally quite like that freedom
> as it gives me, the application developer, complete control over what
> goes where. SQLite actually does have now a few datatypes that it
> respects, but doesn't complain about. Since all users will be
> accessing the data via an application, as long as the application is
> well defined, it should be fine.
>
> - SQLite excels at one thing that it has been entrusted to do --
> retrieve data that it has been entrusted with at extremely fast
> speeds, and maintain ACID data integrity in case of a programmatic
> catastrophe. The transactions themselves are worth their price of
> admission, which, happily, happens to be zero.
>
> - Langdon mentions Java support -- well, yes, use/work on SQLite JDBC.
> I have been using it for a few days now and find it to be a pretty
> competent conduit. Extend it, spatialize it. ANSI standard C is still
> that magic common denominator that compiles and works predictably on
> most number of systems. I have a lot against Java, but those who love
> Java should definitely work on tools for accessing and working with
> this new format as it would only make the format more widely used and
> adopted.
>
> Ok, enough for now.
>
>
>
> On Nov 13, 2007 8:52 AM, P Kishor <punk.kish at gmail.com> wrote:
> > So, I am thinking, Shapefile is the de facto data standard for GIS
> > data. That it is open (albeit not Free) along with the deep and wide
> > presence of ESRI's products from the beginning of the epoch, it has
> > been widely adopted. Existence of shapelib, various language bindings,
> > and ready use by products such as MapServer has continued to cement
> > Shapefile as the format to use. All this is in spite of Shapefile's
> > inherent drawbacks, particularly in the area of attribute data
> > management.
> >
> > What if we came up with a new and improved data format -- call it
> > "Open Shapefile" (extension .osh) -- that would be completely Free,
> > single-file based (instead of the multiple .shp, .dbf, .shx, etc.),
> > and based on SQLite, giving the .osh format complete relational data
> > handling capabilities. We would require a new version of Shapelib,
> > improved language bindings, make it the default and preferred format
> > for MapServer, and provide seamless and painless import of regular
> > .shp data into .osh for native rendering. Its adoption would be quick
> > in the open source community. The non-opensource community would
> > either not give a rat's behind for it, but it wouldn't affect them...
> > they would still work with their preferred .shp until they learned
> > better. By having a completely open and Free single-file based, built
> > on SQLite, fully relational dbms capable spatial data format, it would
> > be positioned for continued improvement and development.
> >
> > Is this too crazy?
> >
> > --
> > Puneet Kishor
> >
>


-- 
Puneet Kishor
http://punkish.eidesis.org/
Nelson Institute for Environmental Studies
http://www.nelson.wisc.edu/
Open Source Geospatial Foundation (OSGeo)
http://www.osgeo.org/
Summer 2007 S&T Policy Fellow, The National Academies
http://www.nas.edu/


More information about the Discuss mailing list