I have created a (now empty) space on the OSGeo wiki to start to fill in concrete details that come out of this discussion at <a href="http://wiki.osgeo.org/index.php/Geodata_formats">http://wiki.osgeo.org/index.php/Geodata_formats
</a>. Please use the wiki to put your wishlists for a new open data format, lists of existing data formats with links to their specifications etc in the wiki. Please join the Geodata Mailing list (<a href="http://www.osgeo.org/geodata">
http://www.osgeo.org/geodata</a>) and continue this thread with debate and discussion relating to a new format on that list as I believe it is a more appropriate venue.<br><br>David<br><br><div class="gmail_quote">On Nov 13, 2007 12:55 PM, P Kishor <
<a href="mailto:punk.kish@gmail.com">punk.kish@gmail.com</a>> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">David,<br><div class="Ih2E3d">
<br><br>On 11/13/07, David William Bitner <<a href="mailto:david.bitner@gmail.com">david.bitner@gmail.com</a>> wrote:<br>> Part of the mission of the OSGeo Geodata committee<br>> (<a href="http://www.osgeo.org/geodata" target="_blank">
http://www.osgeo.org/geodata</a>) is to "promote the use of open geospatial<br>> formats". If there is a group that wants to continue pursuing the creation<br>> of a new open geodata format, I would like to encourage the use of the
<br>> geodata mailing list. That being said, I think part of the discussion that<br>> needs to be had is whether or not OSGeo should be creating standards in the<br>> first place.<br>><br>> A couple comments that I have on some of the discussion that has taken place
<br>> in this thread:<br>><br>> Regarding the suggestion that MapServer takes on this new format as the<br>> primary format: I think this is way beyond the scope of what OSGeo should<br>> be doing. Even if we spec a new standard, we (OSGeo) have no teeth to be
<br>> able to make any of our projects do any kind of implementation of that<br>> standard. The choice of formats that are used by any of our projects is<br>> driven by the needs of the users and developers and the resources (time,
<br>> money) that have been dedicated towards implementing them. If someone takes<br>> OpenShape or whatever and decides they have a business need that they can<br>> spend the time or money to get it implemented then it will be implemented.
<br>> Shapefile has and will continue to be an important format for many projects<br>> as it is one of, if not the most distributed formats in the GIS world.<br>><br><br></div>I respectfully disagree. I think OSGeo has plenty teeth for those who
<br>want to believe in it. In the end, yes, just like any real project, it<br>needs a core of committed developer and plenty of time (or money --<br>usually they are synonymous). This is not something that can happen<br>overnight, but if good, it deserves a start and support. That the
<br>long, long-term effects of a solid, relational, transactional, geodata<br>format would be very good is a reasonable assumption for me.<br><div class="Ih2E3d"><br>> Regarding the comments on standards wanking: Standards can get in the way
<br>> of progress along a straight line, but they can also encourage<br>> interoperability that can create better progress for everyone. To get a<br>> singular task done, standards often can slow things down, but there *are*
<br>> gains to be had from playing well with everyone else.<br><br></div>Here I totally agree. I am not sure how to interpret the "standards<br>wanking" statement. On the one hand it is a reasonably accurate<br>
assessment of a lot of public hand-wringing and open alliances (for a<br>really funny take on this, read Fake Steve's tirade on the open<br>handset alliance at<br><<a href="http://fakesteve.blogspot.com/2007/11/its-not-phone-its-alliance.html" target="_blank">
http://fakesteve.blogspot.com/2007/11/its-not-phone-its-alliance.html</a>>).<br>But, on the other hand, it is a pretty damning judgment on any attempt<br>to do things via collaboration, and thus, on OSGeo and such efforts
<br>itself.<br><br>My take is that if I can't do it alone, I will lay it out in the open<br>hoping someone better than me will work on it as well. If I can do it<br>alone, I will do it until I think it is ready to benefit from extra
<br>eyeballs. Sometimes getting started is the biggest hurdle.<br><div><div></div><div class="Wj3C7c"><br><br>><br>> David Bitner<br>> OSGeo, Public Geospatial Data Project Chair<br>><br>> On Nov 13, 2007 11:40 AM, Allan Doyle <
<a href="mailto:afdoyle@mit.edu">afdoyle@mit.edu</a>> wrote:<br>> ><br>> ><br>> > On Nov 13, 2007, at 12:24 , Steve Coast wrote:<br>> ><br>> > > OSM: $0<br>> > > CCBYSA: $0<br>
> > > Donation of entire Netherlands: Priceless<br>> > ><br>> > > Real artists ship. For everyone else there's standards wanking.<br>> ><br>> > Perhaps there's an art to wanking standards as well.
<br>> ><br>> ><br>> ><br>> ><br>> > ><br>> > ><br>> > ><br>> > ><br>> > > Seriously though, this is so kafka-esque. When OSM started it was<br>> > > like this: We should have got a committee to design a standard, then
<br>> > > we could think about a committee to design an ontology... and choose<br>> > > a name... and on some sunny distant day make a map.<br>> > ><br>> > ><br>> > ><br>> > > On 13 Nov 2007, at 17:09, P Kishor wrote:
<br>> > ><br>> > >> On 11/13/07, Landon Blake <<a href="mailto:lblake@ksninc.com">lblake@ksninc.com</a>> wrote:<br>> > >>> Puneet,<br>> > >>><br>> > >>> You wrote: "Should be easy to transition to. By building the new
<br>> > >>> format<br>> > >>> on the<br>> > >>> structure of the Shapefile format, and *in fact*, calling it "open<br>> > >>> shapefiles" or some such thing, we indicate from its name that the
<br>> > >>> transition is not that revolutionary but is evolutionary. This,<br>> > >>> hopefully, will bring some name-familiarity, and make the transition<br>> > >>> less scary."
<br>> > >>><br>> > >>> I really think you are going to run into problems using the<br>> > >>> "Shapefile"<br>> > >>> as part of the trademark or name for any product not sold by ESRI. I
<br>> > >>> strongly recommend against this move. Let people adopt the<br>> > >>> implementation of your idea for its merits, not for name recognition<br>> > >>> that comes from another product line.
<br>> > >><br>> > >> Good enough point to keep in mind, but not to get hung up over enough<br>> > >> to entangle us. Suggestions for names of the data format can be a<br>> > >> project in itself. "open spatial data format" or its variations could
<br>> > >> be chosen. Still, point taken.<br>> > >><br>> > >>><br>> > >>> You wrote: "ANSI standard C is still<br>> > >>> that magic common denominator that compiles and works predictably on
<br>> > >>> most number of systems. I have a lot against Java, but those who<br>> > >>> love<br>> > >>> Java should definitely work on tools for accessing and working with<br>> > >>> this new format as it would only make the format more widely used
<br>> > >>> and<br>> > >>> adopted."<br>> > >>><br>> > >>> It sounds to me like you are really describing a tool. File<br>> > >>> formats are
<br>> > >>> written in a binary encoding or text, not in a programming<br>> > >>> language. If<br>> > >>> you are designing a tool you can choose the programming language<br>> > >>> of your
<br>> > >>> choice, but be aware that this will limit the developers that<br>> > >>> adopt the<br>> > >>> tool. This will be the case no matter what language you choose to<br>> > >>> use,
<br>> > >>> whether it is C, Java, or something else.<br>> > >>><br>> > >>> If, in contrast, you are creating a file format, then programming<br>> > >>> languages shouldn't really matter. Binary and text data can be
<br>> > >>> accessed<br>> > >>> by almost all programming languages.<br>> > >>><br>> > >>> I think you need to decide if you want a tool or a data format. It<br>
> > >>> sounds like you are shooting more for a spatial database written<br>> > >>> in the<br>> > >>> C programming language that uses some form of the ESRI Shapefile<br>> > >>> as its
<br>> > >>> underlying data storage mechanism. To me that is a tool or piece of<br>> > >>> software, not a format. But maybe I don't completely understand your<br>> > >>> goal.
<br>> > >>><br>> > >><br>> > >> well, I am, frankly confused.<br>> > >><br>> > >> I was quite convinced I wasn't describing a "tool" but was describing
<br>> > >> a "format." Of course, to describe the format, I positioned it on the<br>> > >> "format" (the SQLite-compatible format) used and popularized by a<br>> > >> "tool" (SQLite, the library, which happens to be written in C). In my
<br>> > >> mind, having the data format based on SQLite *format* for its<br>> > >> relational attribute handling was the real winner. In that sense,<br>> > >> perhaps I conflated the format and the tool. I am not well versed in
<br>> > >> these things to I am probably already walking on thin ice, but that<br>> > >> shouldn't stop others.<br>> > >><br>> > >> So, forget that I mentioned C and Java... let's just concentrate on a
<br>> > >> way of laying out data on a disk that is not too dissimilar from how<br>> > >> Shapefile data are laid out, except that we utilize the<br>> > >> SQLite-compatible binary format for relational data handling, so that
<br>> > >> SQLite-enabled spatial tools can access this new format.<br>> > >><br>> > >> And, put this format into public domain.<br>> > >><br>> > >><br>> > >>>
<br>> > >>> -----Original Message-----<br>> > >>> From: <a href="mailto:discuss-bounces@lists.osgeo.org">discuss-bounces@lists.osgeo.org</a><br>> > >>> [mailto:<a href="mailto:discuss-bounces@lists.osgeo.org">
discuss-bounces@lists.osgeo.org</a>] On Behalf Of P Kishor<br>> > >>> Sent: Tuesday, November 13, 2007 8:35 AM<br>> > >>> To: OSGeo Discussions<br>> > >>> Subject: [OSGeo-Discuss] Re: idea for an OSGeo project -- a new,open
<br>> > >>> data format<br>> > >>><br>> > >>> Thanks everyone, for responding. Here is my "groundwork."<br>> > >>><br>> > >>> The new format --
<br>> > >>><br>> > >>> - Should be fast. SQLite is plenty fast, and anything that simply<br>> > >>> "extends" the Shapefile format to inject relational capabilities<br>
> > >>> should be pretty fast. It should definitely be faster than a<br>> > >>> geodatabase format (such as PostGIS/ArcSDE) and perhaps even faster<br>> > >>> than Shapefiles especially while accessing attribute data. DBF is
<br>> > >>> sequential, and searching for textual information is particularly<br>> > >>> expensive. SQLite has been tuned to excellence. I have been working<br>> > >>> with it for a few years now, and it really is an amazing product,
<br>> > >>> development community, support, and capabilities. That it is in<br>> > >>> public<br>> > >>> domain makes for a transfat-free icing on the cake.<br>> > >>>
<br>> > >>> - Should be unencumbered by licenses and copyrights. Ideally, the<br>> > >>> new<br>> > >>> format could also be put back into public domain. We want to remove<br>> > >>> all encumbrances to encourage rapid and wide adoption.
<br>> > >>><br>> > >>> - Should be a single file. Well, some like multiple files and some<br>> > >>> like single files. We can achieve both objectives by using a<br>> > >>> tar-gzipped packaging such as Apple tends to use for much of its
<br>> > >>> stuff<br>> > >>> (for example, its Pages wordprocessor uses a tgzipped xml file along<br>> > >>> with other resources for icons and pictures and stuff). Or, if speed
<br>> > >>> is going to be affected because of gzipping and gunzipping, just a<br>> > >>> package format (I have no idea if this is a Unix thing or a Mac OS<br>> > >>> thing -- we, in the Mac world, call them packages... they appear
<br>> > >>> like<br>> > >>> files in the Finder, and like directories in the shell).<br>> > >>><br>> > >>> - Should be easy to transition to. By building the new format on the
<br>> > >>> structure of the Shapefile format, and *in fact*, calling it "open<br>> > >>> shapefiles" or some such thing, we indicate from its name that the<br>> > >>> transition is not that revolutionary but is evolutionary. This,
<br>> > >>> hopefully, will bring some name-familiarity, and make the transition<br>> > >>> less scary.<br>> > >>><br>> > >>> - Frank mentions SQLite's lack of datatypes as an issue -- I guess
<br>> > >>> that is a matter of preference. I personally quite like that freedom<br>> > >>> as it gives me, the application developer, complete control over<br>> > >>> what<br>
> > >>> goes where. SQLite actually does have now a few datatypes that it<br>> > >>> respects, but doesn't complain about. Since all users will be<br>> > >>> accessing the data via an application, as long as the application is
<br>> > >>> well defined, it should be fine.<br>> > >>><br>> > >>> - SQLite excels at one thing that it has been entrusted to do --<br>> > >>> retrieve data that it has been entrusted with at extremely fast
<br>> > >>> speeds, and maintain ACID data integrity in case of a programmatic<br>> > >>> catastrophe. The transactions themselves are worth their price of<br>> > >>> admission, which, happily, happens to be zero.
<br>> > >>><br>> > >>> - Langdon mentions Java support -- well, yes, use/work on SQLite<br>> > >>> JDBC.<br>> > >>> I have been using it for a few days now and find it to be a pretty
<br>> > >>> competent conduit. Extend it, spatialize it. ANSI standard C is<br>> > >>> still<br>> > >>> that magic common denominator that compiles and works predictably on<br>
> > >>> most number of systems. I have a lot against Java, but those who<br>> > >>> love<br>> > >>> Java should definitely work on tools for accessing and working with<br>> > >>> this new format as it would only make the format more widely used
<br>> > >>> and<br>> > >>> adopted.<br>> > >>><br>> > >>> Ok, enough for now.<br>> > >>><br>> > >>><br>> > >>><br>> > >>> On Nov 13, 2007 8:52 AM, P Kishor <
<a href="mailto:punk.kish@gmail.com">punk.kish@gmail.com</a>> wrote:<br>> > >>>> So, I am thinking, Shapefile is the de facto data standard for GIS<br>> > >>>> data. That it is open (albeit not Free) along with the deep and
<br>> > >>>> wide<br>> > >>>> presence of ESRI's products from the beginning of the epoch, it has<br>> > >>>> been widely adopted. Existence of shapelib, various language
<br>> > >>>> bindings,<br>> > >>>> and ready use by products such as MapServer has continued to cement<br>> > >>>> Shapefile as the format to use. All this is in spite of Shapefile's
<br>> > >>>> inherent drawbacks, particularly in the area of attribute data<br>> > >>>> management.<br>> > >>>><br>> > >>>> What if we came up with a new and improved data format -- call it
<br>> > >>>> "Open Shapefile" (extension .osh) -- that would be completely Free,<br>> > >>>> single-file based (instead of the multiple .shp, .dbf, .shx, etc.),<br>> > >>>> and based on SQLite, giving the .osh format complete relational
<br>> > >>>> data<br>> > >>>> handling capabilities. We would require a new version of Shapelib,<br>> > >>>> improved language bindings, make it the default and preferred
<br>> > >>>> format<br>> > >>>> for MapServer, and provide seamless and painless import of regular<br>> > >>>> .shp data into .osh for native rendering. Its adoption would be
<br>> > >>>> quick<br>> > >>>> in the open source community. The non-opensource community would<br>> > >>>> either not give a rat's behind for it, but it wouldn't affect
<br>> > >>>> them...<br>> > >>>> they would still work with their preferred .shp until they learned<br>> > >>>> better. By having a completely open and Free single-file based,
<br>> > >>>> built<br>> > >>>> on SQLite, fully relational dbms capable spatial data format, it<br>> > >>>> would<br>> > >>>> be positioned for continued improvement and development.
<br>> > >>>><br>> > >>>> Is this too crazy?<br>> > >>>><br>> > >>>> --<br>> > >>>> Puneet Kishor<br>> > >>>><br>> > >>> _______________________________________________
<br>> > >>> Discuss mailing list<br>> > >>> <a href="mailto:Discuss@lists.osgeo.org">Discuss@lists.osgeo.org</a><br>> > >>> <a href="http://lists.osgeo.org/mailman/listinfo/discuss" target="_blank">
http://lists.osgeo.org/mailman/listinfo/discuss</a><br>><br>> > >>><br>> > >>><br>> > >>> Warning:<br>> > >>> Information provided via electronic media is not guaranteed
<br>> > >>> against defects including translation and transmission errors. If<br>> > >>> the reader is not the intended recipient, you are hereby notified<br>> > >>> that any dissemination, distribution or copying of this
<br>> > >>> communication is strictly prohibited. If you have received this<br>> > >>> information in error, please notify the sender immediately.<br>> > >>> _______________________________________________
<br>> > >>> Discuss mailing list<br>> > >>> <a href="mailto:Discuss@lists.osgeo.org">Discuss@lists.osgeo.org</a><br>> > >>><br>> <a href="http://lists.osgeo.org/mailman/listinfo/discuss" target="_blank">
http://lists.osgeo.org/mailman/listinfo/discuss</a><br>> > >>><br>> > >> _______________________________________________<br>> > >> Discuss mailing list<br>> > >> <a href="mailto:Discuss@lists.osgeo.org">
Discuss@lists.osgeo.org</a><br>> > >> <a href="http://lists.osgeo.org/mailman/listinfo/discuss" target="_blank">http://lists.osgeo.org/mailman/listinfo/discuss</a><br>><br>> > >><br>> > >
<br>> > > have fun,<br>> > ><br>> > > SteveC | <a href="mailto:steve@asklater.com">steve@asklater.com</a> | <a href="http://www.asklater.com/steve/" target="_blank">http://www.asklater.com/steve/
</a><br>> > ><br>> > ><br>> > > _______________________________________________<br>> > > Discuss mailing list<br>> > > <a href="mailto:Discuss@lists.osgeo.org">Discuss@lists.osgeo.org
</a><br>> > ><br>> <a href="http://lists.osgeo.org/mailman/listinfo/discuss" target="_blank">http://lists.osgeo.org/mailman/listinfo/discuss</a><br>> ><br>> > --<br>> > Allan Doyle<br>> > Director of Technology
<br>> > MIT Museum<br>> > +1.617.452.2111<br>> ><br>> ><br>> ><br>> ><br>> ><br>> ><br>> ><br>> > _______________________________________________<br>> > Discuss mailing list
<br>> > <a href="mailto:Discuss@lists.osgeo.org">Discuss@lists.osgeo.org</a><br>> ><br>> <a href="http://lists.osgeo.org/mailman/listinfo/discuss" target="_blank">http://lists.osgeo.org/mailman/listinfo/discuss
</a><br>> ><br>><br>><br>><br>> --<br>> ************************************<br>> David William Bitner<br></div></div></blockquote></div><br><br clear="all"><br>-- <br>************************************
<br>David William Bitner