[Java-collab] Simple Text Exchange Format
Paul Austin
mail-lists at revolsys.com
Tue May 27 14:22:41 EDT 2008
All,
I saw in one of the other posts there was a discussion of binary format
to replace shape files quick random access to data. Someone suggested
using an embedded database such as H2 with a spatial extension. I think
that using a database is a much better way to go for this kind of
access. Otherwise if we come up with our own binary format we'll need to
deal with all the issues such as storage management and indexing that
databases already do for us.
I do however think that we need a simple format for exchange of data.
Exchanging data may be via files or via a web service. GML in my view is
very verbose and complex to read and write and does not include an
embedded schema.
I have been working on a CSV derivative which I'm calling Enhanced-CSV.
Basically it's a CSV file where the format is strict about placement of
commas and use of "". It also has two header sections. The first section
is a list of properties about the file, such as type name, projection,
author and a list of which attribute headers will follow. The next
header is the attribute header (schema). There can be multiple attribute
headers including the name,type, length, precision, required flag of the
attribute. There is one entry for each data column (attribute). Finally
there is the data section which is just all your rows of data encoded as
CSV. Geometries are encoded as WKT
Below is a sample of a ECSV file with the three sections.
{http://ns.ecsv.org/ecsv}typeName,QName,{GFT}GFT_CAPTURE_METHOD_CODE
{http://ns.ecsv.org/ecsv}srid,QName,{http://epsg.org}3005
{http://ns.ecsv.org/ecsv}attributeHeaderTypes,list,"{http://ns.ecsv.org/ecsv}attributeName,{http://ns.ecsv.org/ecsv}attributeType,{http://ns.ecsv.org/ecsv}attributeLength,{http://ns.ecsv.org/ecsv}attributeScale,{http://ns.ecsv.org/ecsv}attributeRequired"
CAPTURE_METHOD_CODE_ID,CODE_VALUE,WHO_CREATED,WHEN_CREATED
integer,string,string,dateTime
3,255,255,2147483647
0,0,0,0
false,false,false,false
1,Photogrammetric,PROXY_GFT,2008-05-26T00:00:00
2,Differential Gps,PROXY_GFT,2008-05-26T00:00:00
3,Tablet Digitizing,PROXY_GFT,2008-05-26T00:00:00
I'm working on a specification for this format and hopefully should have
a draft up in the next month or so. I have developed a reader and writer
and a JUMP plug-in which I'll make available when I've finalized the
specification.
Is this something that would interest any one else?
Paul
More information about the Java-collab
mailing list