[Java-collab] Simple Text Exchange Format

Paul Austin mail-lists at revolsys.com
Tue May 27 14:22:41 EDT 2008


All,

I saw in one of the other posts there was a discussion of binary format 
to replace shape files quick random access to data. Someone suggested 
using an embedded database such as H2 with a spatial extension. I think 
that using a database is a much better way to go for this kind of 
access. Otherwise if we come up with our own binary format we'll need to 
deal with all the issues such as storage management and indexing that 
databases already do for us.

I do however think that we need a simple format for exchange of data. 
Exchanging data may be via files or via a web service. GML in my view is 
very verbose and complex to read and write and does not include an 
embedded schema.

I have been working on a CSV derivative which I'm calling Enhanced-CSV. 
Basically it's a CSV file where the format is strict about placement of 
commas and use of "". It also has two header sections. The first section 
is a list of properties about the file, such as type name, projection, 
author and a list of which attribute headers will follow. The next 
header is the attribute header (schema). There can be multiple attribute 
headers including the name,type, length, precision, required flag of the 
attribute. There is one entry for each data column (attribute). Finally 
there is the data section which is just all your rows of data encoded as 
CSV. Geometries are encoded as WKT

Below is a sample of a ECSV file with the three sections.

{http://ns.ecsv.org/ecsv}typeName,QName,{GFT}GFT_CAPTURE_METHOD_CODE
{http://ns.ecsv.org/ecsv}srid,QName,{http://epsg.org}3005
{http://ns.ecsv.org/ecsv}attributeHeaderTypes,list,"{http://ns.ecsv.org/ecsv}attributeName,{http://ns.ecsv.org/ecsv}attributeType,{http://ns.ecsv.org/ecsv}attributeLength,{http://ns.ecsv.org/ecsv}attributeScale,{http://ns.ecsv.org/ecsv}attributeRequired"

CAPTURE_METHOD_CODE_ID,CODE_VALUE,WHO_CREATED,WHEN_CREATED
integer,string,string,dateTime
3,255,255,2147483647
0,0,0,0
false,false,false,false

1,Photogrammetric,PROXY_GFT,2008-05-26T00:00:00
2,Differential Gps,PROXY_GFT,2008-05-26T00:00:00
3,Tablet Digitizing,PROXY_GFT,2008-05-26T00:00:00


I'm working on a specification for this format and hopefully should have 
a draft up in the next month or so. I have developed a reader and writer 
and a JUMP plug-in which I'll make available when I've finalized the 
specification.

Is this something that would interest any one else?

Paul


More information about the Java-collab mailing list