[gdal-dev] OGR Field Types?

Even Rouault even.rouault at spatialys.com
Fri Jun 5 00:50:28 PDT 2015


Hi Stefan,

> 
> We've finished the GeoCSV spec. and we're almost ready to publish the
> Editable GeoCSV plugin fpr QGIS.
> 
> So, I have following enhancement requests for the OGR CSV reader,
> regarding CSVT:
> 1. Accept "WKT" (case insensitive) indicating WKT geometry field.
> 2. Accept "CoordX" and "CoordX" (in any order, case insensitive)
> indicating easting/northing of a point geometry.
> 3. While opening the CSV file, look for .prj as coordinate reference
> system/CRS (same CRS format like Shapefiles).
Hum, ESRI WKT so ? I would have rather recommanded the OGC WKT format (the one 
natively spoken by GDAL, based on OGC 01-009 
http://www.opengeospatial.org/standards/ct), that has the advantage of 
including EPSG codes explicitely. If you'd want to be up-to-date, there's also 
the WKT 2 / ISO TC 211 format ( 
http://docs.opengeospatial.org/is/12-063r5/12-063r5.html ), but GDAL doesn't 
handle it yet.

> 
> What do you think? Can I open an enhancement request for this?

Yes, I might have a look at that while doing something related in the CSV 
driver.

Even

> 
> Cheers, S.
> 
> 2015-05-22 1:07 GMT+02:00 Stefan Keller <sfkeller at gmail.com>:
> > 2015-05-22 0:53 GMT+02:00 Even Rouault <even.rouault at spatialys.com>:
> > ...
> > 
> >>> * "Easting","Northing"
> >> 
> >> X,Y or Geometry(X), Geometry(Y) or Point(X),Point(Y) would perhaps be
> >> easier to get. I don't know.
> > 
> > So let's propose Point(X),Point(Y) or PointX,PointY.
> > 
> > -S.
> > 
> > 2015-05-22 0:53 GMT+02:00 Even Rouault <even.rouault at spatialys.com>:
> >> Le vendredi 22 mai 2015 00:33:43, Stefan Keller a écrit :
> >>> 2015-05-21 23:34 GMT+02:00 Even Rouault <even.rouault at spatialys.com>:
> >>> ...
> >>> 
> >>> >> 4. "Geometry(Easting)","Geometry(Northing)"
> >>> > 
> >>> > For points only I guess?
> >>> 
> >>> Yes.
> >>> 
> >>> >> 5. "Geometry" -- encoded in WKT; having subtype values WKT
> >>> >> (default), Point, LineString, Polygon.
> >>> > 
> >>> > "WKT" is not really consistant with Point,LineString,Polygon since
> >>> > the later would be expressed as WKT I guess. So perhaps WKT,
> >>> > WKT(Point), WKT(LineString), WKT(Polygon) ?
> >>> 
> >>> Right, d'accord. So, let's introduce following two geometry types:
> >>> * "Easting","Northing"
> >> 
> >> X,Y or Geometry(X), Geometry(Y) or Point(X),Point(Y) would perhaps be
> >> easier to get. I don't know.
> >> 
> >>> * WKT, WKT(Point), WKT(LineString), WKT(Polygon)
> >>> 
> >>> There are additional restricions to these geometry types:
> >>> * "Easting","Northing" must co-occur and should be neighboring columns
> >>> (in either order).
> >> 
> >> I guess most people would do that, but I don't see a strong rationale to
> >> impose neighboring coumns
> >> 
> >>> * if WKT is given, all rows are expected to contain the same geometry
> >>> type.
> >> 
> >> For generic WKT, all geometry types should be possible (as currently).
> >> Only for WKT(xxxx), geometries should be restricted to the specified
> >> type.
> >> 
> >>> * There's only one geometry column per .csvt, namely either
> >>> <<"Easting","Northing">> or <<WKT>>.
> >> 
> >> Kind of makes sense for Easting,Northing. But for WKT columns I don't
> >> see a reason for such a restriction. Actually there's already a
> >> secret/undocumented/debug mode in the CSV driver to read multiple
> >> geometry columns (I won't reveal it unless I'm tortured, but you can
> >> look at the code). This was mostly useful at the first stages when
> >> developing RFC 41.
> >> 
> >>> OK?
> >> 
> >> Not sure what kind of committment would be expected if I say "OK", so
> >> I'll abstain ;-)
> >> 
> >>> --S.
> >>> 
> >>> 2015-05-21 23:34 GMT+02:00 Even Rouault <even.rouault at spatialys.com>:
> >>> > Le jeudi 21 mai 2015 23:17:26, Stefan Keller a écrit :
> >>> >> Hi Even
> >>> >> 
> >>> >> I just see some type mod and subtype definitions for .csvt files in
> >>> >> the CSV docs [1] :
> >>> >> <<
> >>> >> In a single line the types for each column have to be listed with
> >>> >> double quotes and be comma separated (e.g., "Integer","String"). It
> >>> >> is also possible to specify explicitly the width and precision of
> >>> >> each column, e.g. "Integer(5)","Real(10.7)","String(15)". The
> >>> >> driver will then use these types as specified for the csv columns.
> >>> >> Starting with GDAL 2.0, subtypes can be passed between parenthesis,
> >>> >> such as "Integer(Boolean)", "Integer(Int16)" and "Real(Float32)"
> >>> >> <<
> >>> >> 
> >>> >> Following questions and suggestions about .csvt:
> >>> >> 1. Do type names really have to be in double quotes?
> >>> > 
> >>> > No
> >>> > 
> >>> >> 2. Is the separator always comma or can it also be a semicolon?
> >>> > 
> >>> > Yes, always comma
> >>> > 
> >>> >> 3. What about a Geometry type with subtypes? I suggest to add
> >>> >> 4. "Geometry(Easting)","Geometry(Northing)"
> >>> > 
> >>> > For points only I guess? Well you can build points with a OGR VRT
> >>> > from 2 CSV columns. I might perhaps add in some time an open option
> >>> > to specify the columns for the easting/longitude and
> >>> > northing/latitude.
> >>> > 
> >>> >> 5. "Geometry" -- encoded in WKT; having subtype values WKT
> >>> >> (default), Point, LineString, Polygon.
> >>> > 
> >>> > "WKT" is not really consistant with Point,LineString,Polygon since
> >>> > the later would be expressed as WKT I guess. So perhaps WKT,
> >>> > WKT(Point), WKT(LineString), WKT(Polygon) ?
> >>> > 
> >>> >> What do you think?
> >>> >> This could also make QGIS "Add Delimited Text Layer..." even better.
> >>> >> 
> >>> >> --S.
> >>> >> 
> >>> >> [1] http://www.gdal.org/drv_csv.html
> >>> >> 
> >>> >> 2015-04-29 22:57 GMT+02:00 Stefan Keller <sfkeller at gmail.com>:
> >>> >> > Salut Even,
> >>> >> > 
> >>> >> > Merci!
> >>> >> > 
> >>> >> > 2015-04-29 20:35 GMT+02:00 Even Rouault 
<even.rouault at spatialys.com>:
> >>> >> >> Stefan
> >>> >> >> 
> >>> >> >>> Questions:
> >>> >> >>> 1. How is 'binary' encoded? E.g. when defining binary in a CSV
> >>> >> >>> file, how is it encoded? Hex?
> >>> >> >> 
> >>> >> >> There's no support in the CSV driver for binary data
> >>> >> >> 
> >>> >> >>> 2. Can a field in a CSV input files have a IntegerList or a
> >>> >> >>> Binary?
> >>> >> >> 
> >>> >> >> No. Well on writing, the IntegerList will be serialized as a
> >>> >> >> string. But not recognized as IntegerList on reading
> >>> >> >> 
> >>> >> >>> 3. What is the value delimiter in a field of type IntegerList,
> >>> >> >>> Integer64List, RealList, StringList?
> >>> >> >> 
> >>> >> >> The default serializatoin will be
> >>> >> >> (number_of_elements:val1,val2,...,valn), but currently it is
> >>> >> >> truncated to 80 chracters
> >>> >> >> 
> >>> >> >>> 4. "Boolean, Int16, Float32" are mentioned as subtypes. Are
> >>> >> >>> there more subtypes?
> >>> >> >> 
> >>> >> >> Not currently. See
> >>> >> >> https://trac.osgeo.org/gdal/wiki/rfc50_ogr_field_subtype
> >>> >> >> 
> >>> >> >> Even
> >>> >> >> 
> >>> >> >> --
> >>> >> >> Spatialys - Geospatial professional services
> >>> >> >> http://www.spatialys.com
> >>> > 
> >>> > --
> >>> > Spatialys - Geospatial professional services
> >>> > http://www.spatialys.com
> >> 
> >> --
> >> Spatialys - Geospatial professional services
> >> http://www.spatialys.com

-- 
Spatialys - Geospatial professional services
http://www.spatialys.com


More information about the gdal-dev mailing list