[gdal-dev] Whitespace in WKT

Even Rouault even.rouault at spatialys.com
Tue Jan 8 09:10:35 PST 2019


On mardi 8 janvier 2019 11:38:32 CET Andrew Bell wrote:
> Hi,
> 
> I have some WKT where point X and Y are separated by newline characters
> rather than spaces.  A look at OGRWktReadToken seems to eat spaces and
> tabs, but not newlines or other whitespace.  My reading of the OGC simple
> feature BNF doesn't help much, as AFAICT, the separator between is an
> "implied" space:
> 
> OGC 06-103r4
> 
> <point z> ::= <x> <y> <z>
> 
> I would have expected to see spacing specified something like:
> 
> <point z> ::= <x> <space> <y> <space> <z>
> 
> So I'm confused.  Are only tabs and spaces allowed?  Only a single
> space?  Is this defined somewhere I'm not seeing?

The BNF mentions only <space> " " // unicode "U+0020" (space)
but indeed doesn't use it in a rigorous way.
SQL/MM Part 3 (at least the draft of it publicly found or the extract at
https://github.com/postgis/postgis/blob/svn-trunk/doc/bnf-wkt.txt ) doesn't 
even mention it...

The general practice in other implementations I've seen (PostGIS, Spatialite), 
on the write side, is to just use a single space to separate the coordinates 
of a tuple. Some implementations might add an extra space between a geometry 
name keyword and the ( parenthesis: "POINT (1 2)" vs "POINT(1 2)". I've never 
seen tabulations or newlines.

That said, from a quick test, PostGIS WKT parser seems to support tabulations 
and newlines, and several occurences of those separators.

Confirmed by
https://github.com/postgis/postgis/blob/
1ba28a8ea39e8be0eabc322c992a315d9c09528e/liblwgeom/lwin_wkt_lex.l#L92

We might we more tolerant on the read side. Can't think right now of potential 
issues in doing so

Even

-- 
Spatialys - Geospatial professional services
http://www.spatialys.com


More information about the gdal-dev mailing list