[gdal-dev] Whitespace in WKT
Even Rouault
even.rouault at spatialys.com
Tue Jan 8 09:10:35 PST 2019
On mardi 8 janvier 2019 11:38:32 CET Andrew Bell wrote:
> Hi,
>
> I have some WKT where point X and Y are separated by newline characters
> rather than spaces. A look at OGRWktReadToken seems to eat spaces and
> tabs, but not newlines or other whitespace. My reading of the OGC simple
> feature BNF doesn't help much, as AFAICT, the separator between is an
> "implied" space:
>
> OGC 06-103r4
>
> <point z> ::= <x> <y> <z>
>
> I would have expected to see spacing specified something like:
>
> <point z> ::= <x> <space> <y> <space> <z>
>
> So I'm confused. Are only tabs and spaces allowed? Only a single
> space? Is this defined somewhere I'm not seeing?
The BNF mentions only <space> " " // unicode "U+0020" (space)
but indeed doesn't use it in a rigorous way.
SQL/MM Part 3 (at least the draft of it publicly found or the extract at
https://github.com/postgis/postgis/blob/svn-trunk/doc/bnf-wkt.txt ) doesn't
even mention it...
The general practice in other implementations I've seen (PostGIS, Spatialite),
on the write side, is to just use a single space to separate the coordinates
of a tuple. Some implementations might add an extra space between a geometry
name keyword and the ( parenthesis: "POINT (1 2)" vs "POINT(1 2)". I've never
seen tabulations or newlines.
That said, from a quick test, PostGIS WKT parser seems to support tabulations
and newlines, and several occurences of those separators.
Confirmed by
https://github.com/postgis/postgis/blob/
1ba28a8ea39e8be0eabc322c992a315d9c09528e/liblwgeom/lwin_wkt_lex.l#L92
We might we more tolerant on the read side. Can't think right now of potential
issues in doing so
Even
--
Spatialys - Geospatial professional services
http://www.spatialys.com
More information about the gdal-dev
mailing list