[gdal-dev] [OSGeo-Standards] CSV spatial data on the web

Stefan Keller sfkeller at gmail.com
Wed Feb 17 04:53:44 PST 2016


Jeremy,

2016-02-17 11:48 GMT+01:00 Jeremy Palmer <JPalmer at linz.govt.nz>:
>> Allowing non-printable control-characters in strings makes it very
>> complicated and disables mainly all existing CSV software.
>
> Yes but Excel handles it. So does OGR and QGIS Delimited Text Provider.
> From memory the Microsoft ODBC CSV driver doesn’t - but that’s got lots of issues.

No, it does'nt handle linebreaks as one record - nor is this allowed
in any of the CSV standardization approaches I know.
Actually, Excel is very bad at reading and exchanging CSV.
You probably meant Exel's .xls/.xslx?
CSV is a line oriented human-readable text format.
A text field is not "formatted": Technically spoken it's an array of characters.
There is no way to copy&paste "anything" into a pure string including
non-printable chars -
unless it's either encoded/binary or has a markup (and both would need
pre- and postformatting).

:Stefan



2016-02-17 11:48 GMT+01:00 Jeremy Palmer <JPalmer at linz.govt.nz>:
> Hi Sfefan,
>
>> On 17/02/2016, at 11:31 PM, Stefan Keller <sfkeller at gmail.com> wrote:
>>
>> Hi Jeremy
>>
>> Semicolon is well supported in software.
>> Tab is poorly supported in some text editors.
>> Comma is heavy used in number values in european countries.
>> What delimiter do you prefer and why?
>>
>
> Comma because it’s a well known default for most software. I agree semicolon is support almost as well. I see your european issue.
>
>>> We also advertise the geometry datatype (useful for software quickly
>>> reading the data field metadata), field lengths/widths in the VRT,
>>
>> That's covered by CSVT, isn't it?
>
> Specification wasn’t clear and mentioned it could be supported. Looking at the actual OGR implementation is seems field width/lengths are supported (!), but still not the definition of the geometry field and type.
>
>>
>>> and have datasets with legitimate carriage returns within fields.
>>
>> Allowing non-printable control-characters in strings makes it very
>> complicated and disables mainly all existing CSV software.
>
> Yes but Excel handles it. So does OGR and QGIS Delimited Text Provider. From memory the Microsoft ODBC CSV driver doesn’t - but that’s got lots of issues.
>
>> carriage returns is one of the few undisputed end-of-line things in
>> CSV (besides CR, LF, CR/LF differences).
>> Markup to the rescue.
>> What do you prefer?
>>
>
> Markup can work but then it becomes about the reader being about to turn that markup into real carriage returns again. Mostly a pre-processing script is required first.
>
> Cheers
> Jeremy
>
>
> This message contains information, which may be in confidence and may be subject to legal privilege. If you are not the intended recipient, you must not peruse, use, disseminate, distribute or copy this message. If you have received this message in error, please notify us immediately (Phone 0800 665 463 or info at linz.govt.nz) and destroy the original message. LINZ accepts no responsibility for changes to this email, or for any attachments, after its transmission from LINZ. Thank You.


More information about the gdal-dev mailing list