[gdal-dev] Field type detection by GeoJSON driver in the case of untidy data
Sean Gillies
sean.gillies at gmail.com
Mon Sep 23 10:59:58 PDT 2024
Thanks for the explanation, Even. Makes sense to me.
On Mon, Sep 23, 2024 at 10:35 AM Even Rouault <even.rouault at spatialys.com>
wrote:
> Sean,
>
> yes if there's a mix of data types, a String(JSON) field is reported to
> mean that.
>
> The only annoying thing is that for backward compatibility with past
> behaviour where we silently homogenized to a string, we didn't go to the
> point to actually quoting strings, so this isn't fully JSON compliant
> unfortunately
>
> I mean if we have out.json with:
>
> {
> "type": "FeatureCollection",
> "features": [
> { "type": "Feature", "properties": { "foo": "str" }, "geometry": null },
> { "type": "Feature", "properties": { "foo": 0 }, "geometry": null },
> { "type": "Feature", "properties": { "foo": ["a", "b"] }, "geometry": null
> }
> ]
> }
>
> $ ogrinfo -al out.geojson -q
>
> Layer name: out
> OGRFeature(out):0
> foo (String(JSON)) = str
>
> OGRFeature(out):1
> foo (String(JSON)) = 0
>
> OGRFeature(out):2
> foo (String(JSON)) = [ "a", "b" ]
>
> In theory, we should report "str", not just str. A GDAL 4.0 topic... ?
> Just recorded it in
> https://github.com/OSGeo/gdal/issues/8440#issuecomment-2368801316
>
> To actually answer your last question, this is a bit more subtle than the
> above. For example, if there's a mix of strings and array of strings, we
> report a StringList field. If there's a mix of integer and floating-point
> numbers, we report a Real field (which is OK since JSON has just a
> "numeric" type)
>
> Even
> Le 23/09/2024 à 18:17, Sean Gillies via gdal-dev a écrit :
>
> Hi all,
>
> The good thing about GeoJSON is that you don't need specialized GIS
> software to create it. The bad thing about GeoJSON is that people create it
> using software with none of the familiar GIS constraints.
>
> I've been looking at a collection of features that have the same set of
> properties (good), but one of the properties has a mix of strings (strings
> of digits, specifically) and unquoted numbers (0, specifically). In
> versions <= 3.5, GDAL detects this field's type to be "String". In versions
> >= 3.6, the field type is "String(JSON)". Is this intended behavior? Will
> all such fields be found to be "String(JSON)", or does it depend on their
> content?
>
> --
> Sean Gillies
>
> _______________________________________________
> gdal-dev mailing listgdal-dev at lists.osgeo.orghttps://lists.osgeo.org/mailman/listinfo/gdal-dev
>
> -- http://www.spatialys.com
> My software is free, but my time generally not.
>
>
--
Sean Gillies
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20240923/a6a26f28/attachment.htm>
More information about the gdal-dev
mailing list