[gdal-dev] Field type detection by GeoJSON driver in the case of untidy data

Even Rouault even.rouault at spatialys.com
Mon Sep 23 09:35:21 PDT 2024


Sean,

yes if there's a mix of data types, a String(JSON) field is reported to 
mean that.

The only annoying thing is that for backward compatibility with past 
behaviour where we silently homogenized to a string, we didn't go to the 
point to actually quoting strings, so this isn't fully JSON compliant 
unfortunately

I mean if we have out.json with:

{
"type": "FeatureCollection",
"features": [
{ "type": "Feature", "properties": { "foo": "str" }, "geometry": null },
{ "type": "Feature", "properties": { "foo": 0 }, "geometry": null },
{ "type": "Feature", "properties": { "foo": ["a", "b"] }, "geometry": null }
]
}

$ ogrinfo -al out.geojson -q

Layer name: out
OGRFeature(out):0
   foo (String(JSON)) = str

OGRFeature(out):1
   foo (String(JSON)) = 0

OGRFeature(out):2
   foo (String(JSON)) = [ "a", "b" ]

In theory, we should report "str", not just str. A GDAL 4.0 topic... ? 
Just recorded it in 
https://github.com/OSGeo/gdal/issues/8440#issuecomment-2368801316

To actually answer your last question, this is a bit more subtle than 
the above. For example, if there's a mix of strings and array of 
strings, we report a StringList field. If there's a mix of integer and 
floating-point numbers, we report a Real field (which is OK since JSON 
has just a "numeric" type)

Even

Le 23/09/2024 à 18:17, Sean Gillies via gdal-dev a écrit :
> Hi all,
>
> The good thing about GeoJSON is that you don't need specialized GIS 
> software to create it. The bad thing about GeoJSON is that people 
> create it using software with none of the familiar GIS constraints.
>
> I've been looking at a collection of features that have the same set 
> of properties (good), but one of the properties has a mix of strings 
> (strings of digits, specifically) and unquoted numbers (0, 
> specifically). In versions <= 3.5, GDAL detects this field's type to 
> be "String". In versions >= 3.6, the field type is "String(JSON)". Is 
> this intended behavior? Will all such fields be found to be 
> "String(JSON)", or does it depend on their content?
>
> -- 
> Sean Gillies
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev

-- 
http://www.spatialys.com
My software is free, but my time generally not.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20240923/0d604548/attachment.htm>


More information about the gdal-dev mailing list