[gdal-dev] [EXTERNAL] Re: Large GeoJSONs and aborting file opening

Newcomb, Doug doug_newcomb at fws.gov
Thu Jul 29 07:26:06 PDT 2021


https://github.com/microsoft/USBuildingFootprints
[https://opengraph.githubassets.com/225fd5de0d4b915c82edf14205ec82909692b31f97fa3453012808d51a8fd323/microsoft/USBuildingFootprints]<https://github.com/microsoft/USBuildingFootprints>
GitHub - microsoft/USBuildingFootprints: Computer generated building footprints for the United States<https://github.com/microsoft/USBuildingFootprints>
Introduction. Microsoft Maps is releasing country wide open building footprints datasets in United States. This dataset contains 129,591,852 computer generated building footprints derived using our computer vision algorithms on satellite imagery.
github.com
Texas is 2.38 GB , but you get a variety of other sizes

________________________________
From: gdal-dev <gdal-dev-bounces at lists.osgeo.org> on behalf of Mike Flannigan <mflan at mflan.com>
Sent: Thursday, July 29, 2021 9:49 AM
To: gdal-dev at lists.osgeo.org <gdal-dev at lists.osgeo.org>
Subject: [EXTERNAL] Re: [gdal-dev] Large GeoJSONs and aborting file opening



 This email has been received from outside of DOI - Use caution before clicking on links, opening attachments, or responding.



I would like to hear more about large GeoJSON files.
How large are they?

My GeoJSON files contain linear features only.  My
largest one is 50.2 MB with 1,230,000 newlines in it.
Next biggest one is 12 MB with 280,000 newlines.  These
and about 140 other geojsons are open in the same project
and I have no problems.  In fact I converted from
SHP to geojson 2 years ago because I used to have problems
with SHP linear files.

I use QGIS 3.16.8 on Linux Mint.


Mike


On 7/28/21 2:36 PM, gdal-dev-request at lists.osgeo.org wrote:
> Date: Wed, 28 Jul 2021 12:22:12 -0700
> From: Simon Eves<simon.eves at omnisci.com>
> To:gdal-dev at lists.osgeo.org
> Subject: [gdal-dev] Large GeoJSONs and aborting file opening
> Message-ID:
>       <CAJf0KTRsaskSOsPv8tbA+iTB+TqL_ui5y4n05WGLDw_3guRs4w at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Dear All,
>
> I am aware that some improvements were made in the 2.3 timeframe with
> regards to dealing with large GeoJSON files, although even in 3.2, it's
> still very slow and memory hungry.
>
> Our system allows for aborting imports, but this only works reliably once
> it has actually got to the stage of reading features from the file. With
> the GeoJSON, it just sits in the GDALOpenEx call for ages.
>
> My question, therefore, is whether it might be practical to run the
> GDALOpenEx in a separate thread with a future to return the resulting
> handle, such that it could be monitored and killed if necessary?
>
> Mainly I would be concerned that killing the thread might trash some global
> GDAL state that might then not be recoverable, or that the open relies on
> some TLS for the process thread and therefore might not work properly.
>
> We're going to try it anyway, but opinions welcomed, thanks!
>
> Simon


_______________________________________________
gdal-dev mailing list
gdal-dev at lists.osgeo.org
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.osgeo.org%2Fmailman%2Flistinfo%2Fgdal-dev&data=04%7C01%7Cdoug_newcomb%40fws.gov%7C4a700f32558543c33b2308d9529c3a35%7C0693b5ba4b184d7b9341f32f400a5494%7C0%7C0%7C637631653236424668%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=86Xw3QEVbY0bqUurcBL8iLFi9IVjl0vGyErJUclKFFg%3D&reserved=0
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20210729/b0eb1cc6/attachment.html>


More information about the gdal-dev mailing list