<div dir="ltr">The problematic one in this case is about 30GB, with ~5.9M features, of property parcels in Florida, each with polygons with 5-10 vertices and 57 (!) other columns. Below is the first feature as printed by ogrinfo. It appears to have originated as a Shapefile, which we have also converted to regular GeoJSON (the 30GB one), and linear GeoJSONL/Seq.<div><br></div><div>The Shapefile version imports with no issues or obvious delays, with features flowing basically immediately, and the overall process taking about 7 minutes.</div><div><br></div><div>The regular GeoJSON version spends over 20 minutes in the GDALOpenEx call, and another 10 minutes before features flow (haven't looked yet in what), after which it takes about the same 7 minutes.</div><div><br></div><div>The GeoJSONL/Seq version spends about 12 minutes in GDALOpenEx and then the same 10 minutes, and then the same 7 minutes.</div><div><br></div><div>Note that ogrinfo has the same initial delays (20 minutes and 12 minutes) before it prints anything.</div><div><br></div><div>This is all with GDAL 3.2.2 on Ubuntu 20.04 on a quad i7 4.2 with 32GB and SSD.</div><div><br></div><div>The 7 minutes of import is not the issue, and once features are flowing, our code is able to be aborted. The issue is the 20-30 minutes where it can't because it's (seemingly) stuck in GDAL calls.</div><div><br></div><div>Simon<br><div>______________________________</div><div><br></div><div>OGRFeature(fl_parcels):0<br> CNTYNAME (String) = ESCAMBIA<br> LINK (String) = 27-083S321301060003<br> PARCELID (String) = 083S321301060003<br> NPARNO (String) = 12-033-083S321301060003<br> DORUC (String) = 000<br> PAUC (String) = 1<br> PARUSEDESC (String) = VACANT RESIDENTIAL<br> SPASS_CD (String) = (null)<br> IMPROVVAL (Integer) = 0<br> LNDVAL (Integer64) = 1069<br> JV (Integer64) = 1069<br> JV_CHNG (Integer) = (null)<br> JV_HMSTD (Integer) = (null)<br> AV_SD (Integer64) = 1069<br> AV_NSD (Integer64) = 1069<br> AV_HMSTD (Integer) = (null)<br> JV_CLASS_U (Integer) = (null)<br> ONAME (String) = GRAF MABIE PARTNERSHIP<br> OADDR1 (String) = 5544 BAKER RD<br> OADDR2 (String) = (null)<br> OCITY (String) = MILTON<br> OSTATE (String) = FL<br> OZIPCD (String) = 32570<br> PHYADDR1 (String) = (null)<br> PHYCITY (String) = PERDIDO KEY<br> PHYZIP (String) = 32507<br> SLEGAL (String) = LT 6 BLK 3 PERDIDO BAY COUNTRY<br> ALTKEY (String) = 103002391<br> ACTYRBLT (Integer) = (null)<br> EFFYRBLT (Integer) = (null)<br> TOTLVGAREA (Integer) = (null)<br> NOBULDNG (Integer) = (null)<br> NORESUNTS (Integer) = (null)<br> PARSPLT (String) = (null)<br> LNDSQFOOT (Real) = 14610.000000000000000<br> CONSTCLASS (String) = (null)<br> SALEPRC1 (Integer64) = (null)<br> SALEYR1 (Integer) = (null)<br> SALEMO1 (Integer) = (null)<br> ORBOOK1 (String) = (null)<br> ORPAGE1 (String) = (null)<br> SALEPRC2 (Integer) = (null)<br> SALEYR2 (Integer) = (null)<br> SALEMO2 (Integer) = (null)<br> NBRHDCD (String) = (null)<br> PUBLICLND (String) = (null)<br> TAXAUTHCD (String) = MSTU<br> SEC (String) = 8<br> TWN (String) = 03S<br> RNG (String) = (null)<br> CENSUSBK (String) = 12033002604<br> SOURCEAGE (String) = ESCAMBIA COUNTY PROPERTY APPRAISER<br> SOURCEDATE (Integer64) = 1506643200<br> LAT_DD (Real) = 30.334948776670700<br> LONG_DD (Real) = -87.417015732515793<br> MGRS (String) = 16RDU5991555975<br> ACRES (Real) = 0.335475404173654<br> EXMPT (String) = (null)<br> LU_RES (String) = (null)<br> LUCODE (String) = 000<br> GCID (Integer) = 3070217<br> DESCRIPT (String) = VACANT RESIDENTIAL<br> FLAG (String) = (null)<br> FGDLAQDATE (Integer64) = 1509494400<br> AUTOID (Integer) = 3070217<br> Shape_Leng (Real) = 161.443473611912992<br> Shape_Area (Real) = 1357.620793937390090<br> POLYGON ((-87.4169063602653 30.3347107536787,-87.4169237108049 30.3346997733855,-87.4172722303389 30.3349846323649,-87.41724423478<br>23 30.3350359296124,-87.4172384512691 30.3351985804435,-87.4171183385966 30.335195856325,-87.4169790313658 30.3350555432658,-87.4167<br>283286418 30.3348031641612,-87.4167416558679 30.3347974225575,-87.4167635326352 30.3347875319118,-87.4167852417644 30.3347773059899,<br>-87.4168030952182 30.334768463082,-87.4168206972148 30.3347594525361,-87.4168382153925 30.3347501905331,-87.416855482113 30.33474067<br>7073,-87.4168725811955 30.3347309121558,-87.4168895964589 30.334720937691,-87.4169063602653 30.3347107536787))<br></div><div><br></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Jul 29, 2021 at 6:49 AM Mike Flannigan <<a href="mailto:mflan@mflan.com">mflan@mflan.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
I would like to hear more about large GeoJSON files.<br>
How large are they?<br>
<br>
My GeoJSON files contain linear features only. My<br>
largest one is 50.2 MB with 1,230,000 newlines in it.<br>
Next biggest one is 12 MB with 280,000 newlines. These<br>
and about 140 other geojsons are open in the same project<br>
and I have no problems. In fact I converted from<br>
SHP to geojson 2 years ago because I used to have problems<br>
with SHP linear files.<br>
<br>
I use QGIS 3.16.8 on Linux Mint.<br>
<br>
<br>
Mike<br>
<br>
<br>
On 7/28/21 2:36 PM, <a href="mailto:gdal-dev-request@lists.osgeo.org" target="_blank">gdal-dev-request@lists.osgeo.org</a> wrote:<br>
> Date: Wed, 28 Jul 2021 12:22:12 -0700<br>
> From: Simon Eves<<a href="mailto:simon.eves@omnisci.com" target="_blank">simon.eves@omnisci.com</a>><br>
> <a href="mailto:To%3Agdal-dev@lists.osgeo.org" target="_blank">To:gdal-dev@lists.osgeo.org</a><br>
> Subject: [gdal-dev] Large GeoJSONs and aborting file opening<br>
> Message-ID:<br>
> <<a href="mailto:CAJf0KTRsaskSOsPv8tbA%2BiTB%2BTqL_ui5y4n05WGLDw_3guRs4w@mail.gmail.com" target="_blank">CAJf0KTRsaskSOsPv8tbA+iTB+TqL_ui5y4n05WGLDw_3guRs4w@mail.gmail.com</a>><br>
> Content-Type: text/plain; charset="utf-8"<br>
><br>
> Dear All,<br>
><br>
> I am aware that some improvements were made in the 2.3 timeframe with<br>
> regards to dealing with large GeoJSON files, although even in 3.2, it's<br>
> still very slow and memory hungry.<br>
><br>
> Our system allows for aborting imports, but this only works reliably once<br>
> it has actually got to the stage of reading features from the file. With<br>
> the GeoJSON, it just sits in the GDALOpenEx call for ages.<br>
><br>
> My question, therefore, is whether it might be practical to run the<br>
> GDALOpenEx in a separate thread with a future to return the resulting<br>
> handle, such that it could be monitored and killed if necessary?<br>
><br>
> Mainly I would be concerned that killing the thread might trash some global<br>
> GDAL state that might then not be recoverable, or that the open relies on<br>
> some TLS for the process thread and therefore might not work properly.<br>
><br>
> We're going to try it anyway, but opinions welcomed, thanks!<br>
><br>
> Simon<br>
<br>
<br>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div style="margin:0px;padding:0px 0px 20px;width:2544px;font-family:Roboto,RobotoDraft,Helvetica,Arial,sans-serif;font-size:medium"><div><div style="font-size:12.8px;margin:8px 0px 0px;padding:0px"><div><div dir="ltr"><span><font color="#888888"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><table cellpadding="0" cellspacing="0" border="0" style="font-family:Times;width:2544px"><tbody><tr><td align="left" style="vertical-align:top;font-size:0px"><table cellpadding="0" cellspacing="0" border="0"><tbody><tr><td align="left" style="padding:0px 15px 0px 0px;vertical-align:middle"><font face="arial, helvetica, sans-serif" size="2"><a href="http://www.omnisci.com/" target="_blank"><img src="http://www2.omnisci.com/l/298412/2018-09-18/3sqpg/298412/61753/OmniSci_Email_Header2.png"></a><br></font></td><td align="left" style="padding:0px 0px 0px 15px;vertical-align:top"><table cellpadding="0" cellspacing="0" border="0" style="width:215px"><tbody><tr><td align="left" style="vertical-align:top"><span style="white-space:nowrap;color:rgb(0,0,1)"><span style="color:rgb(14,76,144);font-weight:700"><font face="arial, helvetica, sans-serif" size="2">Simon Eves</font></span></span></td></tr><tr><td align="left" style="vertical-align:top"><table cellpadding="0" cellspacing="0" border="0"><tbody><tr><td align="left" style="vertical-align:top"><span style="white-space:nowrap;color:rgb(0,0,1)"><font face="arial, helvetica, sans-serif" size="2">Senior Graphics Engineer, Rendering Group<br>100 Montgomery St (5th Floor), San Francisco, CA 94104, USA<br></font></span></td></tr></tbody></table></td></tr><tr><td align="left" style="vertical-align:top"><table cellpadding="0" cellspacing="0" border="0"><tbody><tr><td align="left" style="padding:0px;vertical-align:top"><br></td><td align="left" style="padding:0px;vertical-align:top"><br></td></tr><tr><td align="left" style="padding:0px;vertical-align:top"><span style="white-space:nowrap;color:rgb(0,0,1)"><font face="arial, helvetica, sans-serif" size="2">Email: <a href="mailto:simon.eves@omnisci.com" target="_blank">simon.eves@omnisci.com</a> | Cell: </font></span></td><td align="left" style="padding:0px;vertical-align:top"><span style="white-space:nowrap;color:rgb(0,0,1)"><font face="arial, helvetica, sans-serif" size="2">+1 (415) 902-1996</font></span></td></tr></tbody></table><br></td></tr></tbody></table></td></tr></tbody></table></td></tr></tbody></table><br></div></div></div></div></div></div></div></div></div></div></font></span></div><div></div></div></div><div></div></div></div></div></div></div></div></div></div></div></div>