<div dir="auto">Improvements have hit master. I suspect there are some remaining bottlenecks though, but I currently lack the tests/means to investigate further in short term and will appreciate feedback.</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Den tors 31 okt. 2019 02:17Björn Harrtell <<a href="mailto:bjorn.harrtell@gmail.com">bjorn.harrtell@gmail.com</a>> skrev:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Thanks for trying out accessing FlatGeobuf via http.<div><br></div><div>For the record I've been slightly aware of this particular efficiency problem and I aim to improve it when I can get to it, because this is a use case I definitely want FlatGeobuf to grab the first place. :)<div><br></div><div>/Björn</div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Den tors 24 okt. 2019 kl 20:05 skrev Even Rouault <<a href="mailto:even.rouault@spatialys.com" target="_blank" rel="noreferrer">even.rouault@spatialys.com</a>>:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On jeudi 24 octobre 2019 17:42:23 CEST Rahkonen Jukka (MML) wrote:<br>

> Hi,<br>

> <br>

> I was experimenting with accessing some vector files through http (same data<br>

> as FlatGeoBuffers, GeoPackage, and shapefile). The file size in each format<br>

> was about 850 MB and the amount of data was about 240000 linestrings. I<br>

> made ogrinfo request with spatial filter that selects one feature and<br>

> cheched the number of http requests and amount of requested data.<br>

> <br>

> FlatGeoBuffers<br>

> 19 http requests<br>

> 33046509 bytes read<br>

<br>

Looking at the debug log, FlatGeoBuf currently loads the whole index-of-<br>

features array( "Reading feature offsets index" ), which accounts for 32.7 MB <br>

of the above 33 MB. This could probably be avoided by only loading the offsets <br>

of the selected features. The shapefile driver a few years ago had the same <br>

issue and this was fixed by initializing the offset array to zeroes, and load <br>

on demand the offsets when needed.<br>

<br>

> If somebody<br>

> really finds a use case for reading vector data from the web it seems<br>

> obvious that having a possibility to cache and re-use the spatial index<br>

> would be very beneficial. I can imagine that with shapefile it would mean <br>

> downloading the .qix file, with GeoPackage reading the contents of the<br>

> rtree index table, and with FlatGeoBuffers probably extracting the Static<br>

> packed Hilbert R-tree index.<br>

<br>

A general caching logic in /vsicurl/ would be preferable (although the <br>

download of the 'data' part of files might potentially evict the indexes, but <br>

having a dedicated logic in each driver to tell which files / region of the <br>

files should be cached would be a bit annoying). Basically doing a HEAD <br>

request on the file to get its last update date, and have a local cache of <br>

downloaded pieces would be a more general solution.<br>

<br>

Even<br>

<br>

-- <br>

Spatialys - Geospatial professional services<br>

<a href="http://www.spatialys.com" rel="noreferrer noreferrer" target="_blank">http://www.spatialys.com</a><br>

</blockquote></div>

</blockquote></div>