[gdal-dev] Ogr2ogr taking too much time to process a MapInfo TAB file
Daniel Morissette
dmorissette at mapgears.com
Thu Jul 28 07:25:37 PDT 2022
I confirm that the structure of the TAB dataset uses 512 bytes data
blocks organized in a tree structure, so reading from the file implies
lots of random access over the whole file even if you read the features
sequentially since a single feature is stored in multiple data blocks of
various types (feature header blocks, feature coordinate blocks, etc.).
It would be interesting to know if VSI_CACHE as suggested by Even will help.
Daniel
On 2022-07-27 11:55, Even Rouault wrote:
>
> Moises,
>
> I've not reviewed in depth the MITAB driver, but reading from a .tab
> file may require random access, and it is thus not surprising that
> reading from a compressed file may exhibit poor performance. You might
> try to set the VSI_CACHE config option / env variable to YES, but no
> guarantee this will help for your use case.
>
> Even
>
> Le 27/07/2022 à 11:39, Moises Calzado via gdal-dev a écrit :
>> Hi everyone!
>>
>> We're using ogr2ogr to convert MapInfo TAB files into CSV format
>> using the following command:
>>
>> ogr2ogr -f CSV -skipfailures -makevalid /vsistdout/
>> /vsizip/onLDU.zip -oo AUTODETECT_TYPE=YES -lco CREATE_CSVT=YES >
>> test_2.csv
>>
>>
>> The file weights ≈200 MB and the process is taking too much time to
>> finish (almost 20 min), so we don't know if we're doing something
>> wrong regarding the command that we launch.
>> Screenshot 2022-07-20 at 12.55.14.png
>> However, if we launch the same command against the .tab file instead
>> of using the vsizip virtual file system, it takes less than 30
>> seconds to complete.
>>
>> Have you ever seen something like this? Do you know if it's expected
>> that it takes too much time to process this kind of files, or we're
>> doing something wrong?
>>
>> Thanks so much for your help in advance,
>> Regards!
>> --
>> *Moises Calzado*
>>
>> Support Engineer
>>
>> (US) +1 917 463 3232 | (ES) +34 911 165 823 | mcalzado at carto.com
>>
>> <https://spatial-data-science-conference.com/2022/newyork/>
>>
>> _______________________________________________
>> gdal-dev mailing list
>> gdal-dev at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/gdal-dev
> --
> http://www.spatialys.com
> My software is free, but my time generally not.
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev
--
Daniel Morissette
Mapgears Inc
T: +1 418-696-5056 #201
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20220728/e2dd361f/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot%202022-07-20%20at%2012.55.14.png
Type: image/png
Size: 22063 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20220728/e2dd361f/attachment-0001.png>
More information about the gdal-dev
mailing list