[gdal-dev] Ogr2ogr taking too much time to process a MapInfo TAB file

Daniel Morissette dmorissette at mapgears.com
Thu Jul 28 07:25:37 PDT 2022


I confirm that the structure of the TAB dataset uses 512 bytes data 
blocks organized in a tree structure, so reading from the file implies 
lots of random access over the whole file even if you read the features 
sequentially since a single feature is stored in multiple data blocks of 
various types (feature header blocks, feature coordinate blocks, etc.).  
It would be interesting to know if VSI_CACHE as suggested by Even will help.

Daniel

On 2022-07-27 11:55, Even Rouault wrote:
>
> Moises,
>
> I've not reviewed in depth the MITAB driver, but reading from a .tab 
> file may require random access, and it is thus not surprising that 
> reading from a compressed file may exhibit poor performance. You might 
> try to set the VSI_CACHE config option / env variable to YES, but no 
> guarantee this will help for your use case.
>
> Even
>
> Le 27/07/2022 à 11:39, Moises Calzado via gdal-dev a écrit :
>> Hi everyone!
>>
>> We're using ogr2ogr to convert MapInfo TAB files into CSV format 
>> using the following command:
>>
>>     ogr2ogr -f CSV -skipfailures -makevalid /vsistdout/
>>     /vsizip/onLDU.zip  -oo AUTODETECT_TYPE=YES -lco CREATE_CSVT=YES >
>>     test_2.csv
>>
>>
>> The file weights ≈200 MB and the process is taking too much time to 
>> finish (almost 20 min), so we don't know if we're doing something 
>> wrong regarding the command that we launch.
>> Screenshot 2022-07-20 at 12.55.14.png
>> However, if we launch the same command against the .tab file instead 
>> of using the vsizip virtual file system, it takes less than 30 
>> seconds to complete.
>>
>> Have you ever seen something like this? Do you know if it's expected 
>> that it takes too much time to process this kind of files, or we're 
>> doing something wrong?
>>
>> Thanks so much for your help in advance,
>> Regards!
>> -- 
>> *Moises Calzado*
>>
>> Support Engineer
>>
>> (US) +1 917 463 3232 | (ES) +34 911 165 823 | mcalzado at carto.com
>>
>> <https://spatial-data-science-conference.com/2022/newyork/>
>>
>> _______________________________________________
>> gdal-dev mailing list
>> gdal-dev at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/gdal-dev
> -- 
> http://www.spatialys.com
> My software is free, but my time generally not.
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev


-- 
Daniel Morissette
Mapgears Inc
T: +1 418-696-5056 #201
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20220728/e2dd361f/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot%202022-07-20%20at%2012.55.14.png
Type: image/png
Size: 22063 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20220728/e2dd361f/attachment-0001.png>


More information about the gdal-dev mailing list