[gdal-dev] Shapefile with corrupted index: SHAPE_RESTORE_SHX=YES doesn't correctly repairs it.

Robert Hewlett rob.hewy at gmail.com
Mon May 15 08:43:40 PDT 2023


Hi,

Out of curiosity, if you isolate the shp, dbf and shx (make a copy) in a
separate folder is the data still corrupt?

Rob


On Mon, May 15, 2023 at 6:35 AM Andrea Giudiceandrea via gdal-dev <
gdal-dev at lists.osgeo.org> wrote:

> Hi devs,
> in a reent QGIS issue report at
> https://github.com/qgis/QGIS/issues/53058 , an user complains about an
> ESRI Shapefile layer that was corrupted after an attribute value was
> changed and the edit was saved. The corrupted layer is opened by QGIS
> without errors or warning being reported, anyway it shows only a subset
> of the original feature geometry: a lot of records have now a null
> geometry associated, so they cannot be displayed.
>
> After some investigations, although I don't know why and how the layer
> was corrupted, it seems to me that the issue is mostly due to a
> corruption of the .idx file: in fact it contains, for various records,
> incorrect value of index and length of the record. This generates the
> incorrect reading of such record and the following ones, until the the
> index in the .idx file and the data in the .shp file line up again.
>
> Running the QGIS "Repair Shapefile" processing algorithm against such
> layer, the algorithm fails while the .idx file is actually updated but
> the layer becomes totally invalid and it is not possible to load it in
> QGIS. The same happens directly using ogrinfo after the .idx file was
> deleted and the SHAPE_RESTORE_SHX variable was set to YES: the .idx file
> was recreated but the layer becomes unreadable by both QGIS and ogrinfo.
>
> Inspecting the .idx file created by ogrinfo with SHAPE_RESTORE_SHX=YES
> (which is the same as the one created by the QGIS tool "Repair
> Shapefile"), it seems to me ogr fails to properly create the .idx file:
> it incorrectly stores, in the index file header, the total length in
> 16-bit words of the .shp file instead of the total length in 16-bit
> words of the .idx file itself.
> In this particular case,
> it stores the incorrect value 00 29 2A C2 = 2697922 16-bit words =
> 5395844 bytes
> instead of the correct value 00 02 1D 26 = 138534 16-bit words = 277068
> bytes
>
> Changing such incorrect value to the correct one in the repaired .idx
> file, makes the layer valid again and showing again the previously
> missing feature geometries (with only some glitches and a missing record).
>
> This behaviour seems weird to me, as I remember that the Repair
> Shapefile tool or the SHAPE_RESTORE_SHX=YES setting worked well to
> repair Shapefiles with corrupted index in the past.
>
> Maybe the issue in this particular Shapefile prevent ogr to correctly
> repair the index?
> For comparison, the old "Shape Checker utility" succeeds to repair the
> .idx file: it creates the same .idx file as the one created by ogr,
> apart from the total file length value which is correct.
>
> Any clue as to what may have gone wrong during the layer editing in QGIS
> that eventually corrupted the layer?
>
>
> Best regards.
>
> Andrea Giudiceandrea
> _______________________________________________
> gdal-dev mailing list
> gdal-dev at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20230515/453d4e2d/attachment.htm>


More information about the gdal-dev mailing list