[gdal-dev] core dump on dir info
Michael Sumner
mdsumner at gmail.com
Sun Feb 4 13:51:52 PST 2024
indeed there's no avx2:
cat /proc/cpuinfo|grep sse|head -n 1
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt pdpe1gb
rdtscp lm rep_good nopl cpuid extd_apicid tsc_known_freq pni pclmulqdq
ssse3 fma cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx
f16c hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse
3dnowprefetch osvw xop fma4 tbm perfctr_core ssbd ibpb vmmcall tsc_adjust
bmi1 virt_ssbd arat npt nrip_save arch_capabilities
Cheers, Mike
On Sun, Feb 4, 2024 at 10:55 PM Even Rouault <even.rouault at spatialys.com>
wrote:
> ok, so I believe this is the AVX2 issue I was talking about, as I realize
> that enabling AVX2 is the default mode when TileDB is built from source
> (which the Docker image does), and must be explicitly disabled with
> "./bootstrap --disable-avx2" (I've just changed the build recipe to include
> that, will take effect next time the images are refreshed)
>
> To confirm, can you send or just check the output of : cat
> /proc/cpuinfo|grep sse|head -n 1
>
> If there is no "avx2" in it, this is at 99.9% the reason of the issue.
>
> Even
> Le 04/02/2024 à 06:20, Michael Sumner a écrit :
>
> skipping TileDB does fix:
>
> ogr2ogr /tmp/newdir
> https://github.com/SymbolixAU/geojsonsf/raw/master/inst/examples/geo_melbourne.geojson -f
> "ESRI Shapefile"
> export GDAL_SKIP=TileDB
> ogrinfo /tmp/newdir/
> INFO: Open of `/tmp/newdir/'
> using driver `ESRI Shapefile' successful.
> 1: geo_melbourne (Polygon)
>
> unset GDAL_SKIP
> ogrinfo /tmp/newdir/
> Illegal instruction (core dumped)
>
> I failed to explain that I'm using gdal containers from the repo:
>
> docker run --rm -ti ghcr.io/osgeo/gdal:ubunt
>
> u-full-latest
>
> apt update
> apt install -y gdb
>
> Here's the output of under gdb as you suggested, there was a lot so I put
> it on a gist:
> https://gist.github.com/mdsumner/839ae6e05ededf640b65bfee3a20a4c0
>
> gdb --args ogrinfo /tmp/newdir/
> > run
> > thread apply all bt
>
> Thanks!
>
>
>
>
>
> On Sat, Feb 3, 2024 at 7:49 PM Even Rouault <even.rouault at spatialys.com>
> wrote:
>
>> - When it crashes under gdb, type "thread apply all bt" to get the stack
>> trace of all threads
>>
>> - I suspect there is a connection with
>> https://github.com/OSGeo/gdal/pull/9170 , but that pull request wouldn't
>> help here as "/tmp/newdir" could be a valid connection to TileDB
>>
>> - how did you get TileDB installed? It looks to be packaged? Which
>> distribution do you use?
>>
>> - SIGILL reminds me of issues with some TileDB builds using the AVX2
>> instruction set by default, which could cause some crash on host CPUs that
>> don't have AVX2 (unlikely on recent hardware though)
>>
>> - Setting GDAL_SKIP=TileDB should be a workaround
>>
>>
>> Le 03/02/2024 à 07:15, Michael Sumner a écrit :
>>
>> Thanks Even, so there's something about tiledb under gdb (or maybe I am
>> mangling the context, I will try variants of the host I'm using). Run with
>> valgrind included below.
>>
>> gdb --args ogrinfo /tmp/newdir/
>> ...
>> (gdb) run
>> Starting program: /usr/local/bin/ogrinfo /tmp/newdir/
>> [Thread debugging using libthread_db enabled]
>> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
>> [New Thread 0x7fffe7757640 (LWP 988)]
>> [New Thread 0x7fffe6f56640 (LWP 989)]
>> [New Thread 0x7fffde755640 (LWP 990)]
>> [New Thread 0x7fffd5f54640 (LWP 991)]
>> [New Thread 0x7fffc5753640 (LWP 992)]
>> [New Thread 0x7fffc4f52640 (LWP 993)]
>> [New Thread 0x7fffb4751640 (LWP 994)]
>> [New Thread 0x7fffabf50640 (LWP 995)]
>> [New Thread 0x7fffab74f640 (LWP 996)]
>> [New Thread 0x7fffa2f4e640 (LWP 997)]
>> [New Thread 0x7fff9a74d640 (LWP 998)]
>> [New Thread 0x7fff91f4c640 (LWP 999)]
>> [New Thread 0x7fff8974b640 (LWP 1000)]
>> [New Thread 0x7fff78f4a640 (LWP 1001)]
>> [New Thread 0x7fff78749640 (LWP 1002)]
>> [New Thread 0x7fff6f5ff640 (LWP 1003)]
>>
>> Thread 1 "ogrinfo" received signal SIGILL, Illegal instruction.
>> 0x00007ffff3773c9e in tiledb::common::ThreadPool::ThreadPool(unsigned
>> long) () from /lib/x86_64-linux-gnu/libtiledb.so.2.16
>>
>>
>>
>>
>> valgrind -s ogrinfo /tmp/newdir
>> ==704== Memcheck, a memory error detector
>> ==704== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
>> ==704== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
>> ==704== Command: ogrinfo /tmp/newdir
>> ==704==
>> INFO: Open of `/tmp/newdir'
>> using driver `ESRI Shapefile' successful.
>> 1: geo_melbourne (Polygon)
>> ==704==
>> ==704== HEAP SUMMARY:
>> ==704== in use at exit: 25,486 bytes in 216 blocks
>> ==704== total heap usage: 15,761 allocs, 15,545 frees, 2,390,169 bytes
>> allocated
>> ==704==
>> ==704== LEAK SUMMARY:
>> ==704== definitely lost: 0 bytes in 0 blocks
>> ==704== indirectly lost: 0 bytes in 0 blocks
>> ==704== possibly lost: 544 bytes in 1 blocks
>> ==704== still reachable: 24,942 bytes in 215 blocks
>> ==704== suppressed: 0 bytes in 0 blocks
>> ==704== Rerun with --leak-check=full to see details of leaked memory
>> ==704==
>> ==704== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
>>
>>
>> ogrinfo /tmp/newdir
>> Illegal instruction (core dumped)
>>
>> Cheers, Mike
>>
>>
>>
>>
>> On Sat, Feb 3, 2024 at 12:46 PM Even Rouault <even.rouault at spatialys.com>
>> wrote:
>>
>>> Michael,
>>>
>>> I'm wondering if there not might be something wrong with your build or
>>> runtime environment. Or there's something subtle, because that works fine
>>> for me with my dev build or in the
>>> ghcr.io/osgeo/gdal:alpine-normal-3.8.3 Docker image
>>>
>>> Try running "valgrind ogrinfo /tmp/newdir/" or "gdb --args ogrinfo
>>> /tmp/newdir/" (type "run") to get more useful information
>>>
>>> Even
>>> Le 03/02/2024 à 02:35, Michael Sumner via gdal-dev a écrit :
>>>
>>> I'm getting Illegal instruction / core dumped on ogrinfo of a
>>> directory:
>>>
>>> ogr2ogr /tmp/newdir
>>> https://github.com/SymbolixAU/geojsonsf/raw/master/inst/examples/geo_melbourne.geojson
>>> -f "ESRI Shapefile"
>>>
>>> ogrinfo /tmp/newdir/
>>> Illegal instruction (core dumped)
>>>
>>> I've worked back through some docker images and it wasn't a problem in
>>> 3.6.0, but I'm getting it since 3.7.0 - or I'm doing something wrong
>>> entirely.
>>>
>>> Cheers, Mike
>>>
>>>
>>> --
>>> Michael Sumner
>>> Software and Database Engineer
>>> Australian Antarctic Division
>>> Hobart, Australia
>>> e-mail: mdsumner at gmail.com
>>>
>>> _______________________________________________
>>> gdal-dev mailing listgdal-dev at lists.osgeo.orghttps://lists.osgeo.org/mailman/listinfo/gdal-dev
>>>
>>> -- http://www.spatialys.com
>>> My software is free, but my time generally not.
>>>
>>>
>>
>> --
>> Michael Sumner
>> Software and Database Engineer
>> Australian Antarctic Division
>> Hobart, Australia
>> e-mail: mdsumner at gmail.com
>>
>> -- http://www.spatialys.com
>> My software is free, but my time generally not.
>>
>>
>
> --
> Michael Sumner
> Software and Database Engineer
> Australian Antarctic Division
> Hobart, Australia
> e-mail: mdsumner at gmail.com
>
> -- http://www.spatialys.com
> My software is free, but my time generally not.
>
>
--
Michael Sumner
Software and Database Engineer
Australian Antarctic Division
Hobart, Australia
e-mail: mdsumner at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20240205/c1ad302e/attachment.htm>
More information about the gdal-dev
mailing list