[postgis-devel] running PostGIS on valgrind-enabled build / identifying memory issues

Tomas Vondra tomas.vondra at 2ndquadrant.com
Fri Feb 9 17:30:27 PST 2018


Hi,

We have a bunch of customers running PostGIS, and from time to time they
report us some sort of crash. The reports are rare and not reproducible,
so it's difficult to identify the actual root cause. But the symptoms
are usually consistent with memory corruption - invalid pointers in core
dumps, segfaults, etc.

For PostgreSQL, valgrind turned out to be an invaluable tool to identify
various memory issues, so after after yet another crash report I had an
idea to try running valgrind-enabled PostGIS build.

So built current PostgreSQL and PostGIS master branches (f069c91a57 and
d613598e67) with valgrind support:

    ./configure --enable-debug --enable-cassert \
                --prefix=/home/tomas/pg-debug \
                CFLAGS="-O0 -ggdb3 -fno-omit-frame-pointer \
                        -DUSE_VALGRIND -DRANDOMIZE_ALLOCATED_MEMORY"

Disabling optimizations makes valgrind reports more accurate in my
experience, ggdb3 makes them easier to analyze.

Running "make check" (in PostGIS source tree) against such build
identifies a fairly high number of possible issues. The report is
usually 0.5-1MB, but many of the reports follow the same pattern.

See the attached example report - pretty much all the initial reports
are "jumps on uninitialised value" in do_analyze_rel. I didn't get to
the bottom of this yet, but my understanding is the GBOX in
compute_gserialized_stats_mode may not be fully initialized in some
cases, and some of the uninitialised fields are accessed later.

There are a few more such patterns in the report, including for example
"invalid read" issues which in my experience usually means out-of-bounds
access (buffer overflows and such).


I'm wondering if those are known issues - for example, while I find
valgrind quite reliable, it's still possible those are false positives.

I plan to continue working on this, but I'm not very familiar with the
PostGIS code, so I'd appreciate help from the more experienced people on
this list. It's likely you'll immediately spot where the issue is.
Anyone willing to look at the valgrind report?


The attached pg_ctl.patch makes it easier to start PostgreSQL under
valgrind. You may need to modify the hard-coded paths, though.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pg_ctl.patch.gz
Type: application/gzip
Size: 587 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20180210/98a09811/attachment-0002.gz>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: valgrind.log.gz
Type: application/gzip
Size: 23501 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20180210/98a09811/attachment-0003.gz>


More information about the postgis-devel mailing list