[postgis-devel] [pgrouting] need help with std::bad_alloc issue
Stephen Woodbridge
woodbri at swoodbridge.com
Wed Jun 12 13:56:22 PDT 2013
Interestingly, I just traced down another of these issues. I turns out
this one was another in the family of std::vector problems. I resolved
it by adding my_vector.reserve(my_size) to the code before we tried to
add anything to the vector using a vector.push_back(...)
While it is probably good practice to do this, it is not require and the
code works fine in a standalone command line program. I'm not very
knowledgeable in C++ and how it deals with creating objects, etc, but
this feels a lot like a bug in the create extension facility where it
missing a step of calling the c++ library initialization code when the
shared library is loaded and that some how my adding the call to
reserve() method initializes what should have gotten initialized before
and does get initialized when I re-connect to the database and the
connection code makes sure all my libraries are loaded and initialized.
But this is just my wild guess based on what I'm observing and I could
be way off base.
-Steve
On 6/12/2013 2:40 PM, Stephen Woodbridge wrote:
> On 6/12/2013 5:20 AM, Sandro Santilli wrote:
>> On Tue, Jun 11, 2013 at 11:32:47AM -0400, Stephen Woodbridge wrote:
>>> On 6/11/2013 10:58 AM, Bborie Park wrote:
>>>> Steve,
>>>>
>>>> On what platform? Windows? Linux?
>>>
>>> Linux, pg 9.2.4
>>>>
>>>> By the looks of it (I'm not very good at C++), std::bad_alloc comes
>>> >from a failed new allocation. Run gdb or valgrind yet?
>>>
>>> Yes, gdb is not vary useful because the postgresql is compiled with
>>> -PIE and gdb does not support that well. valgrind is my friend I
>>> have run it before but it does not report anything useful in this
>>> case.
>>>
>>> I'm not great with C++ either, but I'm stuck on that fact that this
>>> seems like C++ does not know how much memory is available so it
>>> fails but after connecting to the database it seems to have a better
>>> idea.
>>
>> Did you look at the system memory state while testing ?
>> I suspect you're just not releasing memory associated with results
>> of queries run in previous connection, so that re-connection releases
>> them all for you, or something like that. Alternatively there might be
>> a memory leak in some database-side functions so that re-connecting
>> quits the old backend and releases all leaked memory with it.
>> Valgrind doesn't help because the memory isn't really lost, but rather
>> hold by the postgresql backend pool (released on reconnect).
>>
>> As you said, bad_alloc doesn't come from palloc, but who's keeping
>> all the system memory busy is still not known at this stage, so it
>> could be either sides. To be frank I think it's more likely for it
>> to be in pgrouting code itself, what do you think ?
>
> So I go back to the simplest case that I can run to reproduce this:
>
> 1. createdb
> 2. connect and create postgis and pgrouting extensions
> 3. create a simple small table
> 4. run the query, get the error
> 5. \c to reconnect in psql
> 6. run same query it works
>
> So new database, new connection, minimal work in session. Nothing that
> is requesting a huge amount of memory. This happens consistently
> regardless of server load or if I restart postgresql.
>
> A slight variation in the above sequence:
>
> 1. createdb
> 2. connect and create postgis and pgrouting extensions
> 3. \c to reconnect in psql
> 4. create a simple small table
> 5. run the query, get NO error
>
> You ask: What do I think?
>
> This is harder to say. The symptoms all point to something systemic
> happening. That is not to say that pgrouting is not triggering the
> problem in some way, it is just not obvious.
>
> There is a big difference in the structure of pgrouting functions over
> the postgis functions in that almost all of our function are SRF, set
> returning function, and most all of our functions use the SPI facility
> to run queries that fetch data from the database. I know we have to be
> careful to make sure that we do not try to hold data palloc's during SPI
> across the multiple SRF calls. At one point I review our code to make
> sure we did not do that.
>
> I have reviewed most (all?) of the code and made changes to a lot of the
> C++ code (scary because I'm not a C++ programmer) to do things like wrap
> all C++ functions called from C with try-catch block to report errors
> rather than crash the server, I have changed a bunch of the std::vector
> allocators to reserve needed memory if I know up front what they will
> need. This has helped resolve a bunch of std::bad_alloc errors, but all
> of these were reproducible regardless of the reconnection.
>
> I found some code that I can call that will tell me the status of system
> memory usage, that I think I will add to the catch block and see if I
> can see what that is reporting. I have run this problem against valgrind
> without getting anything useful. And I have added debug output to trace
> down the specific line generating the error and that was not illuminating.
>
> I hate problems like this.
>
> Thank you for your thoughts and comments. I'll keep plugging away at this.
>
> -Steve
>
>> --strk;
>>
>>>
>>> We use a lot of std::vector structures, these are arrays that get
>>> dynamically extended and as a result have a lot of realloc
>>> equovalents on them the causes memory fragmentation, can where
>>> possible I have changed the code to reserve a minimum size which has
>>> helped a lot, but this kind of issue is consistently reproduced
>>> regardless of reconnecting.
>>>
>>> I'll take another run at it with valgrind, but I'm pretty sure this
>>> will not show anything new.
>>>
>>> Thanks,
>>> -Steve
>>>
>>>> -bborie
>>>>
>>>> On Tue, Jun 11, 2013 at 7:33 AM, Stephen Woodbridge
>>>> <woodbri at swoodbridge.com> wrote:
>>>>> Hi devs,
>>>>>
>>>>> I have run into a strange problem with some pgrouting functions
>>>>> that you
>>>>> guys might have already seen in postgis.
>>>>>
>>>>> We have stored procedures that are C and C++ and in general they
>>>>> work fine,
>>>>> but if I create a database, connect to it, create extensions and
>>>>> run some
>>>>> commands I get std::bad_alloc error. If I simply reconnect to the
>>>>> database,
>>>>> the same command does not generate an error, and, in fact, if I
>>>>> reconnect
>>>>> after installing the extension, I never get this error.
>>>>>
>>>>> I have traced this down to the particular statement that is
>>>>> throwing the
>>>>> error, but there is nothing unique or particular about it. And we
>>>>> have seen
>>>>> this behavior in multiple commands.
>>>>>
>>>>> So I have to conclude that:
>>>>>
>>>>> 1. we use the same pattern for most of our commands so that might
>>>>> be flawed
>>>>> in some basic way regarding memory
>>>>>
>>>>> 2. that there is something strange about create extension in that the
>>>>> libraries are not getting initialized correctly (or completely?)
>>>>> until a
>>>>> connection is made. We are trying to verify this is or is not
>>>>> unique to pg
>>>>> 9.2.
>>>>>
>>>>> 3. pgrouting installs multiple shared libraries in our extension
>>>>> and maybe
>>>>> postgresql assumes there is only going to be one shared library
>>>>>
>>>>> 4. or something else totally different that we are missing
>>>>>
>>>>> So has anyone seen anything like this with postgis code?
>>>>> Any thoughts on what this might be? or how to run it down?
>>>>>
>>>>> I did post post a inquiry to the postgresql list and Tom responded
>>>>> with not
>>>>> enough information and to compile the server with --enable-cassert
>>>>> which I
>>>>> did (assuming my Debian recompile worked correctly), but since this
>>>>> is a C++
>>>>> error and not a postgresql palloc issue we have not seen and
>>>>> cassert errors.
>>>>>
>>>>> Thoughts?
>>>>>
>>>>> -Steve
>> _______________________________________________
>> postgis-devel mailing list
>> postgis-devel at lists.osgeo.org
>> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-devel
>>
>
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-devel
More information about the postgis-devel
mailing list