[mapserver-users] Mapcache segmentation faults under high load

Pieter Callewaert pieter.callewaert at be-mobile.com
Tue Feb 28 05:43:32 PST 2017


Hi,

We are trying to add mapcache to our stack, because in the last years the number of requests on our mapservers kept increasing, and we are looking for smarter ways to scale.
We have a test setup, and on low load everything works perfectly. An ab benchmark on a single requests has also no problem.
However, when we try to replay some production data on the mapcache server, with 500 concurrent users,  we see the following errors in apache error log:

[Tue Feb 28 12:50:33.789479 2017] [core:notice] [pid 13:tid 140568766519168] AH00051: child pid 793 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Tue Feb 28 12:50:34.791605 2017] [core:notice] [pid 13:tid 140568766519168] AH00051: child pid 1384 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Tue Feb 28 12:50:35.792510 2017] [core:notice] [pid 13:tid 140568766519168] AH00051: child pid 1346 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Tue Feb 28 12:50:38.795504 2017] [core:notice] [pid 13:tid 140568766519168] AH00051: child pid 1338 exit signal Segmentation fault (11), possible coredump in /etc/apache2

And also:
[Tue Feb 28 12:54:49.804880 2017] [:error] [pid 8256:tid 140568096990976] [client xxx.xxx.xxx.xxx:59612] deleting a possibly stale lock after waiting on it for 30.034 seconds
[Tue Feb 28 12:54:49.815800 2017] [:error] [pid 796:tid 140568273237760] [client xxx.xxx.xxx.xxx:59631] deleting a possibly stale lock after waiting on it for 30.037 seconds
[Tue Feb 28 12:54:50.052952 2017] [:error] [pid 9502:tid 140568449484544] [client xxx.xxx.xxx.xxx:59900] deleting a possibly stale lock after waiting on it for 30.064 seconds
[Tue Feb 28 12:54:50.062164 2017] [:error] [pid 1364:tid 140568264845056] [client xxx.xxx.xxx.xxx:59951] deleting a possibly stale lock after waiting on it for 30.04 seconds
[Tue Feb 28 12:54:54.226499 2017] [:error] [pid 796:tid 140568239666944] [client xxx.xxx.xxx.xxx:34636] deleting a possibly stale lock after waiting on it for 30.046 seconds
[Tue Feb 28 12:54:55.358419 2017] [:error] [pid 797:tid 140568399128320] [client xxx.xxx.xxx.xxx:35329] deleting a possibly stale lock after waiting on it for 30.047 seconds
[Tue Feb 28 12:54:55.949560 2017] [:error] [pid 794:tid 140568256452352] [client xxx.xxx.xxx.xxx:35639] deleting a possibly stale lock after waiting on it for 30.037 seconds
[Tue Feb 28 12:55:00.062597 2017] [:error] [pid 1408:tid 140568122169088] [client xxx.xxx.xxx.xxx:38816] deleting a possibly stale lock after waiting on it for 30.036 seconds

In the syslog we see a lot of this:
[51314200.930086] traps: apache2[7324] general protection ip:7f308fc53fe9 sp:7f3089657b10 error:0 in libmapcache.so.1.4.1[7f308fc10000+56000]
[51314204.940646] traps: apache2[7469] general protection ip:7f308fc53fe9 sp:7f3082ffcb10 error:0 in libmapcache.so.1.4.1[7f308fc10000+56000]
[51314207.948949] traps: apache2[7626] general protection ip:7f308fc53fe9 sp:7f307c7efb10 error:0 in libmapcache.so.1.4.1[7f308fc10000+56000]
[51314209.954537] traps: apache2[7748] general protection ip:7f308fc53fe9 sp:7f30837fdb10 error:0 in libmapcache.so.1.4.1[7f308fc10000+56000]
[51314214.966671] traps: apache2[7954] general protection ip:7f308fc53fe9 sp:7f3088e56b10 error:0 in libmapcache.so.1.4.1[7f308fc10000+56000]
[51314229.004451] traps: apache2[8562] general protection ip:7f308fc53fe9 sp:7f307c7efb10 error:0 in libmapcache.so.1.4.1[7f308fc10000+56000]
[51314235.020377] traps: apache2[8798] general protection ip:7f308fc53fe9 sp:7f307f7f5b10 error:0
[51314235.020395] traps: apache2[8799] general protection ip:7f308fc53fe9 sp:7f307eff4b10 error:0 in libmapcache.so.1.4.1[7f308fc10000+56000]
[51314235.020535]  in libmapcache.so.1.4.1[7f308fc10000+56000]

We compile/run apache/mapcache in docker container (base image 14:04) (with -net host), compiling with this parameters:

# Install Mapcache itself
ADD https://github.com/mapserver/mapcache/archive/rel-1-4-1.tar.gz /
# Compile Mapcache for Apache
RUN mkdir -p /usr/local/src/mapcache && \
    tar xf rel-1-4-1.tar.gz -C /usr/local/src/mapcache --strip-components=1 && \
    mkdir /usr/local/src/mapcache/build && \
    cd /usr/local/src/mapcache/build && \
    cmake ../ \
   -DWITH_FCGI=0 -DWITH_APACHE=1 -DWITH_PCRE=0 \
    -DWITH_TIFF=0 -DWITH_BERKELEY_DB=0 -DWITH_MEMCACHE=0 \
    -DWITH_SQLITE=0 -DCMAKE_PREFIX_PATH="/etc/apache2" && \
    make && \
    make install

What we've tried:

-         Change mpm from event to worker, and play with the tuning options

-         Change cache and/or locking directory from tmpfs to normal disks.

-         Disable all unneeded options in compile

Mapcache.xml:
<?xml version="1.0" encoding="UTF-8"?>

<!-- see the accompanying mapcache.xml.sample for a fully commented configuration file -->

<mapcache>
   <cache name="disk" type="disk">
      <base>/tmp/mapcache/</base>
      <symlink_blank/>
   </cache>
   <source name="LOS-all" type="wms">
      <getmap>
         <params>
            <FORMAT>image/png</FORMAT>
            <LAYERS>layer1,layer2,layer3</LAYERS>
            <MAP>/maps/LOS.map</MAP>
         </params>
      </getmap>

      <http>
         <url>http://localhost:81/ms</url<http://localhost:81/ms%3c/url>>
      </http>
   </source>
   <tileset name="LOS-all">
      <source>LOS-all</source>
      <cache>disk</cache>
      <grid>WGS84</grid>
      <grid>GoogleMapsCompatible</grid>
      <format>PNG</format>
      <metatile>5 5</metatile>
      <metabuffer>10</metabuffer>
      <expires>60</expires>
      <auto_expire>60</auto_expire>
   </tileset>

   <default_format>PNG</default_format>

   <service type="wms" enabled="false">
      <full_wms>assemble</full_wms>
      <resample_mode>bilinear</resample_mode>
      <format>JPEG</format>
      <maxsize>4096</maxsize>
   </service>
   <service type="wmts" enabled="true"/>
   <service type="tms" enabled="true"/>
   <service type="kml" enabled="true"/>
   <service type="gmaps" enabled="true"/>
   <service type="ve" enabled="false"/>
   <service type="mapguide" enabled="false"/>
   <service type="demo" enabled="false"/>

   <errors>report</errors>
   <locker type="disk">
     <directory>/tmp</directory>
     <timeout>30</timeout>
    </locker>

</mapcache>

Andy idea what I can do to find exactly what is going wrong?
Thanks in advance!

Kind regards,
Pieter Callewaert


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/mapserver-users/attachments/20170228/b1eb6606/attachment-0001.html>


More information about the mapserver-users mailing list