[mapserver-users] Mapcache segmentation faults under high load

thomas bonfort thomas.bonfort at gmail.com
Wed Mar 1 00:32:47 PST 2017


Try to get a backtrace from the crashes by compiling mapcache in debug
mode, and configuring apache to allow coredumps. The "stale lock" messages
you are seeing are side effects of the crashes so that's not where I'd
investigate at first.

regards,
thomas

On Tue, Feb 28, 2017 at 6:47 PM Pieter Callewaert <
pieter.callewaert at be-mobile.com> wrote:

> Hi,
>
>
>
> We are trying to add mapcache to our stack, because in the last years the
> number of requests on our mapservers kept increasing, and we are looking
> for smarter ways to scale.
>
> We have a test setup, and on low load everything works perfectly. An ab
> benchmark on a single requests has also no problem.
>
> However, when we try to replay some production data on the mapcache
> server, with 500 concurrent users,  we see the following errors in apache
> error log:
>
>
>
> [Tue Feb 28 12:50:33.789479 2017] [core:notice] [pid 13:tid
> 140568766519168] AH00051: child pid 793 exit signal Segmentation fault
> (11), possible coredump in /etc/apache2
>
> [Tue Feb 28 12:50:34.791605 2017] [core:notice] [pid 13:tid
> 140568766519168] AH00051: child pid 1384 exit signal Segmentation fault
> (11), possible coredump in /etc/apache2
>
> [Tue Feb 28 12:50:35.792510 2017] [core:notice] [pid 13:tid
> 140568766519168] AH00051: child pid 1346 exit signal Segmentation fault
> (11), possible coredump in /etc/apache2
>
> [Tue Feb 28 12:50:38.795504 2017] [core:notice] [pid 13:tid
> 140568766519168] AH00051: child pid 1338 exit signal Segmentation fault
> (11), possible coredump in /etc/apache2
>
>
>
> And also:
>
> [Tue Feb 28 12:54:49.804880 2017] [:error] [pid 8256:tid 140568096990976]
> [client xxx.xxx.xxx.xxx:59612] deleting a possibly stale lock after waiting
> on it for 30.034 seconds
>
> [Tue Feb 28 12:54:49.815800 2017] [:error] [pid 796:tid 140568273237760]
> [client xxx.xxx.xxx.xxx:59631] deleting a possibly stale lock after waiting
> on it for 30.037 seconds
>
> [Tue Feb 28 12:54:50.052952 2017] [:error] [pid 9502:tid 140568449484544]
> [client xxx.xxx.xxx.xxx:59900] deleting a possibly stale lock after waiting
> on it for 30.064 seconds
>
> [Tue Feb 28 12:54:50.062164 2017] [:error] [pid 1364:tid 140568264845056]
> [client xxx.xxx.xxx.xxx:59951] deleting a possibly stale lock after waiting
> on it for 30.04 seconds
>
> [Tue Feb 28 12:54:54.226499 2017] [:error] [pid 796:tid 140568239666944]
> [client xxx.xxx.xxx.xxx:34636] deleting a possibly stale lock after waiting
> on it for 30.046 seconds
>
> [Tue Feb 28 12:54:55.358419 2017] [:error] [pid 797:tid 140568399128320]
> [client xxx.xxx.xxx.xxx:35329] deleting a possibly stale lock after waiting
> on it for 30.047 seconds
>
> [Tue Feb 28 12:54:55.949560 2017] [:error] [pid 794:tid 140568256452352]
> [client xxx.xxx.xxx.xxx:35639] deleting a possibly stale lock after waiting
> on it for 30.037 seconds
>
> [Tue Feb 28 12:55:00.062597 2017] [:error] [pid 1408:tid 140568122169088]
> [client xxx.xxx.xxx.xxx:38816] deleting a possibly stale lock after waiting
> on it for 30.036 seconds
>
>
>
> In the syslog we see a lot of this:
>
> [51314200.930086] traps: apache2[7324] general protection ip:7f308fc53fe9
> sp:7f3089657b10 error:0 in libmapcache.so.1.4.1[7f308fc10000+56000]
>
> [51314204.940646] traps: apache2[7469] general protection ip:7f308fc53fe9
> sp:7f3082ffcb10 error:0 in libmapcache.so.1.4.1[7f308fc10000+56000]
>
> [51314207.948949] traps: apache2[7626] general protection ip:7f308fc53fe9
> sp:7f307c7efb10 error:0 in libmapcache.so.1.4.1[7f308fc10000+56000]
>
> [51314209.954537] traps: apache2[7748] general protection ip:7f308fc53fe9
> sp:7f30837fdb10 error:0 in libmapcache.so.1.4.1[7f308fc10000+56000]
>
> [51314214.966671] traps: apache2[7954] general protection ip:7f308fc53fe9
> sp:7f3088e56b10 error:0 in libmapcache.so.1.4.1[7f308fc10000+56000]
>
> [51314229.004451] traps: apache2[8562] general protection ip:7f308fc53fe9
> sp:7f307c7efb10 error:0 in libmapcache.so.1.4.1[7f308fc10000+56000]
>
> [51314235.020377] traps: apache2[8798] general protection ip:7f308fc53fe9
> sp:7f307f7f5b10 error:0
>
> [51314235.020395] traps: apache2[8799] general protection ip:7f308fc53fe9
> sp:7f307eff4b10 error:0 in libmapcache.so.1.4.1[7f308fc10000+56000]
>
> [51314235.020535]  in libmapcache.so.1.4.1[7f308fc10000+56000]
>
>
>
> We compile/run apache/mapcache in docker container (base image 14:04)
> (with –net host), compiling with this parameters:
>
>
>
> # Install Mapcache itself
>
> ADD https://github.com/mapserver/mapcache/archive/rel-1-4-1.tar.gz /
>
> # Compile Mapcache for Apache
>
> RUN mkdir -p /usr/local/src/mapcache && \
>
>     tar xf rel-1-4-1.tar.gz -C /usr/local/src/mapcache
> --strip-components=1 && \
>
>     mkdir /usr/local/src/mapcache/build && \
>
>     cd /usr/local/src/mapcache/build && \
>
>     cmake ../ \
>
>    -DWITH_FCGI=0 -DWITH_APACHE=1 -DWITH_PCRE=0 \
>
>     -DWITH_TIFF=0 -DWITH_BERKELEY_DB=0 -DWITH_MEMCACHE=0 \
>
>     -DWITH_SQLITE=0 -DCMAKE_PREFIX_PATH="/etc/apache2" && \
>
>     make && \
>
>     make install
>
>
>
> What we’ve tried:
>
> -         Change mpm from event to worker, and play with the tuning
> options
>
> -         Change cache and/or locking directory from tmpfs to normal
> disks.
>
> -         Disable all unneeded options in compile
>
>
>
> Mapcache.xml:
>
> <?xml version="1.0" encoding="UTF-8"?>
>
>
>
> <!-- see the accompanying mapcache.xml.sample for a fully commented
> configuration file -->
>
>
>
> <mapcache>
>
>    <cache name="disk" type="disk">
>
>       <base>/tmp/mapcache/</base>
>
>       <symlink_blank/>
>
>    </cache>
>
>    <source name="LOS-all" type="wms">
>
>       <getmap>
>
>          <params>
>
>             <FORMAT>image/png</FORMAT>
>
>             <LAYERS>layer1,layer2,layer3</LAYERS>
>
>             <MAP>/maps/LOS.map</MAP>
>
>          </params>
>
>       </getmap>
>
>
>
>       <http>
>
>          <url>http://localhost:81/ms</url>
>
>       </http>
>
>    </source>
>
>    <tileset name="LOS-all">
>
>       <source>LOS-all</source>
>
>       <cache>disk</cache>
>
>       <grid>WGS84</grid>
>
>       <grid>GoogleMapsCompatible</grid>
>
>       <format>PNG</format>
>
>       <metatile>5 5</metatile>
>
>       <metabuffer>10</metabuffer>
>
>       <expires>60</expires>
>
>       <auto_expire>60</auto_expire>
>
>    </tileset>
>
>
>
>    <default_format>PNG</default_format>
>
>
>
>    <service type="wms" enabled="false">
>
>       <full_wms>assemble</full_wms>
>
>       <resample_mode>bilinear</resample_mode>
>
>       <format>JPEG</format>
>
>       <maxsize>4096</maxsize>
>
>    </service>
>
>    <service type="wmts" enabled="true"/>
>
>    <service type="tms" enabled="true"/>
>
>    <service type="kml" enabled="true"/>
>
>    <service type="gmaps" enabled="true"/>
>
>    <service type="ve" enabled="false"/>
>
>    <service type="mapguide" enabled="false"/>
>
>    <service type="demo" enabled="false"/>
>
>
>
>    <errors>report</errors>
>
>    <locker type="disk">
>
>      <directory>/tmp</directory>
>
>      <timeout>30</timeout>
>
>     </locker>
>
>
>
> </mapcache>
>
>
>
> Andy idea what I can do to find exactly what is going wrong?
>
> Thanks in advance!
>
>
>
> Kind regards,
>
> Pieter Callewaert
>
>
>
>
> _______________________________________________
> mapserver-users mailing list
> mapserver-users at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/mapserver-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/mapserver-users/attachments/20170301/8938b4a4/attachment-0001.html>


More information about the mapserver-users mailing list