[mapserver-dev] Trying to improve perfs of MapServer with PostGIS
Daniel Morissette
dmorissette at mapgears.com
Fri Dec 11 04:38:56 PST 2015
Hi Patrick,
We have run into similar behaviors with a quite similar setup on Amazon
a little while ago, in our case we had a dedicated DB server with SSD
(like yours) and multiple rendering nodes connecting to it, and adding
more nodes or more powerful ones still resulted in a constant throughput
and the additional rendering nodes CPU running mostly idle. After much
hair pulling we finally found out that we were hitting the 1 Gbps limit
of the network connection between the DB and rendering node.
I suspect you are running into the same thing here: if you could verify
that the size of the original vs binary response is about the double,
then that would confirm this possibility. We used PGBouncer to look at
the volume of data transferred by the DB server, it produces output like
this:
2014-06-06 18:49:44.597 27891 LOG Stats: 852 req/s, in 435114 b/s, out
83424883 b/s,query 7133 us
2014-06-06 18:50:44.597 27891 LOG Stats: 818 req/s, in 418046 b/s, out
79177139 b/s,query 5761 us
2014-06-06 18:52:44.598 27891 LOG Stats: 898 req/s, in 458041 b/s, out
88441708 b/s,query 7949 us
Then to really confirm it, look at the number of hits per second in each
test case (instead of the processing time per request). You will likely
notice that it is constant for a given zoom level once you hit the
network limit no matter how many parallel requests you have. This is
because the number of hits per seconds that can be served are limited by
the volume of data per second that the DB server can deliver to
MapServer (around ~800Mbits/sec for a 1 Gbps connection).
In our case, if I remember correctly, we used to hit a limit of about 40
maps/second, where each map was averaging 2MB of data... so we were
capping at 4 hits x 2MB ~= 80MBytes per second from the DB (or
~800Mbits/sec) ... which is as I wrote above in line with a 1Gbps limit.
Good luck!
Daniel
On 2015-12-11 4:13 AM, Patrick Valsecchi wrote:
> Hi,
>
> I was a bit disapointed by the perfs of reading the features from
> PostGIS compared to reading them from a local Shapefile. So I've pocked
> my nose around the code and I saw that geometries were fetched from the
> DB in EWKB encoded in hex.
>
> So I went ahead and changed the code to use libpq's binary mode when we
> are fetching features. To avoid conversion nightmares when fetching
> attributes, I'm tricking Postgres into sending them in their textual
> form. That way, the behavior of MapServer should not change regarding
> the other columns.
>
> The patch can be found in github [1].
>
> General testing algo:
> for each zoom level
> start at the same time N threads that does
> for each iteration
> compute random BBOX
> get the 1000x1000 image (throw the content out, but read it)
> Wait for all the threads to stop
> Collect the stats from the threads (throwing out the slowest result and
> the fastest result).
>
> The DB is on a db.m4.large from Amazon with SSD disk. Mapserver is on
> apache with fcgid (up to 20 processes) and runs on a m4.xlarge. The
> measures are taken from another Amazon machine to avoid hitting the
> bandwidth limit too fast.
>
> Times are for one query from one thread and are given in milliseconds
> along with their standard deviation. Each time, I did a full run before
> taking measures.
>
> Data used is the swiss villages (total 2479 features) with polygon
> contours in a 1.7MB shapefile file [2], imported as is in PostGIS.
>
> nbThreads=1 nbIterations=20
> zoom level 1.00 4.00 16.00 32.00
> original 964± 14 222± 126 66± 18 68± 19
> binary 807± 13 194± 111 87± 28 79± 24
> shapefile 554± 94 187± 107 72± 26 56± 3
>
>
> nbThreads=5 nbIterations=20
> zoom level 1.00 4.00 16.00 32.00
> original 3686± 946 403± 264 84± 37 70± 22
> binary 1710± 486 340± 242 105± 59 89± 32
> shapefile 519± 225 278± 166 91± 58 80± 34
>
> nbThreads=10 nbIterations=20
> zoom level 1.00 4.00 16.00 32.00
> original 7287±1936 800± 575 119± 79 110± 81
> binary 3737± 647 471± 294 123± 70 110± 54
> shapefile 884± 241 412± 269 111± 119 98± 57
>
> nbThreads=20 nbIterations=20
> zoom level 1.00 4.00 16.00 32.00
> original 14969±2507 1643±1221 239± 231 166± 103
> binary 7649± 730 857± 576 210± 121 181± 77
> shapefile 1455± 438 483± 326 143± 97 126± 75
>
> What is what:
>
> * original: Mapserver 7.0 (git 157fa47)
> * binary: Same as above with a small patch to configure libpg to use
> binary transfer.
> * shapefile: Same as original, but using a shapefile on the local disk
> (just here for comparison).
>
>
> We can see that when the machine gets parallel queries, we quickly get a
> factor 2 in perfs when there are a lot of features (low zoom level).
> There is no measurable negative impact at higher zoom levels and lower
> loads.
>
> Now, what do you guys think? Do you see in risk? Or should do a pull
> request?
>
> Thanks.
>
>
> [1]
> https://github.com/pvalsecc/mapserver/commit/45bd3d5795c9108618c37cc8c7472809cff54d16
> [2]
> http://www.swisstopo.admin.ch/internet/swisstopo/en/home/products/landscape/swissBOUNDARIES3D.html
>
>
>
> _______________________________________________
> mapserver-dev mailing list
> mapserver-dev at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/mapserver-dev
>
--
Daniel Morissette
http://www.mapgears.com/
T: +1 418-696-5056 #201
http://evouala.com/ - Location Intelligence Made Easy
More information about the mapserver-dev
mailing list