[mapserver-users] Mapserver Storage

Stephen Woodbridge woodbri at swoodbridge.com
Fri Jan 28 15:38:40 EST 2011


On 1/28/2011 1:19 PM, Mark Korver wrote:
> there is a lot analysis of EBS IO performance out there.  like
> http://orion.heroku.com/past/2009/7/29/io_performance_on_ebs/
>
> but I think, my earlier question, about the purpose of this mapserver
> system needs to be addressed to be able to go further.
>
> for example, if you are reading a lot of layers/shapefiles to render
> pngs, thats one thing.  If you are only interested in performance on
> image-based data, maybe just one giant layer of uncompressed GTifs
> thats another. It doesn't matter if you have the best IO, if you don't
> have the data optimized.

I think you will find the best/good(?) performance can be had by using 
uncompressed gtiff files that are internally tiled. I gdal use the -co 
"TILED=YES" option. The size of the gtiff is less important because if 
it is internally tiled the access to the data is extremely efficient.

Another option you might want to consider if you data is relatively 
static is to generate image tiles, like google, and and precache all the 
tiles. These can be saved as jpeg files so you get some compression and 
you do not need mapserver to handle each image request. Using tiles I 
was not able to create a backlog in apache serving them because I always 
ran into bandwidth limits before I overload a single apache instance. 
This was based in using an EqualLogic SAN.

For another client, I created an mapserver instance on Amazon where we 
could spatially separate the tiling tasks, to do the tiling and then 
they cloned the like 10 of them to do the tiling in a couple of days. If 
I recall they sent/bought USB drives and copied the tiles from the 
amazon instances and had them mailed to their data center.

-steve

> On Fri, Jan 28, 2011 at 11:53 AM, Paul Spencer<pspencer at dmsolutions.ca>  wrote:
>> I don't have actual numbers handy to back this up, but a rough comparison from what we observed rendering relatively complex maps was that EBS storage was perhaps 2-4 as slow as a local dell desktop running linux with eSATA 1TB drives (not high end hardware for sure).  Using glusterfs was 8+ times as slow.  We ended up using EBS as we could fit all our data onto a 1TB disk, it replicates reasonably quickly once snapshotted so to scale we start a new instance and replicate the EBS volume behind a load balancer.  Adding more servers and copies of the data makes up for the slower IO speed somewhat but every map draw still takes 2-4 times longer than using dedicated hardware.  On the other hand, it is very cost effective for scaling.  But for the amount of data that you are talking about, scaling by duplicating the EBS volumes would be very expensive and I really thing shared storage via glusterfs is a non-starter for that kind of volume of data if you want any kind of reason
able render time.
>>
>>
>> On 2011-01-28, at 10:57 AM, tigana.fluens at gmail.com wrote:
>>
>>> What do you exactly mean by pathetic I/O on S3/EBS?
>>
>>
>> __________________________________________
>>
>>    Paul Spencer
>>    Chief Technology Officer
>>    DM Solutions Group Inc
>>    http://research.dmsolutions.ca/
>>
>>
> _______________________________________________
> mapserver-users mailing list
> mapserver-users at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/mapserver-users



More information about the mapserver-users mailing list