[mapserver-dev] Speed in accessing World .wld files varies across disk systems

Chris Galli (XC Skies) cgalli at xcskies.com
Fri Oct 17 19:16:50 EDT 2008


That's exactly why I came up with such a bold theory :) My python code 
is very simple and targets the exact same files. The performance for 
getting any arbitrary 256x256 tile from any of several layers with my 
code is about .08 seconds. The exact same mapserv request was roughly .3 
seconds in this new (slower) environment.

When multiple requests are made via http for tiles, then mapserv starts 
slowing down considerably, where in some instances it takes over 3 
seconds per tile request. This is likely due to competition for disk IO 
resources, but my custom python code running as a listening http server 
consistently yields .07 to .09 seconds per tile for identical 
simultaneous requests. This is why I'm left scratching my head still.

My temporary "solution" was to create more directories on the file 
system by day, and store a limited amount of files in each directory. 
i.e., 20080901/*files* and 20080902/*files*. I only store a rotating 
archive of 2 weeks, but this was the best I could do for now.

Thanks for your thoughts. I'll keep thinking about how else to debug this.

-Chris

Paul Ramsey wrote:
> Note that "lots of files in a directory" is a common performance
> anti-pattern. The fopen() has to linearly scan the directory contents
> to find the file requested, so too many files means too much seaking.
> However, you should see about the same performance fall off for
> Mapserver as for any other program doing file opens, and you haven't
> described that kind of behavior.
>
> P.
>
> On Fri, Oct 17, 2008 at 3:55 PM, Chris Galli (XC Skies)
> <cgalli at xcskies.com> wrote:
>   
>> Thanks for the post Paul. I can clearly see what you've pointed out so I can
>> chuck that theory out the window. I'll dig deeper into the intricacies of my
>> different disks again and see what I can shake out.
>>
>> -Chris
>>
>> Paul Ramsey wrote:
>>     
>>> Here's the function in question:
>>>
>>> http://trac.osgeo.org/mapserver/browser/trunk/mapserver/mapraster.c#L342
>>>
>>> As you can see, it doesn't do a directory search, though it does work
>>> its way through a number of possible extension options. Note that
>>> "wld" is the *first* option though, so that's not your problem.
>>>
>>> P.
>>>
>>> On Fri, Oct 17, 2008 at 3:29 PM, Chris Galli <cgalli at xcskies.com> wrote:
>>>
>>>       
>>>> Hi Everyone,
>>>>
>>>> I know the above statement seems like it deserves an obvious answer, so
>>>> first let me say that I understand the complexities of disk
>>>> implementations
>>>> enough to realize that speed depends on a tremendous amount of factors
>>>> and
>>>> so cannot be easily discussed in terms of absolutes when comparing
>>>> different
>>>> disk systems. With that said, however, I'm seeing behaviour that leads me
>>>> to
>>>> believe the discovery process for .wld files can be improved in mapserv.
>>>> I've tested with V 4.10 and 5.2 and they produce identical results.
>>>>
>>>> Here's the crux:
>>>> When rendering raster images (say png files) which use .wld world files
>>>> via
>>>> the cgi interface, I get wildly different response times on different
>>>> linux
>>>> systems. After a lengthy discovery process of why this was, I have come
>>>> to
>>>> the conclusion that mapserv is probably not targeting wld files directly
>>>> on
>>>> the file system, and instead looking for matching wld files for raster
>>>> images by using some type of 'wild card' or other inefficient scan of the
>>>> file's current directory.
>>>>
>>>> For example, if I place a single raster png file called world.png with a
>>>> world.wld in an empty directory and turn on mapserver debug, response
>>>> times
>>>> seem reasonable. As I increase the amount of files within the directory,
>>>> the
>>>> mapserv raster rendering becomes increasingly slower (asking for a single
>>>> 256x256 tile from a 1MB png file). When I perform the same test on
>>>> another
>>>> system, I barely see a slowdown in performance. Why? Because one disk
>>>> system
>>>> is much more robust with directory caching and disk-to-memory hardware.
>>>> Fair
>>>> enough. But when I run the same tests on tiff files, both systems produce
>>>> identical results to within a few milliseconds. This implies that wld
>>>> files
>>>> are likely not being targeted efficiently.
>>>>
>>>> In addition to the above, I have some custom python code that accesses
>>>> the
>>>> exact same png raster files and servers them up to the exact extents and
>>>> tile size as does mapserv using the GD libs.  And that code was actually
>>>> returning tiles faster on the system which mapserv was running so poorly.
>>>> My
>>>> code expects a file to exist and so does not need to 'discover' it,
>>>> making
>>>> the process much more efficient.
>>>>
>>>> Does anyone know or suspect that the above is true? If so, how does one
>>>> go
>>>> about providing more details and elevating this to a potential
>>>> change/enhancement?
>>>>
>>>> Thanks!
>>>>
>>>> -Chris
>>>>
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/Speed-in-accessing-World-.wld-files-varies-across-disk-systems-tp20042027p20042027.html
>>>> Sent from the Mapserver - Dev mailing list archive at Nabble.com.
>>>>
>>>> _______________________________________________
>>>> mapserver-dev mailing list
>>>> mapserver-dev at lists.osgeo.org
>>>> http://lists.osgeo.org/mailman/listinfo/mapserver-dev
>>>>
>>>>
>>>>         
>>     



More information about the mapserver-dev mailing list