[mapserver-users] How to tile a large TIF image?

Ed McNierney ed at mcnierney.com
Fri Apr 18 07:55:42 PDT 2008


Dejan -

I NEVER say "always"...

But no, I do not think one could state that one approach (a single TIFF with
overviews vs. multiple TIFFs at different resolutions) is ALWAYS better than
the other.  There are a number of factors involved, some of which depend on
your filesystem and disk hardware characteristics.

There isn't usually much "overhead" in opening a file by itself.  What's
expensive is a random seek from one place on a disk to another - that's VERY
slow compared to almost anything else you can do on your machine except
typing.  If you have a single very large TIFF file with overviews, GDAL will
need to read the TIFF directory, locate the overview, and seek to it to read
it.  That's really not very much different from reading the disk directory,
locating the file, and seeking to it to read it in the other case.  A very
large file is more likely to be fragmented into different pieces across the
disk, which will require more seeks and slow things down.  And a large,
fragmented directory will slow down opening one file due to lots of disk
seeks.

I'm sure there are scenarios in which one approach is faster than the other,
but I would not worry about this factor right now.  The most important
things to do are:

1. Ensure there is a pre-built overview that matches the resolution
MapServer will request.  If possible, constrain the UI so only a predefined
set of view scales are available, and build overviews at all of those
scales.

2. Divide large files into tiles (either internally or externally) that are
roughly comparable to the size of the output map image, or somewhat larger.
It is inefficient to read a lot of small files to create one map, and it is
also inefficient to parse through a single large (untiled) file to extract a
small area to create one map.

3. Don't put a large number of files into a single directory.  On many
filesystems this will cause the directory to be stored in separate pieces
across the disk, and this can turn "just open the file" into "lots of disk
seeks".  If you have more than a few thousand files in one directory or so
(that's a very rough guess) think about reorganizing the data.

     - Ed


On 4/18/08 10:01 AM, "Paul Spencer" <pspencer at dmsolutions.ca> wrote:

> Dejan,
> 
> I'm not really sure that there would be a noticeable performance
> difference, and I definitely don't feel sufficiently educated to
> comment on why one approach would be superior to another, except that
> I find it much easier to manage single files and to set up a map file
> to point to them.  The process of creating multiple files and setting
> up a tileindex can be automated, certainly, but I think it is a little
> more complicated especially for the beginner.
> 
> The multiple files approach requires mapserver to open the tileindex
> file, find the polygons that intersect the requested extent, extract
> the references to the actual disk files, open the disk files using
> gdal to get the content and composite those into the memory copy of
> the map.
> 
> The single file approach does something very similar except that GDAL
> handles it and there is only one open file handle.  I'm not sure that
> there is a significant overhead to the first approach just on the
> basis of files needing to be opened.
> 
> I'm hoping others more qualified will jump in here (Frank, Ed ...) and
> provide more educated opinions.
> 
> Cheers
> 
> Paul
> 
> On 18-Apr-08, at 9:40 AM, Dejan.Gambin at pula.hr wrote:
>> 
>> Paul,
>> 
>> I am just interested in one thing:
>> 
>> If you say "breaking" a single tiff file with "TILED=YES" is
>> equivalent to breaking into individual files and using tileindex -
>> does it mean it is ALWAYS better to create a single file with
>> "internal" tiling than create many files and using tileindex? I
>> suppose the answer is NO, but I would very like to know why?
>> 
>> For example, if you need to display a region that covers several
>> "blocks", then in the first case you open just one file, in the
>> second you open several files. So the first method has less overhead
>> right? What are the opposite situations then?
>> 
>> thanks very much
>> 
>> regards, dejan
>> 
>> mapserver-users-bounces at lists.osgeo.org wrote on 18.04.2008 13:11:47:
>> 
>>> Stefan,
>>> 
>>> I don't think you can specify 10'', you need something in pixels.
>> But
>>> the command is also probably not what you really need to do.
>>> 
>>> More likely, you should be doing the following:
>>> 
>>> gdal_translate -co "TILED=YES" gebco/bathymetry.tif gebco/
>>> bathmetry_tiled.tif
>>> 
>>> This will create a single tif file that has an internal block size
>> of
>>> 256x256 - you can think of this as having broken your tif up into
>>> 256x256 tiles but keeps them all within the same file.  This is
>>> equivalent to breaking the tif up into individual files, creating a
>>> shapefile that has rectangles for each individual files' extent, and
>>> using that as a tile index in mapserver.
>>> 
>>> Next, you want to do this:
>>> 
>>> gdaladdo gebco/bathymetry.tif 2 4 6 8 16
>>> 
>>> this will pre-compute smaller versions of the tif image called
>>> overviews at 1/2, 1/4 1/6 1/8 and 1/16 of the original size of the
>> tif
>>> - this makes it much more efficient for gdal to return exactly what
>>> mapserver is asking for at any given scale.  Depending on the
>> range of
>>> scales that you need to display your raster image at, you can add
>> more
>>> overview levels (or take some away).
>>> 
>>> With this two commands, you can make your rasters much more
>> efficient
>>> for mapserver and probably avoid the need to split them up into many
>>> files.
>>> 
>>> It does make the file somewhat larger.  If you are working with very
>>> large files ( > 4GB ) then you may run into some problems with tiff
>>> and may need to investigate another format or a compressed format
>> like
>>> ECW or MrSID (both requiring commercial licenses I believe).
>>> 
>>> There are also some built in compression schemes for tiffs in gdal
>>> which you can apply when running gdal_translate (see http://www.
>>> gdal.org/frmt_gtiff.html
>>>   for creation options), for instance:
>>> 
>>> gdal_translate -co "TILED=YES" -co "COMPRESS=JPEG" -co
>>> "JPEG_QUALITY=80" gebco/bathymetry.tif gebco/bathmetry_tiled.tif
>>> 
>>> would compress the tif using JPEG compression (lossy) set to 80%
>> (low
>>> compression, small loss).
>>> 
>>> Cheers
>>> 
>>> Paul
>>> 
>>> 
>>> 
>>> On 18-Apr-08, at 5:47 AM, Stefan Schwarzer wrote:
>>>> Thanks for the info.
>>>> 
>>>> Did it like this:
>>>> 
>>>> Library/Frameworks/GDAL.framework/Versions/1.5/Programs/
>>>> gdal_translate -outsize 10'' 10'' -co TILED=YES gebco/
>> bathymetry.tif
>>>> gebco/bathymetry_tiled.tif
>>>> 
>>>> But get the message: "Segmentation fault"
>>>> 
>>>> Anything that I did wrong? Or should do differently?
>>>> 
>>>> Thanks for a hint,
>>>> 
>>>> Stef
>>>> 
>>>>> Hi,
>>>>> 
>>>>> Gdal_translate program has options for you:
>>>>> 
>>>>> c:\FWTools>gdal_translate
>>>>> Usage: gdal_translate [--help-general]
>>>>>        [-ot {Byte/Int16/UInt16/UInt32/Int32/Float32/Float64/
>>>>>              CInt16/CInt32/CFloat32/CFloat64}] [-strict]
>>>>>        [-of format] [-b band] [-outsize xsize[%] ysize[%]]
>>>>>        [-scale [src_min src_max [dst_min dst_max]]]
>>>>>        [-srcwin xoff yoff xsize ysize] [-projwin ulx uly lrx lry]
>>>>>        [-a_srs srs_def] [-a_ullr ulx uly lrx lry] [-a_nodata
>> value]
>>>>>        [-gcp pixel line easting northing [elevation]]*
>>>>>        [-mo "META-TAG=VALUE"]* [-quiet] [-sds]
>>>>>        [-co "NAME=VALUE"]*
>>>>>        src_dataset dst_dataset
>>>>> 
>>>>> By playing with -srcwin or possibly with -outsize and -projwin
>> you
>>>>> should be able to split your image as you wish.  Read more from
>>>>> http://gdal.org/gdal_translate.html
>>>>> 
>>>>> -Jukka Rahkonen-
>>>>> 
>>>>> 
>>>>> 
>>>>> Lähettäjä: mapserver-users-bounces at lists.osgeo.org [mailto:
>>> mapserver-users-bounces at lists.osgeo.org
>>>>> ] Puolesta Stefan Schwarzer
>>>>> Lähetetty: 18. huhtikuuta 2008 12:05
>>>>> Vastaanottaja: mapserver-users at lists.osgeo.org
>>>>> Aihe: [mapserver-users] How to tile a large TIF image?
>>>>> 
>>>>> Hi there,
>>>>> 
>>>>> I would like to use instead of single large tif image smaller
>>>>> tiles. Although I am well aware of mapserver's and gdal's
>>>>> possibilities to create the shapes for it, I first need to
>> "split"
>>>>> the large tif into 20 or 50 or 100 tiles.
>>>>> 
>>>>> Can anyone give me a hint with what kind of software this is
>>>>> possible?
>>>>> 
>>>>> There is an ArcGIS script, but it doesn't work on my machine (
>>> http://arcscripts.esri.com/details.asp?dbid=13978
>>>>>  ).
>>>>> 
>>>>> Thanks for any hints,
>>>>> 
>>>>> Stef
>>>>> 
>>>>>   
>> ____________________________________________________________________
>>>>> 
>>>>>   Stefan Schwarzer
>>>>> 
>>>>>   Lean Back and Relax - Enjoy some Nature Photography
>>>>>   http://photoblog.la-famille-schwarzer.de
>>>>> 
>>>>>   Appetite for Global Data? UNEP GEO Data Portal:
>>>>>   http://geodata.grid.unep.ch
>>>>> 
>>>>>  
>> ____________________________________________________________________
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> _______________________________________________
>>>> mapserver-users mailing list
>>>> mapserver-users at lists.osgeo.org
>>>> http://lists.osgeo.org/mailman/listinfo/mapserver-users
>>> 
>>> 
>>> __________________________________________
>>> 
>>>     Paul Spencer
>>>     Chief Technology Officer
>>>     DM Solutions Group Inc
>>>     http://www.dmsolutions.ca/
>>> 
>>> _______________________________________________
>>> mapserver-users mailing list
>>> mapserver-users at lists.osgeo.org
>>> http://lists.osgeo.org/mailman/listinfo/mapserver-users
> 
> 
> __________________________________________
> 
>     Paul Spencer
>     Chief Technology Officer
>     DM Solutions Group Inc
>     http://www.dmsolutions.ca/
> 
> _______________________________________________
> mapserver-users mailing list
> mapserver-users at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/mapserver-users





More information about the MapServer-users mailing list