GeoTIFF overviews / TILEINDEX / Large dataset performa

Dan Greve grevedan at HOTMAIL.COM
Fri Jun 10 16:49:51 EDT 2005


Unfortunately documentation is usually one of the last steps in the 
development process.  I'm one of the worst at even internally documenting my 
code, much less providing additional user information.

God it's good having Frank back from vacation.  I've learned alot from him, 
and he always seems to find time to respond to questions while saving the 
GIS world.  Hear Hear!!

Seriously though, I'm pretty sure I understand most of what you said Frank, 
and will start building additional layers and looking into Group/Max Min 
scale.  If I get a chance (HA!) I'll write up a little tutorial of how I 
improve performance with large tiled rastersets.

-- dan greve
-- software engineer, Northrop Grumman
-- palm bay, FL

>From: Gambin Dejan <Dejan.Gambin at PULA.HR>
>Reply-To: Gambin Dejan <Dejan.Gambin at PULA.HR>
>To: MAPSERVER-USERS at LISTS.UMN.EDU
>Subject: Re: [UMN_MAPSERVER-USERS] GeoTIFF overviews / TILEINDEX / Large 
>dataset performance
>Date: Fri, 10 Jun 2005 08:33:20 +0200
>
>Frank,
>
>I have to say it is great to get such a detalied explanation of
>overviews/tiles/performance problems. I would personally very like to
>see some kind of HOWTO that would explain this in more details end
>technical/theoretical background. For example, how to create overviews
>that exactly match desired output resolution so no resampling occurs
>(with fixed set of zoom scales of course) and similar performance
>things.
>
>I would be happy enough if someone can point me to literature that
>explains all this stuff.
>
>regards
>
>dejan
>
>-----Original Message-----
>From: UMN MapServer Users List [mailto:MAPSERVER-USERS at LISTS.UMN.EDU] On
>Behalf Of Frank Warmerdam
>Sent: Thursday, June 09, 2005 9:53 PM
>To: MAPSERVER-USERS at LISTS.UMN.EDU
>Subject: Re: [UMN_MAPSERVER-USERS] GeoTIFF overviews / TILEINDEX / Large
>dataset performance
>
>
>On 6/4/05, Dan Greve <grevedan at hotmail.com> wrote:
> > To everyone,
> >
> > Are GeoTiff overviews taken advantage of by Mapserver?  What's the
> > best way to handle large datasets (300 GB, 260,000 files in my case)
> > when you want the user to be able to view the whole dataset.
>
>
>Dan,
>
>I think that Mark had the right idea with creating new overview layers
>to kick in at various scales.   To show an overview of your whole region
>it would be disaster to have touch all 260,000 of your files.
>
>To answer one specific question, MapServer will take advantage of
>overviews
>built into GeoTIFF files (assuming GDAL is in use).
>
> > I have a lot of data, probably about 300GB spread among 260,000 tiles.
>
> > Let's just say the region is... Texas.  I want the user to be able to
> > see the data set at ANY zoom factor.  He'd start out looking at the
> > entire state of texas, and be able to zoom progressively into a city
> > block, and back out again.  If the data format could handle 300 GB in
> > a single file (I'm using GeoTIFF), theoretically the performance would
>
> > be better than if I created a TILEINDEX (shapefile) of the 260,000
> > tiles.  I've seen this in smaller datasets when requesting the entire
> > scene, even just a 13000x13000 dataset with just over 2500 tiles.
>
>If GeoTIFF supported very large files (there are plans for "BigTIFF
>support" one day) then I might encourage you to just create one huge
>internally
>tiled GeoTIFF file with lots and lots of overview levels.  You could use
>a
>format like Erdas Imagine that does support very large files and build
>one huge mosaic image, with lots of overviews.  It *ought* to work quite
>efficiently though there might be some efficiency hits with such a large
>dataset.  For instance, just processing the block pointers array might
>prove quite a bit of work.
>
>What Mark is suggesting is to:
>
>  o Create a tile index for all your files.  You will likely want a
>spatial
>     index built on this tileindex shapefile.
>
>  o Create a layer in your mapfile using this tileindex, perhaps named
>     "mosaic_fullres".
>
>  o I would suggest building internal overviews on all the individual
>     geotiff files as well.
>
>  o View the resulting layer in MapServer, starting near full resolution.
>     Zoom out till performance degrades unacceptably.  This will be a
>     new resolution at which you you need to build a new "overview
>layer".
>     This isn't an overview within the files in question, it is a whole
>new
>     layer in the map.  There are a variety of ways to build it.  I would
>likely
>     prepare a script to generate it with MapServer itself, by issuing a
>series
>     of scripted render requests at your new chosen overview resolution.
>
>   o If you produce this new layer as a set of tile files, you will also
>need
>     a tileindex for it.
>
>   o In the mapfile you will need to add this new set of tiles as a new
>layer.
>      You will want to use the MINSCALE and MAXSCALE options on this
>      layer and the full resolution layer to ensure that renders start to
>operate
>      from this layer instead of the full resolution data at a suitable
>resolution.
>
>      There is some mechanism (GROUP?  Using the same layer name?) to
>      ensure that this layer and the full res layer will be treated as a
>single
>      layer from a user-visible point of view.  I don't know this details
>of this
>     aspect.
>
>   o you can repeat the above overview layer steps to build additional
>     overview layers if needed till your full scene gives acceptable
>performance.
>
>OK, looking over my garbled explanation, I'm not sure I have helped at
>all. This fairly common situations screams out for some sort of utility
>to help build the overview map layers.  Or at least we should have a
>more detailed HOWTO for this process than I am in a position to prepare
>just now.
>
> >
> > When you say
> >
> > "I create a new tile grid shapefile using that map extent as the size
> > of one tile. I tile the entire map area (in my case the world). "
> >
> > Do you mean you just duplicate the entire dataset with larger tiles
> > when a TILEINDEX search would take longer?
>
>He means to duplicate the whole dataset, but at the much reduced
>resolution at which the render performance started to degrade.  If your
>original files were fairly large, and had internal overviews built, I
>believe your first overview map layer would likely be at something like
>1/128'th of the resolution of the original data. So the overview dataset
>would then be 1/16000th the size of the original data or so.
>
> > When you say
> >
> > "I create a new aggregate image layer using calls to the map server to
>
> > generate an image for each new tile."
> >
> > I have no idea what you meant by "aggregate image layer"
>
>He means a whole new map layer which is at a reduced resolution. It is
>an aggregate of a whole bunch of calls to mapserver to render tiles of
>the total region.  (hence the need for a new tile index).
>
> > Are you downsampling the image at all as you increase the tile sizes?
>
> > The raster howto on the UMN site has a snippet about Frank W. wanting
> > to implement using GeoTIFF overviews in the mapserver.  Does mapserver
>
> > currently take advantage of this? Could you elaborate on your pyramid
> > scheme?
>
>Yes, he means that it would be at a much reduced resolution.  The tile
>sizes in meters is much bigger, but the actual tile sizes in terms of
>pixels need not necessarily be much larger.
>
>Note that there are different types of tiling and overviews coming into
>play.  o Macro tiling: Each tile is a separate TIFF file, and a
>tileindex shapefile
>     is used to associate them to treat them as one layer in the .map
>file.
>  o Internal tiling: A given TIFF file can be internally organized into
>tiles
>     as opposed to strips (scanlines).  This gives
>
>  o  "map level overviews": using mutiple layers in a .map file with
>     MINSCALE/MAXSCALE to select which layer to render from.
>  o "internal overviews": individual TIFF files can have overviews built
>     in and GDAL will automatically take advantage of them if present.
>
>Best regards,
>--
>---------------------------------------+--------------------------------
>---------------------------------------+------
>I set the clouds in motion - turn up   | Frank Warmerdam,
>warmerdam at pobox.com
>light and sound - activate the windows | http://pobox.com/~warmerdam
>and watch the world go round - Rush    | Geospatial Programmer for Rent



More information about the mapserver-users mailing list