GeoTIFF overviews / TILEINDEX / Large dataset performance

Gambin Dejan Dejan.Gambin at PULA.HR
Mon Jun 13 09:47:58 EDT 2005


Hi,

I could only say that I begin to understand the things that Frank has
told us. But I would like if someone can clearify few things:

1. Frank talks about different types of tilings and overviews:
	- is the tiling type related to overview type and how? To be
more concrete, if I have a set of tiff files (and i am using tileindex
shapefile to associate thme into one layer) can I use internal
overviews? Which are the possible combinations? What are the advantages
and disadvantages? Is it better to use internal overviews/tiles or what?

2. Ed McNierney gave a good explanation of getting "the best possible
performance by constraining the map view scale to a list of specific
values" "and then creating overviews to exactly match those output
scales" so that no resampling occurs. Can someone give me an example of
creating an overview that exactly match a predefined output scale? Any
example would be OK. If it is easier to someone I can give my example.
my tiffs are something like:
...
Subfile Type: (0 = 0x0)
  Image Width: 4500 Image Length: 6000
  Resolution: 72, 72 pixels/inch
  Bits/Sample: 8
  Compression Scheme: None
  Photometric Interpretation: RGB color
  Date & Time: "2004:12:14 17:10:55"
  Software: "Adobe Photoshop 7.0"
  Samples/Pixel: 3
  Rows/Strip: 6000
...

and suppose I want to build an overview tiff file that exactly matches
the 1:10000 scale. You can use any additional input data if needed for
example

3. This is not so much related to overviews/tiles but I would like to
know if there is any difference in using tiffs with their associating
world files or real Geotiffs that have the geocoding informations
embedded within the file? Is there any benefits to convert to "real"
geotiff files using for example geotifcp utility?

thanks very much

dg

-----Original Message-----
From: Dan Greve [mailto:grevedan at hotmail.com] 
Sent: Friday, June 10, 2005 10:50 PM
To: Gambin Dejan; MAPSERVER-USERS at LISTS.UMN.EDU
Subject: Re: [UMN_MAPSERVER-USERS] GeoTIFF overviews / TILEINDEX / Large
dataset performa


Unfortunately documentation is usually one of the last steps in the 
development process.  I'm one of the worst at even internally
documenting my 
code, much less providing additional user information.

God it's good having Frank back from vacation.  I've learned alot from
him, 
and he always seems to find time to respond to questions while saving
the 
GIS world.  Hear Hear!!

Seriously though, I'm pretty sure I understand most of what you said
Frank, 
and will start building additional layers and looking into Group/Max Min

scale.  If I get a chance (HA!) I'll write up a little tutorial of how I

improve performance with large tiled rastersets.

-- dan greve
-- software engineer, Northrop Grumman
-- palm bay, FL

>From: Gambin Dejan <Dejan.Gambin at PULA.HR>
>Reply-To: Gambin Dejan <Dejan.Gambin at PULA.HR>
>To: MAPSERVER-USERS at LISTS.UMN.EDU
>Subject: Re: [UMN_MAPSERVER-USERS] GeoTIFF overviews / TILEINDEX / 
>Large
>dataset performance
>Date: Fri, 10 Jun 2005 08:33:20 +0200
>
>Frank,
>
>I have to say it is great to get such a detalied explanation of 
>overviews/tiles/performance problems. I would personally very like to 
>see some kind of HOWTO that would explain this in more details end 
>technical/theoretical background. For example, how to create overviews 
>that exactly match desired output resolution so no resampling occurs 
>(with fixed set of zoom scales of course) and similar performance 
>things.
>
>I would be happy enough if someone can point me to literature that 
>explains all this stuff.
>
>regards
>
>dejan
>
>-----Original Message-----
>From: UMN MapServer Users List [mailto:MAPSERVER-USERS at LISTS.UMN.EDU] 
>On Behalf Of Frank Warmerdam
>Sent: Thursday, June 09, 2005 9:53 PM
>To: MAPSERVER-USERS at LISTS.UMN.EDU
>Subject: Re: [UMN_MAPSERVER-USERS] GeoTIFF overviews / TILEINDEX / 
>Large dataset performance
>
>
>On 6/4/05, Dan Greve <grevedan at hotmail.com> wrote:
> > To everyone,
> >
> > Are GeoTiff overviews taken advantage of by Mapserver?  What's the 
> > best way to handle large datasets (300 GB, 260,000 files in my case)

> > when you want the user to be able to view the whole dataset.
>
>
>Dan,
>
>I think that Mark had the right idea with creating new overview layers
>to kick in at various scales.   To show an overview of your whole
region
>it would be disaster to have touch all 260,000 of your files.
>
>To answer one specific question, MapServer will take advantage of 
>overviews built into GeoTIFF files (assuming GDAL is in use).
>
> > I have a lot of data, probably about 300GB spread among 260,000 
> > tiles.
>
> > Let's just say the region is... Texas.  I want the user to be able 
> > to see the data set at ANY zoom factor.  He'd start out looking at 
> > the entire state of texas, and be able to zoom progressively into a 
> > city block, and back out again.  If the data format could handle 300

> > GB in a single file (I'm using GeoTIFF), theoretically the 
> > performance would
>
> > be better than if I created a TILEINDEX (shapefile) of the 260,000 
> > tiles.  I've seen this in smaller datasets when requesting the 
> > entire scene, even just a 13000x13000 dataset with just over 2500 
> > tiles.
>
>If GeoTIFF supported very large files (there are plans for "BigTIFF 
>support" one day) then I might encourage you to just create one huge 
>internally tiled GeoTIFF file with lots and lots of overview levels.  
>You could use a
>format like Erdas Imagine that does support very large files and build
>one huge mosaic image, with lots of overviews.  It *ought* to work
quite
>efficiently though there might be some efficiency hits with such a
large
>dataset.  For instance, just processing the block pointers array might
>prove quite a bit of work.
>
>What Mark is suggesting is to:
>
>  o Create a tile index for all your files.  You will likely want a 
>spatial
>     index built on this tileindex shapefile.
>
>  o Create a layer in your mapfile using this tileindex, perhaps named
>     "mosaic_fullres".
>
>  o I would suggest building internal overviews on all the individual
>     geotiff files as well.
>
>  o View the resulting layer in MapServer, starting near full
resolution.
>     Zoom out till performance degrades unacceptably.  This will be a
>     new resolution at which you you need to build a new "overview 
>layer".
>     This isn't an overview within the files in question, it is a whole

>new
>     layer in the map.  There are a variety of ways to build it.  I 
>would likely
>     prepare a script to generate it with MapServer itself, by issuing 
>a series
>     of scripted render requests at your new chosen overview 
>resolution.
>
>   o If you produce this new layer as a set of tile files, you will 
>also need
>     a tileindex for it.
>
>   o In the mapfile you will need to add this new set of tiles as a new

>layer.
>      You will want to use the MINSCALE and MAXSCALE options on this
>      layer and the full resolution layer to ensure that renders start 
>to operate
>      from this layer instead of the full resolution data at a suitable

>resolution.
>
>      There is some mechanism (GROUP?  Using the same layer name?) to
>      ensure that this layer and the full res layer will be treated as 
>a single
>      layer from a user-visible point of view.  I don't know this 
>details of this
>     aspect.
>
>   o you can repeat the above overview layer steps to build additional
>     overview layers if needed till your full scene gives acceptable 
>performance.
>
>OK, looking over my garbled explanation, I'm not sure I have helped at 
>all. This fairly common situations screams out for some sort of utility

>to help build the overview map layers.  Or at least we should have a 
>more detailed HOWTO for this process than I am in a position to prepare

>just now.
>
> >
> > When you say
> >
> > "I create a new tile grid shapefile using that map extent as the 
> > size of one tile. I tile the entire map area (in my case the world).

> > "
> >
> > Do you mean you just duplicate the entire dataset with larger tiles 
> > when a TILEINDEX search would take longer?
>
>He means to duplicate the whole dataset, but at the much reduced 
>resolution at which the render performance started to degrade.  If your

>original files were fairly large, and had internal overviews built, I 
>believe your first overview map layer would likely be at something like

>1/128'th of the resolution of the original data. So the overview 
>dataset would then be 1/16000th the size of the original data or so.
>
> > When you say
> >
> > "I create a new aggregate image layer using calls to the map server 
> > to
>
> > generate an image for each new tile."
> >
> > I have no idea what you meant by "aggregate image layer"
>
>He means a whole new map layer which is at a reduced resolution. It is 
>an aggregate of a whole bunch of calls to mapserver to render tiles of 
>the total region.  (hence the need for a new tile index).
>
> > Are you downsampling the image at all as you increase the tile 
> > sizes?
>
> > The raster howto on the UMN site has a snippet about Frank W. 
> > wanting to implement using GeoTIFF overviews in the mapserver.  Does

> > mapserver
>
> > currently take advantage of this? Could you elaborate on your 
> > pyramid scheme?
>
>Yes, he means that it would be at a much reduced resolution.  The tile 
>sizes in meters is much bigger, but the actual tile sizes in terms of 
>pixels need not necessarily be much larger.
>
>Note that there are different types of tiling and overviews coming into

>play.  o Macro tiling: Each tile is a separate TIFF file, and a 
>tileindex shapefile
>     is used to associate them to treat them as one layer in the .map 
>file.
>  o Internal tiling: A given TIFF file can be internally organized into

>tiles
>     as opposed to strips (scanlines).  This gives
>
>  o  "map level overviews": using mutiple layers in a .map file with
>     MINSCALE/MAXSCALE to select which layer to render from.
>  o "internal overviews": individual TIFF files can have overviews
built
>     in and GDAL will automatically take advantage of them if present.
>
>Best regards,
>--
>---------------------------------------+-------------------------------
>---------------------------------------+-
>---------------------------------------+------
>I set the clouds in motion - turn up   | Frank Warmerdam,
>warmerdam at pobox.com
>light and sound - activate the windows | http://pobox.com/~warmerdam
>and watch the world go round - Rush    | Geospatial Programmer for Rent



More information about the mapserver-users mailing list