[Mapserver-users] generation of class colors and optimizing large shapefile access

Charlton Purvis cpurvis at asg.sc.edu
Fri Mar 21 14:37:43 EST 2003


I'm wondering if too-much-of-a-good-thing applies in the MapServer world.

My mind is racing w/ potential pros and cons of mixing the options you've given me together.  For example, I can certainly see the value and inherent simplicity of reducing the resolution into multiple shapefiles and only rendering them when their appropriate SCALE is encountered.  Further still, I can imagine that producing the coarse shapefiles would not be too time-consuming, but I can also see the overhead to reproduce these files if my original source data were to change.  Then again, maybe not, if I organized the data effectively.

I think a good start for my particular problem would be for me to:

* identify the levels of shapefile-granularity and how that correlates w/ SCALE boundaries.  

* Maybe I could create, say, 10 shapefiles from the original 1GB file, each shapefile 1/10 more fine than the next.  

* Perhaps I could run a combination of SHPTREE and TILEINDEX on all 10 of these shapefiles.

* My CLASSES would stay the same since we're dealing w/ the same types of data (elevation) no matter what the zoom factor.

And all of this w/o a RDBMS in the background.  Come to think of it, I don't think PostGIS would help me w/ the first two bullets.  But using what little knowledge I have gleaned from my partners in crime, it could possibly eliminate the third.

Thanks again for paying attention to another selfish email.

Charlton

-----Original Message-----
From: Ed McNierney [mailto:ed at topozone.com] 
Sent: Friday, March 21, 2003 1:49 PM
To: Charlton Purvis; mapserver-users at lists.gis.umn.edu
Subject: RE: [Mapserver-users] generation of class colors and optimizing large shapefile access

Charlton -

The other option I did not mention is the simplification of your shapefiles.  If you have, say, a shapefile with a single polyline in it that's the entire NC coastline at 1-meter resolution, then when you're zoomed way out you will waste a LOT of time having MapServer draw multiple tiny little line segments, all of which fit inside a single pixel.

In cases like these it's appropriate to look for tools (e.g. the "genfeat" sample feature generalization script that comes with ArcView) to create multiple shapefiles, each at a coarser resolution than the original.  You can then use THOSE files with MINSCALE/MAXSCALE to create layers with an appropriate level of detail for each zoom level.

	- Ed

-----Original Message-----
From: Charlton Purvis [mailto:cpurvis at asg.sc.edu]
Sent: Friday, March 21, 2003 1:41 PM
To: Ed McNierney; mapserver-users at lists.gis.umn.edu
Subject: RE: [Mapserver-users] generation of class colors and optimizing
large shapefile access


Kudos for an even better-articulated response, Ed!

Infinite thanks for sharing your experience.  I'll admit that we've been dancing on the fence of whether or not to stay in flatfile-land or plug it into PostGIS and let it take care of everything ("everything" being a gross overstatement).  Truth be told, I will probably have to try both before I'm completely happy.  You've shed bright light on flatfile approaches while other comrades have suggested that PostGIS will take care of my indexing issues.

Additionally, flatfiles are also somewhat compelling since we plan to share the data via MapServer queries as well as DODS (i.e. netCDF).

I'll read closely how to use SHPTREE and what type of parameters I'll need to control.  That sure will be a clear indication of whether or not I understand my data, not to mention my users!

Yes, I want to have an image of the entire shapefile as the first point-of-entry for the user.  Sounds like I'll need to keep my TIFF handy unless I want to have MapServer trudge through the entire shapefile -- not an option I'm willing to take.

As far as SCALE issues go, though, all I'm currently dealing w/ is one layer that might be divided into quite a few classes.  The layer is the topographic and bathymetric layers combined.  So I don't see any clear way to turn on and turn off parts of the layer based on SCALE or a clear way to take a sampling on-the-fly.  That is, I don't see a clear way for *me* to do it.  I'm hoping to leave it up to SHPTREE and TILEINDEX.

Thanks again.

Charlton

-----Original Message-----
From: Ed McNierney [mailto:ed at topozone.com] 
Sent: Friday, March 21, 2003 12:37 PM
To: Charlton Purvis; mapserver-users at lists.gis.umn.edu
Subject: RE: [Mapserver-users] generation of class colors and optimizing large shapefile access

Charlton -

Thanks for a well-articulated question!

You should focus on two MapServer tools for performance - TILEINDEX and SHPTREE spatial indexes.

The SHPTREE tool generates a spatial index for a shapefile.  Put simply, it acts analogously to a database index; it allows MapServer to perform spatial selection on a shapefile without having to do a linear scan of all the objects in the shapefile.  Use SHPTREE so that when you're drawing a portion of the NC coast, MapServer can quickly inspect only those objects inside a given shapefile that have a chance of being visible on the output image.

Remember, too, that each shapefile reports the extents of all the objects in the shapefile, so MapServer can quickly reject an entire shapefile if none of the objects in it will be drawn.

However (moving up a level), opening and checking a number of shapefiles to determine they're not needed takes time.  That's where the TILEINDEX comes in, created by the TILE4MS tool.  This tool takes a list of input shapefile names and creates a NEW shapefile (which can also be indexed with SHPTREE) that contains a rectangle for each input shapefile, showing its extent.  By opening the TILEINDEX file, MapServer can quickly ignore whole shapefiles that can't possibly show up on the map, and by using the SHPTREE index for the files it does open, it can quickly ignore those objects in the file that can't possibly show up on the map.

All of this is independent of scale.  You should certainly set appropriate MINSCALE and MAXSCALE values for each of these data layers.  If you zoom out to display all of NC and SC on one map, and your MAP file is set up to display every object in every layer, performance will be awful and the map illegible.  Remember that all this indexing allows MapServer to quickly discard objects that don't need to be drawn; if you compel everything to be drawn ANYWAY, the indexing won't help.

	- Ed

Ed McNierney
President and Chief Mapmaker
TopoZone.com / Maps a la carte, Inc.
73 Princeton Street, Suite 305
North Chelmsford, MA  01863
ed at topozone.com
(978) 251-4242 

-----Original Message-----
From: Charlton Purvis [mailto:cpurvis at asg.sc.edu]
Sent: Friday, March 21, 2003 11:42 AM
To: mapserver-users at lists.gis.umn.edu
Subject: [Mapserver-users] generation of class colors and optimizing
large shapefile access


Boy I'm showing my naiveté here, but I'm going to go for broke.

I'm dealing w/ topo/bat data whose shapefile approaches 1GB in size.  I have been through all sorts of conversions using ESRI software to get it from its original format to shapefile, and I thought I'd carry it further into a TIFF.  But I'm not going to be happy w/ a TIFF.  Not at this stage of the game, since it's topo/bat data that is the core of our product.

I've been through the wiki and the archives, and I'm not convinced my lines of thinking are going the right direction.

A few things:

* ArcMap allows me to manipulate my layers as categories and assign colors and ramps to each value or value range.  No surprise there.  Thinking MapServer now . . . let's say I were completely crazy and wanted to break my shapefile into 100 classes w/ a different color per class.  Is there any way to pull the RGB values from something ArcMap-esque w/o doing it manually to populate my .map file?  It's not a simple ramp that I use.  I'm not above coding anything, but I can't imagine that this hasn't been encountered before.

* My shapefile is of NC and SC coasts.  Assuming I can handle the classes issue satisfactorily, how about performance issues?  What is the correct way to approach this problem w/ my large shapefile?  For example, the initial image is of the entire Carolinas and their coasts.  Even if it is a 640x480 image, there is no reason to have MapServer crunch the shapefile to try to render pixels that don't even appear on the screen.  And along the same vain, when I'm zoomed in to a SC street detail (streets and population to be added later), I don't need to have MapServer consumed w/ NC-related data.  I have read examples of setting the scale to keep roads from displaying until you're zoomed in to a satisfactory level, but I'm not clear on how this could apply to me since I'm dealing w/ a homogenous set of land and water at this point.  Is the vector tile index the way to go?  (http://mapserver.gis.umn.edu/cgi-bin/wiki.pl?VectorTileIndex)  And that could muck up my classes and colors!
 from my .map file, right?  How about different shapefiles for different levels of zooming?  Sounds painful.

* Finally, on a hopefully related note, would PostGIS create a clearer picture for me?  I'm a RDBMS fellow, not GIS by birth, but it seems reasonable to assume that slapping the large datafile into a PostGIS database would lend itself to more efficient access than as a flatfile.  Why?  I'm not quite sure, since I've read about shapefile indexing on flatfiles.  But I can't imagine breaking up a shapefile that's in the database just to have it render faster to accommodate issues in my second bullet above.

I'll admit that the other half of our outfit is ESRI-centric.  I've asked them how these issues are resolved w/i their suite of applications.  I'm eager to learn.

As always, thanks for your amazingly free and intuitive advice.  I will work to return the favor(s).

Charlton

 
 
 
Charlton Purvis
(803) 777-4453 : voice
(803) 777-8833 : fax
cpurvis at sc.edu
 
Advanced Solutions Group
Department of Physics and Astronomy
University of South Carolina
Columbia, SC 29208

_______________________________________________
Mapserver-users mailing list
Mapserver-users at lists.gis.umn.edu
http://lists.gis.umn.edu/mailman/listinfo/mapserver-users




More information about the mapserver-users mailing list