data proliferation or the data that ate the disk space
Stephen Woodbridge
woodbri at SWOODBRIDGE.COM
Sat Mar 25 06:16:23 PST 2006
Richard,
Data management is a common problem. The best practices for me have been
to separate physical storage and logical storage. This is easiest to do
on Linux systems with symbolic links. For physical storage, I like to
keep datasets self contained, especially if I have to update them at any
frequency. Because these are self contained (ie. in a single directory
tree) it is easy to create a parallel tree with new data and just swap
out the old data for the new data by changing the symlink to the new
data. This also allow any data tree to reside on any partition.
For logical storage, I think in terms of maps or applications and I
build a single directory for each. Into this directory, I link in the
physical datasets in need and I create all the tileindexes relative to
that directory. Then in the mapfile I set DATAPATH to point to that
directory. So for example, I have tiger data directories for the
separate tiger releases with physical names like:
/u/data/tiger2004fe/
/u/data/tiger2004se/
/u2/data/tiger2005fe/
In my application directory I have something like:
/u/application/tiger -> /u/data/tiger2005fe/
I call the tiger data by "tiger" regardless of the version I am showing.
That way I can change the underlying data without the application caring
and I don't need to rebuild the tileindexes.
If I want to move the application to another server, I move the physical
datasets I need and the application directory and fix up the symlinks to
point to the respective new locations. In 99% of the time I do not need
to rebuild the tileindexes.
Hope this helps,
-Steve W.
Richard Taylor wrote:
> Hello LIST
>
> this is not just a MapServer question, but perhaps some of you farther
> down the path have insights that you are willing to pass on.
>
> As my learning curve progresses i find that local data volume is
> increasing rapidly. It started of course with local apps, then expanded
> with my introduction to MapServer, in my case ms4w, for getting the
> basics, then has continued on to local directories to send up to remote
> unix system instances.
>
> While the mapfiles allow one to give a full path to your data, meaning
> locally you can get at it wherever it is, that structure does not hold
> well with or all with remote instances. the end result is multiple
> copies of many files, some of which are quite large, one for local apps,
> one for ms4w, and one for each remote mapserver.
>
> One solution is to keep getting large storage space but feeling this
> might a common problem wonder if any of the long term users or those
> with large data volumes have come to a 'best practises' solution to this
> issue.
>
> thanks in advance
>
> richard taylor
>
More information about the MapServer-users
mailing list