ATTN: mirror sites using wget
    Justin Hickey 
    jhickey at impact1.hpcc.nectec.or.th
       
    Fri Nov 27 05:13:13 EST 1998
    
    
  
Hello all
If you are maintaining a mirror site of Markus's web site using wget, you may
have the following problem.
I recently found that wget was downloading extra files from Markus's site that
overwrote some of the pages of the grass site. My wget command was as follows:
wget -b -m -np --cut-dirs=1 -P /www/hpcc/grass -o mirrorLog
http://www.geog.uni-hannover.de/grass/
As far as I know this is supposed to only download files under
http://www.geog.uni-hannover.de/grass/, however, wget was downloading files
from http://www.geog.uni-hannover.de/grasslinks/ and
http://www.geog.uni-hannover.de/grasslinksB/ as well. The effect it had was
that the grasslinks/index.html file overwrote the grass/index.html file wiping
out the main home page for the grass site. It may have wiped out other files as
well if they had the same name.
I don't know why wget does this but I have mailed the wget mailing list to see
what they say. If you find your mirror site has the same problem, then a fix is
to use the -X option to exclude the "offending" directories like so:
wget -b -m -np -X /grasslinks/,/grasslinksB/ --cut-dirs=1 -P /www/hpcc/grass -o
mirrorLog http://www.geog.uni-hannover.de/grass/
Just thought I'd let you know.
-- 
Sincerely,
Jazzman (a.k.a. Justin Hickey)  e-mail: jhickey at hpcc.nectec.or.th
High Performance Computing Center
National Electronics and Computer Technology Center (NECTEC)
Bangkok, Thailand
==================================================================
People who think they know everything are very irritating to those
of us who do.  ---Anonymous
Jazz and Trek Rule!!!
==================================================================
    
    
More information about the grass-user
mailing list