Mirror site with wget

Venkatesh Raghavan raghavan at media.osaka-cu.ac.jp
Wed Sep 9 01:52:21 EDT 1998


Hi,
Just to confirm about the changes needed in script for automatically
mirroring Marku's Grass homepage. I have been having some trouble
with the mirroring script since the URL has been changed.

Will the following change in Justin's script work for mirroring the
script. I have changed the --cut-dirs to 1 and also added the
new URL as follows

wget -b -m -np --cut-dirs=1 -P /www/grass -o /home/webmaster/mirrorLog/httpLog0
http://www.geog.uni-hannover.de/grass/

but am having some problems with the mirroring.
Can someone (Justin maybe?) please confirm if the
changes in the script are okay. It seems that I have to redo
the entire mirroring once again.

Thanks in advance.

Venkatesh Raghavan
Osaka City University
raghavan at media.osaka-cu.ac.jp




Justin Hickey wrote:

> Hello all
>
> I set up a mirror site of Markus's http and ftp sites using wget. After
> compiling wget, this is what I did:
>
> 1. Defined directories under my http server and my ftp server to hold the grass
> data
>
> 2. Made changes (set proxies etc.) to the global wgetrc file (default is
> /usr/local/etc/wgetrc). The only change worth noting is that I added the
> following line so that the host name of the URL's would be dropped when saving
> them
>
>         add_hostdir = off
>
> Otherwise, a directory is created with the name of the host machine (eg
> www.laum.uni-hannover.de) as the root of your mirror site.
>
> 3. Wrote a shell script to run the wget commands (I plan to use this script as
> a cron job) shown below
>
> ------------------------------ begin script ----------------------------------
>
> #! /bin/sh
>
> # Rotate the logs
> mv /home/webmaster/mirrorLog/httpLog3 /home/webmaster/mirrorLog/httpLog4
> mv /home/webmaster/mirrorLog/httpLog2 /home/webmaster/mirrorLog/httpLog3
> mv /home/webmaster/mirrorLog/httpLog1 /home/webmaster/mirrorLog/httpLog2
> mv /home/webmaster/mirrorLog/httpLog0 /home/webmaster/mirrorLog/httpLog1
> mv /home/webmaster/mirrorLog/ftpLog3 /home/webmaster/mirrorLog/ftpLog4
> mv /home/webmaster/mirrorLog/ftpLog2 /home/webmaster/mirrorLog/ftpLog3
> mv /home/webmaster/mirrorLog/ftpLog1 /home/webmaster/mirrorLog/ftpLog2
> mv /home/webmaster/mirrorLog/ftpLog0 /home/webmaster/mirrorLog/ftpLog1
>
> # Get the grass html pages
> wget -b -m -np --cut-dirs=3 -P /www/grass -o /home/webmaster/mirrorLog/httpLog0
> http://www.laum.uni-hannover.de/iln/grass/grass42/
>
> # Get the grass ftp pages
> wget -b -m -np --cut-dirs=2 -P /ftp/grass -o /home/webmast/mirrorLog/ftpLog0
> ftp://130.75.72.14/pub/grass421/
>
> ----------------------------------- end script ------------------------------
>
> Notes:
>
> The above shows a rotation of 5 logs each of wget output
>
> The wget commands should all be on one line (of course)
>
> Explanation of the options:
>
>         -b              run in the background
>         -m              use the mirror options
>         -np             no parent files - only download files that are under
> the
>                         given URL, even if there are links to files in other
>                         directories (eg without -np, if there is a link to the
>                         website's top page you will download the whole site
>                         instead of just the grass files)
>         --cut-dirs=n    remove n directories from the path of the URL (eg http
>                         URL is www.laum.uni-hannover.de/iln/grass/grass42/ if n
>                         equals 3 then iln/grass/grass42 is removed from the
> URL.
>                         Otherwise the mirror site will have iln/grass/grass42
>                         as its root)
>         -P <dest>       path to the mirror site destination
>         -o <log>        specify the log file
>
> I hope this is of help to anyone who is setting up a mirror site.
>
> --
> Sincerely,
>
> Jazzman (a.k.a. Justin Hickey)  e-mail: jhickey at hpcc.nectec.or.th
> High Performance Computing Center
> National Electronics and Computer Technology Center (NECTEC)
> Bangkok, Thailand
> ==================================================================
> People who think they know everything are very irritating to those
> of us who do.  ---Anonymous
>
> Jazz and Trek Rule!!!
> ==================================================================


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.osgeo.org/pipermail/grass-user/attachments/19980909/371a228d/attachment.html


More information about the grass-user mailing list