Creating website copies with wget

One can create a copy of a website with the *NIX wget command like this:

wget -k -K -E -r -l 10 -p -N -F --restrict-file-names=windows -nH http://website.com/

The options in more detail:

-nH = no host directories
-F = force html
-N use timestamp
-l (–level=depth) level of recursion
-k (–convert-links)
-K (–backup-converted) When converting a file, back up the original version with a ‘.orig’ suffix. Affects the behavior of ‘-N’.
-E (–adjust-extension) adjust extension, change file ending to html
-r (–recursive) recursive
-P Set directory prefix to prefix. The directory prefix is the directory where all other files and subdirectories will be saved to, i.e. the top of the retrieval tree. The default is ‘.’ (the current directory).
-p (–page-requisites) This option causes Wget to download all the files that are necessary to properly display a given HTML page. This includes such things as inlined images, sounds, and referenced stylesheets.

Some how one find himself in a situation where he/she needs to copy a website locally e.g. for development on a customer site. For this case the wget command is hopefully at you fingertips. With the above command you can simply copy the website to your local computer.

For those who are working on windows I recommend using the restrict-file-names=windows option. Otherwise you can get problems with long filenames.

Frank Zinner

Creating website copies with wget

You might also enjoy (View all posts)