You can download the latest version of this document at the official URL http://www.gnu.org/software/wget/faq.html.
The maintainer of this document is Mauro Tortonesi (mauro at ferrara dot linux dot it). You are encouraged to email him with concerns regarding this FAQ. The original author of the GNU Wget FAQ is James Bouressa (james at bouressa dot com).
GNU Wget is a network utility to retrieve files from the World Wide Web using HTTP and FTP, the two most widely used Internet protocols. It works non-interactively, so it can work in the background, after having logged off. The program supports recursive retrieval of web-authoring pages as well as FTP sites -- you can use Wget to make mirrors of archives and home pages or to travel the Web like a WWW robot.
You can find the official Wget homepage at this URL:
There are also other two homepages related to Wget:You can:
Source Tarball:
The main mailing list for end users is wget@sunsite.dk. You can subscribe by sending an email with a message body of "subscribe" to wget-subscribe@sunsite.dk. If you wish to post to the list, please be sure and include the complete output of your problem when using the -d flag with Wget. It will drastically improve the likelihood and quality of responses.
You can view the mailing list archives at http://www.mail-archive.com/wget%40sunsite.dk/
Info about other mailing lists can be found on the GNU Wget home page.
On most UNIX-like operating systems, this will work:
$ gzip -dc wget-1.10.tar.gz | tar -xvf - $ cd wget-1.10 $ ./configure $ make # make installIf it doesn't, be sure to look at the README and INSTALL files that came with your distribution. You can also run configure with the "--help" flag to get more options.
Try using:
wget -erobots=off http://your.site.here
Yes, starting from version 1.10, GNU Wget supports files larger than 2GB.
Yes. You can load your Mozilla/Firefox cookie file using the
--load-cookies
option. Wget will accept cookies by
default and save them if you specify
--save-cookies
. Also see the
--keep-session-cookies
option, which forces saving
of session cookies to disk.
See the documentation for details.
Try putting single or double quotes around the URL:
wget 'http://my.funny/$url&with%characters special;to|my#operating<system'Or try substituting the funny character with a percent sign (%) and the character's ASCII HEX equivalent. So this URL:
wget 'http://my.funny/$url&with characters special;to|my#operating<system'becomes:
wget 'http://my.funny/%24url%26with%20characters%20special%3Bto%7Bmy%23operating%3Csystem'
The server admin may be specifically denying the Wget user agent.
Try changing the identification string to something else:
wget -m -U "Mozilla/5.0 (compatible; Konqueror/3.2; Linux)" http://some.web.site
Wget is an HTTP/1.0 client. But, since the HTTP/1.1 protocol was designed to fully support HTTP/1.0 clients, Wget interoperates with any HTTP/1.1 compliant server.
In addition, Wget support several features introduced by HTTP/1.1 and used by many web servers, such as keep-alive connections and the Host header.
Wget doesn't feature JavaScript support and is not capable of performing recursive retrieval of URLs included in JavaScript code.
In fact, it is impossible to extract URLs from JavaScript by merely parsing it. Web clients need to actually execute it, and Wget can't do that since it's not a GUI browser.
However, some heuristics could be applied to JavaScript source code to extract URLs from certain often-encountered JavaScript patterns, such as rollover image changes. This feature will probably added in a later version of Wget.
I don't think there is a portable and reliable way to prevent this. You can work around it, though. Put your URLs with passwords to a file and invoke Wget with `wget -i FILE'. Or use `wget -i -' and type the URL followed by ctrl-d. You may also be able to put this info in wgetrc.
Later versions of Wget support posting of forms. Try:
wget --post-data="login=user&password=pw" http://www.yourclient.com/somepage.html
Return to the GNU Project home page.
Please send FSF & GNU inquiries to
gnu@gnu.org.
There are also other ways to contact the FSF.
Please send broken links and other corrections (or suggestions) to
webmasters@gnu.org.
Please see the Translations README for information on coordinating and submitting translations of this article.
Copyright (C) 2001, 2003, 2004 Free Software Foundation, Inc.,
59 Temple Place - Suite 330, Boston, MA 02111, USA
Verbatim copying and distribution of this entire article is
permitted worldwide without royalty in any medium provided
this notice is preserved.
Updated: $Date: 2005/07/01 12:22:49 $ $Author: hniksic $