Previous: Recursive Retrieval Options, Up: Invoking



2.11 Recursive Accept/Reject Options

-A acclist --accept acclist
-R rejlist --reject rejlist
Specify comma-separated lists of file name suffixes or patterns to accept or reject (see Types of Files for more details).
-D domain-list
--domains=domain-list
Set domains to be followed. domain-list is a comma-separated list of domains. Note that it does not turn on -H.
--exclude-domains domain-list
Specify the domains that are not to be followed. (see Spanning Hosts).


--follow-ftp
Follow ftp links from html documents. Without this option, Wget will ignore all the ftp links.


--follow-tags=list
Wget has an internal table of html tag / attribute pairs that it considers when looking for linked documents during a recursive retrieval. If a user wants only a subset of those tags to be considered, however, he or she should be specify such tags in a comma-separated list with this option.
--ignore-tags=list
This is the opposite of the --follow-tags option. To skip certain html tags when recursively looking for documents to download, specify them in a comma-separated list.

In the past, this option was the best bet for downloading a single page and its requisites, using a command-line like:

          wget --ignore-tags=a,area -H -k -K -r http://site/document
     

However, the author of this option came across a page with tags like <LINK REL="home" HREF="/"> and came to the realization that specifying tags to ignore was not enough. One can't just tell Wget to ignore <LINK>, because then stylesheets will not be downloaded. Now the best bet for downloading a single page and its requisites is the dedicated --page-requisites option.

-H
--span-hosts
Enable spanning across hosts when doing recursive retrieving (see Spanning Hosts).
-L
--relative
Follow relative links only. Useful for retrieving a specific home page without any distractions, not even those from the same hosts (see Relative Links).
-I list
--include-directories=list
Specify a comma-separated list of directories you wish to follow when downloading (see Directory-Based Limits for more details.) Elements of list may contain wildcards.
-X list
--exclude-directories=list
Specify a comma-separated list of directories you wish to exclude from download (see Directory-Based Limits for more details.) Elements of list may contain wildcards.
-np
--no-parent
Do not ever ascend to the parent directory when retrieving recursively. This is a useful option, since it guarantees that only the files below a certain hierarchy will be downloaded. See Directory-Based Limits, for more details.