WGET(1) GNU Wget WGET(1) NAME Wget - The non-interactive network downloader. SYNOPSIS wget [option]... [URL]... DESCRIPTION GNU Wget is a free utility for non-interactive download of files from the Web. It supports HTTP, HTTPS, and FTP protocols, as well as retrieval through HTTP proxies. Wget is non-interactive, meaning that it can work in the background, while the user is not logged on. This allows you to start a retrieval and disconnect from the system, letting Wget finish the work. By contrast, most of the Web browsers require constant user's presence, which can be a great hindrance when transferring a lot of data. Wget can follow links in HTML, XHTML, and CSS pages, to create local versions of remote web sites, fully recreating the directory structure of the original site. This is sometimes referred to as "recursive downloading." While doing that, Wget respects the Robot Exclusion Standard (/robots.txt). Wget can be instructed to convert the links in downloaded files to point at the local files, for offline viewing. Wget has been designed for robustness over slow or unstable network connections; if a download fails due to a network problem, it will keep retrying until the whole file has been retrieved. If the server supports regetting, it will instruct the server to continue the download from where it left off. OPTIONS Option Syntax Since Wget uses GNU getopt to process command-line arguments, every option has a long form along with the short one. Long options are more convenient to remember, but take time to type. You may freely mix different option styles, or specify options after the command-line arguments. Thus you may write: wget -r --tries=10 http://fly.srk.fer.hr/ -o log The space between the option accepting an argument and the argument may be omitted. Instead of -o log you can write -olog. You may put several options that do not require arguments together, like: wget -drc This is completely equivalent to: wget -d -r -c Since the options can be specified after the arguments, you may terminate them with --. So the following will try to download URL -x, reporting failure to log: wget -o log -- -x The options that accept comma-separated lists all respect the convention that specifying an empty list clears its value. This can be useful to clear the .wgetrc settings. For instance, if your .wgetrc sets "exclude_directories" to /cgi-bin, the following example will first reset it, and then set it to exclude /~nobody and /~somebody. You can also clear the lists in .wgetrc. wget -X " -X /~nobody,/~somebody Most options that do not accept arguments are boolean options, so named because their state can be captured with a yes-or-no ("boolean") variable. For example, --follow-ftp tells Wget to follow FTP links from HTML files and, on the other hand, --no-glob tells it not to perform file globbing on FTP URLs. A boolean option is either affirmative or negative (beginning with --no). All such options share several properties. Unless stated otherwise, it is assumed that the default behavior is the opposite of what the option accomplishes. For example, the documented existence of --follow-ftp assumes that the default is to not follow FTP links from HTML pages. Affirmative options can be negated by prepending the --no- to the option name; negative options can be negated by omitting the --no- prefix. This might seem superfluous---if the default for an affirmative option is to not do something, then why provide a way to explicitly turn it off? But the startup file may in fact change the default. For instance, using "follow_ftp = on" in .wgetrc makes Wget follow FTP links by default, and using --no-follow-ftp is the only way to restore the factory default from the command line. Basic Startup Options -V --version Display the version of Wget. -h --help Print a help message describing all of Wget's command-line options. -b --background Go to background immediately after startup. If no output file is specified via the -o, output is redirected to wget-log. -e command --execute command Execute command as if it were a part of .wgetrc. A command thus invoked will be executed after the commands in .wgetrc, thus taking precedence over them. If you need to specify more than one wgetrc command, use multiple instances of -e. Logging and Input File Options -o logfile --output-file=logfile Log all messages to logfile. The messages are normally reported to standard error. -a logfile --append-output=logfile Append to logfile. This is the same as -o, only it appends to logfile instead of overwriting the old log file. If logfile does not exist, a new file is created. -d --debug Turn on debug output, meaning various information important to the developers of Wget if it does not work properly. Your system administrator may have chosen to compile Wget without debug support, in which case -d will not work. Please note that compiling with debug support is always safe---Wget compiled with the debug support will not print any debug info unless requested with -d. -q --quiet Turn off Wget's output. -v --verbose Turn on verbose output, with all the available data. The default output is verbose. -nv --no-verbose Turn off verbose without being completely quiet (use -q for that), which means that error messages and basic information still get printed. --report-speed=type Output bandwidth as type. The only accepted value is bits. -i file --input-file=file Read URLs from a local or external file. If - is specified as file, URLs are read from the standard input. (Use ./- to read from a file literally named -.) If this function is used, no URLs need be present on the command line. If there are URLs both on the command line and in an input file, those on the command lines will be the first ones to be retrieved. If --force-html is not specified, then file should consist of a series of URLs, one per line. However, if you specify --force-html, the document will be regarded as html. In that case you may have problems with relative links, which you can solve either by adding "" to the documents or by specifying --base=url on the command line. If the file is an external one, the document will be automatically treated as html if the Content-Type matches text/html. Furthermore, the file's location will be implicitly used as base href if none was specified. -F --force-html When input is read from a file, force it to be treated as an HTML file. This enables you to retrieve relative links from existing HTML files on your local disk, by adding "" to HTML, or using the --base command-line option. -B URL --base=URL Resolves relative links using URL as the point of reference, when reading links from an HTML file specified via the -i/--input-file option (together with --force-html, or when the input file was fetched remotely from a server describing it as HTML). This is equivalent to the presence of a "BASE" tag in the HTML input file, with URL as the value for the "href" attribute. For instance, if you specify http://foo/bar/a.html for URL, and Wget reads ../baz/b.html from the input file, it would be resolved to http://foo/baz/b.html. --config=FILE Specify the location of a startup file you wish to use. Download Options --bind-address=ADDRESS When making client TCP/IP connections, bind to ADDRESS on the local machine. ADDRESS may be specified as a hostname or IP address. This option can be useful if your machine is bound to multiple IPs. -t number --tries=number Set number of tries to number. Specify 0 or inf for infinite retrying. The default is to retry 20 times, with the exception of fatal errors like "connection refused" or "not found" (404), which are not retried. -O file --output-document=file The documents will not be written to the appropriate files, but all will be concatenated together and written to file. If - is used as file, documents will be printed to standard output, disabling link conversion. (Use ./- to print to a file literally named -.) Use of -O is not intended to mean simply "use the name file instead of the one in the URL;" rather, it is analogous to shell redirection: wget -O file http://foo is intended to work like wget -O - http://foo > file; file will be truncated immediately, and all downloaded content will be written there. For this reason, -N (for timestamp-checking) is not supported in combination with -O: since file is always newly created, it will always have a very new timestamp. A warning will be issued if this combination is used. Similarly, using -r or -p with -O may not work as you expect: Wget won't just download the first file to file and then download the rest to their normal names: all downloaded content will be placed in file. This was disabled in version 1.11, but has been reinstated (with a warning) in 1.11.2, as there are some cases where this behavior can actually have some use. Note that a combination with -k is only permitted when downloading a single document, as in that case it will just convert all relative URIs to external ones; -k makes no sense for multiple URIs when they're all being downloaded to a single file; -k can be used only when the output is a regular file. -nc --no-clobber If a file is downloaded more than once in the same directory, Wget's behavior depends on a few options, including -nc. In certain cases, the local file will be clobbered, or overwritten, upon repeated download. In other cases it will be preserved. When running Wget without -N, -nc, -r, or -p, downloading the same file in the same directory will result in the original copy of file being preserved and the second copy being named file.1. If that file is downloaded yet again, the third copy will be named file.2, and so on. (This is also the behavior with -nd, even if -r or -p are in effect.) When -nc is specified, this behavior is suppressed, and Wget will refuse to download newer copies of file. Therefore, ""no-clobber"" is actually a misnomer in this mode---it's not clobbering that's prevented (as the numeric suffixes were already preventing clobbering), but rather the multiple version saving that's prevented. When running Wget with -r or -p, but without -N, -nd, or -nc, re-downloading a file will result in the new copy simply overwriting the old. Adding -nc will prevent this behavior, instead causing the original version to be preserved and any newer copies on the server to be ignored. When running Wget with -N, with or without -r or -p, the decision as to whether or not to download a newer copy of a file depends on the local and remote timestamp and size of the file. -nc may not be specified at the same time as -N. Note that when -nc is specified, files with the suffixes .html or .htm will be loaded from the local disk and parsed as if they had been retrieved from the Web. --backups=backups Before (over)writing a file, back up an existing file by adding a .1 suffix (_1 on VMS) to the file name. Such backup files are rotated to .2, .3, and so on, up to backups (and lost beyond that). -c --continue Continue getting a partially-downloaded file. This is useful when you want to finish up a download started by a previous instance of Wget, or by another program. For instance: wget -c ftp://sunsite.doc.ic.ac.uk/ls-lR.Z If there is a file named ls-lR.Z in the current directory, Wget will assume that it is the first portion of the remote file, and will ask the server to continue the retrieval from an offset equal to the length of the local file. Note that you don't need to specify this option if you just want the current invocation of Wget to retry downloading a file should the connection be lost midway through. This is the default behavior. -c only affects resumption of downloads started prior to this invocation of Wget, and whose local files are still sitting around. Without -c, the previous example would just download the remote file to ls-lR.Z.1, leaving the truncated ls-lR.Z file alone. Beginning with Wget 1.7, if you use -c on a non-empty file, and it turns out that the server does not support continued downloading, Wget will refuse to start the download from scratch, which would effectively ruin existing contents. If you really want the download to start from scratch, remove the file. Also beginning with Wget 1.7, if you use -c on a file which is of equal size as the one on the server, Wget will refuse to download the file and print an explanatory message. The same happens when the file is smaller on the server than locally (presumably because it was changed on the server since your last download attempt)---because "continuing" is not meaningful, no download occurs. On the other side of the coin, while using -c, any file that's bigger on the server than locally will be considered an incomplete download and only "(length(remote) - length(local))" bytes will be downloaded and tacked onto the end of the local file. This behavior can be desirable in certain cases---for instance, you can use wget -c to download just the new portion that's been appended to a data collection or log file. However, if the file is bigger on the server because it's been changed, as opposed to just appended to, you'll end up with a garbled file. Wget has no way of verifying that the local file is really a valid prefix of the remote file. You need to be especially careful of this when using -c in conjunction with -r, since every file will be considered as an "incomplete download" candidate. Another instance where you'll get a garbled file if you try to use -c is if you have a lame HTTP proxy that inserts a "transfer interrupted" string into the local file. In the future a "rollback" option may be added to deal with this case. Note that -c only works with FTP servers and with HTTP servers that support the "Range" header. --start-pos=OFFSET Start downloading at zero-based position OFFSET. Offset may be expressed in bytes, kilobytes with the `k' suffix, or megabytes with the `m' suffix, etc. --start-pos has higher precedence over --continue. When --start-pos and --continue are both specified, wget will emit a warning then proceed as if --continue was absent. Server support for continued download is required, otherwise --start-pos cannot help. See -c for details. --progress=type Select the type of the progress indicator you wish to use. Legal indicators are "dot" and "bar". The "bar" indicator is used by default. It draws an ASCII progress bar graphics (a.k.a "thermometer" display) indicating the status of retrieval. If the output is not a TTY, the "dot" bar will be used by default. Use --progress=dot to switch to the "dot" display. It traces the retrieval by printing dots on the screen, each dot representing a fixed amount of downloaded data. The progress type can also take one or more parameters. The parameters vary based on the type selected. Parameters to type are passed by appending them to the type sperated by a colon (:) like this: --progress=type:parameter1:parameter2. When using the dotted retrieval, you may set the style by specifying the type as dot:style. Different styles assign different meaning to one dot. With the "default" style each dot represents 1K, there are ten dots in a cluster and 50 dots in a line. The "binary" style has a more "computer"-like orientation---8K dots, 16-dots clusters and 48 dots per line (which makes for 384K lines). The "mega" style is suitable for downloading large files---each dot represents 64K retrieved, there are eight dots in a cluster, and 48 dots on each line (so each line contains 3M). If "mega" is not enough then you can use the "giga" style---each dot represents 1M retrieved, there are eight dots in a cluster, and 32 dots on each line (so each line contains 32M). With --progress=bar, there are currently two possible parameters, force and noscroll. When the output is not a TTY, the progress bar always falls back to "dot", even if --progress=bar was passed to Wget during invokation. This behaviour can be overridden and the "bar" output forced by using the "force" parameter as --progress=bar:force. By default, the bar style progress bar scroll the name of the file from left to right for the file being downloaded if the filename exceeds the maximum length allotted for its display. In certain cases, such as with --progress=bar:force, one may not want the scrolling filename in the progress bar. By passing the "noscroll" parameter, Wget can be forced to display as much of the filename as possible without scrolling through it. Note that you can set the default style using the "progress" command in .wgetrc. That setting may be overridden from the command line. For example, to force the bar output without scrolling, use --progress=bar:force:noscroll. --show-progress Force wget to display the progress bar in any verbosity. By default, wget only displays the progress bar in verbose mode. One may however want wget to display the progress bar on screen in conjunction with any other verbosity modes like --no-verbose or --quiet. This is often a desired a property when invoking wget to download several small/large files. In such a case, wget could simply be invoked with this parameter to get a much cleaner output on the screen. -N --timestamping Turn on time-stamping. --no-use-server-timestamps Don't set the local file's timestamp by the one on the server. By default, when a file is downloaded, its timestamps are set to match those from the remote file. This allows the use of --timestamping on subsequent invocations of wget. However, it is sometimes useful to base the local file's timestamp on when it was actually downloaded; for that purpose, the --no-use-server-timestamps option has been provided. -S --server-response Print the headers sent by HTTP servers and responses sent by FTP servers. --spider When invoked with this option, Wget will behave as a Web spider, which means that it will not download the pages, just check that they are there. For example, you can use Wget to check your bookmarks: wget --spider --force-html -i bookmarks.html This feature needs much more work for Wget to get close to the functionality of real web spiders. -T seconds --timeout=seconds Set the network timeout to seconds seconds. This is equivalent to specifying --dns-timeout, --connect-timeout, and --read-timeout, all at the same time. When interacting with the network, Wget can check for timeout and abort the operation if it takes too long. This prevents anomalies like hanging reads and infinite connects. The only timeout enabled by default is a 900-second read timeout. Setting a timeout to 0 disables it altogether. Unless you know what you are doing, it is best not to change the default timeout settings. All timeout-related options accept decimal values, as well as subsecond values. For example, 0.1 seconds is a legal (though unwise) choice of timeout. Subsecond timeouts are useful for checking server response times or for testing network latency. --dns-timeout=seconds Set the DNS lookup timeout to seconds seconds. DNS lookups that don't complete within the specified time will fail. By default, there is no timeout on DNS lookups, other than that implemented by system libraries. --connect-timeout=seconds Set the connect timeout to seconds seconds. TCP connections that take longer to establish will be aborted. By default, there is no connect timeout, other than that implemented by system libraries. --read-timeout=seconds Set the read (and write) timeout to seconds seconds. The "time" of this timeout refers to idle time: if, at any point in the download, no data is received for more than the specified number of seconds, reading fails and the download is restarted. This option does not directly affect the duration of the entire download. Of course, the remote server may choose to terminate the connection sooner than this option requires. The default read timeout is 900 seconds. --limit-rate=amount Limit the download speed to amount bytes per second. Amount may be expressed in bytes, kilobytes with the k suffix, or megabytes with the m suffix. For example, --limit-rate=20k will limit the retrieval rate to 20KB/s. This is useful when, for whatever reason, you don't want Wget to consume the entire available bandwidth. This option allows the use of decimal numbers, usually in conjunction with power suffixes; for example, --limit-rate=2.5k is a legal value. Note that Wget implements the limiting by sleeping the appropriate amount of time after a network read that took less time than specified by the rate. Eventually this strategy causes the TCP transfer to slow down to approximately the specified rate. However, it may take some time for this balance to be achieved, so don't be surprised if limiting the rate doesn't work well with very small files. -w seconds --wait=seconds Wait the specified number of seconds between the retrievals. Use of this option is recommended, as it lightens the server load by making the requests less frequent. Instead of in seconds, the time can be specified in minutes using the "m" suffix, in hours using "h" suffix, or in days using "d" suffix. Specifying a large value for this option is useful if the network or the destination host is down, so that Wget can wait long enough to reasonably expect the network error to be fixed before the retry. The waiting interval specified by this function is influenced by "--random-wait", which see. --waitretry=seconds If you don't want Wget to wait between every retrieval, but only between retries of failed downloads, you can use this option. Wget will use linear backoff, waiting 1 second after the first failure on a given file, then waiting 2 seconds after the second failure on that file, up to the maximum number of seconds you specify. By default, Wget will assume a value of 10 seconds. --random-wait Some web sites may perform log analysis to identify retrieval programs such as Wget by looking for statistically significant similarities in the time between requests. This option causes the time between requests to vary between 0.5 and 1.5 * wait seconds, where wait was specified using the --wait option, in order to mask Wget's presence from such analysis. A 2001 article in a publication devoted to development on a popular consumer platform provided code to perform this analysis on the fly. Its author suggested blocking at the class C address level to ensure automated retrieval programs were blocked despite changing DHCP-supplied addresses. The --random-wait option was inspired by this ill-advised recommendation to block many unrelated users from a web site due to the actions of one. --no-proxy Don't use proxies, even if the appropriate *_proxy environment variable is defined. -Q quota --quota=quota Specify download quota for automatic retrievals. The value can be specified in bytes (default), kilobytes (with k suffix), or megabytes (with m suffix). Note that quota will never affect downloading a single file. So if you specify wget -Q10k ftp://wuarchive.wustl.edu/ls-lR.gz, all of the ls-lR.gz will be downloaded. The same goes even when several URLs are specified on the command-line. However, quota is respected when retrieving either recursively, or from an input file. Thus you may safely type wget -Q2m -i sites---download will be aborted when the quota is exceeded. Setting quota to 0 or to inf unlimits the download quota. --no-dns-cache Turn off caching of DNS lookups. Normally, Wget remembers the IP addresses it looked up from DNS so it doesn't have to repeatedly contact the DNS server for the same (typically small) set of hosts it retrieves from. This cache exists in memory only; a new Wget run will contact DNS again. However, it has been reported that in some situations it is not desirable to cache host names, even for the duration of a short-running application like Wget. With this option Wget issues a new DNS lookup (more precisely, a new call to "gethostbyname" or "getaddrinfo") each time it makes a new connection. Please note that this option will not affect caching that might be performed by the resolving library or by an external caching layer, such as NSCD. If you don't understand exactly what this option does, you probably won't need it. --restrict-file-names=modes Change which characters found in remote URLs must be escaped during generation of local filenames. Characters that are restricted by this option are escaped, i.e. replaced with %HH, where HH is the hexadecimal number that corresponds to the restricted character. This option may also be used to force all alphabetical cases to be either lower- or uppercase. By default, Wget escapes the characters that are not valid or safe as part of file names on your operating system, as well as control characters that are typically unprintable. This option is useful for changing these defaults, perhaps because you are downloading to a non-native partition, or because you want to disable escaping of the control characters, or you want to further restrict characters to only those in the ASCII range of values. The modes are a comma-separated set of text values. The acceptable values are unix, windows, nocontrol, ascii, lowercase, and uppercase. The values unix and windows are mutually exclusive (one will override the other), as are lowercase and uppercase. Those last are special cases, as they do not change the set of characters that would be escaped, but rather force local file paths to be converted either to lower- or uppercase. When "unix" is specified, Wget escapes the character / and the control characters in the ranges 0--31 and 128--159. This is the default on Unix-like operating systems. When "windows" is given, Wget escapes the characters \, |, /, :, ?, ", *, <, >, and the control characters in the ranges 0--31 and 128--159. In addition to this, Wget in Windows mode uses + instead of : to separate host and port in local file names, and uses @ instead of ? to separate the query portion of the file name from the rest. Therefore, a URL that would be saved as www.xemacs.org:4300/search.pl?input=blah in Unix mode would be saved as www.xemacs.org+4300/search.pl@input=blah in Windows mode. This mode is the default on Windows. If you specify nocontrol, then the escaping of the control characters is also switched off. This option may make sense when you are downloading URLs whose names contain UTF-8 characters, on a system which can save and display filenames in UTF-8 (some possible byte values used in UTF-8 byte sequences fall in the range of values designated by Wget as "controls"). The ascii mode is used to specify that any bytes whose values are outside the range of ASCII characters (that is, greater than 127) shall be escaped. This can be useful when saving filenames whose encoding does not match the one used locally. -4 --inet4-only -6 --inet6-only Force connecting to IPv4 or IPv6 addresses. With --inet4-only or -4, Wget will only connect to IPv4 hosts, ignoring AAAA records in DNS, and refusing to connect to IPv6 addresses specified in URLs. Conversely, with --inet6-only or -6, Wget will only connect to IPv6 hosts and ignore A records and IPv4 addresses. Neither options should be needed normally. By default, an IPv6-aware Wget will use the address family specified by the host's DNS record. If the DNS responds with both IPv4 and IPv6 addresses, Wget will try them in sequence until it finds one it can connect to. (Also see "--prefer-family" option described below.) These options can be used to deliberately force the use of IPv4 or IPv6 address families on dual family systems, usually to aid debugging or to deal with broken network configuration. Only one of --inet6-only and --inet4-only may be specified at the same time. Neither option is available in Wget compiled without IPv6 support. --prefer-family=none/IPv4/IPv6 When given a choice of several addresses, connect to the addresses with specified address family first. The address order returned by DNS is used without change by default. This avoids spurious errors and connect attempts when accessing hosts that resolve to both IPv6 and IPv4 addresses from IPv4 networks. For example, www.kame.net resolves to 2001:200:0:8002:203:47ff:fea5:3085 and to 203.178.141.194. When the preferred family is "IPv4", the IPv4 address is used first; when the preferred family is "IPv6", the IPv6 address is used first; if the specified value is "none", the address order returned by DNS is used without change. Unlike -4 and -6, this option doesn't inhibit access to any address family, it only changes the order in which the addresses are accessed. Also note that the reordering performed by this option is stable---it doesn't affect order of addresses of the same family. That is, the relative order of all IPv4 addresses and of all IPv6 addresses remains intact in all cases. --retry-connrefused Consider "connection refused" a transient error and try again. Normally Wget gives up on a URL when it is unable to connect to the site because failure to connect is taken as a sign that the server is not running at all and that retries would not help. This option is for mirroring unreliable sites whose servers tend to disappear for short periods of time. --user=user --password=password Specify the username user and password password for both FTP and HTTP file retrieval. These parameters can be overridden using the --ftp-user and --ftp-password options for FTP connections and the --http-user and --http-password options for HTTP connections. --ask-password Prompt for a password for each connection established. Cannot be specified when --password is being used, because they are mutually exclusive. --no-iri Turn off internationalized URI (IRI) support. Use --iri to turn it on. IRI support is activated by default. You can set the default state of IRI support using the "iri" command in .wgetrc. That setting may be overridden from the command line. --local-encoding=encoding Force Wget to use encoding as the default system encoding. That affects how Wget converts URLs specified as arguments from locale to UTF-8 for IRI support. Wget use the function "nl_langinfo()" and then the "CHARSET" environment variable to get the locale. If it fails, ASCII is used. You can set the default local encoding using the "local_encoding" command in .wgetrc. That setting may be overridden from the command line. --remote-encoding=encoding Force Wget to use encoding as the default remote server encoding. That affects how Wget converts URIs found in files from remote encoding to UTF-8 during a recursive fetch. This options is only useful for IRI support, for the interpretation of non-ASCII characters. For HTTP, remote encoding can be found in HTTP "Content-Type" header and in HTML "Content-Type http-equiv" meta tag. You can set the default encoding using the "remoteencoding" command in .wgetrc. That setting may be overridden from the command line. --unlink Force Wget to unlink file instead of clobbering existing file. This option is useful for downloading to the directory with hardlinks. Directory Options -nd --no-directories Do not create a hierarchy of directories when retrieving recursively. With this option turned on, all files will get saved to the current directory, without clobbering (if a name shows up more than once, the filenames will get extensions .n). -x --force-directories The opposite of -nd---create a hierarchy of directories, even if one would not have been created otherwise. E.g. wget -x http://fly.srk.fer.hr/robots.txt will save the downloaded file to fly.srk.fer.hr/robots.txt. -nH --no-host-directories Disable generation of host-prefixed directories. By default, invoking Wget with -r http://fly.srk.fer.hr/ will create a structure of directories beginning with fly.srk.fer.hr/. This option disables such behavior. --protocol-directories Use the protocol name as a directory component of local file names. For example, with this option, wget -r http://host will save to http/host/... rather than just to host/.... --cut-dirs=number Ignore number directory components. This is useful for getting a fine-grained control over the directory where recursive retrieval will be saved. Take, for example, the directory at ftp://ftp.xemacs.org/pub/xemacs/. If you retrieve it with -r, it will be saved locally under ftp.xemacs.org/pub/xemacs/. While the -nH option can remove the ftp.xemacs.org/ part, you are still stuck with pub/xemacs. This is where --cut-dirs comes in handy; it makes Wget not "see" number remote directory components. Here are several examples of how --cut-dirs option works. No options -> ftp.xemacs.org/pub/xemacs/ -nH -> pub/xemacs/ -nH --cut-dirs=1 -> xemacs/ -nH --cut-dirs=2 -> . --cut-dirs=1 -> ftp.xemacs.org/xemacs/ ... If you just want to get rid of the directory structure, this option is similar to a combination of -nd and -P. However, unlike -nd, --cut-dirs does not lose with subdirectories---for instance, with -nH --cut-dirs=1, a beta/ subdirectory will be placed to xemacs/beta, as one would expect. -P prefix --directory-prefix=prefix Set directory prefix to prefix. The directory prefix is the directory where all other files and subdirectories will be saved to, i.e. the top of the retrieval tree. The default is . (the current directory). HTTP Options --default-page=name Use name as the default file name when it isn't known (i.e., for URLs that end in a slash), instead of index.html. -E --adjust-extension If a file of type application/xhtml+xml or text/html is downloaded and the URL does not end with the regexp \.[Hh][Tt][Mm][Ll]?, this option will cause the suffix .html to be appended to the local filename. This is useful, for instance, when you're mirroring a remote site that uses .asp pages, but you want the mirrored pages to be viewable on your stock Apache server. Another good use for this is when you're downloading CGI-generated materials. A URL like http://site.com/article.cgi?25 will be saved as article.cgi?25.html. Note that filenames changed in this way will be re-downloaded every time you re-mirror a site, because Wget can't tell that the local X.html file corresponds to remote URL X (since it doesn't yet know that the URL produces output of type text/html or application/xhtml+xml. As of version 1.12, Wget will also ensure that any downloaded files of type text/css end in the suffix .css, and the option was renamed from --html-extension, to better reflect its new behavior. The old option name is still acceptable, but should now be considered deprecated. At some point in the future, this option may well be expanded to include suffixes for other types of content, including content types that are not parsed by Wget. --http-user=user --http-password=password Specify the username user and password password on an HTTP server. According to the type of the challenge, Wget will encode them using either the "basic" (insecure), the "digest", or the Windows "NTLM" authentication scheme. Another way to specify username and password is in the URL itself. Either method reveals your password to anyone who bothers to run "ps". To prevent the passwords from being seen, store them in .wgetrc or .netrc, and make sure to protect those files from other users with "chmod". If the passwords are really important, do not leave them lying in those files either---edit the files and delete them after Wget has started the download. --no-http-keep-alive Turn off the "keep-alive" feature for HTTP downloads. Normally, Wget asks the server to keep the connection open so that, when you download more than one document from the same server, they get transferred over the same TCP connection. This saves time and at the same time reduces the load on the server. This option is useful when, for some reason, persistent (keep-alive) connections don't work for you, for example due to a server bug or due to the inability of server- side scripts to cope with the connections. --no-cache Disable server-side cache. In this case, Wget will send the remote server an appropriate directive (Pragma: no-cache) to get the file from the remote service, rather than returning the cached version. This is especially useful for retrieving and flushing out-of-date documents on proxy servers. Caching is allowed by default. --no-cookies Disable the use of cookies. Cookies are a mechanism for maintaining server-side state. The server sends the client a cookie using the "Set-Cookie" header, and the client responds with the same cookie upon further requests. Since cookies allow the server owners to keep track of visitors and for sites to exchange this information, some consider them a breach of privacy. The default is to use cookies; however, storing cookies is not on by default. --load-cookies file Load cookies from file before the first HTTP retrieval. file is a textual file in the format originally used by Netscape's cookies.txt file. You will typically use this option when mirroring sites that require that you be logged in to access some or all of their content. The login process typically works by the web server issuing an HTTP cookie upon receiving and verifying your credentials. The cookie is then resent by the browser when accessing that part of the site, and so proves your identity. Mirroring such a site requires Wget to send the same cookies your browser sends when communicating with the site. This is achieved by --load-cookies---simply point Wget to the location of the cookies.txt file, and it will send the same cookies your browser would send in the same situation. Different browsers keep textual cookie files in different locations: "Netscape 4.x." The cookies are in ~/.netscape/cookies.txt. "Mozilla and Netscape 6.x." Mozilla's cookie file is also named cookies.txt, located somewhere under ~/.mozilla, in the directory of your profile. The full path usually ends up looking somewhat like ~/.mozilla/default/some-weird-string/cookies.txt. "Internet Explorer." You can produce a cookie file Wget can use by using the File menu, Import and Export, Export Cookies. This has been tested with Internet Explorer 5; it is not guaranteed to work with earlier versions. "Other browsers." If you are using a different browser to create your cookies, --load-cookies will only work if you can locate or produce a cookie file in the Netscape format that Wget expects. If you cannot use --load-cookies, there might still be an alternative. If your browser supports a "cookie manager", you can use it to view the cookies used when accessing the site you're mirroring. Write down the name and value of the cookie, and manually instruct Wget to send those cookies, bypassing the "official" cookie support: wget --no-cookies --header "Cookie: =" --save-cookies file Save cookies to file before exiting. This will not save cookies that have expired or that have no expiry time (so-called "session cookies"), but also see --keep-session-cookies. --keep-session-cookies When specified, causes --save-cookies to also save session cookies. Session cookies are normally not saved because they are meant to be kept in memory and forgotten when you exit the browser. Saving them is useful on sites that require you to log in or to visit the home page before you can access some pages. With this option, multiple Wget runs are considered a single browser session as far as the site is concerned. Since the cookie file format does not normally carry session cookies, Wget marks them with an expiry timestamp of 0. Wget's --load-cookies recognizes those as session cookies, but it might confuse other browsers. Also note that cookies so loaded will be treated as other session cookies, which means that if you want --save-cookies to preserve them again, you must use --keep-session-cookies again. --ignore-length Unfortunately, some HTTP servers (CGI programs, to be more precise) send out bogus "Content-Length" headers, which makes Wget go wild, as it thinks not all the document was retrieved. You can spot this syndrome if Wget retries getting the same document again and again, each time claiming that the (otherwise normal) connection has closed on the very same byte. With this option, Wget will ignore the "Content-Length" header---as if it never existed. --header=header-line Send header-line along with the rest of the headers in each HTTP request. The supplied header is sent as-is, which means it must contain name and value separated by colon, and must not contain newlines. You may define more than one additional header by specifying --header more than once. wget --header='Accept-Charset: iso-8859-2' \ --header='Accept-Language: hr' \ http://fly.srk.fer.hr/ Specification of an empty string as the header value will clear all previous user-defined headers. As of Wget 1.10, this option can be used to override headers otherwise generated automatically. This example instructs Wget to connect to localhost, but to specify foo.bar in the "Host" header: wget --header="Host: foo.bar" http://localhost/ In versions of Wget prior to 1.10 such use of --header caused sending of duplicate headers. --max-redirect=number Specifies the maximum number of redirections to follow for a resource. The default is 20, which is usually far more than necessary. However, on those occasions where you want to allow more (or fewer), this is the option to use. --proxy-user=user --proxy-password=password Specify the username user and password password for authentication on a proxy server. Wget will encode them using the "basic" authentication scheme. Security considerations similar to those with --http-password pertain here as well. --referer=url Include `Referer: url' header in HTTP request. Useful for retrieving documents with server-side processing that assume they are always being retrieved by interactive web browsers and only come out properly when Referer is set to one of the pages that point to them. --save-headers Save the headers sent by the HTTP server to the file, preceding the actual contents, with an empty line as the separator. -U agent-string --user-agent=agent-string Identify as agent-string to the HTTP server. The HTTP protocol allows the clients to identify themselves using a "User-Agent" header field. This enables distinguishing the WWW software, usually for statistical purposes or for tracing of protocol violations. Wget normally identifies as Wget/version, version being the current version number of Wget. However, some sites have been known to impose the policy of tailoring the output according to the "User-Agent"-supplied information. While this is not such a bad idea in theory, it has been abused by servers denying information to clients other than (historically) Netscape or, more frequently, Microsoft Internet Explorer. This option allows you to change the "User-Agent" line issued by Wget. Use of this option is discouraged, unless you really know what you are doing. Specifying empty user agent with --user-agent="" instructs Wget not to send the "User-Agent" header in HTTP requests. --post-data=string --post-file=file Use POST as the method for all HTTP requests and send the specified data in the request body. --post-data sends string as data, whereas --post-file sends the contents of file. Other than that, they work in exactly the same way. In particular, they both expect content of the form "key1=value1&key2=value2", with percent-encoding for special characters; the only difference is that one expects its content as a command-line parameter and the other accepts its content from a file. In particular, --post-file is not for transmitting files as form attachments: those must appear as "key=value" data (with appropriate percent-coding) just like everything else. Wget does not currently support "multipart/form-data" for transmitting POST data; only "application/x-www-form-urlencoded". Only one of --post-data and --post-file should be specified. Please note that wget does not require the content to be of the form "key1=value1&key2=value2", and neither does it test for it. Wget will simply transmit whatever data is provided to it. Most servers however expect the POST data to be in the above format when processing HTML Forms. Please be aware that Wget needs to know the size of the POST data in advance. Therefore the argument to "--post-file" must be a regular file; specifying a FIFO or something like /dev/stdin won't work. It's not quite clear how to work around this limitation inherent in HTTP/1.0. Although HTTP/1.1 introduces chunked transfer that doesn't require knowing the request length in advance, a client can't use chunked unless it knows it's talking to an HTTP/1.1 server. And it can't know that until it receives a response, which in turn requires the request to have been completed -- a chicken-and-egg problem. Note: As of version 1.15 if Wget is redirected after the POST request is completed, its behaviour will depend on the response code returned by the server. In case of a 301 Moved Permanently, 302 Moved Temporarily or 307 Temporary Redirect, Wget will, in accordance with RFC2616, continue to send a POST request. In case a server wants the client to change the Request method upon redirection, it should send a 303 See Other response code. This example shows how to log in to a server using POST and then proceed to download the desired pages, presumably only accessible to authorized users: # Log in to the server. This can be done only once. wget --save-cookies cookies.txt \ --post-data 'user=foo&password=bar' \ http://server.com/auth.php # Now grab the page or pages we care about. wget --load-cookies cookies.txt \ -p http://server.com/interesting/article.php If the server is using session cookies to track user authentication, the above will not work because --save-cookies will not save them (and neither will browsers) and the cookies.txt file will be empty. In that case use --keep-session-cookies along with --save-cookies to force saving of session cookies. --method=HTTP-Method For the purpose of RESTful scripting, Wget allows sending of other HTTP Methods without the need to explicitly set them using --header=Header-Line. Wget will use whatever string is passed to it after --method as the HTTP Method to the server. --body-data=Data-String --body-file=Data-File Must be set when additional data needs to be sent to the server along with the Method specified using --method. --body-data sends string as data, whereas --body-file sends the contents of file. Other than that, they work in exactly the same way. Currently, --body-file is not for transmitting files as a whole. Wget does not currently support "multipart/form-data" for transmitting data; only "application/x-www-form-urlencoded". In the future, this may be changed so that wget sends the --body-file as a complete file instead of sending its contents to the server. Please be aware that Wget needs to know the contents of BODY Data in advance, and hence the argument to --body-file should be a regular file. See --post-file for a more detailed explanation. Only one of --body-data and --body-file should be specified. If Wget is redirected after the request is completed, Wget will suspend the current method and send a GET request till the redirection is completed. This is true for all redirection response codes except 307 Temporary Redirect which is used to explicitly specify that the request method should not change. Another exception is when the method is set to "POST", in which case the redirection rules specified under --post-data are followed. --content-disposition If this is set to on, experimental (not fully-functional) support for "Content-Disposition" headers is enabled. This can currently result in extra round-trips to the server for a "HEAD" request, and is known to suffer from a few bugs, which is why it is not currently enabled by default. This option is useful for some file-downloading CGI programs that use "Content-Disposition" headers to describe what the name of a downloaded file should be. --content-on-error If this is set to on, wget will not skip the content when the server responds with a http status code that indicates error. --trust-server-names If this is set to on, on a redirect the last component of the redirection URL will be used as the local file name. By default it is used the last component in the original URL. --auth-no-challenge If this option is given, Wget will send Basic HTTP authentication information (plaintext username and password) for all requests, just like Wget 1.10.2 and prior did by default. Use of this option is not recommended, and is intended only to support some few obscure servers, which never send HTTP authentication challenges, but accept unsolicited auth info, say, in addition to form-based authentication. HTTPS (SSL/TLS) Options To support encrypted HTTP (HTTPS) downloads, Wget must be compiled with an external SSL library, currently OpenSSL. If Wget is compiled without SSL support, none of these options are available. --secure-protocol=protocol Choose the secure protocol to be used. Legal values are auto, SSLv2, SSLv3, TLSv1, TLSv1_1, TLSv1_2 and PFS. If auto is used, the SSL library is given the liberty of choosing the appropriate protocol automatically, which is achieved by sending a TLSv1 greeting. This is the default. Specifying SSLv2, SSLv3, TLSv1, TLSv1_1 or TLSv1_2 forces the use of the corresponding protocol. This is useful when talking to old and buggy SSL server implementations that make it hard for the underlying SSL library to choose the correct protocol version. Fortunately, such servers are quite rare. Specifying PFS enforces the use of the so-called Perfect Forward Security cipher suites. In short, PFS adds security by creating a one-time key for each SSL connection. It has a bit more CPU impact on client and server. We use known to be secure ciphers (e.g. no MD4) and the TLS protocol. --https-only When in recursive mode, only HTTPS links are followed. --no-check-certificate Don't check the server certificate against the available certificate authorities. Also don't require the URL host name to match the common name presented by the certificate. As of Wget 1.10, the default is to verify the server's certificate against the recognized certificate authorities, breaking the SSL handshake and aborting the download if the verification fails. Although this provides more secure downloads, it does break interoperability with some sites that worked with previous Wget versions, particularly those using self-signed, expired, or otherwise invalid certificates. This option forces an "insecure" mode of operation that turns the certificate verification errors into warnings and allows you to proceed. If you encounter "certificate verification" errors or ones saying that "common name doesn't match requested host name", you can use this option to bypass the verification and proceed with the download. Only use this option if you are otherwise convinced of the site's authenticity, or if you really don't care about the validity of its certificate. It is almost always a bad idea not to check the certificates when transmitting confidential or important data. --certificate=file Use the client certificate stored in file. This is needed for servers that are configured to require certificates from the clients that connect to them. Normally a certificate is not required and this switch is optional. --certificate-type=type Specify the type of the client certificate. Legal values are PEM (assumed by default) and DER, also known as ASN1. --private-key=file Read the private key from file. This allows you to provide the private key in a file separate from the certificate. --private-key-type=type Specify the type of the private key. Accepted values are PEM (the default) and DER. --ca-certificate=file Use file as the file with the bundle of certificate authorities ("CA") to verify the peers. The certificates must be in PEM format. Without this option Wget looks for CA certificates at the system-specified locations, chosen at OpenSSL installation time. --ca-directory=directory Specifies directory containing CA certificates in PEM format. Each file contains one CA certificate, and the file name is based on a hash value derived from the certificate. This is achieved by processing a certificate directory with the "c_rehash" utility supplied with OpenSSL. Using --ca-directory is more efficient than --ca-certificate when many certificates are installed because it allows Wget to fetch certificates on demand. Without this option Wget looks for CA certificates at the system-specified locations, chosen at OpenSSL installation time. --random-file=file Use file as the source of random data for seeding the pseudo-random number generator on systems without /dev/random. On such systems the SSL library needs an external source of randomness to initialize. Randomness may be provided by EGD (see --egd-file below) or read from an external source specified by the user. If this option is not specified, Wget looks for random data in $RANDFILE or, if that is unset, in $HOME/.rnd. If none of those are available, it is likely that SSL encryption will not be usable. If you're getting the "Could not seed OpenSSL PRNG; disabling SSL." error, you should provide random data using some of the methods described above. --egd-file=file Use file as the EGD socket. EGD stands for Entropy Gathering Daemon, a user-space program that collects data from various unpredictable system sources and makes it available to other programs that might need it. Encryption software, such as the SSL library, needs sources of non-repeating randomness to seed the random number generator used to produce cryptographically strong keys. OpenSSL allows the user to specify his own source of entropy using the "RAND_FILE" environment variable. If this variable is unset, or if the specified file does not produce enough randomness, OpenSSL will read random data from EGD socket specified using this option. If this option is not specified (and the equivalent startup command is not used), EGD is never contacted. EGD is not needed on modern Unix systems that support /dev/random. --warc-file=file Use file as the destination WARC file. --warc-header=string Use string into as the warcinfo record. --warc-max-size=size Set the maximum size of the WARC files to size. --warc-cdx Write CDX index files. --warc-dedup=file Do not store records listed in this CDX file. --no-warc-compression Do not compress WARC files with GZIP. --no-warc-digests Do not calculate SHA1 digests. --no-warc-keep-log Do not store the log file in a WARC record. --warc-tempdir=dir Specify the location for temporary files created by the WARC writer. FTP Options --ftp-user=user --ftp-password=password Specify the username user and password password on an FTP server. Without this, or the corresponding startup option, the password defaults to -wget@, normally used for anonymous FTP. Another way to specify username and password is in the URL itself. Either method reveals your password to anyone who bothers to run "ps". To prevent the passwords from being seen, store them in .wgetrc or .netrc, and make sure to protect those files from other users with "chmod". If the passwords are really important, do not leave them lying in those files either---edit the files and delete them after Wget has started the download. --no-remove-listing Don't remove the temporary .listing files generated by FTP retrievals. Normally, these files contain the raw directory listings received from FTP servers. Not removing them can be useful for debugging purposes, or when you want to be able to easily check on the contents of remote server directories (e.g. to verify that a mirror you're running is complete). Note that even though Wget writes to a known filename for this file, this is not a security hole in the scenario of a user making .listing a symbolic link to /etc/passwd or something and asking "root" to run Wget in his or her directory. Depending on the options used, either Wget will refuse to write to .listing, making the globbing/recursion/time-stamping operation fail, or the symbolic link will be deleted and replaced with the actual .listing file, or the listing will be written to a .listing.number file. Even though this situation isn't a problem, though, "root" should never run Wget in a non-trusted user's directory. A user could do something as simple as linking index.html to /etc/passwd and asking "root" to run Wget with -N or -r so the file will be overwritten. --no-glob Turn off FTP globbing. Globbing refers to the use of shell-like special characters (wildcards), like *, ?, [ and ] to retrieve more than one file from the same directory at once, like: wget ftp://gnjilux.srk.fer.hr/*.msg By default, globbing will be turned on if the URL contains a globbing character. This option may be used to turn globbing on or off permanently. You may have to quote the URL to protect it from being expanded by your shell. Globbing makes Wget look for a directory listing, which is system-specific. This is why it currently works only with Unix FTP servers (and the ones emulating Unix "ls" output). --no-passive-ftp Disable the use of the passive FTP transfer mode. Passive FTP mandates that the client connect to the server to establish the data connection rather than the other way around. If the machine is connected to the Internet directly, both passive and active FTP should work equally well. Behind most firewall and NAT configurations passive FTP has a better chance of working. However, in some rare firewall configurations, active FTP actually works when passive FTP doesn't. If you suspect this to be the case, use this option, or set "passive_ftp=off" in your init file. --preserve-permissions Preserve remote file permissions instead of permissions set by umask. --retr-symlinks By default, when retrieving FTP directories recursively and a symbolic link is encountered, the symbolic link is traversed and the pointed-to files are retrieved. Currently, Wget does not traverse symbolic links to directories to download them recursively, though this feature may be added in the future. When --retr-symlinks=no is specified, the linked-to file is not downloaded. Instead, a matching symbolic link is created on the local filesystem. The pointed-to file will not be retrieved unless this recursive retrieval would have encountered it separately and downloaded it anyway. This option poses a security risk where a malicious FTP Server may cause Wget to write to files outside of the intended directories through a specially crafted .LISTING file. Note that when retrieving a file (not a directory) because it was specified on the command-line, rather than because it was recursed to, this option has no effect. Symbolic links are always traversed in this case. Recursive Retrieval Options -r --recursive Turn on recursive retrieving. The default maximum depth is 5. -l depth --level=depth Specify recursion maximum depth level depth. --delete-after This option tells Wget to delete every single file it downloads, after having done so. It is useful for pre-fetching popular pages through a proxy, e.g.: wget -r -nd --delete-after http://whatever.com/~popular/page/ The -r option is to retrieve recursively, and -nd to not create directories. Note that --delete-after deletes files on the local machine. It does not issue the DELE command to remote FTP sites, for instance. Also note that when --delete-after is specified, --convert-links is ignored, so .orig files are simply not created in the first place. -k --convert-links After the download is complete, convert the links in the document to make them suitable for local viewing. This affects not only the visible hyperlinks, but any part of the document that links to external content, such as embedded images, links to style sheets, hyperlinks to non-HTML content, etc. Each link will be changed in one of the two ways: · The links to files that have been downloaded by Wget will be changed to refer to the file they point to as a relative link. Example: if the downloaded file /foo/doc.html links to /bar/img.gif, also downloaded, then the link in doc.html will be modified to point to ../bar/img.gif. This kind of transformation works reliably for arbitrary combinations of directories. · The links to files that have not been downloaded by Wget will be changed to include host name and absolute path of the location they point to. Example: if the downloaded file /foo/doc.html links to /bar/img.gif (or to ../bar/img.gif), then the link in doc.html will be modified to point to http://hostname/bar/img.gif. Because of this, local browsing works reliably: if a linked file was downloaded, the link will refer to its local name; if it was not downloaded, the link will refer to its full Internet address rather than presenting a broken link. The fact that the former links are converted to relative links ensures that you can move the downloaded hierarchy to another directory. Note that only at the end of the download can Wget know which links have been downloaded. Because of that, the work done by -k will be performed at the end of all the downloads. -K --backup-converted When converting a file, back up the original version with a .orig suffix. Affects the behavior of -N. -m --mirror Turn on options suitable for mirroring. This option turns on recursion and time-stamping, sets infinite recursion depth and keeps FTP directory listings. It is currently equivalent to -r -N -l inf --no-remove-listing. -p --page-requisites This option causes Wget to download all the files that are necessary to properly display a given HTML page. This includes such things as inlined images, sounds, and referenced stylesheets. Ordinarily, when downloading a single HTML page, any requisite documents that may be needed to display it properly are not downloaded. Using -r together with -l can help, but since Wget does not ordinarily distinguish between external and inlined documents, one is generally left with "leaf documents" that are missing their requisites. For instance, say document 1.html contains an "" tag referencing 1.gif and an "" tag pointing to external document 2.html. Say that 2.html is similar but that its image is 2.gif and it links to 3.html. Say this continues up to some arbitrarily high number. If one executes the command: wget -r -l 2 http:///1.html then 1.html, 1.gif, 2.html, 2.gif, and 3.html will be downloaded. As you can see, 3.html is without its requisite 3.gif because Wget is simply counting the number of hops (up to 2) away from 1.html in order to determine where to stop the recursion. However, with this command: wget -r -l 2 -p http:///1.html all the above files and 3.html's requisite 3.gif will be downloaded. Similarly, wget -r -l 1 -p http:///1.html will cause 1.html, 1.gif, 2.html, and 2.gif to be downloaded. One might think that: wget -r -l 0 -p http:///1.html would download just 1.html and 1.gif, but unfortunately this is not the case, because -l 0 is equivalent to -l inf---that is, infinite recursion. To download a single HTML page (or a handful of them, all specified on the command-line or in a -i URL input file) and its (or their) requisites, simply leave off -r and -l: wget -p http:///1.html Note that Wget will behave as if -r had been specified, but only that single page and its requisites will be downloaded. Links from that page to external documents will not be followed. Actually, to download a single page and all its requisites (even if they exist on separate websites), and make sure the lot displays properly locally, this author likes to use a few options in addition to -p: wget -E -H -k -K -p http:/// To finish off this topic, it's worth knowing that Wget's idea of an external document link is any URL specified in an "" tag, an "" tag, or a "" tag other than "". --strict-comments Turn on strict parsing of HTML comments. The default is to terminate comments at the first occurrence of -->. According to specifications, HTML comments are expressed as SGML declarations. Declaration is special markup that begins with , such as , that may contain comments between a pair of -- delimiters. HTML comments are "empty declarations", SGML declarations without any non-comment text. Therefore, is a valid comment, and so is , but is not. On the other hand, most HTML writers don't perceive comments as anything other than text delimited with , which is not quite the same. For example, something like works as a valid comment as long as the number of dashes is a multiple of four (!). If not, the comment technically lasts until the next --, which may be at the other end of the document. Because of this, many popular browsers completely ignore the specification and implement what users have come to expect: comments delimited with . Until version 1.9, Wget interpreted comments strictly, which resulted in missing links in many web pages that displayed fine in browsers, but had the misfortune of containing non-compliant comments. Beginning with version 1.9, Wget has joined the ranks of clients that implements "naive" comments, terminating each comment at the first occurrence of -->. If, for whatever reason, you want strict comment parsing, use this option to turn it on. Recursive Accept/Reject Options -A acclist --accept acclist -R rejlist --reject rejlist Specify comma-separated lists of file name suffixes or patterns to accept or reject. Note that if any of the wildcard characters, *, ?, [ or ], appear in an element of acclist or rejlist, it will be treated as a pattern, rather than a suffix. In this case, you have to enclose the pattern into quotes to prevent your shell from expanding it, like in -A "*.mp3" or -A '*.mp3'. --accept-regex urlregex --reject-regex urlregex Specify a regular expression to accept or reject the complete URL. --regex-type regextype Specify the regular expression type. Possible types are posix or pcre. Note that to be able to use pcre type, wget has to be compiled with libpcre support. -D domain-list --domains=domain-list Set domains to be followed. domain-list is a comma-separated list of domains. Note that it does not turn on -H. --exclude-domains domain-list Specify the domains that are not to be followed. --follow-ftp Follow FTP links from HTML documents. Without this option, Wget will ignore all the FTP links. --follow-tags=list Wget has an internal table of HTML tag / attribute pairs that it considers when looking for linked documents during a recursive retrieval. If a user wants only a subset of those tags to be considered, however, he or she should be specify such tags in a comma-separated list with this option. --ignore-tags=list This is the opposite of the --follow-tags option. To skip certain HTML tags when recursively looking for documents to download, specify them in a comma-separated list. In the past, this option was the best bet for downloading a single page and its requisites, using a command-line like: wget --ignore-tags=a,area -H -k -K -r http:/// However, the author of this option came across a page with tags like "" and came to the realization that specifying tags to ignore was not enough. One can't just tell Wget to ignore "", because then stylesheets will not be downloaded. Now the best bet for downloading a single page and its requisites is the dedicated --page-requisites option. --ignore-case Ignore case when matching files and directories. This influences the behavior of -R, -A, -I, and -X options, as well as globbing implemented when downloading from FTP sites. For example, with this option, -A "*.txt" will match file1.txt, but also file2.TXT, file3.TxT, and so on. The quotes in the example are to prevent the shell from expanding the pattern. -H --span-hosts Enable spanning across hosts when doing recursive retrieving. -L --relative Follow relative links only. Useful for retrieving a specific home page without any distractions, not even those from the same hosts. -I list --include-directories=list Specify a comma-separated list of directories you wish to follow when downloading. Elements of list may contain wildcards. -X list --exclude-directories=list Specify a comma-separated list of directories you wish to exclude from download. Elements of list may contain wildcards. -np --no-parent Do not ever ascend to the parent directory when retrieving recursively. This is a useful option, since it guarantees that only the files below a certain hierarchy will be downloaded. ENVIRONMENT Wget supports proxies for both HTTP and FTP retrievals. The standard way to specify proxy location, which Wget recognizes, is using the following environment variables: http_proxy https_proxy If set, the http_proxy and https_proxy variables should contain the URLs of the proxies for HTTP and HTTPS connections respectively. ftp_proxy This variable should contain the URL of the proxy for FTP connections. It is quite common that http_proxy and ftp_proxy are set to the same URL. no_proxy This variable should contain a comma-separated list of domain extensions proxy should not be used for. For instance, if the value of no_proxy is .mit.edu, proxy will not be used to retrieve documents from MIT. EXIT STATUS Wget may return one of several error codes if it encounters problems. 0 No problems occurred. 1 Generic error code. 2 Parse error---for instance, when parsing command-line options, the .wgetrc or .netrc... 3 File I/O error. 4 Network failure. 5 SSL verification failure. 6 Username/password authentication failure. 7 Protocol errors. 8 Server issued an error response. With the exceptions of 0 and 1, the lower-numbered exit codes take precedence over higher-numbered ones, when multiple types of errors are encountered. In versions of Wget prior to 1.12, Wget's exit status tended to be unhelpful and inconsistent. Recursive downloads would virtually always return 0 (success), regardless of any issues encountered, and non-recursive fetches only returned the status corresponding to the most recently-attempted download. FILES /usr/local/etc/wgetrc Default location of the global startup file. .wgetrc User startup file. BUGS You are welcome to submit bug reports via the GNU Wget bug tracker (see ). Before actually submitting a bug report, please try to follow a few simple guidelines. 1. Please try to ascertain that the behavior you see really is a bug. If Wget crashes, it's a bug. If Wget does not behave as documented, it's a bug. If things work strange, but you are not sure about the way they are supposed to work, it might well be a bug, but you might want to double-check the documentation and the mailing lists. 2. Try to repeat the bug in as simple circumstances as possible. E.g. if Wget crashes while downloading wget -rl0 -kKE -t5 --no-proxy http://yoyodyne.com -o /tmp/log, you should try to see if the crash is repeatable, and if will occur with a simpler set of options. You might even try to start the download at the page where the crash occurred to see if that page somehow triggered the crash. Also, while I will probably be interested to know the contents of your .wgetrc file, just dumping it into the debug message is probably a bad idea. Instead, you should first try to see if the bug repeats with .wgetrc moved out of the way. Only if it turns out that .wgetrc settings affect the bug, mail me the relevant parts of the file. 3. Please start Wget with -d option and send us the resulting output (or relevant parts thereof). If Wget was compiled without debug support, recompile it---it is much easier to trace bugs with debug support on. Note: please make sure to remove any potentially sensitive information from the debug log before sending it to the bug address. The "-d" won't go out of its way to collect sensitive information, but the log will contain a fairly complete transcript of Wget's communication with the server, which may include passwords and pieces of downloaded data. Since the bug address is publically archived, you may assume that all bug reports are visible to the public. 4. If Wget has crashed, try to run it in a debugger, e.g. "gdb `which wget` core" and type "where" to get the backtrace. This may not work if the system administrator has disabled core files, but it is safe to try. SEE ALSO This is not the complete manual for GNU Wget. For more complete information, including more detailed explanations of some of the options, and a number of commands available for use with .wgetrc files and the -e option, see the GNU Info entry for wget. AUTHOR Originally written by Hrvoje Niksic . COPYRIGHT Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License". GNU Wget 1.16 2014-11-03 WGET(1) rsync(1) rsync(1) NAME rsync - a fast, versatile, remote (and local) file-copying tool SYNOPSIS Local: rsync [OPTION...] SRC... [DEST] Access via remote shell: Pull: rsync [OPTION...] [USER@]HOST:SRC... [DEST] Push: rsync [OPTION...] SRC... [USER@]HOST:DEST Access via rsync daemon: Pull: rsync [OPTION...] [USER@]HOST::SRC... [DEST] rsync [OPTION...] rsync://[USER@]HOST[:PORT]/SRC... [DEST] Push: rsync [OPTION...] SRC... [USER@]HOST::DEST rsync [OPTION...] SRC... rsync://[USER@]HOST[:PORT]/DEST Usages with just one SRC arg and no DEST arg will list the source files instead of copying. DESCRIPTION Rsync is a fast and extraordinarily versatile file copying tool. It can copy locally, to/from another host over any remote shell, or to/from a remote rsync daemon. It offers a large number of options that control every aspect of its behavior and permit very flexible specification of the set of files to be copied. It is famous for its delta-transfer algorithm, which reduces the amount of data sent over the network by sending only the differences between the source files and the existing files in the destination. Rsync is widely used for backups and mirroring and as an improved copy command for everyday use. Rsync finds files that need to be transferred using a "quick check" algorithm (by default) that looks for files that have changed in size or in last-modified time. Any changes in the other preserved attributes (as requested by options) are made on the destination file directly when the quick check indicates that the file’s data does not need to be updated. Some of the additional features of rsync are: o support for copying links, devices, owners, groups, and permissions o exclude and exclude-from options similar to GNU tar o a CVS exclude mode for ignoring the same files that CVS would ignore o can use any transparent remote shell, including ssh or rsh o does not require super-user privileges o pipelining of file transfers to minimize latency costs o support for anonymous or authenticated rsync daemons (ideal for mirroring) GENERAL Rsync copies files either to or from a remote host, or locally on the current host (it does not support copying files between two remote hosts). There are two different ways for rsync to contact a remote system: using a remote-shell program as the transport (such as ssh or rsh) or contacting an rsync daemon directly via TCP. The remote-shell transport is used whenever the source or destination path contains a single colon (:) separator after a host specification. Contacting an rsync daemon directly happens when the source or destination path contains a double colon (::) separator after a host specification, OR when an rsync:// URL is speci- fied (see also the "USING RSYNC-DAEMON FEATURES VIA A REMOTE-SHELL CONNECTION" section for an exception to this latter rule). As a special case, if a single source arg is specified without a destination, the files are listed in an output format similar to "ls -l". As expected, if neither the source or destination path specify a remote host, the copy occurs locally (see also the --list-only option). Rsync refers to the local side as the "client" and the remote side as the "server". Don’t confuse "server" with an rsync daemon -- a daemon is always a server, but a server can be either a daemon or a remote-shell spawned process. SETUP See the file README for installation instructions. Once installed, you can use rsync to any machine that you can access via a remote shell (as well as some that you can access using the rsync daemon-mode protocol). For remote transfers, a modern rsync uses ssh for its communications, but it may have been configured to use a different remote shell by default, such as rsh or remsh. You can also specify any remote shell you like, either by using the -e command line option, or by setting the RSYNC_RSH environment variable. Note that rsync must be installed on both the source and destination machines. USAGE You use rsync in the same way you use rcp. You must specify a source and a destination, one of which may be remote. Perhaps the best way to explain the syntax is with some examples: rsync -t *.c foo:src/ This would transfer all files matching the pattern *.c from the current directory to the directory src on the machine foo. If any of the files already exist on the remote system then the rsync remote-update protocol is used to update the file by sending only the differences. See the tech report for details. rsync -avz foo:src/bar /data/tmp This would recursively transfer all files from the directory src/bar on the machine foo into the /data/tmp/bar directory on the local machine. The files are transferred in "archive" mode, which ensures that symbolic links, devices, attributes, permissions, ownerships, etc. are preserved in the transfer. Additionally, compression will be used to reduce the size of data portions of the transfer. rsync -avz foo:src/bar/ /data/tmp A trailing slash on the source changes this behavior to avoid creating an additional directory level at the destination. You can think of a trailing / on a source as meaning "copy the contents of this directory" as opposed to "copy the directory by name", but in both cases the attributes of the containing directory are transferred to the containing directory on the destination. In other words, each of the following commands copies the files in the same way, including their setting of the attributes of /dest/foo: rsync -av /src/foo /dest rsync -av /src/foo/ /dest/foo Note also that host and module references don’t require a trailing slash to copy the contents of the default directory. For example, both of these copy the remote direc- tory’s contents into "/dest": rsync -av host: /dest rsync -av host::module /dest You can also use rsync in local-only mode, where both the source and destination don’t have a ’:’ in the name. In this case it behaves like an improved copy command. Finally, you can list all the (listable) modules available from a particular rsync daemon by leaving off the module name: rsync somehost.mydomain.com:: And, if Service Location Protocol is available, the following will list the available rsync servers: rsync rsync:// See the following section for even more usage details. ADVANCED USAGE The syntax for requesting multiple files from a remote host is done by specifying additional remote-host args in the same style as the first, or with the hostname omitted. For instance, all these work: rsync -av host:file1 :file2 host:file{3,4} /dest/ rsync -av host::modname/file{1,2} host::modname/file3 /dest/ rsync -av host::modname/file1 ::modname/file{3,4} Older versions of rsync required using quoted spaces in the SRC, like these examples: rsync -av host:'dir1/file1 dir2/file2' /dest rsync host::'modname/dir1/file1 modname/dir2/file2' /dest This word-splitting still works (by default) in the latest rsync, but is not as easy to use as the first method. If you need to transfer a filename that contains whitespace, you can either specify the --protect-args (-s) option, or you’ll need to escape the whitespace in a way that the remote shell will understand. For instance: rsync -av host:'file\ name\ with\ spaces' /dest CONNECTING TO AN RSYNC DAEMON It is also possible to use rsync without a remote shell as the transport. In this case you will directly connect to a remote rsync daemon, typically using TCP port 873. (This obviously requires the daemon to be running on the remote system, so refer to the STARTING AN RSYNC DAEMON TO ACCEPT CONNECTIONS section below for information on that.) Using rsync in this way is the same as using it with a remote shell except that: o you either use a double colon :: instead of a single colon to separate the hostname from the path, or you use an rsync:// URL. o the first word of the "path" is actually a module name. o the remote daemon may print a message of the day when you connect. o if you specify no path name on the remote daemon then the list of accessible paths on the daemon will be shown. o if you specify no local destination then a listing of the specified files on the remote daemon is provided. o you must not specify the --rsh (-e) option. An example that copies all the files in a remote module named "src": rsync -av host::src /dest Some modules on the remote daemon may require authentication. If so, you will receive a password prompt when you connect. You can avoid the password prompt by setting the environment variable RSYNC_PASSWORD to the password you want to use or using the --password-file option. This may be useful when scripting rsync. WARNING: On some systems environment variables are visible to all users. On those systems using --password-file is recommended. You may establish the connection via a web proxy by setting the environment variable RSYNC_PROXY to a hostname:port pair pointing to your web proxy. Note that your web proxy’s configuration must support proxy connections to port 873. You may also establish a daemon connection using a program as a proxy by setting the environment variable RSYNC_CONNECT_PROG to the commands you wish to run in place of making a direct socket connection. The string may contain the escape "%H" to represent the hostname specified in the rsync command (so use "%%" if you need a single "%" in your string). For example: export RSYNC_CONNECT_PROG='ssh proxyhost nc %H 873' rsync -av targethost1::module/src/ /dest/ rsync -av rsync:://targethost2/module/src/ /dest/ The command specified above uses ssh to run nc (netcat) on a proxyhost, which forwards all data to port 873 (the rsync daemon) on the targethost (%H). USING RSYNC-DAEMON FEATURES VIA A REMOTE-SHELL CONNECTION It is sometimes useful to use various features of an rsync daemon (such as named modules) without actually allowing any new socket connections into a system (other than what is already required to allow remote-shell access). Rsync supports connecting to a host using a remote shell and then spawning a single-use "daemon" server that expects to read its config file in the home dir of the remote user. This can be useful if you want to encrypt a daemon-style transfer’s data, but since the daemon is started up fresh by the remote user, you may not be able to use features such as chroot or change the uid used by the daemon. (For another way to encrypt a daemon trans- fer, consider using ssh to tunnel a local port to a remote machine and configure a normal rsync daemon on that remote host to only allow connections from "localhost".) From the user’s perspective, a daemon transfer via a remote-shell connection uses nearly the same command-line syntax as a normal rsync-daemon transfer, with the only exception being that you must explicitly set the remote shell program on the command-line with the --rsh=COMMAND option. (Setting the RSYNC_RSH in the environment will not turn on this functionality.) For example: rsync -av --rsh=ssh host::module /dest If you need to specify a different remote-shell user, keep in mind that the user@ prefix in front of the host is specifying the rsync-user value (for a module that requires user-based authentication). This means that you must give the ’-l user’ option to ssh when specifying the remote-shell, as in this example that uses the short version of the --rsh option: rsync -av -e "ssh -l ssh-user" rsync-user@host::module /dest The "ssh-user" will be used at the ssh level; the "rsync-user" will be used to log-in to the "module". STARTING AN RSYNC DAEMON TO ACCEPT CONNECTIONS In order to connect to an rsync daemon, the remote system needs to have a daemon already running (or it needs to have configured something like inetd to spawn an rsync daemon for incoming connections on a particular port). For full information on how to start a daemon that will handling incoming socket connections, see the rsyncd.conf(5) man page -- that is the config file for the daemon, and it contains the full details for how to run the daemon (including stand-alone and inetd configura- tions). If you’re using one of the remote-shell transports for the transfer, there is no need to manually start an rsync daemon. SORTED TRANSFER ORDER Rsync always sorts the specified filenames into its internal transfer list. This handles the merging together of the contents of identically named directories, makes it easy to remove duplicate filenames, and may confuse someone when the files are transferred in a different order than what was given on the command-line. If you need a particular file to be transferred prior to another, either separate the files into different rsync calls, or consider using --delay-updates (which doesn’t affect the sorted transfer order, but does make the final file-updating phase happen much more rapidly). EXAMPLES Here are some examples of how I use rsync. To backup my wife’s home directory, which consists of large MS Word files and mail folders, I use a cron job that runs rsync -Cavz . arvidsjaur:backup each night over a PPP connection to a duplicate directory on my machine "arvidsjaur". To synchronize my samba source trees I use the following Makefile targets: get: rsync -avuzb --exclude '*~' samba:samba/ . put: rsync -Cavuzb . samba:samba/ sync: get put this allows me to sync with a CVS directory at the other end of the connection. I then do CVS operations on the remote machine, which saves a lot of time as the remote CVS protocol isn’t very efficient. I mirror a directory between my "old" and "new" ftp sites with the command: rsync -az -e ssh --delete ~ftp/pub/samba nimbus:"~ftp/pub/tridge" This is launched from cron every few hours. OPTIONS SUMMARY Here is a short summary of the options available in rsync. Please refer to the detailed description below for a complete description. -v, --verbose increase verbosity --info=FLAGS fine-grained informational verbosity --debug=FLAGS fine-grained debug verbosity --msgs2stderr special output handling for debugging -q, --quiet suppress non-error messages --no-motd suppress daemon-mode MOTD (see caveat) -c, --checksum skip based on checksum, not mod-time & size -a, --archive archive mode; equals -rlptgoD (no -H,-A,-X) --no-OPTION turn off an implied OPTION (e.g. --no-D) -r, --recursive recurse into directories -R, --relative use relative path names --no-implied-dirs don't send implied dirs with --relative -b, --backup make backups (see --suffix & --backup-dir) --backup-dir=DIR make backups into hierarchy based in DIR --suffix=SUFFIX backup suffix (default ~ w/o --backup-dir) -u, --update skip files that are newer on the receiver --inplace update destination files in-place --append append data onto shorter files --append-verify --append w/old data in file checksum -d, --dirs transfer directories without recursing -l, --links copy symlinks as symlinks -L, --copy-links transform symlink into referent file/dir --copy-unsafe-links only "unsafe" symlinks are transformed --safe-links ignore symlinks that point outside the tree --munge-links munge symlinks to make them safer -k, --copy-dirlinks transform symlink to dir into referent dir -K, --keep-dirlinks treat symlinked dir on receiver as dir -H, --hard-links preserve hard links -p, --perms preserve permissions -E, --executability preserve executability --chmod=CHMOD affect file and/or directory permissions -A, --acls preserve ACLs (implies -p) -X, --xattrs preserve extended attributes -o, --owner preserve owner (super-user only) -g, --group preserve group --devices preserve device files (super-user only) --specials preserve special files -D same as --devices --specials -t, --times preserve modification times -O, --omit-dir-times omit directories from --times -J, --omit-link-times omit symlinks from --times --super receiver attempts super-user activities --fake-super store/recover privileged attrs using xattrs -S, --sparse handle sparse files efficiently --preallocate allocate dest files before writing -n, --dry-run perform a trial run with no changes made -W, --whole-file copy files whole (w/o delta-xfer algorithm) -x, --one-file-system don't cross filesystem boundaries -B, --block-size=SIZE force a fixed checksum block-size -e, --rsh=COMMAND specify the remote shell to use --rsync-path=PROGRAM specify the rsync to run on remote machine --existing skip creating new files on receiver --ignore-existing skip updating files that exist on receiver --remove-source-files sender removes synchronized files (non-dir) --del an alias for --delete-during --delete delete extraneous files from dest dirs --delete-before receiver deletes before xfer, not during --delete-during receiver deletes during the transfer --delete-delay find deletions during, delete after --delete-after receiver deletes after transfer, not during --delete-excluded also delete excluded files from dest dirs --ignore-missing-args ignore missing source args without error --delete-missing-args delete missing source args from destination --ignore-errors delete even if there are I/O errors --force force deletion of dirs even if not empty --max-delete=NUM don't delete more than NUM files --max-size=SIZE don't transfer any file larger than SIZE --min-size=SIZE don't transfer any file smaller than SIZE --partial keep partially transferred files --partial-dir=DIR put a partially transferred file into DIR --delay-updates put all updated files into place at end -m, --prune-empty-dirs prune empty directory chains from file-list --numeric-ids don't map uid/gid values by user/group name --usermap=STRING custom username mapping --groupmap=STRING custom groupname mapping --chown=USER:GROUP simple username/groupname mapping --timeout=SECONDS set I/O timeout in seconds --contimeout=SECONDS set daemon connection timeout in seconds -I, --ignore-times don't skip files that match size and time --size-only skip files that match in size --modify-window=NUM compare mod-times with reduced accuracy -T, --temp-dir=DIR create temporary files in directory DIR -y, --fuzzy find similar file for basis if no dest file --compare-dest=DIR also compare received files relative to DIR --copy-dest=DIR ... and include copies of unchanged files --link-dest=DIR hardlink to files in DIR when unchanged -z, --compress compress file data during the transfer --compress-level=NUM explicitly set compression level --skip-compress=LIST skip compressing files with suffix in LIST -C, --cvs-exclude auto-ignore files in the same way CVS does -f, --filter=RULE add a file-filtering RULE -F same as --filter='dir-merge /.rsync-filter' repeated: --filter='- .rsync-filter' --exclude=PATTERN exclude files matching PATTERN --exclude-from=FILE read exclude patterns from FILE --include=PATTERN don't exclude files matching PATTERN --include-from=FILE read include patterns from FILE --files-from=FILE read list of source-file names from FILE -0, --from0 all *from/filter files are delimited by 0s -s, --protect-args no space-splitting; wildcard chars only --address=ADDRESS bind address for outgoing socket to daemon --port=PORT specify double-colon alternate port number --sockopts=OPTIONS specify custom TCP options --blocking-io use blocking I/O for the remote shell --outbuf=N|L|B set out buffering to None, Line, or Block --stats give some file-transfer stats -8, --8-bit-output leave high-bit chars unescaped in output -h, --human-readable output numbers in a human-readable format --progress show progress during transfer -P same as --partial --progress -i, --itemize-changes output a change-summary for all updates -M, --remote-option=OPTION send OPTION to the remote side only --out-format=FORMAT output updates using the specified FORMAT --log-file=FILE log what we're doing to the specified FILE --log-file-format=FMT log updates using the specified FMT --password-file=FILE read daemon-access password from FILE --list-only list the files instead of copying them --bwlimit=RATE limit socket I/O bandwidth --stop-at=y-m-dTh:m Stop rsync at year-month-dayThour:minute --time-limit=MINS Stop rsync after MINS minutes have elapsed --write-batch=FILE write a batched update to FILE --only-write-batch=FILE like --write-batch but w/o updating dest --read-batch=FILE read a batched update from FILE --protocol=NUM force an older protocol version to be used --iconv=CONVERT_SPEC request charset conversion of filenames --checksum-seed=NUM set block/file checksum seed (advanced) -4, --ipv4 prefer IPv4 -6, --ipv6 prefer IPv6 --version print version number (-h) --help show this help (see below for -h comment) Rsync can also be run as a daemon, in which case the following options are accepted: --daemon run as an rsync daemon --address=ADDRESS bind to the specified address --bwlimit=RATE limit socket I/O bandwidth --config=FILE specify alternate rsyncd.conf file -M, --dparam=OVERRIDE override global daemon config parameter --no-detach do not detach from the parent --port=PORT listen on alternate port number --log-file=FILE override the "log file" setting --log-file-format=FMT override the "log format" setting --sockopts=OPTIONS specify custom TCP options -v, --verbose increase verbosity -4, --ipv4 prefer IPv4 -6, --ipv6 prefer IPv6 -h, --help show this help (if used after --daemon) OPTIONS Rsync accepts both long (double-dash + word) and short (single-dash + letter) options. The full list of the available options are described below. If an option can be specified in more than one way, the choices are comma-separated. Some options only have a long variant, not a short. If the option takes a parameter, the parameter is only listed after the long variant, even though it must also be specified for the short. When specifying a parameter, you can either use the form --option=param or replace the ’=’ with whitespace. The parameter may need to be quoted in some manner for it to survive the shell’s command-line parsing. Keep in mind that a leading tilde (~) in a filename is substituted by your shell, so --option=~/foo will not change the tilde into your home directory (remove the ’=’ for that). --help Print a short help page describing the options available in rsync and exit. For backward-compatibility with older versions of rsync, the help will also be output if you use the -h option without any other args. --version print the rsync version number and exit. -v, --verbose This option increases the amount of information you are given during the transfer. By default, rsync works silently. A single -v will give you information about what files are being transferred and a brief summary at the end. Two -v options will give you information on what files are being skipped and slightly more informa- tion at the end. More than two -v options should only be used if you are debugging rsync. In a modern rsync, the -v option is equivalent to the setting of groups of --info and --debug options. You can choose to use these newer options in addition to, or in place of using --verbose, as any fine-grained settings override the implied settings of -v. Both --info and --debug have a way to ask for help that tells you exactly what flags are set for each increase in verbosity. --info=FLAGS This option lets you have fine-grained control over the information output you want to see. An individual flag name may be followed by a level number, with 0 mean- ing to silence that output, 1 being the default output level, and higher numbers increasing the output of that flag (for those that support higher levels). Use --info=help to see all the available flag names, what they output, and what flag names are added for each increase in the verbose level. Some examples: rsync -a --info=progress2 src/ dest/ rsync -avv --info=stats2,misc1,flist0 src/ dest/ Note that --info=name’s output is affected by the --out-format and --itemize-changes (-i) options. See those options for more information on what is output and when. This option was added to 3.1.0, so an older rsync on the server side might reject your attempts at fine-grained control (if one or more flags needed to be send to the server and the server was too old to understand them). --debug=FLAGS This option lets you have fine-grained control over the debug output you want to see. An individual flag name may be followed by a level number, with 0 meaning to silence that output, 1 being the default output level, and higher numbers increasing the output of that flag (for those that support higher levels). Use --debug=help to see all the available flag names, what they output, and what flag names are added for each increase in the verbose level. Some examples: rsync -avvv --debug=none src/ dest/ rsync -avA --del --debug=del2,acl src/ dest/ Note that some debug messages will only be output when --msgs2stderr is specified, especially those pertaining to I/O and buffer debugging. This option was added to 3.1.0, so an older rsync on the server side might reject your attempts at fine-grained control (if one or more flags needed to be send to the server and the server was too old to understand them). --msgs2stderr This option changes rsync to send all its output directly to stderr rather than to send messages to the client side via the protocol (which normally outputs info messages via stdout). This is mainly intended for debugging in order to avoid changing the data sent via the protocol, since the extra protocol data can change what is being tested. Keep in mind that a daemon connection does not have a stderr channel to send messages back to the client side, so if you are doing any dae- mon-transfer debugging using this option, you should start up a daemon using --no-detach so that you can see the stderr output on the daemon side. This option has the side-effect of making stderr output get line-buffered so that the merging of the output of 3 programs happens in a more readable manner. -q, --quiet This option decreases the amount of information you are given during the transfer, notably suppressing information messages from the remote server. This option is useful when invoking rsync from cron. --no-motd This option affects the information that is output by the client at the start of a daemon transfer. This suppresses the message-of-the-day (MOTD) text, but it also affects the list of modules that the daemon sends in response to the "rsync host::" request (due to a limitation in the rsync protocol), so omit this option if you want to request the list of modules from the daemon. -I, --ignore-times Normally rsync will skip any files that are already the same size and have the same modification timestamp. This option turns off this "quick check" behavior, causing all files to be updated. --size-only This modifies rsync’s "quick check" algorithm for finding files that need to be transferred, changing it from the default of transferring files with either a changed size or a changed last-modified time to just looking for files that have changed in size. This is useful when starting to use rsync after using another mirroring system which may not preserve timestamps exactly. --modify-window When comparing two timestamps, rsync treats the timestamps as being equal if they differ by no more than the modify-window value. This is normally 0 (for an exact match), but you may find it useful to set this to a larger value in some situations. In particular, when transferring to or from an MS Windows FAT filesystem (which represents times with a 2-second resolution), --modify-window=1 is useful (allowing times to differ by up to 1 second). -c, --checksum This changes the way rsync checks if the files have been changed and are in need of a transfer. Without this option, rsync uses a "quick check" that (by default) checks if each file’s size and time of last modification match between the sender and receiver. This option changes this to compare a 128-bit checksum for each file that has a matching size. Generating the checksums means that both sides will expend a lot of disk I/O reading all the data in the files in the transfer (and this is prior to any reading that will be done to transfer changed files), so this can slow things down significantly. The sending side generates its checksums while it is doing the file-system scan that builds the list of the available files. The receiver generates its checksums when it is scanning for changed files, and will checksum any file that has the same size as the corresponding sender’s file: files with either a changed size or a changed checksum are selected for transfer. Note that rsync always verifies that each transferred file was correctly reconstructed on the receiving side by checking a whole-file checksum that is generated as the file is transferred, but that automatic after-the-transfer verification has nothing to do with this option’s before-the-transfer "Does this file need to be updated?" check. For protocol 30 and beyond (first supported in 3.0.0), the checksum used is MD5. For older protocols, the checksum used is MD4. -a, --archive This is equivalent to -rlptgoD. It is a quick way of saying you want recursion and want to preserve almost everything (with -H being a notable omission). The only exception to the above equivalence is when --files-from is specified, in which case -r is not implied. Note that -a does not preserve hardlinks, because finding multiply-linked files is expensive. You must separately specify -H. --no-OPTION You may turn off one or more implied options by prefixing the option name with "no-". Not all options may be prefixed with a "no-": only options that are implied by other options (e.g. --no-D, --no-perms) or have different defaults in various circumstances (e.g. --no-whole-file, --no-blocking-io, --no-dirs). You may specify either the short or the long option name after the "no-" prefix (e.g. --no-R is the same as --no-relative). For example: if you want to use -a (--archive) but don’t want -o (--owner), instead of converting -a into -rlptgD, you could specify -a --no-o (or -a --no-owner). The order of the options is important: if you specify --no-r -a, the -r option would end up being turned on, the opposite of -a --no-r. Note also that the side-effects of the --files-from option are NOT positional, as it affects the default state of several options and slightly changes the meaning of -a (see the --files-from option for more details). -r, --recursive This tells rsync to copy directories recursively. See also --dirs (-d). Beginning with rsync 3.0.0, the recursive algorithm used is now an incremental scan that uses much less memory than before and begins the transfer after the scan- ning of the first few directories have been completed. This incremental scan only affects our recursion algorithm, and does not change a non-recursive transfer. It is also only possible when both ends of the transfer are at least version 3.0.0. Some options require rsync to know the full file list, so these options disable the incremental recursion mode. These include: --delete-before, --delete-after, --prune-empty-dirs, and --delay-updates. Because of this, the default delete mode when you specify --delete is now --delete-during when both ends of the connection are at least 3.0.0 (use --del or --delete-during to request this improved deletion mode explicitly). See also the --delete-delay option that is a better choice than using --delete-after. Incremental recursion can be disabled using the --no-inc-recursive option or its shorter --no-i-r alias. -R, --relative Use relative paths. This means that the full path names specified on the command line are sent to the server rather than just the last parts of the filenames. This is particularly useful when you want to send several different directories at the same time. For example, if you used this command: rsync -av /foo/bar/baz.c remote:/tmp/ ... this would create a file named baz.c in /tmp/ on the remote machine. If instead you used rsync -avR /foo/bar/baz.c remote:/tmp/ then a file named /tmp/foo/bar/baz.c would be created on the remote machine, preserving its full path. These extra path elements are called "implied directories" (i.e. the "foo" and the "foo/bar" directories in the above example). Beginning with rsync 3.0.0, rsync always sends these implied directories as real directories in the file list, even if a path element is really a symlink on the sending side. This prevents some really unexpected behaviors when copying the full path of a file that you didn’t realize had a symlink in its path. If you want to duplicate a server-side symlink, include both the symlink via its path, and referent directory via its real path. If you’re dealing with an older rsync on the sending side, you may need to use the --no-implied-dirs option. It is also possible to limit the amount of path information that is sent as implied directories for each path you specify. With a modern rsync on the sending side (beginning with 2.6.7), you can insert a dot and a slash into the source path, like this: rsync -avR /foo/./bar/baz.c remote:/tmp/ That would create /tmp/bar/baz.c on the remote machine. (Note that the dot must be followed by a slash, so "/foo/." would not be abbreviated.) For older rsync versions, you would need to use a chdir to limit the source path. For example, when pushing files: (cd /foo; rsync -avR bar/baz.c remote:/tmp/) (Note that the parens put the two commands into a sub-shell, so that the "cd" command doesn’t remain in effect for future commands.) If you’re pulling files from an older rsync, use this idiom (but only for a non-daemon transfer): rsync -avR --rsync-path="cd /foo; rsync" \ remote:bar/baz.c /tmp/ --no-implied-dirs This option affects the default behavior of the --relative option. When it is specified, the attributes of the implied directories from the source names are not included in the transfer. This means that the corresponding path elements on the destination system are left unchanged if they exist, and any missing implied directories are created with default attributes. This even allows these implied path elements to have big differences, such as being a symlink to a directory on the receiving side. For instance, if a command-line arg or a files-from entry told rsync to transfer the file "path/foo/file", the directories "path" and "path/foo" are implied when --relative is used. If "path/foo" is a symlink to "bar" on the destination system, the receiving rsync would ordinarily delete "path/foo", recreate it as a direc- tory, and receive the file into the new directory. With --no-implied-dirs, the receiving rsync updates "path/foo/file" using the existing path elements, which means that the file ends up being created in "path/bar". Another way to accomplish this link preservation is to use the --keep-dirlinks option (which will also affect symlinks to directories in the rest of the transfer). When pulling files from an rsync older than 3.0.0, you may need to use this option if the sending side has a symlink in the path you request and you wish the implied directories to be transferred as normal directories. -b, --backup With this option, preexisting destination files are renamed as each file is transferred or deleted. You can control where the backup file goes and what (if any) suffix gets appended using the --backup-dir and --suffix options. Note that if you don’t specify --backup-dir, (1) the --omit-dir-times option will be implied, and (2) if --delete is also in effect (without --delete-excluded), rsync will add a "protect" filter-rule for the backup suffix to the end of all your existing excludes (e.g. -f "P *~"). This will prevent previously backed-up files from being deleted. Note that if you are supplying your own filter rules, you may need to manually insert your own exclude/protect rule somewhere higher up in the list so that it has a high enough priority to be effective (e.g., if your rules specify a trailing inclusion/exclusion of ’*’, the auto-added rule would never be reached). --backup-dir=DIR In combination with the --backup option, this tells rsync to store all backups in the specified directory on the receiving side. This can be used for incremental backups. You can additionally specify a backup suffix using the --suffix option (otherwise the files backed up in the specified directory will keep their original filenames). Note that if you specify a relative path, the backup directory will be relative to the destination directory, so you probably want to specify either an absolute path or a path that starts with "../". If an rsync daemon is the receiver, the backup dir cannot go outside the module’s path hierarchy, so take extra care not to delete it or copy into it. --suffix=SUFFIX This option allows you to override the default backup suffix used with the --backup (-b) option. The default suffix is a ~ if no --backup-dir was specified, other- wise it is an empty string. -u, --update This forces rsync to skip any files which exist on the destination and have a modified time that is newer than the source file. (If an existing destination file has a modification time equal to the source file’s, it will be updated if the sizes are different.) Note that this does not affect the copying of symlinks or other special files. Also, a difference of file format between the sender and receiver is always consid- ered to be important enough for an update, no matter what date is on the objects. In other words, if the source has a directory where the destination has a file, the transfer would occur regardless of the timestamps. This option is a transfer rule, not an exclude, so it doesn’t affect the data that goes into the file-lists, and thus it doesn’t affect deletions. It just limits the files that the receiver requests to be transferred. --inplace This option changes how rsync transfers a file when its data needs to be updated: instead of the default method of creating a new copy of the file and moving it into place when it is complete, rsync instead writes the updated data directly to the destination file. This has several effects: o Hard links are not broken. This means the new data will be visible through other hard links to the destination file. Moreover, attempts to copy differing source files onto a multiply-linked destination file will result in a "tug of war" with the destination data changing back and forth. o In-use binaries cannot be updated (either the OS will prevent this from happening, or binaries that attempt to swap-in their data will misbehave or crash). o The file’s data will be in an inconsistent state during the transfer and will be left that way if the transfer is interrupted or if an update fails. o A file that rsync cannot write to cannot be updated. While a super user can update any file, a normal user needs to be granted write permission for the open of the file for writing to be successful. o The efficiency of rsync’s delta-transfer algorithm may be reduced if some data in the destination file is overwritten before it can be copied to a position later in the file. This does not apply if you use --backup, since rsync is smart enough to use the backup file as the basis file for the transfer. WARNING: you should not use this option to update files that are being accessed by others, so be careful when choosing to use this for a copy. This option is useful for transferring large files with block-based changes or appended data, and also on systems that are disk bound, not network bound. It can also help keep a copy-on-write filesystem snapshot from diverging the entire contents of a file that only has minor changes. The option implies --partial (since an interrupted transfer does not delete the file), but conflicts with --partial-dir and --delay-updates. Prior to rsync 2.6.4 --inplace was also incompatible with --compare-dest and --link-dest. --append This causes rsync to update a file by appending data onto the end of the file, which presumes that the data that already exists on the receiving side is identical with the start of the file on the sending side. If a file needs to be transferred and its size on the receiver is the same or longer than the size on the sender, the file is skipped. This does not interfere with the updating of a file’s non-content attributes (e.g. permissions, ownership, etc.) when the file does not need to be transferred, nor does it affect the updating of any non-regular files. Implies --inplace, but does not conflict with --sparse (since it is always extending a file’s length). --append-verify This works just like the --append option, but the existing data on the receiving side is included in the full-file checksum verification step, which will cause a file to be resent if the final verification step fails (rsync uses a normal, non-appending --inplace transfer for the resend). Note: prior to rsync 3.0.0, the --append option worked like --append-verify, so if you are interacting with an older rsync (or the transfer is using a protocol prior to 30), specifying either append option will initiate an --append-verify transfer. -d, --dirs Tell the sending side to include any directories that are encountered. Unlike --recursive, a directory’s contents are not copied unless the directory name speci- fied is "." or ends with a trailing slash (e.g. ".", "dir/.", "dir/", etc.). Without this option or the --recursive option, rsync will skip all directories it encounters (and output a message to that effect for each one). If you specify both --dirs and --recursive, --recursive takes precedence. The --dirs option is implied by the --files-from option or the --list-only option (including an implied --list-only usage) if --recursive wasn’t specified (so that directories are seen in the listing). Specify --no-dirs (or --no-d) if you want to turn this off. There is also a backward-compatibility helper option, --old-dirs (or --old-d) that tells rsync to use a hack of "-r --exclude=’/*/*’" to get an older rsync to list a single directory without recursing. -l, --links When symlinks are encountered, recreate the symlink on the destination. -L, --copy-links When symlinks are encountered, the item that they point to (the referent) is copied, rather than the symlink. In older versions of rsync, this option also had the side-effect of telling the receiving side to follow symlinks, such as symlinks to directories. In a modern rsync such as this one, you’ll need to specify --keep-dirlinks (-K) to get this extra behavior. The only exception is when sending files to an rsync that is too old to understand -K -- in that case, the -L option will still have the side-effect of -K on that older receiving rsync. --copy-unsafe-links This tells rsync to copy the referent of symbolic links that point outside the copied tree. Absolute symlinks are also treated like ordinary files, and so are any symlinks in the source path itself when --relative is used. This option has no additional effect if --copy-links was also specified. --safe-links This tells rsync to ignore any symbolic links which point outside the copied tree. All absolute symlinks are also ignored. Using this option in conjunction with --relative may give unexpected results. --munge-links This option tells rsync to (1) modify all symlinks on the receiving side in a way that makes them unusable but recoverable (see below), or (2) to unmunge symlinks on the sending side that had been stored in a munged state. This is useful if you don’t quite trust the source of the data to not try to slip in a symlink to a unexpected place. The way rsync disables the use of symlinks is to prefix each one with the string "/rsyncd-munged/". This prevents the links from being used as long as that direc- tory does not exist. When this option is enabled, rsync will refuse to run if that path is a directory or a symlink to a directory. The option only affects the client side of the transfer, so if you need it to affect the server, specify it via --remote-option. (Note that in a local transfer, the client side is the sender.) This option has no affect on a daemon, since the daemon configures whether it wants munged symlinks via its "munge symlinks" parameter. See also the "munge-sym- links" perl script in the support directory of the source code. -k, --copy-dirlinks This option causes the sending side to treat a symlink to a directory as though it were a real directory. This is useful if you don’t want symlinks to non-directo- ries to be affected, as they would be using --copy-links. Without this option, if the sending side has replaced a directory with a symlink to a directory, the receiving side will delete anything that is in the way of the new symlink, including a directory hierarchy (as long as --force or --delete is in effect). See also --keep-dirlinks for an analogous option for the receiving side. --copy-dirlinks applies to all symlinks to directories in the source. If you want to follow only a few specified symlinks, a trick you can use is to pass them as additional source args with a trailing slash, using --relative to make the paths match up right. For example: rsync -r --relative src/./ src/./follow-me/ dest/ This works because rsync calls lstat(2) on the source arg as given, and the trailing slash makes lstat(2) follow the symlink, giving rise to a directory in the file-list which overrides the symlink found during the scan of "src/./". -K, --keep-dirlinks This option causes the receiving side to treat a symlink to a directory as though it were a real directory, but only if it matches a real directory from the sender. Without this option, the receiver’s symlink would be deleted and replaced with a real directory. For example, suppose you transfer a directory "foo" that contains a file "file", but "foo" is a symlink to directory "bar" on the receiver. Without --keep-dirlinks, the receiver deletes symlink "foo", recreates it as a directory, and receives the file into the new directory. With --keep-dirlinks, the receiver keeps the symlink and "file" ends up in "bar". One note of caution: if you use --keep-dirlinks, you must trust all the symlinks in the copy! If it is possible for an untrusted user to create their own symlink to any directory, the user could then (on a subsequent copy) replace the symlink with a real directory and affect the content of whatever directory the symlink ref- erences. For backup copies, you are better off using something like a bind mount instead of a symlink to modify your receiving hierarchy. See also --copy-dirlinks for an analogous option for the sending side. -H, --hard-links This tells rsync to look for hard-linked files in the source and link together the corresponding files on the destination. Without this option, hard-linked files in the source are treated as though they were separate files. This option does NOT necessarily ensure that the pattern of hard links on the destination exactly matches that on the source. Cases in which the destination may end up with extra hard links include the following: o If the destination contains extraneous hard-links (more linking than what is present in the source file list), the copying algorithm will not break them explicitly. However, if one or more of the paths have content differences, the normal file-update process will break those extra links (unless you are using the --inplace option). o If you specify a --link-dest directory that contains hard links, the linking of the destination files against the --link-dest files can cause some paths in the destination to become linked together due to the --link-dest associations. Note that rsync can only detect hard links between files that are inside the transfer set. If rsync updates a file that has extra hard-link connections to files outside the transfer, that linkage will be broken. If you are tempted to use the --inplace option to avoid this breakage, be very careful that you know how your files are being updated so that you are certain that no unintended changes happen due to lingering hard links (and see the --inplace option for more caveats). If incremental recursion is active (see --recursive), rsync may transfer a missing hard-linked file before it finds that another link for that contents exists else- where in the hierarchy. This does not affect the accuracy of the transfer (i.e. which files are hard-linked together), just its efficiency (i.e. copying the data for a new, early copy of a hard-linked file that could have been found later in the transfer in another member of the hard-linked set of files). One way to avoid this inefficiency is to disable incremental recursion using the --no-inc-recursive option. -p, --perms This option causes the receiving rsync to set the destination permissions to be the same as the source permissions. (See also the --chmod option for a way to mod- ify what rsync considers to be the source permissions.) When this option is off, permissions are set as follows: o Existing files (including updated files) retain their existing permissions, though the --executability option might change just the execute permission for the file. o New files get their "normal" permission bits set to the source file’s permissions masked with the receiving directory’s default permissions (either the receiving process’s umask, or the permissions specified via the destination directory’s default ACL), and their special permission bits disabled except in the case where a new directory inherits a setgid bit from its parent directory. Thus, when --perms and --executability are both disabled, rsync’s behavior is the same as that of other file-copy utilities, such as cp(1) and tar(1). In summary: to give destination files (both old and new) the source permissions, use --perms. To give new files the destination-default permissions (while leaving existing files unchanged), make sure that the --perms option is off and use --chmod=ugo=rwX (which ensures that all non-masked bits get enabled). If you’d care to make this latter behavior easier to type, you could define a popt alias for it, such as putting this line in the file ~/.popt (the following defines the -Z option, and includes --no-g to use the default group of the destination dir): rsync alias -Z --no-p --no-g --chmod=ugo=rwX You could then use this new option in a command such as this one: rsync -avZ src/ dest/ (Caveat: make sure that -a does not follow -Z, or it will re-enable the two "--no-*" options mentioned above.) The preservation of the destination’s setgid bit on newly-created directories when --perms is off was added in rsync 2.6.7. Older rsync versions erroneously pre- served the three special permission bits for newly-created files when --perms was off, while overriding the destination’s setgid bit setting on a newly-created directory. Default ACL observance was added to the ACL patch for rsync 2.6.7, so older (or non-ACL-enabled) rsyncs use the umask even if default ACLs are present. (Keep in mind that it is the version of the receiving rsync that affects these behaviors.) -E, --executability This option causes rsync to preserve the executability (or non-executability) of regular files when --perms is not enabled. A regular file is considered to be exe- cutable if at least one ’x’ is turned on in its permissions. When an existing destination file’s executability differs from that of the corresponding source file, rsync modifies the destination file’s permissions as follows: o To make a file non-executable, rsync turns off all its ’x’ permissions. o To make a file executable, rsync turns on each ’x’ permission that has a corresponding ’r’ permission enabled. If --perms is enabled, this option is ignored. -A, --acls This option causes rsync to update the destination ACLs to be the same as the source ACLs. The option also implies --perms. The source and destination systems must have compatible ACL entries for this option to work properly. See the --fake-super option for a way to backup and restore ACLs that are not compatible. -X, --xattrs This option causes rsync to update the destination extended attributes to be the same as the source ones. For systems that support extended-attribute namespaces, a copy being done by a super-user copies all namespaces except system.*. A normal user only copies the user.* namespace. To be able to backup and restore non-user namespaces as a normal user, see the --fake-super option. Note that this option does not copy rsyncs special xattr values (e.g. those used by --fake-super) unless you repeat the option (e.g. -XX). This "copy all xattrs" mode cannot be used with --fake-super. --chmod This option tells rsync to apply one or more comma-separated "chmod" modes to the permission of the files in the transfer. The resulting value is treated as though it were the permissions that the sending side supplied for the file, which means that this option can seem to have no effect on existing files if --perms is not enabled. In addition to the normal parsing rules specified in the chmod(1) manpage, you can specify an item that should only apply to a directory by prefixing it with a ’D’, or specify an item that should only apply to a file by prefixing it with a ’F’. For example, the following will ensure that all directories get marked set-gid, that no files are other-writable, that both are user-writable and group-writable, and that both have consistent executability across all bits: --chmod=Dg+s,ug+w,Fo-w,+X Using octal mode numbers is also allowed: --chmod=D2775,F664 It is also legal to specify multiple --chmod options, as each additional option is just appended to the list of changes to make. See the --perms and --executability options for how the resulting permission value can be applied to the files in the transfer. -o, --owner This option causes rsync to set the owner of the destination file to be the same as the source file, but only if the receiving rsync is being run as the super-user (see also the --super and --fake-super options). Without this option, the owner of new and/or transferred files are set to the invoking user on the receiving side. The preservation of ownership will associate matching names by default, but may fall back to using the ID number in some circumstances (see also the --numeric-ids option for a full discussion). -g, --group This option causes rsync to set the group of the destination file to be the same as the source file. If the receiving program is not running as the super-user (or if --no-super was specified), only groups that the invoking user on the receiving side is a member of will be preserved. Without this option, the group is set to the default group of the invoking user on the receiving side. The preservation of group information will associate matching names by default, but may fall back to using the ID number in some circumstances (see also the --numeric-ids option for a full discussion). --devices This option causes rsync to transfer character and block device files to the remote system to recreate these devices. This option has no effect if the receiving rsync is not run as the super-user (see also the --super and --fake-super options). --specials This option causes rsync to transfer special files such as named sockets and fifos. -D The -D option is equivalent to --devices --specials. -t, --times This tells rsync to transfer modification times along with the files and update them on the remote system. Note that if this option is not used, the optimization that excludes files that have not been modified cannot be effective; in other words, a missing -t or -a will cause the next transfer to behave as if it used -I, causing all files to be updated (though rsync’s delta-transfer algorithm will make the update fairly efficient if the files haven’t actually changed, you’re much better off using -t). -O, --omit-dir-times This tells rsync to omit directories when it is preserving modification times (see --times). If NFS is sharing the directories on the receiving side, it is a good idea to use -O. This option is inferred if you use --backup without --backup-dir. -J, --omit-link-times This tells rsync to omit symlinks when it is preserving modification times (see --times). --super This tells the receiving side to attempt super-user activities even if the receiving rsync wasn’t run by the super-user. These activities include: preserving users via the --owner option, preserving all groups (not just the current user’s groups) via the --groups option, and copying devices via the --devices option. This is useful for systems that allow such activities without being the super-user, and also for ensuring that you will get errors if the receiving side isn’t being run as the super-user. To turn off super-user activities, the super-user can use --no-super. --fake-super When this option is enabled, rsync simulates super-user activities by saving/restoring the privileged attributes via special extended attributes that are attached to each file (as needed). This includes the file’s owner and group (if it is not the default), the file’s device info (device & special files are created as empty text files), and any permission bits that we won’t allow to be set on the real file (e.g. the real file gets u-s,g-s,o-t for safety) or that would limit the owner’s access (since the real super-user can always access/change a file, the files we create can always be accessed/changed by the creating user). This option also handles ACLs (if --acls was specified) and non-user extended attributes (if --xattrs was specified). This is a good way to backup data without using a super-user, and to store ACLs from incompatible systems. The --fake-super option only affects the side where the option is used. To affect the remote side of a remote-shell connection, use the --remote-option (-M) option: rsync -av -M--fake-super /src/ host:/dest/ For a local copy, this option affects both the source and the destination. If you wish a local copy to enable this option just for the destination files, specify -M--fake-super. If you wish a local copy to enable this option just for the source files, combine --fake-super with -M--super. This option is overridden by both --super and --no-super. See also the "fake super" setting in the daemon’s rsyncd.conf file. -S, --sparse Try to handle sparse files efficiently so they take up less space on the destination. Conflicts with --inplace because it’s not possible to overwrite data in a sparse fashion. --preallocate This tells the receiver to allocate each destination file to its eventual size before writing data to the file. Rsync will only use the real filesystem-level pre- allocation support provided by Linux’s fallocate(2) system call or Cygwin’s posix_fallocate(3), not the slow glibc implementation that writes a zero byte into each block. Without this option, larger files may not be entirely contiguous on the filesystem, but with this option rsync will probably copy more slowly. If the destination is not an extent-supporting filesystem (such as ext4, xfs, NTFS, etc.), this option may have no positive effect at all. -n, --dry-run This makes rsync perform a trial run that doesn’t make any changes (and produces mostly the same output as a real run). It is most commonly used in combination with the -v, --verbose and/or -i, --itemize-changes options to see what an rsync command is going to do before one actually runs it. The output of --itemize-changes is supposed to be exactly the same on a dry run and a subsequent real run (barring intentional trickery and system call failures); if it isn’t, that’s a bug. Other output should be mostly unchanged, but may differ in some areas. Notably, a dry run does not send the actual data for file trans- fers, so --progress has no effect, the "bytes sent", "bytes received", "literal data", and "matched data" statistics are too small, and the "speedup" value is equivalent to a run where no file transfers were needed. -W, --whole-file With this option rsync’s delta-transfer algorithm is not used and the whole file is sent as-is instead. The transfer may be faster if this option is used when the bandwidth between the source and destination machines is higher than the bandwidth to disk (especially when the "disk" is actually a networked filesystem). This is the default when both the source and destination are specified as local paths, but only if no batch-writing option is in effect. -x, --one-file-system This tells rsync to avoid crossing a filesystem boundary when recursing. This does not limit the user’s ability to specify items to copy from multiple filesystems, just rsync’s recursion through the hierarchy of each directory that the user specified, and also the analogous recursion on the receiving side during deletion. Also keep in mind that rsync treats a "bind" mount to the same device as being on the same filesystem. If this option is repeated, rsync omits all mount-point directories from the copy. Otherwise, it includes an empty directory at each mount-point it encounters (using the attributes of the mounted directory because those of the underlying mount-point directory are inaccessible). If rsync has been told to collapse symlinks (via --copy-links or --copy-unsafe-links), a symlink to a directory on another device is treated like a mount-point. Symlinks to non-directories are unaffected by this option. --existing, --ignore-non-existing This tells rsync to skip creating files (including directories) that do not exist yet on the destination. If this option is combined with the --ignore-existing option, no files will be updated (which can be useful if all you want to do is delete extraneous files). This option is a transfer rule, not an exclude, so it doesn’t affect the data that goes into the file-lists, and thus it doesn’t affect deletions. It just limits the files that the receiver requests to be transferred. --ignore-existing This tells rsync to skip updating files that already exist on the destination (this does not ignore existing directories, or nothing would get done). See also --existing. This option is a transfer rule, not an exclude, so it doesn’t affect the data that goes into the file-lists, and thus it doesn’t affect deletions. It just limits the files that the receiver requests to be transferred. This option can be useful for those doing backups using the --link-dest option when they need to continue a backup run that got interrupted. Since a --link-dest run is copied into a new directory hierarchy (when it is used properly), using --ignore existing will ensure that the already-handled files don’t get tweaked (which avoids a change in permissions on the hard-linked files). This does mean that this option is only looking at the existing files in the destination hierarchy itself. --remove-source-files This tells rsync to remove from the sending side the files (meaning non-directories) that are a part of the transfer and have been successfully duplicated on the receiving side. Note that you should only use this option on source files that are quiescent. If you are using this to move files that show up in a particular directory over to another host, make sure that the finished files get renamed into the source directory, not directly written into it, so that rsync can’t possibly transfer a file that is not yet fully written. If you can’t first write the files into a different directory, you should use a naming idiom that lets rsync avoid transferring files that are not yet finished (e.g. name the file "foo.new" when it is written, rename it to "foo" when it is done, and then use the option --exclude='*.new' for the rsync transfer). Starting with 3.1.0, rsync will skip the sender-side removal (and output an error) if the file’s size or modify time has not stayed unchanged. --delete This tells rsync to delete extraneous files from the receiving side (ones that aren’t on the sending side), but only for the directories that are being synchro- nized. You must have asked rsync to send the whole directory (e.g. "dir" or "dir/") without using a wildcard for the directory’s contents (e.g. "dir/*") since the wildcard is expanded by the shell and rsync thus gets a request to transfer individual files, not the files’ parent directory. Files that are excluded from the transfer are also excluded from being deleted unless you use the --delete-excluded option or mark the rules as only matching on the sending side (see the include/exclude modifiers in the FILTER RULES section). Prior to rsync 2.6.7, this option would have no effect unless --recursive was enabled. Beginning with 2.6.7, deletions will also occur when --dirs (-d) is enabled, but only for directories whose contents are being copied. This option can be dangerous if used incorrectly! It is a very good idea to first try a run using the --dry-run option (-n) to see what files are going to be deleted. If the sending side detects any I/O errors, then the deletion of any files at the destination will be automatically disabled. This is to prevent temporary filesys- tem failures (such as NFS errors) on the sending side from causing a massive deletion of files on the destination. You can override this with the --ignore-errors option. The --delete option may be combined with one of the --delete-WHEN options without conflict, as well as --delete-excluded. However, if none of the --delete-WHEN options are specified, rsync will choose the --delete-during algorithm when talking to rsync 3.0.0 or newer, and the --delete-before algorithm when talking to an older rsync. See also --delete-delay and --delete-after. --delete-before Request that the file-deletions on the receiving side be done before the transfer starts. See --delete (which is implied) for more details on file-deletion. Deleting before the transfer is helpful if the filesystem is tight for space and removing extraneous files would help to make the transfer possible. However, it does introduce a delay before the start of the transfer, and this delay might cause the transfer to timeout (if --timeout was specified). It also forces rsync to use the old, non-incremental recursion algorithm that requires rsync to scan all the files in the transfer into memory at once (see --recursive). --delete-during, --del Request that the file-deletions on the receiving side be done incrementally as the transfer happens. The per-directory delete scan is done right before each direc- tory is checked for updates, so it behaves like a more efficient --delete-before, including doing the deletions prior to any per-directory filter files being updated. This option was first added in rsync version 2.6.4. See --delete (which is implied) for more details on file-deletion. --delete-delay Request that the file-deletions on the receiving side be computed during the transfer (like --delete-during), and then removed after the transfer completes. This is useful when combined with --delay-updates and/or --fuzzy, and is more efficient than using --delete-after (but can behave differently, since --delete-after com- putes the deletions in a separate pass after all updates are done). If the number of removed files overflows an internal buffer, a temporary file will be created on the receiving side to hold the names (it is removed while open, so you shouldn’t see it during the transfer). If the creation of the temporary file fails, rsync will try to fall back to using --delete-after (which it cannot do if --recursive is doing an incremental scan). See --delete (which is implied) for more details on file-deletion. --delete-after Request that the file-deletions on the receiving side be done after the transfer has completed. This is useful if you are sending new per-directory merge files as a part of the transfer and you want their exclusions to take effect for the delete phase of the current transfer. It also forces rsync to use the old, non-incre- mental recursion algorithm that requires rsync to scan all the files in the transfer into memory at once (see --recursive). See --delete (which is implied) for more details on file-deletion. --delete-excluded In addition to deleting the files on the receiving side that are not on the sending side, this tells rsync to also delete any files on the receiving side that are excluded (see --exclude). See the FILTER RULES section for a way to make individual exclusions behave this way on the receiver, and for a way to protect files from --delete-excluded. See --delete (which is implied) for more details on file-deletion. --ignore-missing-args When rsync is first processing the explicitly requested source files (e.g. command-line arguments or --files-from entries), it is normally an error if the file can- not be found. This option suppresses that error, and does not try to transfer the file. This does not affect subsequent vanished-file errors if a file was ini- tially found to be present and later is no longer there. --delete-missing-args This option takes the behavior of (the implied) --ignore-missing-args option a step farther: each missing arg will become a deletion request of the corresponding destination file on the receiving side (should it exist). If the destination file is a non-empty directory, it will only be successfully deleted if --force or --delete are in effect. Other than that, this option is independent of any other type of delete processing. The missing source files are represented by special file-list entries which display as a "*missing" entry in the --list-only output. --ignore-errors Tells --delete to go ahead and delete files even when there are I/O errors. --force This option tells rsync to delete a non-empty directory when it is to be replaced by a non-directory. This is only relevant if deletions are not active (see --delete for details). Note for older rsync versions: --force used to still be required when using --delete-after, and it used to be non-functional unless the --recursive option was also enabled. --max-delete=NUM This tells rsync not to delete more than NUM files or directories. If that limit is exceeded, all further deletions are skipped through the end of the transfer. At the end, rsync outputs a warning (including a count of the skipped deletions) and exits with an error code of 25 (unless some more important error condition also occurred). Beginning with version 3.0.0, you may specify --max-delete=0 to be warned about any extraneous files in the destination without removing any of them. Older clients interpreted this as "unlimited", so if you don’t know what version the client is, you can use the less obvious --max-delete=-1 as a backward-compatible way to spec- ify that no deletions be allowed (though really old versions didn’t warn when the limit was exceeded). --max-size=SIZE This tells rsync to avoid transferring any file that is larger than the specified SIZE. The SIZE value can be suffixed with a string to indicate a size multiplier, and may be a fractional value (e.g. "--max-size=1.5m"). This option is a transfer rule, not an exclude, so it doesn’t affect the data that goes into the file-lists, and thus it doesn’t affect deletions. It just limits the files that the receiver requests to be transferred. The suffixes are as follows: "K" (or "KiB") is a kibibyte (1024), "M" (or "MiB") is a mebibyte (1024*1024), and "G" (or "GiB") is a gibibyte (1024*1024*1024). If you want the multiplier to be 1000 instead of 1024, use "KB", "MB", or "GB". (Note: lower-case is also accepted for all values.) Finally, if the suffix ends in either "+1" or "-1", the value will be offset by one byte in the indicated direction. Examples: --max-size=1.5mb-1 is 1499999 bytes, and --max-size=2g+1 is 2147483649 bytes. Note that rsync versions prior to 3.1.0 did not allow --max-size=0. --min-size=SIZE This tells rsync to avoid transferring any file that is smaller than the specified SIZE, which can help in not transferring small, junk files. See the --max-size option for a description of SIZE and other information. Note that rsync versions prior to 3.1.0 did not allow --min-size=0. -B, --block-size=BLOCKSIZE This forces the block size used in rsync’s delta-transfer algorithm to a fixed value. It is normally selected based on the size of each file being updated. See the technical report for details. -e, --rsh=COMMAND This option allows you to choose an alternative remote shell program to use for communication between the local and remote copies of rsync. Typically, rsync is con- figured to use ssh by default, but you may prefer to use rsh on a local network. If this option is used with [user@]host::module/path, then the remote shell COMMAND will be used to run an rsync daemon on the remote host, and all data will be transmitted through that remote shell connection, rather than through a direct socket connection to a running rsync daemon on the remote host. See the section "USING RSYNC-DAEMON FEATURES VIA A REMOTE-SHELL CONNECTION" above. Command-line arguments are permitted in COMMAND provided that COMMAND is presented to rsync as a single argument. You must use spaces (not tabs or other white- space) to separate the command and args from each other, and you can use single- and/or double-quotes to preserve spaces in an argument (but not backslashes). Note that doubling a single-quote inside a single-quoted string gives you a single-quote; likewise for double-quotes (though you need to pay attention to which quotes your shell is parsing and which quotes rsync is parsing). Some examples: -e 'ssh -p 2234' -e 'ssh -o "ProxyCommand nohup ssh firewall nc -w1 %h %p"' (Note that ssh users can alternately customize site-specific connect options in their .ssh/config file.) You can also choose the remote shell program using the RSYNC_RSH environment variable, which accepts the same range of values as -e. See also the --blocking-io option which is affected by this option. --rsync-path=PROGRAM Use this to specify what program is to be run on the remote machine to start-up rsync. Often used when rsync is not in the default remote-shell’s path (e.g. --rsync-path=/usr/local/bin/rsync). Note that PROGRAM is run with the help of a shell, so it can be any program, script, or command sequence you’d care to run, so long as it does not corrupt the standard-in & standard-out that rsync is using to communicate. One tricky example is to set a different default directory on the remote machine for use with the --relative option. For instance: rsync -avR --rsync-path="cd /a/b && rsync" host:c/d /e/ -M, --remote-option=OPTION This option is used for more advanced situations where you want certain effects to be limited to one side of the transfer only. For instance, if you want to pass --log-file=FILE and --fake-super to the remote system, specify it like this: rsync -av -M --log-file=foo -M--fake-super src/ dest/ If you want to have an option affect only the local side of a transfer when it normally affects both sides, send its negation to the remote side. Like this: rsync -av -x -M--no-x src/ dest/ Be cautious using this, as it is possible to toggle an option that will cause rsync to have a different idea about what data to expect next over the socket, and that will make it fail in a cryptic fashion. Note that it is best to use a separate --remote-option for each option you want to pass. This makes your useage compatible with the --protect-args option. If that option is off, any spaces in your remote options will be split by the remote shell unless you take steps to protect them. When performing a local transfer, the "local" side is the sender and the "remote" side is the receiver. Note some versions of the popt option-parsing library have a bug in them that prevents you from using an adjacent arg with an equal in it next to a short option letter (e.g. -M--log-file=/tmp/foo. If this bug affects your version of popt, you can use the version of popt that is included with rsync. -C, --cvs-exclude This is a useful shorthand for excluding a broad range of files that you often don’t want to transfer between systems. It uses a similar algorithm to CVS to deter- mine if a file should be ignored. The exclude list is initialized to exclude the following items (these initial items are marked as perishable -- see the FILTER RULES section): RCS SCCS CVS CVS.adm RCSLOG cvslog.* tags TAGS .make.state .nse_depinfo *~ #* .#* ,* _$* *$ *.old *.bak *.BAK *.orig *.rej .del-* *.a *.olb *.o *.obj *.so *.exe *.Z *.elc *.ln core .svn/ .git/ .hg/ .bzr/ then, files listed in a $HOME/.cvsignore are added to the list and any files listed in the CVSIGNORE environment variable (all cvsignore names are delimited by whitespace). Finally, any file is ignored if it is in the same directory as a .cvsignore file and matches one of the patterns listed therein. Unlike rsync’s filter/exclude files, these patterns are split on whitespace. See the cvs(1) manual for more information. If you’re combining -C with your own --filter rules, you should note that these CVS excludes are appended at the end of your own rules, regardless of where the -C was placed on the command-line. This makes them a lower priority than any rules you specified explicitly. If you want to control where these CVS excludes get inserted into your filter rules, you should omit the -C as a command-line option and use a combination of --filter=:C and --filter=-C (either on your command-line or by putting the ":C" and "-C" rules into a filter file with your other rules). The first option turns on the per-directory scanning for the .cvsignore file. The second option does a one-time import of the CVS excludes mentioned above. -f, --filter=RULE This option allows you to add rules to selectively exclude certain files from the list of files to be transferred. This is most useful in combination with a recur- sive transfer. You may use as many --filter options on the command line as you like to build up the list of files to exclude. If the filter contains whitespace, be sure to quote it so that the shell gives the rule to rsync as a single argument. The text below also mentions that you can use an underscore to replace the space that separates a rule from its arg. See the FILTER RULES section for detailed information on this option. -F The -F option is a shorthand for adding two --filter rules to your command. The first time it is used is a shorthand for this rule: --filter='dir-merge /.rsync-filter' This tells rsync to look for per-directory .rsync-filter files that have been sprinkled through the hierarchy and use their rules to filter the files in the trans- fer. If -F is repeated, it is a shorthand for this rule: --filter='exclude .rsync-filter' This filters out the .rsync-filter files themselves from the transfer. See the FILTER RULES section for detailed information on how these options work. --exclude=PATTERN This option is a simplified form of the --filter option that defaults to an exclude rule and does not allow the full rule-parsing syntax of normal filter rules. See the FILTER RULES section for detailed information on this option. --exclude-from=FILE This option is related to the --exclude option, but it specifies a FILE that contains exclude patterns (one per line). Blank lines in the file and lines starting with ’;’ or ’#’ are ignored. If FILE is -, the list will be read from standard input. --include=PATTERN This option is a simplified form of the --filter option that defaults to an include rule and does not allow the full rule-parsing syntax of normal filter rules. See the FILTER RULES section for detailed information on this option. --include-from=FILE This option is related to the --include option, but it specifies a FILE that contains include patterns (one per line). Blank lines in the file and lines starting with ’;’ or ’#’ are ignored. If FILE is -, the list will be read from standard input. --files-from=FILE Using this option allows you to specify the exact list of files to transfer (as read from the specified FILE or - for standard input). It also tweaks the default behavior of rsync to make transferring just the specified files and directories easier: o The --relative (-R) option is implied, which preserves the path information that is specified for each item in the file (use --no-relative or --no-R if you want to turn that off). o The --dirs (-d) option is implied, which will create directories specified in the list on the destination rather than noisily skipping them (use --no-dirs or --no-d if you want to turn that off). o The --archive (-a) option’s behavior does not imply --recursive (-r), so specify it explicitly, if you want it. o These side-effects change the default state of rsync, so the position of the --files-from option on the command-line has no bearing on how other options are parsed (e.g. -a works the same before or after --files-from, as does --no-R and all other options). The filenames that are read from the FILE are all relative to the source dir -- any leading slashes are removed and no ".." references are allowed to go higher than the source dir. For example, take this command: rsync -a --files-from=/tmp/foo /usr remote:/backup If /tmp/foo contains the string "bin" (or even "/bin"), the /usr/bin directory will be created as /backup/bin on the remote host. If it contains "bin/" (note the trailing slash), the immediate contents of the directory would also be sent (without needing to be explicitly mentioned in the file -- this began in version 2.6.4). In both cases, if the -r option was enabled, that dir’s entire hierarchy would also be transferred (keep in mind that -r needs to be specified explicitly with --files-from, since it is not implied by -a). Also note that the effect of the (enabled by default) --relative option is to duplicate only the path info that is read from the file -- it does not force the duplication of the source-spec path (/usr in this case). In addition, the --files-from file can be read from the remote host instead of the local host if you specify a "host:" in front of the file (the host must match one end of the transfer). As a short-cut, you can specify just a prefix of ":" to mean "use the remote end of the transfer". For example: rsync -a --files-from=:/path/file-list src:/ /tmp/copy This would copy all the files specified in the /path/file-list file that was located on the remote "src" host. If the --iconv and --protect-args options are specified and the --files-from filenames are being sent from one host to another, the filenames will be translated from the sending host’s charset to the receiving host’s charset. NOTE: sorting the list of files in the --files-from input helps rsync to be more efficient, as it will avoid re-visiting the path elements that are shared between adjacent entries. If the input is not sorted, some path elements (implied directories) may end up being scanned multiple times, and rsync will eventually undupli- cate them after they get turned into file-list elements. -0, --from0 This tells rsync that the rules/filenames it reads from a file are terminated by a null (’\0’) character, not a NL, CR, or CR+LF. This affects --exclude-from, --include-from, --files-from, and any merged files specified in a --filter rule. It does not affect --cvs-exclude (since all names read from a .cvsignore file are split on whitespace). -s, --protect-args This option sends all filenames and most options to the remote rsync without allowing the remote shell to interpret them. This means that spaces are not split in names, and any non-wildcard special characters are not translated (such as ~, $, ;, &, etc.). Wildcards are expanded on the remote host by rsync (instead of the shell doing it). If you use this option with --iconv, the args related to the remote side will also be translated from the local to the remote character-set. The translation hap- pens before wild-cards are expanded. See also the --files-from option. You may also control this option via the RSYNC_PROTECT_ARGS environment variable. If this variable has a non-zero value, this option will be enabled by default, otherwise it will be disabled by default. Either state is overridden by a manually specified positive or negative version of this option (note that --no-s and --no-protect-args are the negative versions). Since this option was first introduced in 3.0.0, you’ll need to make sure it’s disabled if you ever need to interact with a remote rsync that is older than that. Rsync can also be configured (at build time) to have this option enabled by default (with is overridden by both the environment and the command-line). This option will eventually become a new default setting at some as-yet-undetermined point in the future. -T, --temp-dir=DIR This option instructs rsync to use DIR as a scratch directory when creating temporary copies of the files transferred on the receiving side. The default behavior is to create each temporary file in the same directory as the associated destination file. This option is most often used when the receiving disk partition does not have enough free space to hold a copy of the largest file in the transfer. In this case (i.e. when the scratch directory is on a different disk partition), rsync will not be able to rename each received temporary file over the top of the associated destination file, but instead must copy it into place. Rsync does this by copying the file over the top of the destination file, which means that the destination file will contain truncated data during this copy. If this were not done this way (even if the destination file were first removed, the data locally copied to a temporary file in the destination directory, and then renamed into place) it would be possible for the old file to continue taking up disk space (if someone had it open), and thus there might not be enough room to fit the new version on the disk at the same time. If you are using this option for reasons other than a shortage of disk space, you may wish to combine it with the --delay-updates option, which will ensure that all copied files get put into subdirectories in the destination hierarchy, awaiting the end of the transfer. If you don’t have enough room to duplicate all the arriv- ing files on the destination partition, another way to tell rsync that you aren’t overly concerned about disk space is to use the --partial-dir option with a rela- tive path; because this tells rsync that it is OK to stash off a copy of a single file in a subdir in the destination hierarchy, rsync will use the partial-dir as a staging area to bring over the copied file, and then rename it into place from there. (Specifying a --partial-dir with an absolute path does not have this side-effect.) -y, --fuzzy This option tells rsync that it should look for a basis file for any destination file that is missing. The current algorithm looks in the same directory as the destination file for either a file that has an identical size and modified-time, or a similarly-named file. If found, rsync uses the fuzzy basis file to try to speed up the transfer. If the option is repeated, the fuzzy scan will also be done in any matching alternate destination directories that are specified via --compare-dest, --copy-dest, or --link-dest. Note that the use of the --delete option might get rid of any potential fuzzy-match files, so either use --delete-after or specify some filename exclusions if you need to prevent this. --compare-dest=DIR This option instructs rsync to use DIR on the destination machine as an additional hierarchy to compare destination files against doing transfers (if the files are missing in the destination directory). If a file is found in DIR that is identical to the sender’s file, the file will NOT be transferred to the destination direc- tory. This is useful for creating a sparse backup of just files that have changed from an earlier backup. This option is typically used to copy into an empty (or newly created) directory. Beginning in version 2.6.4, multiple --compare-dest directories may be provided, which will cause rsync to search the list in the order specified for an exact match. If a match is found that differs only in attributes, a local copy is made and the attributes updated. If a match is not found, a basis file from one of the DIRs will be selected to try to speed up the transfer. If DIR is a relative path, it is relative to the destination directory. See also --copy-dest and --link-dest. NOTE: beginning with version 3.1.0, rsync will remove a file from a non-empty destination hierarchy if an exact match is found in one of the compare-dest hierar- chies (making the end result more closely match a fresh copy). --copy-dest=DIR This option behaves like --compare-dest, but rsync will also copy unchanged files found in DIR to the destination directory using a local copy. This is useful for doing transfers to a new destination while leaving existing files intact, and then doing a flash-cutover when all files have been successfully transferred. Multiple --copy-dest directories may be provided, which will cause rsync to search the list in the order specified for an unchanged file. If a match is not found, a basis file from one of the DIRs will be selected to try to speed up the transfer. If DIR is a relative path, it is relative to the destination directory. See also --compare-dest and --link-dest. --link-dest=DIR This option behaves like --copy-dest, but unchanged files are hard linked from DIR to the destination directory. The files must be identical in all preserved attributes (e.g. permissions, possibly ownership) in order for the files to be linked together. An example: rsync -av --link-dest=$PWD/prior_dir host:src_dir/ new_dir/ If file’s aren’t linking, double-check their attributes. Also check if some attributes are getting forced outside of rsync’s control, such a mount option that squishes root to a single user, or mounts a removable drive with generic ownership (such as OS X’s "Ignore ownership on this volume" option). Beginning in version 2.6.4, multiple --link-dest directories may be provided, which will cause rsync to search the list in the order specified for an exact match. If a match is found that differs only in attributes, a local copy is made and the attributes updated. If a match is not found, a basis file from one of the DIRs will be selected to try to speed up the transfer. This option works best when copying into an empty destination hierarchy, as existing files may get their attributes tweaked, and that can affect alternate destina- tion files via hard-links. Also, itemizing of changes can get a bit muddled. Note that prior to version 3.1.0, an alternate-directory exact match would never be found (nor linked into the destination) when a destination file already exists. Note that if you combine this option with --ignore-times, rsync will not link any files together because it only links identical files together as a substitute for transferring the file, never as an additional check after the file is updated. If DIR is a relative path, it is relative to the destination directory. See also --compare-dest and --copy-dest. Note that rsync versions prior to 2.6.1 had a bug that could prevent --link-dest from working properly for a non-super-user when -o was specified (or implied by -a). You can work-around this bug by avoiding the -o option when sending to an old rsync. -z, --compress With this option, rsync compresses the file data as it is sent to the destination machine, which reduces the amount of data being transmitted -- something that is useful over a slow connection. Note that this option typically achieves better compression ratios than can be achieved by using a compressing remote shell or a compressing transport because it takes advantage of the implicit information in the matching data blocks that are not explicitly sent over the connection. See the --skip-compress option for the default list of file suffixes that will not be compressed. --compress-level=NUM Explicitly set the compression level to use (see --compress) instead of letting it default. If NUM is non-zero, the --compress option is implied. --skip-compress=LIST Override the list of file suffixes that will not be compressed. The LIST should be one or more file suffixes (without the dot) separated by slashes (/). You may specify an empty string to indicate that no file should be skipped. Simple character-class matching is supported: each must consist of a list of letters inside the square brackets (e.g. no special classes, such as "[:alpha:]", are supported, and ’-’ has no special meaning). The characters asterisk (*) and question-mark (?) have no special meaning. Here’s an example that specifies 6 suffixes to skip (since 1 of the 5 rules matches 2 suffixes): --skip-compress=gz/jpg/mp[34]/7z/bz2 The default list of suffixes that will not be compressed is this (in this version of rsync): 7z ace avi bz2 deb gpg gz iso jpeg jpg lz lzma lzo mov mp3 mp4 ogg png rar rpm rzip tbz tgz tlz txz xz z zip This list will be replaced by your --skip-compress list in all but one situation: a copy from a daemon rsync will add your skipped suffixes to its list of non-com- pressing files (and its list may be configured to a different default). --numeric-ids With this option rsync will transfer numeric group and user IDs rather than using user and group names and mapping them at both ends. By default rsync will use the username and groupname to determine what ownership to give files. The special uid 0 and the special group 0 are never mapped via user/group names even if the --numeric-ids option is not specified. If a user or group has no name on the source system or it has no match on the destination system, then the numeric ID from the source system is used instead. See also the comments on the "use chroot" setting in the rsyncd.conf manpage for information on how the chroot setting affects rsync’s ability to look up the names of the users and groups and what you can do about it. --usermap=STRING, --groupmap=STRING These options allow you to specify users and groups that should be mapped to other values by the receiving side. The STRING is one or more FROM:TO pairs of values separated by commas. Any matching FROM value from the sender is replaced with a TO value from the receiver. You may specify usernames or user IDs for the FROM and TO values, and the FROM value may also be a wild-card string, which will be matched against the sender’s names (wild-cards do NOT match against ID numbers, though see below for why a ’*’ matches everything). You may instead specify a range of ID numbers via an inclusive range: LOW-HIGH. For example: --usermap=0-99:nobody,wayne:admin,*:normal --groupmap=usr:1,1:usr The first match in the list is the one that is used. You should specify all your user mappings using a single --usermap option, and/or all your group mappings using a single --groupmap option. Note that the sender’s name for the 0 user and group are not transmitted to the receiver, so you should either match these values using a 0, or use the names in effect on the receiving side (typically "root"). All other FROM names match those in use on the sending side. All TO names match those in use on the receiving side. Any IDs that do not have a name on the sending side are treated as having an empty name for the purpose of matching. This allows them to be matched via a "*" or using an empty name. For instance: --usermap=:nobody --groupmap=*:nobody When the --numeric-ids option is used, the sender does not send any names, so all the IDs are treated as having an empty name. This means that you will need to specify numeric FROM values if you want to map these nameless IDs to different values. For the --usermap option to have any effect, the -o (--owner) option must be used (or implied), and the receiver will need to be running as a super-user (see also the --fake-super option). For the --groupmap option to have any effect, the -g (--groups) option must be used (or implied), and the receiver will need to have per- missions to set that group. --chown=USER:GROUP This option forces all files to be owned by USER with group GROUP. This is a simpler interface than using --usermap and --groupmap directly, but it is implemented using those options internally, so you cannot mix them. If either the USER or GROUP is empty, no mapping for the omitted user/group will occur. If GROUP is empty, the trailing colon may be omitted, but if USER is empty, a leading colon must be supplied. If you specify "--chown=foo:bar, this is exactly the same as specifying "--usermap=*:foo --groupmap=*:bar", only easier. --timeout=TIMEOUT This option allows you to set a maximum I/O timeout in seconds. If no data is transferred for the specified time then rsync will exit. The default is 0, which means no timeout. --contimeout This option allows you to set the amount of time that rsync will wait for its connection to an rsync daemon to succeed. If the timeout is reached, rsync exits with an error. --address By default rsync will bind to the wildcard address when connecting to an rsync daemon. The --address option allows you to specify a specific IP address (or host- name) to bind to. See also this option in the --daemon mode section. --port=PORT This specifies an alternate TCP port number to use rather than the default of 873. This is only needed if you are using the double-colon (::) syntax to connect with an rsync daemon (since the URL syntax has a way to specify the port as a part of the URL). See also this option in the --daemon mode section. --sockopts This option can provide endless fun for people who like to tune their systems to the utmost degree. You can set all sorts of socket options which may make transfers faster (or slower!). Read the man page for the setsockopt() system call for details on some of the options you may be able to set. By default no special socket options are set. This only affects direct socket connections to a remote rsync daemon. This option also exists in the --daemon mode section. --blocking-io This tells rsync to use blocking I/O when launching a remote shell transport. If the remote shell is either rsh or remsh, rsync defaults to using blocking I/O, otherwise it defaults to using non-blocking I/O. (Note that ssh prefers non-blocking I/O.) --outbuf=MODE This sets the output buffering mode. The mode can be None (aka Unbuffered), Line, or Block (aka Full). You may specify as little as a single letter for the mode, and use upper or lower case. The main use of this option is to change Full buffering to Line buffering when rsync’s output is going to a file or pipe. -i, --itemize-changes Requests a simple itemized list of the changes that are being made to each file, including attribute changes. This is exactly the same as specifying --out-for- mat='%i %n%L'. If you repeat the option, unchanged files will also be output, but only if the receiving rsync is at least version 2.6.7 (you can use -vv with older versions of rsync, but that also turns on the output of other verbose messages). The "%i" escape has a cryptic output that is 11 letters long. The general format is like the string YXcstpoguax, where Y is replaced by the type of update being done, X is replaced by the file-type, and the other letters represent attributes that may be output if they are being modified. The update types that replace the Y are as follows: o A < means that a file is being transferred to the remote host (sent). o A > means that a file is being transferred to the local host (received). o A c means that a local change/creation is occurring for the item (such as the creation of a directory or the changing of a symlink, etc.). o A h means that the item is a hard link to another item (requires --hard-links). o A . means that the item is not being updated (though it might have attributes that are being modified). o A * means that the rest of the itemized-output area contains a message (e.g. "deleting"). The file-types that replace the X are: f for a file, a d for a directory, an L for a symlink, a D for a device, and a S for a special file (e.g. named sockets and fifos). The other letters in the string above are the actual letters that will be output if the associated attribute for the item is being updated or a "." for no change. Three exceptions to this are: (1) a newly created item replaces each letter with a "+", (2) an identical item replaces the dots with spaces, and (3) an unknown attribute replaces each letter with a "?" (this can happen when talking to an older rsync). The attribute that is associated with each letter is as follows: o A c means either that a regular file has a different checksum (requires --checksum) or that a symlink, device, or special file has a changed value. Note that if you are sending files to an rsync prior to 3.0.1, this change flag will be present only for checksum-differing regular files. o A s means the size of a regular file is different and will be updated by the file transfer. o A t means the modification time is different and is being updated to the sender’s value (requires --times). An alternate value of T means that the modifica- tion time will be set to the transfer time, which happens when a file/symlink/device is updated without --times and when a symlink is changed and the receiver can’t set its time. (Note: when using an rsync 3.0.0 client, you might see the s flag combined with t instead of the proper T flag for this time-setting failure.) o A p means the permissions are different and are being updated to the sender’s value (requires --perms). o An o means the owner is different and is being updated to the sender’s value (requires --owner and super-user privileges). o A g means the group is different and is being updated to the sender’s value (requires --group and the authority to set the group). o The u slot is reserved for future use. o The a means that the ACL information changed. o The x means that the extended attribute information changed. One other output is possible: when deleting files, the "%i" will output the string "*deleting" for each item that is being removed (assuming that you are talking to a recent enough rsync that it logs deletions instead of outputting them as a verbose message). --out-format=FORMAT This allows you to specify exactly what the rsync client outputs to the user on a per-update basis. The format is a text string containing embedded single-charac- ter escape sequences prefixed with a percent (%) character. A default format of "%n%L" is assumed if either --info=name or -v is specified (this tells you just the name of the file and, if the item is a link, where it points). For a full list of the possible escape characters, see the "log format" setting in the rsyncd.conf manpage. Specifying the --out-format option implies the --info=name option, which will mention each file, dir, etc. that gets updated in a significant way (a transferred file, a recreated symlink/device, or a touched directory). In addition, if the itemize-changes escape (%i) is included in the string (e.g. if the --itemize-changes option was used), the logging of names increases to mention any item that is changed in any way (as long as the receiving side is at least 2.6.4). See the --item- ize-changes option for a description of the output of "%i". Rsync will output the out-format string prior to a file’s transfer unless one of the transfer-statistic escapes is requested, in which case the logging is done at the end of the file’s transfer. When this late logging is in effect and --progress is also specified, rsync will also output the name of the file being transferred prior to its progress information (followed, of course, by the out-format output). --log-file=FILE This option causes rsync to log what it is doing to a file. This is similar to the logging that a daemon does, but can be requested for the client side and/or the server side of a non-daemon transfer. If specified as a client option, transfer logging will be enabled with a default format of "%i %n%L". See the --log-file-format option if you wish to override this. Here’s a example command that requests the remote side to log what is happening: rsync -av --remote-option=--log-file=/tmp/rlog src/ dest/ This is very useful if you need to debug why a connection is closing unexpectedly. --log-file-format=FORMAT This allows you to specify exactly what per-update logging is put into the file specified by the --log-file option (which must also be specified for this option to have any effect). If you specify an empty string, updated files will not be mentioned in the log file. For a list of the possible escape characters, see the "log format" setting in the rsyncd.conf manpage. The default FORMAT used if --log-file is specified and this option is not is ’%i %n%L’. --stats This tells rsync to print a verbose set of statistics on the file transfer, allowing you to tell how effective rsync’s delta-transfer algorithm is for your data. This option is equivalent to --info=stats2 if combined with 0 or 1 -v options, or --info=stats3 if combined with 2 or more -v options. The current statistics are as follows: o Number of files is the count of all "files" (in the generic sense), which includes directories, symlinks, etc. The total count will be followed by a list of counts by filetype (if the total is non-zero). For example: "(reg: 5, dir: 3, link: 2, dev: 1, special: 1)" lists the totals for regular files, directories, symlinks, devices, and special files. If any of value is 0, it is completely omitted from the list. o Number of created files is the count of how many "files" (generic sense) were created (as opposed to updated). The total count will be followed by a list of counts by filetype (if the total is non-zero). o Number of deleted files is the count of how many "files" (generic sense) were created (as opposed to updated). The total count will be followed by a list of counts by filetype (if the total is non-zero). Note that this line is only output if deletions are in effect, and only if protocol 31 is being used (the default for rsync 3.1.x). o Number of regular files transferred is the count of normal files that were updated via rsync’s delta-transfer algorithm, which does not include dirs, sym- links, etc. Note that rsync 3.1.0 added the word "regular" into this heading. o Total file size is the total sum of all file sizes in the transfer. This does not count any size for directories or special files, but does include the size of symlinks. o Total transferred file size is the total sum of all files sizes for just the transferred files. o Literal data is how much unmatched file-update data we had to send to the receiver for it to recreate the updated files. o Matched data is how much data the receiver got locally when recreating the updated files. o File list size is how big the file-list data was when the sender sent it to the receiver. This is smaller than the in-memory size for the file list due to some compressing of duplicated data when rsync sends the list. o File list generation time is the number of seconds that the sender spent creating the file list. This requires a modern rsync on the sending side for this to be present. o File list transfer time is the number of seconds that the sender spent sending the file list to the receiver. o Total bytes sent is the count of all the bytes that rsync sent from the client side to the server side. o Total bytes received is the count of all non-message bytes that rsync received by the client side from the server side. "Non-message" bytes means that we don’t count the bytes for a verbose message that the server sent to us, which makes the stats more consistent. -8, --8-bit-output This tells rsync to leave all high-bit characters unescaped in the output instead of trying to test them to see if they’re valid in the current locale and escaping the invalid ones. All control characters (but never tabs) are always escaped, regardless of this option’s setting. The escape idiom that started in 2.6.7 is to output a literal backslash (\) and a hash (#), followed by exactly 3 octal digits. For example, a newline would output as "\#012". A literal backslash that is in a filename is not escaped unless it is followed by a hash and 3 digits (0-9). -h, --human-readable Output numbers in a more human-readable format. There are 3 possible levels: (1) output numbers with a separator between each set of 3 digits (either a comma or a period, depending on if the decimal point is represented by a period or a comma); (2) output numbers in units of 1000 (with a character suffix for larger units -- see below); (3) output numbers in units of 1024. The default is human-readable level 1. Each -h option increases the level by one. You can take the level down to 0 (to output numbers as pure digits) by specifing the --no-human-readable (--no-h) option. The unit letters that are appended in levels 2 and 3 are: K (kilo), M (mega), G (giga), or T (tera). For example, a 1234567-byte file would output as 1.23M in level-2 (assuming that a period is your local decimal point). Backward compatibility note: versions of rsync prior to 3.1.0 do not support human-readable level 1, and they default to level 0. Thus, specifying one or two -h options will behave in a comparable manner in old and new versions as long as you didn’t specify a --no-h option prior to one or more -h options. See the --list-only option for one difference. --partial By default, rsync will delete any partially transferred file if the transfer is interrupted. In some circumstances it is more desirable to keep partially trans- ferred files. Using the --partial option tells rsync to keep the partial file which should make a subsequent transfer of the rest of the file much faster. --partial-dir=DIR A better way to keep partial files than the --partial option is to specify a DIR that will be used to hold the partial data (instead of writing it out to the desti- nation file). On the next transfer, rsync will use a file found in this dir as data to speed up the resumption of the transfer and then delete it after it has served its purpose. Note that if --whole-file is specified (or implied), any partial-dir file that is found for a file that is being updated will simply be removed (since rsync is sending files without using rsync’s delta-transfer algorithm). Rsync will create the DIR if it is missing (just the last dir -- not the whole path). This makes it easy to use a relative path (such as "--partial-dir=.rsync-par- tial") to have rsync create the partial-directory in the destination file’s directory when needed, and then remove it again when the partial file is deleted. If the partial-dir value is not an absolute path, rsync will add an exclude rule at the end of all your existing excludes. This will prevent the sending of any partial-dir files that may exist on the sending side, and will also prevent the untimely deletion of partial-dir items on the receiving side. An example: the above --partial-dir option would add the equivalent of "-f '-p .rsync-partial/'" at the end of any other filter rules. If you are supplying your own exclude rules, you may need to add your own exclude/hide/protect rule for the partial-dir because (1) the auto-added rule may be inef- fective at the end of your other rules, or (2) you may wish to override rsync’s exclude choice. For instance, if you want to make rsync clean-up any left-over par- tial-dirs that may be lying around, you should specify --delete-after and add a "risk" filter rule, e.g. -f 'R .rsync-partial/'. (Avoid using --delete-before or --delete-during unless you don’t need rsync to use any of the left-over partial-dir data during the current run.) IMPORTANT: the --partial-dir should not be writable by other users or it is a security risk. E.g. AVOID "/tmp". You can also set the partial-dir value the RSYNC_PARTIAL_DIR environment variable. Setting this in the environment does not force --partial to be enabled, but rather it affects where partial files go when --partial is specified. For instance, instead of using --partial-dir=.rsync-tmp along with --progress, you could set RSYNC_PARTIAL_DIR=.rsync-tmp in your environment and then just use the -P option to turn on the use of the .rsync-tmp dir for partial transfers. The only times that the --partial option does not look for this environment value are (1) when --inplace was specified (since --inplace conflicts with --partial-dir), and (2) when --delay-updates was specified (see below). For the purposes of the daemon-config’s "refuse options" setting, --partial-dir does not imply --partial. This is so that a refusal of the --partial option can be used to disallow the overwriting of destination files with a partial transfer, while still allowing the safer idiom provided by --partial-dir. --delay-updates This option puts the temporary file from each updated file into a holding directory until the end of the transfer, at which time all the files are renamed into place in rapid succession. This attempts to make the updating of the files a little more atomic. By default the files are placed into a directory named ".~tmp~" in each file’s destination directory, but if you’ve specified the --partial-dir option, that directory will be used instead. See the comments in the --partial-dir section for a discussion of how this ".~tmp~" dir will be excluded from the transfer, and what you can do if you want rsync to cleanup old ".~tmp~" dirs that might be lying around. Conflicts with --inplace and --append. This option uses more memory on the receiving side (one bit per file transferred) and also requires enough free disk space on the receiving side to hold an addi- tional copy of all the updated files. Note also that you should not use an absolute path to --partial-dir unless (1) there is no chance of any of the files in the transfer having the same name (since all the updated files will be put into a single directory if the path is absolute) and (2) there are no mount points in the hierarchy (since the delayed updates will fail if they can’t be renamed into place). See also the "atomic-rsync" perl script in the "support" subdir for an update algorithm that is even more atomic (it uses --link-dest and a parallel hierarchy of files). -m, --prune-empty-dirs This option tells the receiving rsync to get rid of empty directories from the file-list, including nested directories that have no non-directory children. This is useful for avoiding the creation of a bunch of useless directories when the sending rsync is recursively scanning a hierarchy of files using include/exclude/filter rules. Note that the use of transfer rules, such as the --min-size option, does not affect what goes into the file list, and thus does not leave directories empty, even if none of the files in a directory match the transfer rule. Because the file-list is actually being pruned, this option also affects what directories get deleted when a delete is active. However, keep in mind that excluded files and directories can prevent existing items from being deleted due to an exclude both hiding source files and protecting destination files. See the perishable filter-rule option for how to avoid this. You can prevent the pruning of certain empty directories from the file-list by using a global "protect" filter. For instance, this option would ensure that the directory "emptydir" was kept in the file-list: --filter ’protect emptydir/’ Here’s an example that copies all .pdf files in a hierarchy, only creating the necessary destination directories to hold the .pdf files, and ensures that any super- fluous files and directories in the destination are removed (note the hide filter of non-directories being used instead of an exclude): rsync -avm --del --include=’*.pdf’ -f ’hide,! */’ src/ dest If you didn’t want to remove superfluous destination files, the more time-honored options of "--include='*/' --exclude='*'" would work fine in place of the hide-filter (if that is more natural to you). --progress This option tells rsync to print information showing the progress of the transfer. This gives a bored user something to watch. With a modern rsync this is the same as specifying --info=flist2,name,progress, but any user-supplied settings for those info flags takes precedence (e.g. "--info=flist0 --progress"). While rsync is transferring a regular file, it updates a progress line that looks like this: 782448 63% 110.64kB/s 0:00:04 In this example, the receiver has reconstructed 782448 bytes or 63% of the sender’s file, which is being reconstructed at a rate of 110.64 kilobytes per second, and the transfer will finish in 4 seconds if the current rate is maintained until the end. These statistics can be misleading if rsync’s delta-transfer algorithm is in use. For example, if the sender’s file consists of the basis file followed by addi- tional data, the reported rate will probably drop dramatically when the receiver gets to the literal data, and the transfer will probably take much longer to finish than the receiver estimated as it was finishing the matched part of the file. When the file transfer finishes, rsync replaces the progress line with a summary line that looks like this: 1,238,099 100% 146.38kB/s 0:00:08 (xfr#5, to-chk=169/396) In this example, the file was 1,238,099 bytes long in total, the average rate of transfer for the whole file was 146.38 kilobytes per second over the 8 seconds that it took to complete, it was the 5th transfer of a regular file during the current rsync session, and there are 169 more files for the receiver to check (to see if they are up-to-date or not) remaining out of the 396 total files in the file-list. In an incremental recursion scan, rsync won’t know the total number of files in the file-list until it reaches the ends of the scan, but since it starts to transfer files during the scan, it will display a line with the text "ir-chk" (for incremental recursion check) instead of "to-chk" until the point that it knows the full size of the list, at which point it will switch to using "to-chk". Thus, seeing "ir-chk" lets you know that the total count of files in the file list is still going to increase (and each time it does, the count of files left to check will increase by the number of the files added to the list). -P The -P option is equivalent to --partial --progress. Its purpose is to make it much easier to specify these two options for a long transfer that may be inter- rupted. There is also a --info=progress2 option that outputs statistics based on the whole transfer, rather than individual files. Use this flag without outputting a file- name (e.g. avoid -v or specify --info=name0 if you want to see how the transfer is doing without scrolling the screen with a lot of names. (You don’t need to spec- ify the --progress option in order to use --info=progress2.) --password-file=FILE This option allows you to provide a password for accessing an rsync daemon via a file or via standard input if FILE is -. The file should contain just the password on the first line (all other lines are ignored). Rsync will exit with an error if FILE is world readable or if a root-run rsync command finds a non-root-owned file. This option does not supply a password to a remote shell transport such as ssh; to learn how to do that, consult the remote shell’s documentation. When accessing an rsync daemon using a remote shell as the transport, this option only comes into effect after the remote shell finishes its authentication (i.e. if you have also specified a password in the daemon’s config file). --list-only This option will cause the source files to be listed instead of transferred. This option is inferred if there is a single source arg and no destination specified, so its main uses are: (1) to turn a copy command that includes a destination arg into a file-listing command, or (2) to be able to specify more than one source arg (note: be sure to include the destination). Caution: keep in mind that a source arg with a wild-card is expanded by the shell into multiple args, so it is never safe to try to list such an arg without using this option. For example: rsync -av --list-only foo* dest/ Starting with rsync 3.1.0, the sizes output by --list-only are affected by the --human-readable option. By default they will contain digit separators, but higher levels of readability will output the sizes with unit suffixes. Note also that the column width for the size output has increased from 11 to 14 characters for all human-readable levels. Use --no-h if you want just digits in the sizes, and the old column width of 11 characters. Compatibility note: when requesting a remote listing of files from an rsync that is version 2.6.3 or older, you may encounter an error if you ask for a non-recur- sive listing. This is because a file listing implies the --dirs option w/o --recursive, and older rsyncs don’t have that option. To avoid this problem, either specify the --no-dirs option (if you don’t need to expand a directory’s content), or turn on recursion and exclude the content of subdirectories: -r --exclude='/*/*'. --bwlimit=RATE This option allows you to specify the maximum transfer rate for the data sent over the socket, specified in units per second. The RATE value can be suffixed with a string to indicate a size multiplier, and may be a fractional value (e.g. "--bwlimit=1.5m"). If no suffix is specified, the value will be assumed to be in units of 1024 bytes (as if "K" or "KiB" had been appended). See the --max-size option for a description of all the available suffixes. A value of zero specifies no limit. For backward-compatibility reasons, the rate limit will be rounded to the nearest KiB unit, so no rate smaller than 1024 bytes per second is possible. Rsync writes data over the socket in blocks, and this option both limits the size of the blocks that rsync writes, and tries to keep the average transfer rate at the requested limit. Some "burstiness" may be seen where rsync writes out a block of data and then sleeps to bring the average rate into compliance. Due to the internal buffering of data, the --progress option may not be an accurate reflection on how fast the data is being sent. This is because some files can show up as being rapidly sent when the data is quickly buffered, while other can show up as very slow when the flushing of the output buffer occurs. This may be fixed in a future version. --stop-at=y-m-dTh:m This option allows you to specify at what time to stop rsync, in year-month-dayThour:minute numeric format (e.g. 2004-12-31T23:59). You can specify a 2 or 4-digit year. You can also leave off various items and the result will be the next possible time that matches the specified data. For example, "1-30" specifies the next January 30th (at midnight), "04:00" specifies the next 4am, "1" specifies the next 1st of the month at midnight, and ":59" specifies the next 59th minute after the hour. If you prefer, you may separate the date numbers using slashes instead of dashes. --time-limit=MINS This option allows you to specify the maximum number of minutes rsync will run for. --write-batch=FILE Record a file that can later be applied to another identical destination with --read-batch. See the "BATCH MODE" section for details, and also the --only-write-batch option. --only-write-batch=FILE Works like --write-batch, except that no updates are made on the destination system when creating the batch. This lets you transport the changes to the destination system via some other means and then apply the changes via --read-batch. Note that you can feel free to write the batch directly to some portable media: if this media fills to capacity before the end of the transfer, you can just apply that partial transfer to the destination and repeat the whole process to get the rest of the changes (as long as you don’t mind a partially updated destination sys- tem while the multi-update cycle is happening). Also note that you only save bandwidth when pushing changes to a remote system because this allows the batched data to be diverted from the sender into the batch file without having to flow over the wire to the receiver (when pulling, the sender is remote, and thus can’t write the batch). --read-batch=FILE Apply all of the changes stored in FILE, a file previously generated by --write-batch. If FILE is -, the batch data will be read from standard input. See the "BATCH MODE" section for details. --protocol=NUM Force an older protocol version to be used. This is useful for creating a batch file that is compatible with an older version of rsync. For instance, if rsync 2.6.4 is being used with the --write-batch option, but rsync 2.6.3 is what will be used to run the --read-batch option, you should use "--protocol=28" when creating the batch file to force the older protocol version to be used in the batch file (assuming you can’t upgrade the rsync on the reading system). --iconv=CONVERT_SPEC Rsync can convert filenames between character sets using this option. Using a CONVERT_SPEC of "." tells rsync to look up the default character-set via the locale setting. Alternately, you can fully specify what conversion to do by giving a local and a remote charset separated by a comma in the order --iconv=LOCAL,REMOTE, e.g. --iconv=utf8,iso88591. This order ensures that the option will stay the same whether you’re pushing or pulling files. Finally, you can specify either --no-iconv or a CONVERT_SPEC of "-" to turn off any conversion. The default setting of this option is site-specific, and can also be affected via the RSYNC_ICONV environment variable. For a list of what charset names your local iconv library supports, you can run "iconv --list". If you specify the --protect-args option (-s), rsync will translate the filenames you specify on the command-line that are being sent to the remote host. See also the --files-from option. Note that rsync does not do any conversion of names in filter files (including include/exclude files). It is up to you to ensure that you’re specifying matching rules that can match on both sides of the transfer. For instance, you can specify extra include/exclude rules if there are filename differences on the two sides that need to be accounted for. When you pass an --iconv option to an rsync daemon that allows it, the daemon uses the charset specified in its "charset" configuration parameter regardless of the remote charset you actually pass. Thus, you may feel free to specify just the local charset for a daemon transfer (e.g. --iconv=utf8). -4, --ipv4 or -6, --ipv6 Tells rsync to prefer IPv4/IPv6 when creating sockets. This only affects sockets that rsync has direct control over, such as the outgoing socket when directly con- tacting an rsync daemon. See also these options in the --daemon mode section. If rsync was complied without support for IPv6, the --ipv6 option will have no effect. The --version output will tell you if this is the case. --checksum-seed=NUM Set the checksum seed to the integer NUM. This 4 byte checksum seed is included in each block and MD4 file checksum calculation (the more modern MD5 file checksums don’t use a seed). By default the checksum seed is generated by the server and defaults to the current time() . This option is used to set a specific checksum seed, which is useful for applications that want repeatable block checksums, or in the case where the user wants a more random checksum seed. Setting NUM to 0 causes rsync to use the default of time() for checksum seed. DAEMON OPTIONS The options allowed when starting an rsync daemon are as follows: --daemon This tells rsync that it is to run as a daemon. The daemon you start running may be accessed using an rsync client using the host::module or rsync://host/module/ syntax. If standard input is a socket then rsync will assume that it is being run via inetd, otherwise it will detach from the current terminal and become a background dae- mon. The daemon will read the config file (rsyncd.conf) on each connect made by a client and respond to requests accordingly. See the rsyncd.conf(5) man page for more details. --address By default rsync will bind to the wildcard address when run as a daemon with the --daemon option. The --address option allows you to specify a specific IP address (or hostname) to bind to. This makes virtual hosting possible in conjunction with the --config option. See also the "address" global option in the rsyncd.conf manpage. --bwlimit=RATE This option allows you to specify the maximum transfer rate for the data the daemon sends over the socket. The client can still specify a smaller --bwlimit value, but no larger value will be allowed. See the client version of this option (above) for some extra details. --config=FILE This specifies an alternate config file than the default. This is only relevant when --daemon is specified. The default is /etc/rsyncd.conf unless the daemon is running over a remote shell program and the remote user is not the super-user; in that case the default is rsyncd.conf in the current directory (typically $HOME). -M, --dparam=OVERRIDE This option can be used to set a daemon-config parameter when starting up rsync in daemon mode. It is equivalent to adding the parameter at the end of the global settings prior to the first module’s definition. The parameter names can be specified without spaces, if you so desire. For instance: rsync --daemon -M pidfile=/path/rsync.pid --no-detach When running as a daemon, this option instructs rsync to not detach itself and become a background process. This option is required when running as a service on Cygwin, and may also be useful when rsync is supervised by a program such as daemontools or AIX’s System Resource Controller. --no-detach is also recommended when rsync is run under a debugger. This option has no effect if rsync is run from inetd or sshd. --port=PORT This specifies an alternate TCP port number for the daemon to listen on rather than the default of 873. See also the "port" global option in the rsyncd.conf man- page. --log-file=FILE This option tells the rsync daemon to use the given log-file name instead of using the "log file" setting in the config file. --log-file-format=FORMAT This option tells the rsync daemon to use the given FORMAT string instead of using the "log format" setting in the config file. It also enables "transfer logging" unless the string is empty, in which case transfer logging is turned off. --sockopts This overrides the socket options setting in the rsyncd.conf file and has the same syntax. -v, --verbose This option increases the amount of information the daemon logs during its startup phase. After the client connects, the daemon’s verbosity level will be con- trolled by the options that the client used and the "max verbosity" setting in the module’s config section. -4, --ipv4 or -6, --ipv6 Tells rsync to prefer IPv4/IPv6 when creating the incoming sockets that the rsync daemon will use to listen for connections. One of these options may be required in older versions of Linux to work around an IPv6 bug in the kernel (if you see an "address already in use" error when nothing else is using the port, try specify- ing --ipv6 or --ipv4 when starting the daemon). If rsync was complied without support for IPv6, the --ipv6 option will have no effect. The --version output will tell you if this is the case. -h, --help When specified after --daemon, print a short help page describing the options available for starting an rsync daemon. FILTER RULES The filter rules allow for flexible selection of which files to transfer (include) and which files to skip (exclude). The rules either directly specify include/exclude patterns or they specify a way to acquire more include/exclude patterns (e.g. to read them from a file). As the list of files/directories to transfer is built, rsync checks each name to be transferred against the list of include/exclude patterns in turn, and the first match- ing pattern is acted on: if it is an exclude pattern, then that file is skipped; if it is an include pattern then that filename is not skipped; if no matching pattern is found, then the filename is not skipped. Rsync builds an ordered list of filter rules as specified on the command-line. Filter rules have the following syntax: RULE [PATTERN_OR_FILENAME] RULE,MODIFIERS [PATTERN_OR_FILENAME] You have your choice of using either short or long RULE names, as described below. If you use a short-named rule, the ’,’ separating the RULE from the MODIFIERS is optional. The PATTERN or FILENAME that follows (when present) must come after either a single space or an underscore (_). Here are the available rule prefixes: exclude, - specifies an exclude pattern. include, + specifies an include pattern. merge, . specifies a merge-file to read for more rules. dir-merge, : specifies a per-directory merge-file. hide, H specifies a pattern for hiding files from the transfer. show, S files that match the pattern are not hidden. protect, P specifies a pattern for protecting files from deletion. risk, R files that match the pattern are not protected. clear, ! clears the current include/exclude list (takes no arg) When rules are being read from a file, empty lines are ignored, as are comment lines that start with a "#". Note that the --include/--exclude command-line options do not allow the full range of rule parsing as described above -- they only allow the specification of include/exclude patterns plus a "!" token to clear the list (and the normal comment parsing when rules are read from a file). If a pattern does not begin with "- " (dash, space) or "+ " (plus, space), then the rule will be interpreted as if "+ " (for an include option) or "- " (for an exclude option) were prefixed to the string. A --filter option, on the other hand, must always contain either a short or long rule name at the start of the rule. Note also that the --filter, --include, and --exclude options take one rule/pattern each. To add multiple ones, you can repeat the options on the command-line, use the merge-file syntax of the --filter option, or the --include-from/--exclude-from options. INCLUDE/EXCLUDE PATTERN RULES You can include and exclude files by specifying patterns using the "+", "-", etc. filter rules (as introduced in the FILTER RULES section above). The include/exclude rules each specify a pattern that is matched against the names of the files that are going to be transferred. These patterns can take several forms: o if the pattern starts with a / then it is anchored to a particular spot in the hierarchy of files, otherwise it is matched against the end of the pathname. This is similar to a leading ^ in regular expressions. Thus "/foo" would match a name of "foo" at either the "root of the transfer" (for a global rule) or in the merge-file’s directory (for a per-directory rule). An unqualified "foo" would match a name of "foo" anywhere in the tree because the algorithm is applied recur- sively from the top down; it behaves as if each path component gets a turn at being the end of the filename. Even the unanchored "sub/foo" would match at any point in the hierarchy where a "foo" was found within a directory named "sub". See the section on ANCHORING INCLUDE/EXCLUDE PATTERNS for a full discussion of how to specify a pattern that matches at the root of the transfer. o if the pattern ends with a / then it will only match a directory, not a regular file, symlink, or device. o rsync chooses between doing a simple string match and wildcard matching by checking if the pattern contains one of these three wildcard characters: ’*’, ’?’, and ’[’ . o a ’*’ matches any path component, but it stops at slashes. o use ’**’ to match anything, including slashes. o a ’?’ matches any character except a slash (/). o a ’[’ introduces a character class, such as [a-z] or [[:alpha:]]. o in a wildcard pattern, a backslash can be used to escape a wildcard character, but it is matched literally when no wildcards are present. o if the pattern contains a / (not counting a trailing /) or a "**", then it is matched against the full pathname, including any leading directories. If the pattern doesn’t contain a / or a "**", then it is matched only against the final component of the filename. (Remember that the algorithm is applied recursively so "full filename" can actually be any portion of a path from the starting directory on down.) o a trailing "dir_name/***" will match both the directory (as if "dir_name/" had been specified) and everything in the directory (as if "dir_name/**" had been speci- fied). This behavior was added in version 2.6.7. Note that, when using the --recursive (-r) option (which is implied by -a), every subcomponent of every path is visited from the top down, so include/exclude patterns get applied recursively to each subcomponent’s full name (e.g. to include "/foo/bar/baz" the subcomponents "/foo" and "/foo/bar" must not be excluded). The exclude patterns actually short-circuit the directory traversal stage when rsync finds the files to send. If a pattern excludes a particular parent directory, it can render a deeper include pattern ineffectual because rsync did not descend through that excluded section of the hierarchy. This is particularly important when using a trailing ’*’ rule. For instance, this won’t work: + /some/path/this-file-will-not-be-found + /file-is-included - * This fails because the parent directory "some" is excluded by the ’*’ rule, so rsync never visits any of the files in the "some" or "some/path" directories. One solution is to ask for all directories in the hierarchy to be included by using a single rule: "+ */" (put it somewhere before the "- *" rule), and perhaps use the --prune-empty-dirs option. Another solution is to add specific include rules for all the parent dirs that need to be visited. For instance, this set of rules works fine: + /some/ + /some/path/ + /some/path/this-file-is-found + /file-also-included - * Here are some examples of exclude/include matching: o "- *.o" would exclude all names matching *.o o "- /foo" would exclude a file (or directory) named foo in the transfer-root directory o "- foo/" would exclude any directory named foo o "- /foo/*/bar" would exclude any file named bar which is at two levels below a directory named foo in the transfer-root directory o "- /foo/**/bar" would exclude any file named bar two or more levels below a directory named foo in the transfer-root directory o The combination of "+ */", "+ *.c", and "- *" would include all directories and C source files but nothing else (see also the --prune-empty-dirs option) o The combination of "+ foo/", "+ foo/bar.c", and "- *" would include only the foo directory and foo/bar.c (the foo directory must be explicitly included or it would be excluded by the "*") The following modifiers are accepted after a "+" or "-": o A / specifies that the include/exclude rule should be matched against the absolute pathname of the current item. For example, "-/ /etc/passwd" would exclude the passwd file any time the transfer was sending files from the "/etc" directory, and "-/ subdir/foo" would always exclude "foo" when it is in a dir named "subdir", even if "foo" is at the root of the current transfer. o A ! specifies that the include/exclude should take effect if the pattern fails to match. For instance, "-! */" would exclude all non-directories. o A C is used to indicate that all the global CVS-exclude rules should be inserted as excludes in place of the "-C". No arg should follow. o An s is used to indicate that the rule applies to the sending side. When a rule affects the sending side, it prevents files from being transferred. The default is for a rule to affect both sides unless --delete-excluded was specified, in which case default rules become sender-side only. See also the hide (H) and show (S) rules, which are an alternate way to specify sending-side includes/excludes. o An r is used to indicate that the rule applies to the receiving side. When a rule affects the receiving side, it prevents files from being deleted. See the s mod- ifier for more info. See also the protect (P) and risk (R) rules, which are an alternate way to specify receiver-side includes/excludes. o A p indicates that a rule is perishable, meaning that it is ignored in directories that are being deleted. For instance, the -C option’s default rules that exclude things like "CVS" and "*.o" are marked as perishable, and will not prevent a directory that was removed on the source from being deleted on the destination. MERGE-FILE FILTER RULES You can merge whole files into your filter rules by specifying either a merge (.) or a dir-merge (:) filter rule (as introduced in the FILTER RULES section above). There are two kinds of merged files -- single-instance (’.’) and per-directory (’:’). A single-instance merge file is read one time, and its rules are incorporated into the filter list in the place of the "." rule. For per-directory merge files, rsync will scan every directory that it traverses for the named file, merging its contents when the file exists into the current list of inherited rules. These per-directory rule files must be created on the sending side because it is the sending side that is being scanned for the available files to transfer. These rule files may also need to be transferred to the receiving side if you want them to affect what files don’t get deleted (see PER-DIRECTORY RULES AND DELETE below). Some examples: merge /etc/rsync/default.rules . /etc/rsync/default.rules dir-merge .per-dir-filter dir-merge,n- .non-inherited-per-dir-excludes :n- .non-inherited-per-dir-excludes The following modifiers are accepted after a merge or dir-merge rule: o A - specifies that the file should consist of only exclude patterns, with no other rule-parsing except for in-file comments. o A + specifies that the file should consist of only include patterns, with no other rule-parsing except for in-file comments. o A C is a way to specify that the file should be read in a CVS-compatible manner. This turns on ’n’, ’w’, and ’-’, but also allows the list-clearing token (!) to be specified. If no filename is provided, ".cvsignore" is assumed. o A e will exclude the merge-file name from the transfer; e.g. "dir-merge,e .rules" is like "dir-merge .rules" and "- .rules". o An n specifies that the rules are not inherited by subdirectories. o A w specifies that the rules are word-split on whitespace instead of the normal line-splitting. This also turns off comments. Note: the space that separates the prefix from the rule is treated specially, so "- foo + bar" is parsed as two rules (assuming that prefix-parsing wasn’t also disabled). o You may also specify any of the modifiers for the "+" or "-" rules (above) in order to have the rules that are read in from the file default to having that modifier set (except for the ! modifier, which would not be useful). For instance, "merge,-/ .excl" would treat the contents of .excl as absolute-path excludes, while "dir-merge,s .filt" and ":sC" would each make all their per-directory rules apply only on the sending side. If the merge rule specifies sides to affect (via the s or r modifier or both), then the rules in the file must not specify sides (via a modifier or a rule prefix such as hide). Per-directory rules are inherited in all subdirectories of the directory where the merge-file was found unless the ’n’ modifier was used. Each subdirectory’s rules are prefixed to the inherited per-directory rules from its parents, which gives the newest rules a higher priority than the inherited rules. The entire set of dir-merge rules are grouped together in the spot where the merge-file was specified, so it is possible to override dir-merge rules via a rule that got specified earlier in the list of global rules. When the list-clearing rule ("!") is read from a per-directory file, it only clears the inherited rules for the current merge file. Another way to prevent a single rule from a dir-merge file from being inherited is to anchor it with a leading slash. Anchored rules in a per-directory merge-file are relative to the merge-file’s directory, so a pattern "/foo" would only match the file "foo" in the directory where the dir-merge filter file was found. Here’s an example filter file which you’d specify via --filter=". file": merge /home/user/.global-filter - *.gz dir-merge .rules + *.[ch] - *.o This will merge the contents of the /home/user/.global-filter file at the start of the list and also turns the ".rules" filename into a per-directory filter file. All rules read in prior to the start of the directory scan follow the global anchoring rules (i.e. a leading slash matches at the root of the transfer). If a per-directory merge-file is specified with a path that is a parent directory of the first transfer directory, rsync will scan all the parent dirs from that starting point to the transfer directory for the indicated per-directory file. For instance, here is a common filter (see -F): --filter=': /.rsync-filter' That rule tells rsync to scan for the file .rsync-filter in all directories from the root down through the parent directory of the transfer prior to the start of the nor- mal directory scan of the file in the directories that are sent as a part of the transfer. (Note: for an rsync daemon, the root is always the same as the module’s "path".) Some examples of this pre-scanning for per-directory files: rsync -avF /src/path/ /dest/dir rsync -av --filter=': ../../.rsync-filter' /src/path/ /dest/dir rsync -av --filter=': .rsync-filter' /src/path/ /dest/dir The first two commands above will look for ".rsync-filter" in "/" and "/src" before the normal scan begins looking for the file in "/src/path" and its subdirectories. The last command avoids the parent-dir scan and only looks for the ".rsync-filter" files in each directory that is a part of the transfer. If you want to include the contents of a ".cvsignore" in your patterns, you should use the rule ":C", which creates a dir-merge of the .cvsignore file, but parsed in a CVS-compatible manner. You can use this to affect where the --cvs-exclude (-C) option’s inclusion of the per-directory .cvsignore file gets placed into your rules by putting the ":C" wherever you like in your filter rules. Without this, rsync would add the dir-merge rule for the .cvsignore file at the end of all your other rules (giv- ing it a lower priority than your command-line rules). For example: cat < out.dat then look at out.dat. If everything is working correctly then out.dat should be a zero length file. If you are getting the above error from rsync then you will probably find that out.dat contains some text or data. Look at the contents and try to work out what is producing it. The most common cause is incorrectly configured shell startup scripts (such as .cshrc or .profile) that contain output statements for non-interactive logins. If you are having trouble debugging filter patterns, then try specifying the -vv option. At this level of verbosity rsync will show why each individual file is included or excluded. EXIT VALUES 0 Success 1 Syntax or usage error 2 Protocol incompatibility 3 Errors selecting input/output files, dirs 4 Requested action not supported: an attempt was made to manipulate 64-bit files on a platform that cannot support them; or an option was specified that is supported by the client and not by the server. 5 Error starting client-server protocol 6 Daemon unable to append to log-file 10 Error in socket I/O 11 Error in file I/O 12 Error in rsync protocol data stream 13 Errors with program diagnostics 14 Error in IPC code 20 Received SIGUSR1 or SIGINT 21 Some error returned by waitpid() 22 Error allocating core memory buffers 23 Partial transfer due to error 24 Partial transfer due to vanished source files 25 The --max-delete limit stopped deletions 30 Timeout in data send/receive 35 Timeout waiting for daemon connection ENVIRONMENT VARIABLES CVSIGNORE The CVSIGNORE environment variable supplements any ignore patterns in .cvsignore files. See the --cvs-exclude option for more details. RSYNC_ICONV Specify a default --iconv setting using this environment variable. (First supported in 3.0.0.) RSYNC_PROTECT_ARGS Specify a non-zero numeric value if you want the --protect-args option to be enabled by default, or a zero value to make sure that it is disabled by default. (First supported in 3.1.0.) RSYNC_RSH The RSYNC_RSH environment variable allows you to override the default shell used as the transport for rsync. Command line options are permitted after the command name, just as in the -e option. RSYNC_PROXY The RSYNC_PROXY environment variable allows you to redirect your rsync client to use a web proxy when connecting to a rsync daemon. You should set RSYNC_PROXY to a hostname:port pair. RSYNC_PASSWORD Setting RSYNC_PASSWORD to the required password allows you to run authenticated rsync connections to an rsync daemon without user intervention. Note that this does not supply a password to a remote shell transport such as ssh; to learn how to do that, consult the remote shell’s documentation. USER or LOGNAME The USER or LOGNAME environment variables are used to determine the default username sent to an rsync daemon. If neither is set, the username defaults to "nobody". HOME The HOME environment variable is used to find the user’s default .cvsignore file. FILES /etc/rsyncd.conf or rsyncd.conf SEE ALSO rsyncd.conf(5) BUGS times are transferred as *nix time_t values When transferring to FAT filesystems rsync may re-sync unmodified files. See the comments on the --modify-window option. file permissions, devices, etc. are transferred as native numerical values see also the comments on the --delete option Please report bugs! See the web site at http://rsync.samba.org/ VERSION This man page is current for version 3.1.0 of rsync. INTERNAL OPTIONS The options --server and --sender are used internally by rsync, and should never be typed by a user under normal circumstances. Some awareness of these options may be needed in certain scenarios, such as when setting up a login that can only run an rsync command. For instance, the support directory of the rsync distribution has an example script named rrsync (for restricted rsync) that can be used with a restricted ssh login. CREDITS rsync is distributed under the GNU General Public License. See the file COPYING for details. A WEB site is available at http://rsync.samba.org/. The site includes an FAQ-O-Matic which may cover questions unanswered by this manual page. The primary ftp site for rsync is ftp://rsync.samba.org/pub/rsync. We would be delighted to hear from you if you like this program. Please contact the mailing-list at rsync@lists.samba.org. This program uses the excellent zlib compression library written by Jean-loup Gailly and Mark Adler. THANKS Special thanks go out to: John Van Essen, Matt McCutchen, Wesley W. Terpstra, David Dykstra, Jos Backus, Sebastian Krahmer, Martin Pool, and our gone-but-not-forgotten compadre, J.W. Schultz. Thanks also to Richard Brent, Brendan Mackay, Bill Waite, Stephen Rothwell and David Bell. I’ve probably missed some people, my apologies if I have. AUTHOR rsync was originally written by Andrew Tridgell and Paul Mackerras. Many people have later contributed to it. It is currently maintained by Wayne Davison. Mailing lists for support and development are available at http://lists.samba.org 28 Sep 2013 rsync(1)