download.file {utils}R Documentation

Download File from the Internet

Description

This function can be used to download a file from the Internet.

Usage

download.file(url, destfile, method, quiet = FALSE, mode = "w",
              cacheOK = TRUE)

Arguments

url A character string naming the URL of a resource to be downloaded.
destfile A character string with the name where the downloaded file is saved. Tilde-expansion is performed.
method Method to be used for downloading files. Currently download methods "internal", "wget" and "lynx" are available. The default is to choose the first of these which will be "internal". The method can also be set through the option "download.file.method": see options().
quiet If TRUE, suppress status messages (if any).
mode character. The mode with which to write the file. Useful values are "w", "wb" (binary), "a" (append) and "ab". Only used for the "internal" method.
cacheOK logical. Is a server-side cached value acceptable? Implemented for the "internal" and "wget" methods.

Details

The function download.file can be used to download a single file as described by url from the internet and store it in destfile. The url must start with a scheme such as "http://", "ftp://" or "file://".

cacheOK = FALSE is useful for "http://" URLs, and will attempt to get a copy directly from the site rather than from an intermediate cache. (Not all platforms support it.) It is used by CRAN.packages.

The remaining details apply to method "internal" only.

The timeout for many parts of the transfer can be set by the option timeout which defaults to 60 seconds.

The level of detail provided during transfer can be set by the quiet argument and the internet.info option. The details depend on the platform and scheme, but setting internet.info to 0 gives all available details, including all server responses. Using 2 (the default) gives only serious messages, and 3 or more suppresses all messages.

A progress bar tracks the transfer. If the file length is known, the full width of the bar is the known length. Otherwise the initial width represents 100Kbytes and is doubled whenever the current width is exceeded.

There is an alternative method if you have Internet Explorer 4 or later installed. You can use the flag --internet2, when the ‘Internet Options’ of the system are used to choose proxies and so on; these are set in the Control Panel and are those used for Internet Explorer. This version does not support cacheOK = FALSE.

Method "wget" can be used with proxy firewalls which require user/password authentication if proper values are stored in the configuration file for wget.

Setting Proxies

This applies to the internal code only.

Proxies can be specified via environment variables. Setting "no_proxy" stops any proxy being tried. Otherwise the setting of "http_proxy" or "ftp_proxy" (or failing that, the all upper-case version) is consulted and if non-empty used as a proxy site. For FTP transfers, the username and password on the proxy can be specified by "ftp_proxy_user" and "ftp_proxy_password". The form of "http_proxy" should be "http://proxy.dom.com/" or "http://proxy.dom.com:8080/" where the port defaults to 80 and the trailing slash may be omitted. For "ftp_proxy" use the form "ftp://proxy.dom.com:3128/" where the default port is 21. These environment variables must be set before the download code is first used: they cannot be altered later by calling Sys.putenv.

Usernames and passwords can be set for HTTP proxy transfers via environment variable http_proxy_user in the form user:passwd. Alternatively, "http_proxy" can be of the form "http://user:pass@proxy.dom.com:8080/" for compatibility with wget. Only the HTTP/1.0 basic authentication scheme is supported. Under Windows, if "http_proxy_user" is set to "ask" then a dialog box will come up for the user to enter the username and password. NB: you will be given only one opportunity to enter this, but if proxy authentication is required and fails there will be one further prompt per download.

Note

Methods "wget" and "lynx" are for historical compatibility. They will block all other activity on the R process.

For methods "wget" and "lynx" a system call is made to the tool given by method, and the respective program must be installed on your system and be in the search path for executables.

See Also

options to set the timeout and internet.info options.

url for a finer-grained way to read data from URLs.

url.show, CRAN.packages, download.packages for applications


[Package utils version 2.1.0 Index]