What is Wget in Linux?
Wget is a versatile command-line utility for Linux users to download files and websites. But what exactly is it, and why is it so popular?
Have you ever been in front of a Linux machine that can’t get into its desktop environment, but you need to download a file? Have you ever needed to schedule downloads on a Linux computer? If you have, then chances are that wget was your go-to tool.
Wget (short for World Wide Web Get) is an open-source utility that allows you to retrieve content from web servers via HTTP, HTTPS, and FTP protocols. It is designed for non-interactive downloads, meaning it can operate in the background without user input, making it great for scheduled downloads.
Wget is included by default on most Linux distributions and is a vital tool for systems administrators, developers, and power users. Let's discuss why you should be using wget and how to use it for the best experience.
Why Use Wget?
Other than being able to access the internet directly from the CLI (command line interface), there are a few other benefits that make wget well worth your while:
Fast and reliable downloading: With Linux wget you can download files much faster than browsers because it uses multiple simultaneous connections. It can also resume downloads interrupted due to network issues.
Reliability: Wget reliably continues partial downloads after network interruptions and disconnections during a file download.
Scripting and scheduling: Wget can be easily scripted and scheduled with cron jobs for unattended downloads. This allows automating large batch downloads that would be tedious to do manually.
Recursive downloading: Wget can recursively download web page content and mirror entire websites easily. This makes it useful for website backups and archiving.
Bandwidth throttling: Wget lets you limit download bandwidth usage, which is helpful when you don't want to saturate your network connection and cause slow downloads for everyone else.
Authentication: Wget also supports username and password authentication for downloading protected content on the web.
Customization: Wget is highly customizable with many options available for advanced usage scenarios that might not be possible with an internet browser.
You might wonder how wget compares to curl. The curl vs wget debate is more about use cases than the benefits. curl excels at making HTTP requests and interacting with various protocols, while wget is known for its simplicity and ability in recursive downloading and mirroring websites. Your choice will come down to the specific tasks you need to accomplish and your familiarity with each tool.
How to Install Wget Command on Linux
The good news is that wget comes pre-installed on most Linux distributions, but you may need to install it yourself, depending on your distro. Luckily, installing it is very easy, and we'll cover some of the most common methods below.
Install wget on Ubuntu/Debian:
sudo apt update
sudo apt install wget
Install wget on CentOS/RHEL:
sudo yum update
sudo yum install wget
Install wget on Fedora:
sudo dnf update
sudo dnf install wget
Install wget on Arch Linux/Manjaro:
sudo pacman -Syu
sudo pacman -S wget
Once installed, you can verify wget is available by typing wget --version, which will print the installed version you have just installed.
How to Use wget: The Basics
Using wget is quite straightforward and requires only a few text parameters from the command line. Let's go over some basic examples of downloading files with wget.
How to download a file over HTTP:
wget https://download.com/example.zip
This will download the example.zip file from the URL and save it in the current working directory, which is the directory you launched the wget command from in the CLI. (We cover specifying download locations in our examples later on in the article.)
How to download a file over FTP:
wget ftp://username:password@ftp.site.com/public/file.iso
In this example, we would download an ISO file from an FTP server using a username and password for authentication, which is more secure for private data like a personal website.
You can also specify the output file name and location:
wget -O /home/user/downloads/newfilename.zip https://site.com/file.zip
This will save the downloaded file as newfilename.zip in the downloads folder under /home/user/downloads.
How to Use Wget: Advanced Wget Usage
In addition to basic downloading, wget has many advanced features and options. Let's explore how to make the most out of this command line utility.
How to limit download speed and bandwidth
You can limit the download speed in bytes per second when you really need to save bandwidth:
wget --limit-rate=300k https://site.com/large-file.mpg
This restricts the download to 300 kilobytes per second maximum, which is useful to conserve data transfer speeds on a connection that is bandwidth limited, and you don’t want your download hogging the entire line.
How to resume interrupted downloads
In Linux, wget can resume interrupted downloads using the -c option, great for larger files and intermittent internet connections:
wget -c https://big-file.iso
This means if the download unexpectedly times out or disconnects, it will continue the previous incomplete download from where it stopped.
Recursive Download and mirror websites
If you have ever wanted to backup your blog or recursively download an entire website then you can try the below example on your own site:
wget --recursive --no-clobber --page-requisites --html-extension --convert-links --restrict-file-names=windows --domains website.org --no-parent https://www.website.org/tutorials/html/
This will mirror the website's HTML tutorials section for backup. We use various options like --no-clobber to skip re-downloading existing files. A real gold mine of information is in the man pages, though. Type in:
man wget
How to Use Wget with Authentication
You may need to access protected resources that require authentication. Here's how to use username and password with wget:
wget --user=username --password=password https://secure.site.com/private-file.pdf
This will authenticate using the provided credentials and download the private PDF file.
You can also save the credentials in a .netrc file:
machine secure.site.com
login username
password password
And you can use wget as:
wget --netrc-file=/home/user/.netrc https://secure.site.com/confidential-doc.docx
This reads credentials from the netrc file for authentication and lets you connect securely.
How to Troubleshoot Wget
Sometimes things go wrong. The good news is wget is very well documented, and many of the issues you will face can be fixed quite easily. Here are some common wget issues and fixes if you get stuck:
Certificate errors: Use --no-check-certificate to bypass SSL certificate validation if you really need to.
Protocol errors: Try a different protocol, like FTP if HTTP fails. Also, confirm whether the site supports resume/threading. You can use the -S switch to ‘spider’ a website and get confirmation about its support for continuing or parallel connections without needing to actually start the download.
Authentication issues: Double-check username and password. For public sites, try without authentication, as some public resources don’t require authentication to access files for downloading.
Connection issues: If your downloads are failing to start, then you might need to check your network and firewall settings. If you suspect you have a particularly troublesome internet connection, then you can also try limiting speed with --limit-rate for larger downloads.
Permission errors: You may need to use sudo if you don’t have write permission for the download location you want to save to. Using your home directory is usually a safe bet, as you will usually have permission to save there.
Recursive errors: Fine-tune recursion parameters like depth, number of URLs per domain and other parameters based on the site structure. Sometimes, a website’s file structure is an issue, so this is a good way to trace the fault.
Final Thoughts on Linux Wget
As you can see, Wget is an indispensable tool for Linux power users who want to efficiently, or even automatically download files via the command line. With its vast range of options and modes of operation, wget can speed up and simplify large batch downloads that would be difficult or impossible through a web browser.
Now that you are familiar with the basics and advanced usage of wget, you can start to experiment with it and leverage its capabilities for both simple and complex downloading tasks. Refer to the wget man pages for more details on all available options and see what else you can do with this unassuming utility.
Not a CBT Nuggets subscriber? Sign up for a free week to access our entire library of training videos.
delivered to your inbox.
By submitting this form you agree to receive marketing emails from CBT Nuggets and that you have read, understood and are able to consent to our privacy policy.