The VBScript Network and Systems Administrator's Cafe

Aug 13 2008   2:29PM GMT

Essential tools: Wget, a command line tool to retrieve web pages

Jerry Lees Jerry Lees Profile: Jerry Lees

There is nothing more annoying than having a web server or site down and IE (or FireFox, for that matter) become dog slow or simply getting in the way of trouble shooting the page. Additionally, sometimes these browsers actually get in the way of troubleshooting the problem by masking the error page the server sends back– IE’s “friendly” HTTP errors messages, for example. When it comes right down to fixing the problem, sometimes you need to retrieve just the HTML code a particular web page sends simply for inspection or analysis. That is where our next essential tool comes in!

Wget is a small (~325K) command line utility that allows you to download a HTTP, HTTPS, or FTP file quickly from the command line and save it locally so you can open it with a text editor, simply have it in an alternate location, or use in a comparison to what a specific browser renders after download. Wget for windows can be downloaded here. It’s a powerful tool, and covering all the options in one posting isn’t possible, so let’s start off with a little syntax to get you rolling:

In its  simplest form you can download a specific page, including a full URL, as shown below:

wget http://www.gamersigs.net

Alternatively, you can download a site and all its linked items recursively to a specific number of levels. This is useful to archive a site or  to simply grab pages that the HTML uses, but doesn’t link to directly– Cascading Style Sheets (css) for example. The syntax below will recursively get 2 levels of www.msn.com and automatically create a directory called www.msn.com in the current directory.

wget -r –level=2 http://www.msn.com

If the page links to a HTTPS page, wget will automatically try to negotiate a SSL connection. You can optionally specify the SSL protocol to use by adding –secure-protocol=PR, where PR is either auto, SSLv2, SSLv3, or TLSv1. This is especially helpful in testing and ensuring your servers do not respond to the weaker SSLv2 SSL protocol.

If you deal with websites as a part of your Systems Administration duties– or if you’re just interested in it as a side project at the office, I’m sure you’ll add this tool to your essential tools.

Know of a tool that you think is essential? Post a comment here and if I don’t already have it in my tool belt, I’ll add it and give it a shot. If it makes the grade– I’ll add it to the list of tools to review. The only criteria are:

  1. The tool must be free, or inexpensive with a “Per User” or “site” type license. (No pay per installation licenses, please)
  2. The tool (or it’s installation file) must be small enough to fit on a 256Mb flash drive for portability.
  3. Command line run time options are beneficial, but not required.
  4. If it has ads… it needs be truly INVALUABLE.
  5. It should make the user’s job easier by gathering information or preforming a task that a typical Network or Systems Administrator would preform.

Enjoy!

2  Comments on this Post

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when other members comment.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
  • Labnuke99
    A nice tool with a graphical interface for Windows that performs similar tasks is [A href="http://www.httrack.com/"]WinHTTrack[/A]. I have used it to copy a website with lots of Excel resources that I keep local for reference when I don't have internet access. I download the website onto a USB drive with lots of space so no worries about cluttering my machine's drive.
    32,960 pointsBadges:
    report
  • Jerry Lees
    Labnuke99, Thanks for the comment! I've looked at this tool. Very nice tool to pull down a copy of the HTML rendered by a web site. I'll be playing with it a bit to get a feel for it.
    5,335 pointsBadges:
    report

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to: