The Blog of Joshua Blais.

How to Archive the Internet

Cover Image for How to Archive the Internet
Josh Blais
Josh Blais

Upon reading an article I don't know how I stumbled upon by Drew Devault

You've been there before.

That article that you loved was removed, and you have no way of seeing it again. Or that Youtube video you referenced that helped you solve a problem got taken down because the song in it was copyrighted.

I've had it happen to me a thousand times. Entire websites I enjoyed reading, gone. Their creators wanting to do something else, or maybe the creator is no longer with us.

Today, I'm going to teach you how to save your favorite posts, videos, audio tracks, and more for your own offline viewing. If an article or video resonates with you, you should archive it to revisit it later. This has become my own standard practice.

The thing is that this also requires you to be organised, as well as responsible for your content. You need to save the files for yourself.

If you're fine with that, let's begin.

There's a few resources that you can use absolutely free of charge. Let's talk about saving a website:

Let's say you come across an article you want to review in the future. Perhaps it's a really good reciepe for a strudle that you just HAVE to keep in your /Recipes folder in your computer. (I don't remember the last time I ate strudle, but let's go with it.)

This will depend on what operating system you're using. Windows, Mac, or Linux. Either way, you're going to use the exact same tool: WGET

What is wget?

Wget is a downloader of files. Your favorite website is just a culmination of files. So, that means that you can download it. It is free and open source software that you can use from any commandline interface. I'll briefly mention below how to download and install it on various OS's.

Installing Wget on Windows:

Follow this guide below.

Install Wget on Windows

Installing Wget on Mac:

Homebrew is your best friend. Follow this guide

Installing Wget on Linux:

You probably already have it installed. If not, use your favorite package manager and get it on your machine.

Use of Wget

I have archived entire forums of information with this utility. My favorite site from a few years back, Bold & Determined shut it's doors unexpectedly last year, and I used this utility to archive my favorite articles. This sort of thing happens all the time on the world wide web.

So, if you want to get down to archiving your favorite site - here's how to do it.

Open up a terminal

You know how to do this now, if you followed the above guides.

Go to whatever folder you want the page stored

If, for example you want the receipe stored in your /Recipes folder, navigate there.

type in the following:

 wget -O NAMEOF_ARTICLE_YOU_WANT.html https://site-you-are-downloading.com/great-reciepe

You'll see the command line freeze for a second, and then you should have the article in your folder.

Simple as that, you have downloaded a site page you want to keep forever.

What About Youtube?

Youtube has a great application that was built for this exact same purpose.

Youtube-Dl

Edit July 2023: I use Youtube DLP now instead

My two favourite aliases are:

In my .zshrc:

alias ytmp3="yt-dlp -f 'ba'"
// and
alias ytd='yt-dlp'

Check this page out for your own info as to how use it. But I will just say that I use it on the regular for music I want to archive, guides to doing things that I want to look up, and much more.

How about Soundcloud?

Same.

With storage being so cheap nowadays (you can get a 10 TB External Drive for $180 USD), you really don't have any reason to not have a backup of your favortie resourses that you can tap into at any time you desire.

Go forth and download the internet for yourself!

Until next time.

Subscribe for updates direct to your inbox.