Sunday, March 31, 2013

youtube-dl, a YouTube downloader in Python




I came across youtube-dl recently. As the title says, it is a YouTube downloader tool, written in Python.


(Update: I made a typo in the title of this post - wrote YouRube. Unintended :-), and fixed now. The post URL can't be fixed, though, except by deleting and re-entering the post.)

I downloaded the tool (it comes both as a binary and as source) and ran the tool on Windows, at the command line, to download a favorite video of mine from YouTube. It worked fine.

youtube-dl has various command-line options that you can set to configure its behavior.

Here is the documentation for youtube-dl.

Then I took a look at the source code. The code for the tool is in the public domain.

It is a few thousand lines long, and divided into 5 or 6 Python source files. There is one file called FileDownloader, another called InfoExtractor, and a few other supporting files.

I've just had a short look at the code so far, but could figure out a few things about the design, and am giving an overview of that below. Going to check out the code in more detail later. It is a non-trivial app and can serve as a good codebase to study.

Instances of the FileDownloader class and InfoExtractor classes register themselves with each other. Actually, a number of InfoExtractors can be registered with a FileDownloader instance. The FileDownloader then passes the input video URL (which the user wants to download) to the chain of InfoExtractors, and uses the first one which reports that it can handle the request, to extract some required info about the video. The FileDownloader itself then does the actual downloading of the file (if requested by the user; it can do other things than download the file, too). Some of its features are:

- Supports multiple YouTube formats - they link to this list of YouTube formats on Wikipedia, so I guess they must support at least some of them; I haven't tried out formats other than the default, but I may later. Some of those YouTube formats include HD video (higher resolution) and 3GP (for mobile). If youtube-dl supports mobile, it could be used to download videos for mobile to your PC, and then transfer it to your mobile, for offline viewing, if you can find an offline video player app for your mobile. Going to try that out for my Android phone.

- Supports multiple video sites, not just YouTube. Some of them are vimeo, Google Video (not sure if that is still operational), blip.tv, SoundCloud, and infoq.

- The tool youtube-dl can be used to update itself, and they recommend you do it often, with the command:

youtube-dl -U

Since YouTube videos are usually in Flash video format, you can play them on your Windows or Linux PC using VLC or MPlayer.

The video I downloaded was in MP4 format, and I was able to play it in Windows Media Player (didn't have VLC just then on that machine), and it did play, though the audio was low and the picture quailty was not that clear, as compared to playing the same video in YouTube via the web client. Windows Media Player gave a message that the MP4 format was unknown, but was able to play it anyway, though with the stated limitations.

youtube-dl can use either urllib or urllib2 from the Python standard library, can do download rate-limiting if asked to, and many other things, as part of its features.

An interesting Python tool, overall.

On a related note, a lot of YouTube itself is written in Python. I had seen a video about that by one of the YouTube engineers, some time ago.

Here's the summary of a talk at PyCon 2012 on Scaling YouTube - it mentions that most of YouTube is written in Python, and that YouTube is one of the largest web sites in the world, that uses Python. I just saw on Wikipedia in the entry about YouTube that it may be the third-most viewed site in the world.

2 comments:

Anonymous said...

Y U NO package?

Advise: you shouldn't re-invent download, installation and update methods, that problem was long ago solved and with better methods than yours. You should use setuptools/distutils (packaging), pypi (distribution), and pip (downloader+installer+updater) :)

Vasudev Ram said...


U R FOOL who also cannot spell English correctly?

1. I am not the creator of the package. I said that I came across it, not that I wrote it. Looks like you didn't bother to read the post properly before commenting, like many other fools, who comment on blog posts.

2. "Advise" should be spelt "Advice", in the context in which you wrote it. Free English lesson for you.

Improve both your brains and your spelling before commenting on any more blog posts :-)