FFmpeg made easy


So you've got those expensive headphones you always wanted. You put them on, set your playlist on shuffle, lean back on the recliner, and hit play. And Robbie Williams sounds just as bad as he did on the older cheapo headphones. What gives? Unless you aren't a Robbie fan, the music doesn't sound great because it isn't encoded to.

If you've already read MythTV made easy and want to take your media knowledge to the next level, read on for our guide to audio conversion with FFmpeg, Mplayer, HandBrake and more...

Encoding is the process of converting information from one format to another. When you ripped your audio CDs, you encoded them to MP3 and were amazed at how small the files were. They were small because the ripper axed away a lot of information to, well, make them small. This was good back in the days of expensive disk space and limited bandwidth for internet streaming.

Now with terabyte disks on desktops, would you really mind dedicating 100GB to your music collection, if you knew you could make it sound as crisp as a live performance? And with the best encoding tools sitting in your distribution repository, all you need is a little patience to make your music and video sound like you've never heard them before, even on that puny little USB jog-pod.

Tools we'll be using...

You can encode audio and video in different codecs and pack them in lots of containers, as per the various standards for displaying them in different countries and on different media. Is it really possible for one tool to handle all this? Well yes, and it's called FFmpeg. FFmpeg runs across platforms and can capture and convert audio and video. It's powered by the libavcodec library, and you can trace almost any open source media player back to these two apps.

The two most popular FFmpeg alternatives are Mencoder and Transcode, and everything we'll do in this tutorial can be replicated on these two. Mencoder is part of the MPlayer media player, and also powers the AcidRip DVD ripper. Similarly, Transcode is used by that other favourite, dvd::rip. Both Mencoder and Transcode develop on top of FFmpeg's libavcodec codec library.

Thanks to its unending list of dependencies, you'll be well off installing FFmpeg via your distro's repository. While you can use FFmpeg to encode your music, and we'll use it to encode the audio streams in the videos, there are dedicated tools that'll do a much nicer job. So while you're at it, also grab FLAC, OggEnc (part of vorbis-tools), and Lame encoders.

Acronym busting

A song is a complex stream of data. Constant bitrate, or CBR, might not always have enough bits to do justice to the complex bits, which is why you should use a variable bitrate, which has a minimum or floor bitrate for the simple bits such as the silent portions and can fix a high rate for complex portions in the song.

Lossy compression algorithms are those that lose some bits of information when they are reconstructed, but are still usable. JPEG and MP3 are two of the most common lossy compression algorithms that people use to keep images and music in manageable sizes. OGG is lossy too but it is free from any patent restrictions.

On the other hand, lossless compression algorithms can be trusted to safeguard all the bits. Popular lossless algorithms are ZIP and PNG. For safeguarding music, there's FLAC, which stands for Free Lossless Audio Codec.

Encoding lossy to lossless is like trying to resuscitate a fossil.

Encoding lossy to lossless is like trying to resuscitate a fossil.

Sweet chin music

If you're encoding audio for keeps, which you should (cheap disk space and all that), you should encode them as FLAC. They'll take up a good deal of disc space, but offer you more encoding options when you need to get that music on to new devices that support newer lossy formats.

Vorbis is good for keeping music for live streaming with software such as Icecast. Plus, there's a long list of music players that'll now play OGG files. But if you must have MP3s, encode them with the Lame encoder.

To get started, grab hold of your audio CD and rip it into the uncompressed lossless WAV format. There are lots of open source CD rippers, and your distro probably already comes with one. In Gnome on Debian Lenny, the default app that's launched when you insert an audio CD is Sound Juicer, which is primarily a ripper that can also play the CDs.

If your CLI fingers are twitching you can also rip out the audio through MPlayer:

$ mplayer -fs cdda://9 -ao pcm:file=track9.wav

This extracts track 9 from the audio CD. Similarly you can also extract audio from a DVD:

$ mplayer dvd://1 -ao pcm:file=gary.wav

This extracts the first audio track.

Either way, you now have an uncompressed WAV that you can encode. FFmpeg does enable you to encode audio streams, but there are other tools that'll do the job better.

To encode to FLAC, do:

$ flac track9.wav 

This will create a track9.flac file, at the default compression rate that's about half the size of the WAV. The compression level ranges from -0 (fast compression) to -8 (best compression) with the default being -5.

Similarly, you can create an OGG, either from the WAV or from the FLAC:

$ oggenc track9.flac 

This will result in an impressively small file size, and that's at the default quality rate. If your Vulcan ears aren't impressed by the quality, turn it to the maximum with the -q10 switch.

Now for some plain ol' MP3:

$ lame -h -V 6 track9.wav track9.mp3

This creates an MP3 with variable bitrate. The quality ranges between 0 to 9, with the smaller number throwing out a slightly larger, higher-quality file. You can let these commands rip through your entire collection at one go, with the * wildcard character. These tools can also read metadata and use tags to intelligently name the newly encoded files.

In addition to quality you'll also sometimes want to tweak the number of channels or the sampling rate. For example, the right-hand speaker of one of my music players that plays OGG hasn't been able to survive constant hammering of the 60s classics. So to enjoy California Dreamin' I have to merge the stereo vocals on to mono, which is easily done with -C 1.

While I'd advice you to use the dedicated tools for encoding audio, you can just as easily use FFmpeg, since it too uses the same libraries:

$ ffmpeg -i gary.wav -acodec libmp3lame -ac 1 -ar 22050 -ab 64k gary.mp3

Here we ask FFmpeg to encode the WAV file for my broken music player using the Lame MP3 encoder. The file will be pretty small since it's using just one channel (-ac 1), sampled at 22,050Hz, at a bitrate of 64k.

FFmpeg supports a huge number of formats. To get an idea, type ffmpeg -formats on the command-line.

Probing files

Before we go any further, it's important you learn how to probe your videos and audio files with FFmpeg. Let's say you have an AVI file called green.avi:

$ ffmpeg -i green.avi
FFmpeg version SVN-r13582, Copyright (c) 2000-2008 Fabrice Bellard, et al.
configuration: --prefix=/usr --libdir=${prefix}/lib
libavutil version: 49.7.0
libavcodec version: 51.58.0
libavformat version: 52.16.0
libavdevice version: 52.0.0
libavfilter version: 0.0.0
built on Oct 22 2008 15:22:08, gcc: 4.3.2
Input #0, avi, from 'green.avi':
  Duration: 00:37:02.92, start: 0.000000, bitrate: 1320 kb/s
    Stream #0.0: Video: mpeg4, yuv420p, 624x352 [PAR 1:1 DAR 39:22], 23.98 tb(r)
    Stream #0.1: Audio: mp3, 48000 Hz, stereo, 32 kb/s
Must supply at least one output file

Your output will be a lot more verbose than the above snippet, as it will include all the configuration options that FFmpeg was compiled with. But most importantly keep an eye out for the lines that begin with 'Stream'. These describe the details about the video and audio streams in the file you're probing.

You need to have this information to encode the streams accordingly. Sometimes you'll just tear them as is, sometimes you'll encode them into another format. You should develop the habit of probing your video and audio files before running them past an encoder.

The human brain hasn't evolved enough to comprehend FFmpeg's full potential.

The human brain hasn't evolved enough to comprehend FFmpeg's full potential.

Stripping videos

Er, it's not exactly what you think. Unless you are a sentient video file, in which case this is equally humiliating.

My favourite bandwidth-wasting activity is downloading racing videos from YouTube. Being a true petrolhead I enjoy the sound of the high-revving engines, as it downshifts into a corner, and the traction control kicking in when it goes up through the gears again. The video in this case is just a distraction. With MPlayer you can easily turn off the video (with -vo null), but what if you want to listen to it on your iPod (other music players are available)?

In this case, you can easily extract the audio from the video with FFmpeg. First probe the video to determine the type of audio under its fold. Let's say we have an FLV video called mcrae.flv that identifies its audio stream as 'Stream #0.1: Audio: mp3, 44100 Hz, stereo, s16, 80 kb/s'.

Now extract the MP3 audio stream as:

$ ffmpeg -i mcrae.flv -vn -acodec copy mcrae-subaru.mp3

With the -vn switch we make our intentions clear and ask FFmpeg not to bother itself with the video. Next we specify the audio codec copy, which tells FFmpeg to use the same codec to encode the audio, which it uses to decode it. To keep things simple, we'll just keep the sampling and bitrate values the same.

With many universities putting up videos from classes online, you can also use this technique to extract the audible bits from these lectures as well. You can also extract audio from an AVI that has high-quality audio stream and encode it so that it can be burned on to an audio CD. Just make sure the encoded audio file follows the specifications detailed in the Red Book audio CD standard:

$ ffmpeg -i burns.avi -vn -acodec pcm_s16le -ar 44100 -ac 2 burns.wav

This gives you an uncompressed two-channel audio at 16-bit sampled at 44,100Hz. Similarly, you can extract video from FLV or any other file:

$ ffmpeg -i mcrae.flv -an -vcodec copy mcrae-mute.flv

The -an option is the opposite of -vn and asks FFmpeg to ignore the audio. Again the copy codec simplifies things and lets FFmpeg select the same codec as the one used for decoding it.

Reach out

When was the last time you reached for your handycam and thought you had captured something that people will enjoy watching? If only your camera could upload to YouTube. Until it can, there's FFmpeg to transform your videos and make them fit within YouTube's restrictions.

My camera outputs MPEG videos and I convert them for YouTube with the following:

$ ffmpeg -i MOV0010.mpg -ar 22050 -acodec libmp3lame -ab 32K -r 25 -s 320x240 -vcodec flv throwing-nuts.flv

Here we've encoded the MPEG with the FLV codec as per YouTube specs. Apart from the audio options, which we've covered already, there are lots of other new switches, like -r, which sets frame rate for the video, which is either 25 for PAL, and 29.97 for NTSC depending on the region you're in. Finally with -s we resize the video to YouTube 320x240 resolution. You can duplicate this for any online video host, as long as you know their video specs.

Another switch you might be interested in is -t, which clips a video and encodes only the specified length of time. For example, -t 10 will encode the first 10 seconds. You can also jump to a specific time in the movie with -ss and encode for a fixed duration from there. For example:

$ ffmpeg -i MOV0010.mpg -acodec copy -r 25 -s 320x240 -vcodec flv -ss 00:10:00 -t 128 throwing-nuts.flv

This encodes 128 seconds of video after skipping the first 10 minutes. You can specify time for both -ss and -t either in seconds or the hh:mm:ss format, which can be more convenient if you don't want to keep multiplying things by 60.

If you are more concerned by the size of the video than its length, use the -fs option, specifying the size in bytes, such as -fs 10485760, which encodes until the size of the output video reaches 10MB.

Remember that -fs will not magically make the entire video fit the specified length. It's used to make sure that the encoded video doesn't exceed a particular size, and breaks the encoding process as soon as the specified size is reached. This might seem a little abrupt, but them's the breaks.

How to convert videos for YouTube

Linux Video Converter

Linux Video Converter: The Linux Video Converter is a simple script that needs Wencoder and PyGTK+. Use your distro repository to fetch these two and grab the script from http://rudd-o.com/new-projects/linuxvideoconverter, extract it and run it with ./linuxvideoconverter.

Select a video

Select a video: From the simple interface, under the Source Video File option, select the video you want to convert. The program is based on Mencoder and it supports all formats that Mencoder can recognise, which covers pretty much everything out there.

Select a target format

Select a target format: Currently the program can only convert the source videos for YouTube. Select the AVI For YouTube option from the pull-down list and click on OK. The encoded video is placed next to the source video and has the word 'converted' in the filename.

Preparing for stardom

Resizing video for the internet is one thing, preparing it for TV broadcast is another. Yeah, it's a lot easier. If you're in PAL-land use:

$ ffmpeg -i my-vid.avi -target pal-vcd audition.mpg

That's all. The -target pal-vcd bit does all the hard work (as in, it sets the bitrate and selects the codec) for you. There are lots of preset targets to choose from, like NTSC DVD, or just plain ol' VCD, DVD, DV etc. This will also come in handy when you want to share those holiday and family videos.

There's one more factor that you'll have to take into account before encoding videos - aspect ratio. My camera lets me record in standard 4:3 and widescreen 16:9 format. The 4:3 format records videos at 320x240. You can try resizing the video to a particular resolution using the -s option we used earlier, but then you'll have to deal with distorted images with round heads appearing egg shaped.

The other option is to sacrifice some bits from the video and resize it to the closest 16:9 resolution. So a 320x240 needs to be resized to 320x[320/(16/9)] or 320x180. This means I have to shave off 60 pixels, like this:

$ ffmpeg -i centre.mpg -croptop 30 -cropbottom 30  -padtop 30 -padbottom 30 -padcolor 000000 -target ntsc-dvd centre-dvd.mpg

This shaves 30 pixels from both top and bottom, and pads them with black strips. The colour of the padding is specified as a hexadecimal value (as also used to specify colours in HTML). To convert a 16:9 video to 4:3, you can crop from the right and left sides with the -cropright and -cropleft options.

Amateur sync

With all this tweaking, sometimes you'll end up with videos that don't have synchronised audio and video. To fix these we'll first extract the audio and video streams from the file, and then while merging them together delay the audio or video as required.

First rip the streams:

$ ffmpeg -i break.flv -vcodec mpeg2video break-video.m2v -acodec copy break-audio.mp3

There's nothing here that we haven't gone over earlier. We'll end up with two streams stored in separate audio and video containers. To introduce the delay we need to use the -itsoffset option to offset one stream to sync with the other. Mostly you'll have delay in milliseconds, but to illustrate the option better, I'll assume there's a 10.2 second delay in the audio, that is you hear the gunshot after the bad guy has fired at the hero guy, and instead accidentally shot his bugle.

$ ffmpeg -i break-audio.mp3 -itsoffset 00:00:10.2 -i break-video.m2v bugle-still-dies.flv 

This will create a new FLV file that plays the audio just like it appears in the original file, but adds a 10.2-second delay in the video. Conversely, if your climax is ruined by the premature delivery of the last words of the hero, you can delay the audio to sync it with the video. Assuming the difference between the two is three milliseconds:

$ ffmpeg -i break-video.m2v -itsoffset 00:00:00.3 -i break-audio.mp3 no-saving-the-bugle.flv 

Presto! Now go rub some funk on those home videos!

Make videos portable with HandBrake

Select a video source

Select a video source: In addition to what you've got on your hard disk, HandBrake can also rip videos from DVDs. But remember it's not designed to be a DVD ripper, so don't expect it to break through every copy protection out there. If your DVD has multiple chapters you'll be able to select the one you want to encode.

Select presets

Select presets: When you've selected a source video, you can choose to convert it to a particular device by selecting one of the presets. HandBrake includes presets for several common devices, including the iPod, iPhone, PSP, Xbox, PS3, and more.

Tweak settings

Tweak settings: When you're done with the presets you can customise the various settings, like selecting a preview picture, choosing the audio and video codecs and container. You can also optimise the video for the web. Add the process to the queue and click the Start button when you've added all the videos you want to encode.

First published in Linux Format

First published in Linux Format magazine

You should follow us on Identi.ca or Twitter

Your comments

FLAC albums?

About FLAC... It also supports putting a whole bunch of tracks into one file, like an album (actually, .mka files for Matroska audio do, too). I'm not sure what the commandline options for that are but the manpages would probably get you a long way. :)


need to convert a mplayer song to mp3 so i can play them in my iTunes

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Username:   Password: