Adjusting Pitch for MP3 Files with FFmpeg

I recently ran into a situation where I needed to adjust the pitch of an MP3 file for a song that I needed to learn. The problem was that song was recorded in a specific key, and I needed to play the song a half-step different. Of course, rehearsing in the original key and transposing on-the-fly is pretty trivial, but sometimes I prefer to learn a song in the key which I will be playing.

In the past I have always used a tool like Cakewalk Sonar to load the MP3 file, adjust the pitch, and then save out the adjusted audio. But I thought that was far too prosaic of an approach; I wanted a way to script the pitch change. This got me thinking about one of my favorite tools: FFmpeg.

I have mentioned FFmpeg in previous blogs, and it's one of my favorite tools; I use it almost every day for one purpose or other, and I have a large collection of batch files to automate various tasks. But unfortunately, I didn't have anything for adjusting audio pitch. That being said, I have done a lot with various FFmpeg audio and video filters, and after a little while of sifting through some of the various settings I came up with a way to easily change the pitch for an MP3 file. (And if I ever need to automate a whole directory of MP3 files, it would be simple to update this script with a loop.)

Here's the secret to the way this works - there are two audio filters that I am using:

  • asetrate - this filter adjusts the sample rate; altering the sample rate will stretch or shrink the audio, thereby changing the pitch and length of the audio.
  • atempo - this filter adjusts the tempo of the audio; altering the tempo will change the length of the audio, without changing the pitch.

So the trick is to use these two filters inversely; in other words:

  • If you increase the sample rate by 2, then you need to decrease the tempo by 2.
  • If you decrease the sample rate by 1.5, then you need to increase the tempo by 1.5.

With that in mind, I pulled out one of my favorite math constants: 2^(1/12), which is roughly 1.0594630943592952645618252949463. You might recall from some of my other blogs that this is the value by which every pitch in Equal Temperament is derived; in other words, that value is used to create every note in the chromatic scale which is used throughout the planet.

Taking that into account, I looked at the filter settings that were possible for use with FFmpeg:

  • If I assume that MP3 files are using a sample rate of 44.1khz, then I need to use values for the asetrate filter which raise or lower the sample rate by r*2^(n/12), where:
    • r is the sample rate.
    • n is the number of half steps to raise or lower.
  • The atempo can be values between 0.5 and 2.0, where:
    • 0.5 is half-tempo
    • 1.0 is the original tempo
    • 2.0 is double-tempo
    With that in mind, I used a similar formula to increase or decrease the tempo by 2^(n/12), where n is the number of half steps to raise or lower.

The math is a little weird, I'll admit - but it's pretty straight-forward. And here's the great part for you: I've already done the math, and I've written a batch file which defines a set of constants that can be used in batch files to script the raising or lowering the pitch of an MP3 file.

Here's the code for the batch file:

@echo off

set TMPFILE1=InputFile.mp3
set TMPFILE2=OutputFile.mp3

set RAISE_PITCH_01=asetrate=r=46722.3224612449211671764955071340,atempo=0.94387431268169349664191315666753
set RAISE_PITCH_02=asetrate=r=49500.5763304433484812188074908520,atempo=0.89089871814033930474022620559051
set RAISE_PITCH_03=asetrate=r=52444.0337716199990422417487017170,atempo=0.84089641525371454303112547623321
set RAISE_PITCH_04=asetrate=r=55562.5183003639065662339877809700,atempo=0.79370052598409973737585281963615
set RAISE_PITCH_05=asetrate=r=58866.4375688985154890396859602340,atempo=0.74915353843834074939964036601490
set RAISE_PITCH_06=asetrate=r=62366.8181006534916521544727376480,atempo=0.70710678118654752440084436210485
set RAISE_PITCH_07=asetrate=r=66075.3420902616540970482802825140,atempo=0.66741992708501718241541594059223
set RAISE_PITCH_08=asetrate=r=70004.3863917975968365502186919090,atempo=0.62996052494743658238360530363911
set RAISE_PITCH_09=asetrate=r=74167.0638253776226953452670037700,atempo=0.59460355750136053335874998528024
set RAISE_PITCH_10=asetrate=r=78577.2669399779266780879513330830,atempo=0.56123102415468649071676652483959
set RAISE_PITCH_11=asetrate=r=83249.7143785253664038167404180770,atempo=0.52973154717964763228091264747317
set RAISE_PITCH_12=asetrate=r=88200.0000000000000000000000000000,atempo=0.50000000000000000000000000000000

set LOWER_PITCH_01=asetrate=r=41624.8571892626832019083702090380,atempo=1.05946309435929526456182529494630
set LOWER_PITCH_02=asetrate=r=39288.6334699889633390439756665420,atempo=1.12246204830937298143353304967920
set LOWER_PITCH_03=asetrate=r=37083.5319126888113476726335018850,atempo=1.18920711500272106671749997056050
set LOWER_PITCH_04=asetrate=r=35002.1931958987984182751093459540,atempo=1.25992104989487316476721060727820
set LOWER_PITCH_05=asetrate=r=33037.6710451308270485241401412570,atempo=1.33483985417003436483083188118450
set LOWER_PITCH_06=asetrate=r=31183.4090503267458260772363688240,atempo=1.41421356237309504880168872420970
set LOWER_PITCH_07=asetrate=r=29433.2187844492577445198429801170,atempo=1.49830707687668149879928073202980
set LOWER_PITCH_08=asetrate=r=27781.2591501819532831169938904850,atempo=1.58740105196819947475170563927230
set LOWER_PITCH_09=asetrate=r=26222.0168858099995211208743508580,atempo=1.68179283050742908606225095246640
set LOWER_PITCH_10=asetrate=r=24750.2881652216742406094037454260,atempo=1.78179743628067860948045241118100
set LOWER_PITCH_11=asetrate=r=23361.1612306224605835882477535670,atempo=1.88774862536338699328382631333510
set LOWER_PITCH_12=asetrate=r=22050.0000000000000000000000000000,atempo=2.00000000000000000000000000000000

ffmpeg -y -i "%TMPFILE1%" -af "%RAISE_PITCH_01%" "%TMPFILE2%"

The only parts that you need to configure are:

  • TMPFILE1 - set this variable to the name of your original input MP3 file.
  • TMPFILE2 - set this variable to the name of your adjusted pitch output MP3 file.
  • Specify whether to raise or lower the pitch in the FFmpeg command by choosing one of the constants defined in the batch file; for example:
    • RAISE_PITCH_02 would raise the pitch of the original audio file by two half-steps (or one whole step).
    • LOWER_PITCH_05 would lower the pitch of the original audio file by five half-steps (or 2½ whole steps).

There are, of course, hundreds of other parameters which you can pass to FFmpeg in order to customize how FFmpeg processes the audio, but those are way out of scope for this blog.

With that in mind, that's it for now; have fun!

Fixing Underwater Videos with FFMPEG

I ran into an interesting predicament: I couldn't get the right color adjustment settings to work in my video editor to correct some underwater videos from a scuba diving trip. After much trial and error, I came up with an alternative method: I have been able to successfully edit underwater photos to restore their color, so I used FFMPEG to export all of the frames from the source video as individual images, then I used a script to automate my photo editor to batch process all of the images, then I used FFMPEG to reassemble the finished results into a new MP4 file.

The following video of a Goliath Triggerfish in Bora Bora shows a before and after of what that looks like. Overall, I think the results are promising, albeit via a weird and somewhat time-consuming hack.

Exporting Videos as Images with FFMPEG

Here is the basic syntax for automating FFMPEG to export the individual frames:

ffmpeg.exe -i "input.mp4" -r 60 -s hd1080 "C:\path\%6d.png"

Where the following items are defined:

-i "input.mp4" specifies the source MP4 file
-r 60 specifies the frame rate for the video at 60fps
-s hd1080 specifies 1920x1080 resolution (there are others)
"C:\path\%6d.png" specifies the directory for storing the images, and specifies PNG images with file names which are numerically sequenced with a width of 6 digits (e.g. 000000.png to 999999.png)

Combining Images as a Video with FFMPEG

Here is the basic syntax for automating FFMPEG to combine the individual frames back into an MP4 file:

ffmpeg.exe -framerate 60 -i "C:\path\%6d.png" -c:v libx264 -f mp4 -pix_fmt yuv420p "output.mp4"

Where the following items are defined:

-framerate 60 specifies the frame rate for the output video at 60fps (note that specifying a different framerate than you used for exporting could be used to alter the playback speed of the final video)
-i "C:\path\%6d.png" specifies the directory where the images are stored, and specifies PNG images with file names which are numerically sequenced with a width of 6 digits (e.g. 000000.png to 999999.png)
-c:v libx264 specifies the H.264 codec
-f mp4 specifies an MP4 file
-pix_fmt yuv420p specifies the pixel format, which could also specify "rgb24" instead of "yuv420p"
"output.mp4" specifies the final MP4 file

How to Merge a Folder of MP4 Files with FFmpeg (Revisted)

I ran into an interesting situation the other day: I had a bunch of H.264 MP4 files which I had created with Handbrake that I needed to combine, and I didn't want to use my normal video editor (Sony Vegas) to perform the merge. I'm a big fan of FFmpeg, so I figured that there was some way to automate the merge without having to use an editor.

I did some searching around the Internet, and I couldn't find anyone who was doing exactly what I was doing, so I wrote my own batch file that combines some tricks that I have used to automate FFmpeg in the past with some ideas that I found through some video hacking forums. Here is the resulting batch file, which will combine all of the MP4 files in a directory into a single MP4 file named "ffmpeg_merge.mp4", which can be renamed to something else:

@echo off

if exist ffmpeg_merge.mp4 del ffmpeg_merge.mp4
if exist ffmpeg_merge.tmp del ffmpeg_merge.tmp
if exist *.ts del *.ts

for /f "usebackq delims=|" %%a in (`dir /on /b *.mp4`) do (
ffmpeg.exe -i "%%a" -c copy -bsf h264_mp4toannexb -f mpegts "%%a.ts"
)

for /f "usebackq delims=|" %%a in (`dir /b *.ts`) do (
echo file %%a>>ffmpeg_merge.tmp
)

ffmpeg.exe -f concat -i ffmpeg_merge.tmp -c copy -bsf aac_adtstoasc ffmpeg_merge.mp4

if exist ffmpeg_merge.tmp del ffmpeg_merge.tmp
if exist *.ts del *.ts

The merging process in this batch file is performed in two steps:

  • First, all of the individual MP4 files are remuxed into individual transport streams
  • Second, all of the individual transport streams are remuxed into a merged MP4 file

Here are the URLs for the official documentation on each of the FFmpeg switches and parameters that I used:

By the way, I realize that there may be better ways to do this with FFmpeg, so I am open to suggestions. ;-]