Adjusting Pitch for MP3 Files with FFmpeg

I recently ran into a situation where I needed to adjust the pitch of an MP3 file for a song that I needed to learn. The problem was that song was recorded in a specific key, and I needed to play the song a half-step different. Of course, rehearsing in the original key and transposing on-the-fly is pretty trivial, but sometimes I prefer to learn a song in the key which I will be playing.

In the past I have always used a tool like Cakewalk Sonar to load the MP3 file, adjust the pitch, and then save out the adjusted audio. But I thought that was far too prosaic of an approach; I wanted a way to script the pitch change. This got me thinking about one of my favorite tools: FFmpeg.

I have mentioned FFmpeg in previous blogs, and it's one of my favorite tools; I use it almost every day for one purpose or other, and I have a large collection of batch files to automate various tasks. But unfortunately, I didn't have anything for adjusting audio pitch. That being said, I have done a lot with various FFmpeg audio and video filters, and after a little while of sifting through some of the various settings I came up with a way to easily change the pitch for an MP3 file. (And if I ever need to automate a whole directory of MP3 files, it would be simple to update this script with a loop.)

Here's the secret to the way this works - there are two audio filters that I am using:

  • asetrate - this filter adjusts the sample rate; altering the sample rate will stretch or shrink the audio, thereby changing the pitch and length of the audio.
  • atempo - this filter adjusts the tempo of the audio; altering the tempo will change the length of the audio, without changing the pitch.

So the trick is to use these two filters inversely; in other words:

  • If you increase the sample rate by 2, then you need to decrease the tempo by 2.
  • If you decrease the sample rate by 1.5, then you need to increase the tempo by 1.5.

With that in mind, I pulled out one of my favorite math constants: 2^(1/12), which is roughly 1.0594630943592952645618252949463. You might recall from some of my other blogs that this is the value by which every pitch in Equal Temperament is derived; in other words, that value is used to create every note in the chromatic scale which is used throughout the planet.

Taking that into account, I looked at the filter settings that were possible for use with FFmpeg:

  • If I assume that MP3 files are using a sample rate of 44.1khz, then I need to use values for the asetrate filter which raise or lower the sample rate by r*2^(n/12), where:
    • r is the sample rate.
    • n is the number of half steps to raise or lower.
  • The atempo can be values between 0.5 and 2.0, where:
    • 0.5 is half-tempo
    • 1.0 is the original tempo
    • 2.0 is double-tempo
    With that in mind, I used a similar formula to increase or decrease the tempo by 2^(n/12), where n is the number of half steps to raise or lower.

The math is a little weird, I'll admit - but it's pretty straight-forward. And here's the great part for you: I've already done the math, and I've written a batch file which defines a set of constants that can be used in batch files to script the raising or lowering the pitch of an MP3 file.

Here's the code for the batch file:

@echo off

set TMPFILE1=InputFile.mp3
set TMPFILE2=OutputFile.mp3

set RAISE_PITCH_01=asetrate=r=46722.3224612449211671764955071340,atempo=0.94387431268169349664191315666753
set RAISE_PITCH_02=asetrate=r=49500.5763304433484812188074908520,atempo=0.89089871814033930474022620559051
set RAISE_PITCH_03=asetrate=r=52444.0337716199990422417487017170,atempo=0.84089641525371454303112547623321
set RAISE_PITCH_04=asetrate=r=55562.5183003639065662339877809700,atempo=0.79370052598409973737585281963615
set RAISE_PITCH_05=asetrate=r=58866.4375688985154890396859602340,atempo=0.74915353843834074939964036601490
set RAISE_PITCH_06=asetrate=r=62366.8181006534916521544727376480,atempo=0.70710678118654752440084436210485
set RAISE_PITCH_07=asetrate=r=66075.3420902616540970482802825140,atempo=0.66741992708501718241541594059223
set RAISE_PITCH_08=asetrate=r=70004.3863917975968365502186919090,atempo=0.62996052494743658238360530363911
set RAISE_PITCH_09=asetrate=r=74167.0638253776226953452670037700,atempo=0.59460355750136053335874998528024
set RAISE_PITCH_10=asetrate=r=78577.2669399779266780879513330830,atempo=0.56123102415468649071676652483959
set RAISE_PITCH_11=asetrate=r=83249.7143785253664038167404180770,atempo=0.52973154717964763228091264747317
set RAISE_PITCH_12=asetrate=r=88200.0000000000000000000000000000,atempo=0.50000000000000000000000000000000

set LOWER_PITCH_01=asetrate=r=41624.8571892626832019083702090380,atempo=1.05946309435929526456182529494630
set LOWER_PITCH_02=asetrate=r=39288.6334699889633390439756665420,atempo=1.12246204830937298143353304967920
set LOWER_PITCH_03=asetrate=r=37083.5319126888113476726335018850,atempo=1.18920711500272106671749997056050
set LOWER_PITCH_04=asetrate=r=35002.1931958987984182751093459540,atempo=1.25992104989487316476721060727820
set LOWER_PITCH_05=asetrate=r=33037.6710451308270485241401412570,atempo=1.33483985417003436483083188118450
set LOWER_PITCH_06=asetrate=r=31183.4090503267458260772363688240,atempo=1.41421356237309504880168872420970
set LOWER_PITCH_07=asetrate=r=29433.2187844492577445198429801170,atempo=1.49830707687668149879928073202980
set LOWER_PITCH_08=asetrate=r=27781.2591501819532831169938904850,atempo=1.58740105196819947475170563927230
set LOWER_PITCH_09=asetrate=r=26222.0168858099995211208743508580,atempo=1.68179283050742908606225095246640
set LOWER_PITCH_10=asetrate=r=24750.2881652216742406094037454260,atempo=1.78179743628067860948045241118100
set LOWER_PITCH_11=asetrate=r=23361.1612306224605835882477535670,atempo=1.88774862536338699328382631333510
set LOWER_PITCH_12=asetrate=r=22050.0000000000000000000000000000,atempo=2.00000000000000000000000000000000

ffmpeg -y -i "%TMPFILE1%" -af "%RAISE_PITCH_01%" "%TMPFILE2%"

The only parts that you need to configure are:

  • TMPFILE1 - set this variable to the name of your original input MP3 file.
  • TMPFILE2 - set this variable to the name of your adjusted pitch output MP3 file.
  • Specify whether to raise or lower the pitch in the FFmpeg command by choosing one of the constants defined in the batch file; for example:
    • RAISE_PITCH_02 would raise the pitch of the original audio file by two half-steps (or one whole step).
    • LOWER_PITCH_05 would lower the pitch of the original audio file by five half-steps (or 2½ whole steps).

There are, of course, hundreds of other parameters which you can pass to FFmpeg in order to customize how FFmpeg processes the audio, but those are way out of scope for this blog.

With that in mind, that's it for now; have fun!