Audio Format Comparison
What's the best way to store your music? MP3, Windows Media Audio, and Ogg Vorbis are three of the most popular formats. Which of these is the most efficient? This objective study attempts to determine which format preserves sound the best while taking up the least amount of disk space.
Lots of audio codecs are available for you to use to store your music. The most popular is undoubtedly MP3, but Ogg Vorbis (*.ogg) and Windows Media (*.wma) are also frequently used. So which of these is the best? Here is my attempt at finding out.
Introduction
MP3 (MPEG-1 Layer 3), Windows Media Audio, and Ogg Vorbis are all lossy audio compression formats. This means that when encoding occurs, some data is lost and is irrecoverable. The purpose of this page is to determine, using human ears as a judge, which of the three formats loses the least amount of perceptible data. The analysis will be made as objective as possible by using WinABX, a program designed to give a double-blind equivalent experimental environment.
As most of us know, MP3 is the most widely used audio compression scheme in the world. But many of us don't realize that it is not a free technology. It is licensed and patented, and in September 1998, the Fraunhofer Institute began attempting to charge money for MP3 encoders and decoders. Besides this problem, MP3 has several others. For example, it performs very poorly at low bitrates.
WMA was likely created by Microsoft to avoid MP3 licensing issues, but personally, I think that Microsoft just likes to make everything Microsoft and create a share for itself in every market imaginable. WMA often does perform better than MP3, but not to the extent that Microsoft claims. For example, I once read that the quality of a WMA file is equal to that of an MP3 with twice the bitrate. As will be shown, this is absurd.
Ogg Vorbis (what an odd name) avoids all licensing issues. Ogg Vorbis is completely free and open-source, and will remain this way. There can never be any charge for using it. It is reputed to sound better than MP3 and WMA at equal bit rates, but of course, there is only one way to find out. That's why this page is here.
Before We Begin
Let me try to clear up some confusion regarding the words I use.
- Codec - Method used to compress and decompress files; MP3, Ogg, and WMA are all codecs.
- Compression scheme - Used interchangeably with "codec" here.
- Encoding - Processing a file using a particular algorithm; in this case, a codec. A encoded file here is one that has been compressed.
- Original - This refers to the original wave file, copied directly from the original music CD, and can be considered basically flawless.
- Transparency - A condition in which the encoded file, to a human ear, sounds exactly the same as the original. (It is "transparent" because you can't "see" the compression surrounding it.)
- Nominal bitrate - The bitrate that a file "should" be, based on the settings chosen at the time of encoding.
- Actual bitrate - May or may not be the same as nominal bitrate. This is the file size in bits divided by the length of the song in seconds.
Now here is my disclaimer. I am neither a student of statistics nor an audiophile who knows what he's talking about. I'm just a curious kid. The possibility that I have used an unscientific procedure, that I have misinterpreted my data, etc. does exist, and I welcome anyone to correct me.
Also, note that these results are based solely on the judgment of my ears. I know my ears are not perfect. In fact, I can't hear frequencies above about 15 kHz, which anyone my age should be able to hear perfectly fine. There are probably other problems with my hearing as well, but since the purpose was for me to determine the best format for my music collection, nothing could work better than my own ears.
Equipment
I know that my equipment could be [a lot] better, but I'm poor. Also, this system was not intended for audiophile use when I purchased it almost two years ago.
- nForce2-based ASUS A7N8X Deluxe motherboard with onboard nForce2 6-channel audio (only left/right were used for the test)
- Yamaha HPE-160 headphones
- WinABX 0.42
- dBpowerAMP Music Converter Release 11 (dMC)
- LAME MP3 3.96
- WMA 9.1
- Ogg Vorbis 1.0 (Note: While I have relatively recent versions of the other two codecs, my Ogg Vorbis encoder is over two years old. There wasn't an Ogg Vorbis 1.1 plugin available for dBpowerAMP, though 1.1 was released quite a while ago..)
- Binomial Calculator
ABX Testing
What is it?
ABX testing is a method that can be used to determine whether two audio files sound the same. A known file A and a known file B are selected. The program randomly sets X to be either A or B, but the user does not know if X is A or B and must determine this by listening.
In my procedure, rather than ABX, I used a variant, ABXY. As before, A and B are selected and their identities known, while the computer randomly picks X to be either A or B and Y to be the other; that is, either X = A and Y = B, or X = B and Y = A.
The purpose of lossy encoders is to lose as little audible information as possible. Therefore, the objective is to make an encoded file and an unencoded file sound identical. This can be tested with WinABX if high-quality original is encoded (compressed) at various bitrates with encoders for various codecs. The encoded files can then be tested against the original to determine whether audible differences exist.
Read the ABX article at Hydrogenaudio for a more detailed explanation.
Statistical Meaning
You may have noticed that this is a hypothesis test with the null hypothesis being that the two sound files are indistinguishable. The p-values generated in the data table are the result of a one-tailed binomial calculation, and 0.05 was the significance limit.
Okay...
Basically, all you have to know for the data here to make sense to you is that if p falls below 0.05, then the results are statistically significant and the files are distinguishable. In other words, if p < 0.05, then the encoded file's quality is not high enough to sound the same as the original.
Warning: p > 0.05 does not prove that the files sound the same; this merely indicates that the results are statistically insignificant and that no conclusion can be drawn.
Procedure
Summary
I ripped two songs (Darude - Sandstorm, Jessica Simpson - Forever in Your Eyes) off of the original CDs into wave files on my hard drive. Then I converted this wave into files of each of the three lossy formats at various bitrates.
After initial setup, I proceeded to test. For each test, I set A as the original perfect CD-quality wave file, and B as the encoded version. For each codec, I began with low bit rates (for example, 96 kbps), and worked my way up until I could no longer hear any differences between the two files. This is where the testing for that particular song and codec ended, and I repeated the process with the other codecs and then the other song.
All the Gory Details
Two CDs were ripped into two 44.1 kHz, 16 bit wave files. They were both songs that I am familiar with, so I know how they sound. "Sandstorm" has no vocals and is pure techno. "Forever in Your Eyes" is a girly gay pop song with plenty of singing (though many may choose to not call it "singing").
These were then converted to MP3, Ogg, and WMA files at various bitrates. For Ogg Vorbis, I used the native VBR mode, which allows you to select a quality and has a single nominal bitrate. For the LAME MP3 encoder, I used VBR also, but it requires a range rather than just one fixed bitarte. The WMA encoder was set to VBR mode, dual-pass. All settings for each of the three encoders were set for the highest possible quality settings besides the bitrate (and of course, the 44.1 kHz and 16 bit setting).
Next, all encoded files were converted back into wave files as the WinABX program will only accept *.wav files. They were named appropriately so I could easily determine which encoder was used on each of them.
The ABX test was then run between the original and the encoded/decoded file. I tested each pair for ten trials, because this seemed like a reasonably large number with which I could obtain accurate data, while still small enough to not cause unnecessary fatigue.
If, after ten trials, 0.05 < p < 0.20, then I ran it for ten more trials and combined the results because I thought this range was rather marginal. After recalculation, if 0.03 < p < 0.10, then I continued repeating the process. Once p fell outside this range (0.03 to 0.10), I stopped. Anything < 0.20 seems quite marginal, so more trials shouldn't hurt.
The results are shown here.
Results
| Codec | Bit Rate | ABX | p | ||
|---|---|---|---|---|---|
| Nominal | Actual | Successes | Attempts | ||
| MP3 | 48~80 | 79 | 10 | 10 | 0.001 |
| 80~112 | 111 | 10 | 10 | 0.001 | |
| 96~128 | 125 | 10 | 10 | 0.001 | |
| 128~160 | 158 | 16 | 20 | 0.006 | |
| 160~192 | 188 | 5 | 10 | 0.623 | |
| WMA | 64 | 78 | 10 | 10 | 0.001 |
| 96 | 110 | 16 | 20 | 0.006 | |
| 128 | 142 | 15 | 20 | 0.021 | |
| 160 | 176 | 21 | 30 | 0.021 | |
| 192 | 208 | 5 | 10 | 0.623 | |
| Ogg | 96 | 88 | 10 | 10 | 0.001 |
| 128 | 123 | 10 | 10 | 0.001 | |
| 160 | 159 | 10 | 10 | 0.001 | |
| 192 | 196 | 17 | 20 | 0.001 | |
| 224 | 234 | 50 | 80 | 0.016 | |
| 256 | 275 | 3 | 10 | 0.945 | |
| Codec | Bit Rate | ABX | p | ||
|---|---|---|---|---|---|
| Nominal | Actual | Successes | Attempts | ||
| MP3 | 48~80 | 79 | 10 | 10 | 0.001 |
| 80~112 | 111 | 9 | 10 | 0.011 | |
| 96~128 | 127 | 5 | 10 | 0.623 | |
| WMA | 64 | 72 | 10 | 10 | 0.001 |
| 96 | 105 | 4 | 10 | 0.828 | |
| Ogg | 45 | 52 | 10 | 10 | 0.001 |
| 64 | 65 | 9 | 10 | 0.011 | |
| 80 | 79 | 10 | 10 | 0.001 | |
| 96 | 90 | 8 | 10 | 0.055 | |
| 112 | 111 | 8 | 10 | 0.055 | |
| 128 | 127 | 9 | 10 | 0.011 | |
| 160 | 159 | 9 | 10 | 0.011 | |
| 192 | 191 | 18 | 30 | 0.181 | |
| 224 | 215 | 16 | 20 | 0.006 | |
| 256 | 247 | 12 | 20 | 0.252 | |
| 320 | 312 | 5 | 10 | 0.623 | |
p values greater than the level of significance are in red; the corresponding bitrates are in blue.
Analysis
Before the analysis of the data, let me emphasize that the test was to determine whether or not the original and the compressed file sound identical. It does not measure what "sounds good." It is purely objective, and even if the compressed file sounds better, it fails because the purpose of a codec is to preserve data.
p is the result of the hypothesis test. It indicates the probability that the test subject (me) was randomly guessing the identities of X and Y. If p < 0.05, then the results are statistically significant. In other words, the null hypothesis is rejected (the two files do not sound the same).
We expect low-bitrate files to be audibly different from the original, but that as bitrate increases, the encoded file and original will sound more and more similar until the point when the two files become indistinguishable. (There are no "degrees of indistinguishability" in this hypothesis test. It's either the same or it's not.) This is clearly shown above, except for the interesting 191 kbps Ogg file for the Jessica Simpson song. For some reason, it sounded the same to me as the original, while the 215 kbps file sounded audibly different.. Weird. But that's ok, because no one listens to Jessica Simpson anyways.
The bitrate required to obtain a quality that is identical to the original is greater than or equal to the lowest bitrate that has p > 0.05. Two files cannot be proven to sound the same by hypothesis testing. So for example, if I want to use Ogg Vorbis, then I know that encoding at a nominal bitrate below 256 kbps will cause audible sound changes, while encoding at or above 256 kbps may or may not produce files that sound the same as the original.
Having said that, these minimum bitrates for MP3, WMA, and Ogg for "Sandstorm" are 188, 208, and 275, respectively, and for "Forever in Your Eyes," 127, 105, and 247, respectively. The nominal bitrates associated with these numbers, respectively, are 160~192, 192, 256; and 96~128, 96, 256.
Conclusion
Objective Conclusion
The data above is pretty thorough. Feel free to draw your own conclusions, but be aware that numbers can easily be misinterpreted or interpreted in invalid ways. I hope you at least have a basic understanding of statistics. If "null hypothesis" means nothing to you, then making your own conclusions may not be the best idea.
I ran this test with hopes that Ogg Vorbis would win. This appears not to be the case. This is especially apparent in the Jessica Simpson song, where a 127 kbps MP3 and a 105 kbps WMA seemed to sound the same as the original, while even a 215 kbps Ogg could not match this quality.
Basically, Ogg Vorbis 1.0 is not a match for MP3 or WMA. Perhaps Ogg Vorbis 1.1 does a better job.
Between MP3 and WMA, however, the results are inconclusive. WMA seems to beat MP3 in the Jessica Simpson song, but does not hold any noticeable advantage for "Sandstorm." But do remember than WMA is produced by Microsoft and has encrypting schemes, copy protection, and whatnot built-in, so in my opinion, WMA should be avoided in favor of MP3.
Thinking back to the claim that WMA performs equally to an MP3 with twice the bitrate -- what a load of bull. The data for both songs totally reject this statement.
It is also interesting to note that a far higher bitrate was required to achieve transparency in the techno song than for the gay girly pop song. I think this is due to the fact that these codecs were engineered to preserve mid-range frequencies (like voice) very well, while "Sandstorm" covers a wide range of frequencies and is more difficult to preserve without a noticeable loss of quality. In casual listening, one's ears are more focused on the singer than anything else, and thus is more likely to notice flaws in voice.
What My Ears Say
Objectively, MP3 and WMA completely destroyed Ogg Vorbis in this test. But subjectively (which, of course, is how we judge music when we listen to it), the situation is different. Note that anything in this section can be and probably is biased.
I noticed that as bit rate decreases, MP3 adds distortion and kind of "warps" the sound, while Ogg Vorbis and WMA alter the tone. I'm sure you know what I mean by distortion in an MP3 -- kind of like your speakers are underwater. In Ogg and WMA files at low bit rates, the tone of the music shifts a little. They did not sound as rich as the original, but they also did not sound "worse" than the original in terms of musical, ear-pleasing quality. I can stand listening to 96 kbps Ogg or WMA files every day, because although they are not identical to the originals, they sound very similar. However, listening to 96 kbps MP3's is a pain, and they sound terrible.
MP3 at low bitrates, like 80 kbps, is unbearable. I can tell this apart from the original without having to listen to the original at all, and even when I'm just listening to music in the background while doing other things, I sometimes get annoyed because I can hear the massive artifacts in a 64 or 80 kbps MP3. I have to strain to even figure out what the singer is singing for some MP3's at 64 or 80 kbps. (Do note that older MP3 encoders may perform quite poorly compared to the fairly recent LAME encoder I used.) On the other hand, Ogg and WMA are not so bad. To my ears, they perform about equally well at low bit rates, though I think Ogg has an edge here. I could not hear any defects in a 47 kbps Ogg file, and could not even spot any differences from the original until I actually listened to them side-by-side. Again, this is because Ogg seems to shift the tone a little, and my brain does not remember tone perfectly. This type of distortion is doubtlessly preferable to the MP3-style "underwater" distortion.
The End
At high bitrates, the crown goes to WMA or MP3, especially if an audio track has vocals. At low bitrates, WMA still performs superbly, but Ogg Vorbis is a notable competitor, especially with the open-source advantage. There are dozens of other audio formats out there, including lossless formats, such as Monkey's Audio, which perfectly preserve data and will always sound identical to the original. MP3, WMA, and Ogg Vorbis are only three of the most popular formats, and each has its own advantages.
The best solution, of course, is to buy a bigger hard drive and use lossless compression.
Last updated January 6, 2005.
3495 hits since January 6, 2005.
Comments
what the hell is that OGG crap
if u hate MP3s then just use WMA
Bill on Tuesday, February 1, 2005 at 9:11 PM
If you don't know what OGG is then do a search for it. Did he say he hates mp3s? One cannot argue that at low bitrates they suck.
Good site by the way, I was trying to decide what to use to rip cds, to get the most songs on my DAP with good quality. I think I'll go 128 WMA and use the equalizer to adjust any shifts in sound.
stu on Monday, February 7, 2005 at 3:58 PM
This is just the data that I was looking for!!!
Thanks for shareing with everyone.
I'd like to read more studies that you've done.
Thanks again
Stewrt on Wednesday, March 2, 2005 at 11:15 AM
who are these n00bs?
Will on Tuesday, April 12, 2005 at 10:51 PM
Thank you,
I supposed that WMA 192kbs had the same audio quality of 320kbs. Now I'm going to use MP3 320kbs.
Stefano on Sunday, April 17, 2005 at 6:21 PM
The best music is silence.
Chang Liu on Tuesday, April 19, 2005 at 5:38 AM
Add a Comment