Harmonic errors in equal tempered musical scales
An early version of this document was e-mailed to staff at Interval Research Corporation in 3 parts, as a puzzle, between 21 Jun and 7 Sep 1994.
Many thanks to Paul H Erlich for his comments, and for pointing out several errors in my earlier description of the historical evolution of musical scales. The history below is still drastically oversimplified but I hope it is no longer blatantly wrong.
Background
The equal tempered scale usually refers to the musical scale with 12 equal divisions of the octave, but here I take an equal-tempered scale to be any scale where the frequency of each note is related to the next by a constant multiplier. For example, in the standard 12-note equal-tempered scale (which I will call E12 from now on) we go up by one degree of the scale (called a semitone) by multiplying the frequency by the 12th root of 2 (approx 1.059).
Our pitch sense is logarithmic with respect to frequency. An octave corresponds to a doubling of frequency. There are 12 semitones to an octave, and for fine measurements of pitch, a semitone is further divided into 100 cents. The best human piano tuners have an average error of +-2 cents. So with regard to musical tuning I take the terms just or true to mean "within about 2 cents".
Equal-tempered scales may be considered as approximating just or true scales. In a just scale, pairs of notes have their frequencies related by small-whole-number ratios such as the following.
1:2 (an octave) |
1:3 (a twelfth) |
1:4 (2 octaves) |
1:5 |
1:6 |
1:7 |
1:8 (3 octaves) |
|
2:3 (a fifth) |
|
2:5 (a major tenth) |
|
2:7 (an augmented 13th) |
|
|
|
3:4 (a fourth) |
3:5 (a major sixth) |
|
3:7 (an augmented ninth) |
3:8 (an eleventh) |
|
|
|
4:5 (a major third) |
|
4:7 (an augmented sixth) |
|
|
|
|
|
5:6 (a minor third) |
5:7 (an augmented fourth) |
5:8 (a minor sixth) |
The quality of a just interval, i.e. how harmonious or consonant it is, is some function of how small the numerator and denominator are when the ratio is in lowest terms. In fact the product of numerator and denominator gives a rough measure of dissonance (opposite of consonance) for typical timbres. We would generally not be interested in ratios whose product is more than about 42 or possibly 72. It depends on the spectrum of the particular timbre being used. It is related to the coincidence or otherwise of the sine wave partials that make up the timbre. These normally occur at whole-number multiples of the fundamental frequency and are then called harmonics. Some timbres, such as square waves, have partials only at odd multiples of the fundamental (odd harmonics) and so none of the ratios with an even number in them would be of any particular interest with such timbres. With a pair of pure sine waves, nothing happens at these ratios at all. The only special ratio for them is the unison (1:1) which is of course consonant for all timbres. However we should also note that, at all but low sound levels, harmonics (particularly the second) are also generated by non-linearities in the ear.
The most common intervals also have names, as you will see from the table above. These are based on the number of white notes included in the span when played on a keyboard (and including the two notes played). This naming scheme is an unfortunate case of counting the fence-posts (e.g. 8 per octave) instead of the gaps (7 per octave). Note that if you want to know the length of a fence, you multiply the gap between two adjacent posts by the number of gaps, not the number of posts. But we're stuck with this way of naming intervals and so have to deal with its strange arithmetic, for example, a fifth plus a fourth is an octave (5+4=8), and two thirds make a fifth (3+3=5). Subtract 1 after adding. Add 1 after subtracting.
Here are the most common intervals up to an octave in size, in approximate order of increasing dissonance. I have included the name of the upper note required to form that interval if the lower note is a C.
1:1 |
C |
unison |
1:2 |
C |
octave |
2:3 |
G |
fifth |
3:4 |
F |
fourth |
3:5 |
A |
major sixth |
4:5 |
E |
major third |
5:6 |
Eb |
minor third |
4:7 |
A# (lower than Bb) |
augmented sixth (not available in E12) |
5:7 |
F# |
augmented fourth |
5:8 |
Ab |
minor sixth |
5:9 (or 9:16) |
Bb |
minor seventh |
I included the augmented sixth (4:7) above although it is not at all common. There is no good approximation to it in E12. Some authors confuse it with the minor seventh, when in fact it is about 30 cents lower (about a third of a semitone) and considerably more consonant.
The problem with these just ratios is that, while singers naturally tend towards them, most instruments can only be tuned to one such just scale and cannot transpose up or down to accommodate different human voice ranges for example.
Certain pentatonic scales (5 notes per octave), based on just fifths, are among the most ancient just scales, occurring in the folk music of many cultures. Western heptatonic scales (7 notes per octave), corresponding to the white keys on keyboard instruments, almost certainly evolved from these.
The steps of the heptatonic scales were unequal and were referred to as whole-tones and semitones. Eventually the 5 "black" notes per octave were added within the whole tones in such a way that one could start a heptatonic scale on any note. This is referred to as "playing in different keys". However, as one departed more from the scales with all "white" notes, one became more likely to encounter intervals that sounded "off". This is one meaning of the musical term wolf tones, as in "there's that wolf howling in the organ pipes again". Apart from the octave, which is essentially universal, these scales still only used ratios involving powers of the prime numbers 2 and 3 (fourths and fifths). These are called Pythagorean scales.
Then we come to mean-tone tuning which introduced just major thirds and minor sixths, which involve the next prime, 5. This caused the fifths and fourths to be out by about 5 cents (barely noticeable except to a trained ear). Any system for distributing the errors caused by the interaction of the various ratios is called a temperament. Mean-tone temperament allowed one to play in 6 major and 3 minor keys with equally good intervals. Unfortunately, if one strayed the slightest bit outside of these keys it was like falling off a cliff, musically speaking.
Where does the name mean-tone come from? The just heptatonic scales not only had whole-tones and semitones, but those that attempted just major thirds (4:5) actually had two slightly different sizes of whole-tone step called major tones and minor tones, the new system made them all the same size; the mean of the two.
The augmented sixth (4:7), and other intervals that involve the prime number 7, also became available to Western music for a brief period during the mean-tone era. To attempt to overcome the worst limitation of only 3 minor keys, some organ keyboards were constructed having some of the black keys split into two, so that for example, A# and Bb were both available.
Such keyboards proved too difficult to play and went the way of the dinosaur when someone found that by making even the white notes slightly off you could start a scale anywhere with usable results. No matter where you started, it was always slightly off, but not enough that you couldn't get used to it. This led first to various "well temperaments" and then to equal-temperament. These allowed composers to modulate (play in different keys in the one piece) as widely and as wildly as they liked. This came at a cost to the harmonies involving the number 5, namely the thirds and sixths, both major and minor. In 12 note equal temperament they are all out by about 15 cents.
For a more complete history of tuning, including well temperaments, see Margo Schulter's article Pythagorean Tuning and Medieval Polyphony. See also David Bartlet's The Mathematics of Tuning and Temperament With audio examples.
Here are the first twelve powers of the 12th root of 2. Notice how remarkably close most of them are to the aforementioned ratios of small whole numbers. If the distribution were random we would expect an average error magnitude of 25 cents. Instead the average error is only 10 cents.
Degree |
From C |
E12 ratio |
Just ratio |
Just ratio |
Interval name |
Error (cents) |
1 |
C# |
1.059 |
n/a |
n/a |
semitone |
n/a |
2 |
D |
1.122 |
1.125 |
8:9 |
tone |
4 |
3 |
Eb |
1.189 |
1.200 |
5:6 |
minor third |
16 |
4 |
E |
1.260 |
1.250 |
4:5 |
major third |
14 |
5 |
F |
1.335 |
1.333 |
3:4 |
fourth |
2 |
6 |
F# |
1.414 |
1.400 |
5:7 |
augmented fourth |
17 |
7 |
G |
1.498 |
1.500 |
2:3 |
fifth |
2 |
8 |
Ab |
1.587 |
1.600 |
5:8 |
minor sixth |
14 |
9 |
A |
1.682 |
1.667 |
3:5 |
major sixth |
16 |
10 |
Bb |
1.782 |
1.778 |
9:16 |
minor seventh |
4 |
11 |
B |
1.888 |
1.875 |
8:15 |
major seventh |
12 |
Thanks to Micah Caudle for correcting an error in the above table.
The problem
Is there something really special about the number 12 or are there other numbers smaller or greater than 12 which give an equal-tempered octave whose notes are close to many small-whole-number ratios. Limit the search to less than 60 notes per octave because with 60 notes per octave any frequency would fall in a +-10 cent range from some note.* Remember that the ratios with smaller numbers in them are more important.
A side note: The 12 note equal-tempered octave is not big on ratios with 7 in them. Perhaps we can find a system that treats the seventh harmonic well.
Feel free to skip the next section which is heavily mathematical and go straight to the section, "The solution".
* [Unfortunately, by limiting the search to 60 notes we miss out on E72 which I later learned is very useful, particularly since it is a multiple of the standard E12. However 72 notes per octave is rather daunting and one is more likely to use an unequally spaced subset, such as Paul Erlich's and my linearly tempered Blackjack scale, which is 5,2,5,2,5,2,2,5,2,5,2,5,2,5,2,5,2,5,2,5,2 in steps of 72-ET, or my 19-note planar tempering of Rami Vitale's Byzantine-superset scale, which is 5,2,5,4,3,4,3,4,5,2,5,5,2,5,4,3,4,3,4.]
A method
A first pass at finding equal tempered scales that work is to compare powers of 2 with powers of the other small primes and look for near matches. I used a spreadsheet as follows. Name the first column "Notes per octave" and put the integers from 1 to 60 in it. Then name six more columns "3rd harmonic", "3rd harmonic error", "5th harmonic", "5th harmonic error", "7th harmonic", "7th harmonic error". Unless they are tuned very accurately, ratios involving the 11th harmonic (the next prime) are too dissonant to distinguish themselves from the general dissonance in the region between the simpler ratios on either side.
I'll use 'N' to mean Notes per octave. In the "3rd harmonic" column calculate which power of 2 is closest to the Nth power of 3 by
M = Round(N * log(3) / log(2))
(the base of the logarithm doesn't matter)
Then in the "3rd harmonic error" column calculate the error in cents between these two powers.
e3 = Abs(1200 * log(3^N/2^M) / log(2) / N)
This will be the error in the intervals called a fifth (2:3) and a fourth (3:4). Since these are so important we should probably reject any scale with an error greater than 10 cents in these harmonies.
Do the same thing in the "5th harmonic" and "7th harmonic" columns, replacing 3 with 5 or 7 in the formulas.
We should probably reject any scales with a 5th harmonic error worse than that of E12, i.e. greater than 14 cents. This corresponds to the error in the harmony called a major third (4:5).
Since we're happy with our E12 and it does such a lousy job on 7th harmonics (a 31 cent error) I guess we can't eliminate a scale just because it doesn't do 7th harmonics (e.g. augmented sixths) very well, but we'd certainly like to find some that do.
We'd like a single number that lets us directly compare the usefulness of the different scales. I decided on a somewhat ad hoc (or contrived?) "uselessness" metric which is a weighted average of the errors. The weights I used for the errors were the squares of the reciprocals of their harmonic number. This may give excessive weight to the low numbers but certainly the first power does not give enough. This becomes obvious if you have an "11th harmonic error" column.
e = (e3/9 + e5/25 + e7/49)/(1/9 + 1/25 + 1/49))
Note also that we are simplifying by not treating separately those ratios involving two odd primes, such as 3:5, 5:6, 5:7, 6:7. It is quite possible that a scale may have good approximations for the two odd primes involved but have a bad approximation for the combined interval because the errors in the primes are in opposite directions and so their magnitudes add. This simplification is reasonable when the harmonics are weighted as they are here. But a more even weighting, such as the first power mentioned above, may need to treat these separately. Consistency of adding the intervals in a triad is another issue brought to my attention by Paul Erlich.
The solution
If there was no disadvantage in having more notes per octave, then E12 is only slightly special with regard to harmonies. Several tunings with less than 60 notes do better. See the chart below.
.
Of those, E24 does fifths (2:3) as well as E12 (hardly surprising since 24 is an exact multiple of 12), while E29 does them better. E24 and E29 do major thirds (4:5) just as well as E12. E19 and E22 do them better.
They all do augmented sixths (4:7) much better than E12. E26 has just augmented sixths (0.4 of a cent error).
When reasonable consideration is given to the inconvenience of having more keys on a keyboard, E12 is easily the most useful, followed by E22 and E24 and then E19 and E26. A classical scale in India has 22 notes per octave but they are not equally spaced.
Of those with less than 12 keys, only E10 has anything at all to recommend it, but it doesn't have very good fifths or thirds. If you want to hear something particularly horrid, try E11.
If you really don't care about the number of keys then E53 is extremely good, having an error of only 0.07 cents in its fifths, 1.4 cents in its major thirds and 5 cents in its augmented sixths. It is much better than any other until we reach E72.
Of course we expect the errors to get smaller as the number of divisions increases and hence their size decreases. The chart below shows the errors as a proportion of the scale division rather than as absolute cents.
The only ones equal to or better than E12 on this criterion are E31, E41 and E53.
Most MIDI synthesisers can accept tuning tables to retune them to weird scales like these. The design of keyboards to play in these scales is rather daunting but has been done for some of them. Guitar fretboards are a better candidate since most of the chord patterns should still work, with only minor adjustments of finger position.
One should note that harmonies that are too-true (say less than a 1 cent error), can sound more like a single note of a different timbre than a harmony. So one probably shouldn't attempt to eliminate them entirely, and on that basis alone we might reject E53.
Unlike many electronically generated sounds, the partials of the timbres of real musical instruments may not be exact multiples of the fundamental (not phase-locked). The series is sometimes "stretched" or "compressed". For example, the 11th harmonic might differ from 11 times the frequency of the fundamental by up to 12 cents. So in selecting scales we really should consider the particular timbres we expect to play in them. However bowed strings, brass, and reed instrument, the human voice and many synthetic tones have exact harmonics for tuning purposes, and so E31 warrants more investigation.
It turns out that mean-tone tuning is essentially 12 notes chosen from E31. And both E19 and E31 were known hundreds of years ago as "the cycle of 19" and "the cycle of 31". They form part of a Fibbonacci-like sequence of scales: E5, E7, E12, E19, E31. I say Fibbonacci-like because adding two numbers in sequence gives the next, however the sequence essentially stops at 31 since 50 is not very good, considering how small its divisions are. These scales have the advantage that their notes can be consistently named in the standard Western manner, i.e. the letters A to G and sharps and flats (including double sharps and double flats). This is because they all have a single chain of fifths and agree with the approximation that 4 fifths is the same as 2 octaves and a major third. (3/2)^4 ~= 2^2 * 5/4. The actual difference between these is about 21.5 cents and is called a syntonic comma. Other interesting scales either do not agree with this approximation (E22, E41, E53), or have multiple interleaved chains of fifths (E24), and so cannot be named consistently in the standard Western manner. Note that 19 + 22 = 41 and 31 + 22 = 53.
Note that while E24 appears, from the above analysis, to be significantly better than E12, particularly in having an almost passable treatment of the 7th harmonic; in reality it is not. This is because the errors in the 5th and 7th harmonics, while individually being barely acceptable, are in opposite directions. This results in a 5:7 harmony with an unacceptable error. Interestingly both E12 and E24 do contain a reasonable approximation to a 5:7 but it is a different size to the one produced by playing the best E24 approximations to 4:5 and 4:7 from the same root (the note representing 4). This is the consistency issue raised by Paul Erlich.
For similar evaluations of equal tempered scales, using some different criteria, including those important to melody, see Paul Erlich's article Tuning, Tonality, and Twenty-Two-Tone Equal Temperament and Georg Hajdu's Low Energy and Equal Spacing; the Multifactorial Evolution of Tuning Systems, Interface, 22, (1993), 319-333.
John Starret's Microtonality Music page has links to many wonderful resources relating to this field..
Book References
Helmholtz, "The sensations of tone"
Alexander Wood, "The physical basis of sound"
Llewellyn S. Lloyd, "Music and sound"