System and method for interpretation and visualization of acoustic spectra, particularly to discover the pitch and timbre of musical sounds

US 6,725,108 B1
Filed: 01/28/1999
Issued: 04/20/2004
Est. Priority Date: 01/28/1999
Status: Expired due to Fees

First Claim

Patent Images

1. A system for analyzing an acoustic spectrum comprising:

a computer with one or more memories and one or more central processing units;

an audio input device that acquires an audio waveform in the time domain;

means for conducting a spectral analysis process that evaluates the frequency content of the audio waveform at one or more discrete evaluation frequencies, the spectral-analysis process determining at each evaluation frequency a spectral amplitude representing the power spectral density of the waveform at the respective frequency, this set of spectral amplitudes versus frequency being called a power spectrum;

a note analysis process that identifies a set of peaks in the power spectrum, finds low-integer relationships between the frequency of the peaks, and thereby determines which of the peaks belongs to a note contained in the audio waveform;

wherein said note analysis process comprises the following steps;

a. selecting the largest peak into a set of overtones comprising the note;

b. sequentially comparing a candidate peak not yet in the set to those already in the set, said sequential comparisons being done in order of decreasing amplitude of the candidate peaks;

c. for each of the comparisons, selecting the candidate peak into the set of overtones if and only if the candidate peak'"'"'s frequency as well as the frequencies of all peaks already in the set are low-integer multiples of a common fundamental frequency, within a tolerance.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The invention comprises a transducer, computer hardware, and software. The computer hardware contains a waveform-input device. The transducer (such as a microphone) converts a signal (such as sound waves) into a time-varying voltage. The waveform-input device periodically samples this voltage and digitizes each sample, thereby producing an array of N numbers in the memory of the computer that represent a small snippet of the signal measured over a time interval Δt_meas. Snippets are typically measured one after the other at a repetition rate that is inversely related to Δt_meas. The software, also stored in the memory of the computer, and executed using its central processing unit, includes a spectral-analysis process that analyzes the frequency content of each snippet and produces an associated spectrum. The software also includes a novel note-analysis process that analyzes the spectrum and extracts from it the pitch and timbre of the principal musical note contained therein. The process works for any spectrum, including cases where the fundamental frequency of the note is missing. The software further includes novel processes to visualize graphically the pitch and the timbre.

Citations

19 Claims

1. A system for analyzing an acoustic spectrum comprising:
- a computer with one or more memories and one or more central processing units;
  
  an audio input device that acquires an audio waveform in the time domain;
  
  means for conducting a spectral analysis process that evaluates the frequency content of the audio waveform at one or more discrete evaluation frequencies, the spectral-analysis process determining at each evaluation frequency a spectral amplitude representing the power spectral density of the waveform at the respective frequency, this set of spectral amplitudes versus frequency being called a power spectrum;
  
  a note analysis process that identifies a set of peaks in the power spectrum, finds low-integer relationships between the frequency of the peaks, and thereby determines which of the peaks belongs to a note contained in the audio waveform;
  
  wherein said note analysis process comprises the following steps;
  
  a. selecting the largest peak into a set of overtones comprising the note;
  
  b. sequentially comparing a candidate peak not yet in the set to those already in the set, said sequential comparisons being done in order of decreasing amplitude of the candidate peaks;
  
  c. for each of the comparisons, selecting the candidate peak into the set of overtones if and only if the candidate peak'"'"'s frequency as well as the frequencies of all peaks already in the set are low-integer multiples of a common fundamental frequency, within a tolerance.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. The system as in claim 1 where each of the overtones is given an integer overtone number that specifies approximately the ration between the overtone'"'"'s frequency and the common fundamental frequency.
  - 3. A system, as in claim 2, where a pitch of the note is determined as a weighted average of the estimates which the various overtones in the note make of the note'"'"'s fundamental frequency, this estimate being an overtone'"'"'s frequency divided by its overtone number.
  - 4. A system, as in claim 3, where the average is weighted by the spectral amplitudes.
  - 5. A system, as in claim 4, where the average is given by:
    - $Pitch \equiv \frac{\sum_{j = 0}^{L - 1} Φ_{j} (\frac{f_{j}}{n_{j}})}{\sum_{j = 0}^{L - 1} Φ_{j}},$
6. A system, as in claim 3, further comprising a computer display unit, upon which values of pitch obtained sequentially in time are displayed in the manner of a strip-chart recording, each value of pitch being presented as a data point on a graph of logarithmically scaled frequency versus time, and the data points being accumulated on the displayed graph as time progresses, so that a user can see a history of pitch versus time.
7. A system, as in claim 3, further comprising a computer display unit, upon which the vector of numbers representing the timbre of the note is displayed in the manner of a bar-chart, the height of the i^thbar representing the amplitude of overtone i, such that a user can see directly, for the sampled waveform most recently acquired, the overtone content of the sound, and by observing the bar chart in real time, can see how the overtone content of the sound changes over time.
8. A system, as in claim 3, further comprising a computer display unit, upon which values of pitch are displayed as notes on a musical staff, each note'"'"'s pitch to the nearest semitone being indicated by the note'"'"'s location on the staff, as in standard musical notation, and the note'"'"'s exact pitch within the semitone being indicated by the color of the note as displayed on the computer display unit.
9. A system, as in claim 3, further comprising a computer display unit, upon which values of pitch are displayed as notes on a musical staff, each note'"'"'s pitch to the nearest semitone being indicated by the note'"'"'s location on the staff, as in standard musical notation, and the note'"'"'s exact pitch within the semitone being indicated by the shape of the note as displayed on the computer display unit.
10. The system as in claim 1 where for each of the comparisons, the candidate peak is selected into the set only if the low-integer multiples can be found in the range 1 through 24.
11. The system as in claim 1 where a timbre of the note is determined from the amplitudes and overtone numbers of the peaks in the set of overtones.
12. A system, as in claim 11, where the timbre is given by a vector of numbers whose i^thelement is non-zero only if the overtone number i appears in the set of overtones, and in that case the element of the vector is equal to the overtone'"'"'s amplitude divided by the largest amplitude in the set of overtones.

13. A computer system comprising:
- one or more central processing units and one or more memories, said computer system further comprising;
  
  an audio input device that acquires an audio waveform in the time domain and samples the audio waveform periodically at a sampling rate to produce one or more sampled wave forms in a temporal sequence, each sampled waveform comprising one or more discrete samples at a respective sample time;
  
  means for conducting a spectral analysis process that evaluates a power spectral density of each waveform at a set of one or more discrete evaluation frequencies, the evaluation being evenly and logarithmically distributed over a frequency range, the spectral-analysis process determining at each evaluation frequency a spectral amplitude representing the power spectral density of the waveform at the respective evaluation frequency, this set of spectral amplitudes versus frequency being called a power spectrum;
  
  means for conducting a note analysis process that identifies a set of peaks in the power spectrum, which finds low-integer relationships between the frequency of the peaks, and thereby determines which of the peaks belongs to a note contained in the audio waveform;
  
  wherein said note analysis process comprises the following steps;
  
  a. selecting the largest peak into a set of overtones comprising the note;
  
  b. sequentially comparing a candidate peak not yet in the set to those already in the set, said sequential comparisons being done in order of decreasing amplitude of the candidate peaks;
  
  c. for each of the comparisons, selecting the candidate peak into the set of overtones if and only if the candidate peak'"'"'s frequency as well as the frequencies of all peaks already in the set are low-integer multiples of a common fundamental frequency, within a tolerance.
- View Dependent Claims (14, 15, 16, 17, 18, 19)
- - 14. A system, as in claim 13, where the number of evaluation frequencies is fewer than the number of discrete samples.
  - 15. The system as in claim 13, where the number of evaluation frequencies in said set of discrete evaluation frequencies is greater than or equal to the number of discrete samples.
  - 16. A system, as in claim 13, where the evaluation frequencies are given by
- 17. A system as in claim 16, where f₀is a “
  - true”
    
    musical note on an equally tempered scale.
- 18. A system, as in claim 16, where f₀=(440)2^s/12for some integer s, where s is any one of the following values:
  - a positive value, a negative value, and a zero value.
- 19. A system, as in claim 18, where M=12 m for some positive integer m, where m specifies the number of evaluation frequencies per half-step in a 12-tone system of music.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Hall, Shawn Anthony
Primary Examiner(s)
Harvey, Minsun Oh
Assistant Examiner(s)
Jacobson, Tony M.

Application Number

US09/239,324
Time in Patent Office

1,909 Days
Field of Search

381/56, 844/77.R, 700/94
US Class Current

700/94
CPC Class Codes

G10G 7/00 Other auxiliary devices or ...

H04R 29/004 for microphones H04R29/007 ...

System and method for interpretation and visualization of acoustic spectra, particularly to discover the pitch and timbre of musical sounds

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for interpretation and visualization of acoustic spectra, particularly to discover the pitch and timbre of musical sounds

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links