System and methods for analyzing and critiquing a vocal performance

US 6,182,044 B1
Filed: 09/01/1998
Issued: 01/30/2001
Est. Priority Date: 09/01/1998
Status: Expired due to Term

First Claim

Patent Images

1. A system for analyzing a vocal performance, comprising:

means for receiving input utterances corresponding to a current vocal performance of a user;

means for extracting pitch information from each frame of said input utterances of said current vocal performance;

means for extracting phonetic information from each frame of said input utterances of said current vocal performance;

means for combining said pitch information and said phonetic information of corresponding frames to generate an encoded representation of said current vocal performance; and

means for outputting said encoded representation of said current vocal performance.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

System and methods for analyzing a vocal performance by automatically critiquing pitch, rhythm and pronunciation or diction of a singer in accordance with pre-programmed criteria. In one aspect, a method for analyzing a vocal performance comprises the steps of capturing the acoustic utterances of a user'"'"'s vocal performance (singing a song); extracting pitch information from each frame of the acoustic utterances; extracting phonetic information from each frame of the acoustic utterances; combining the extracted pitch information and phonetic information of corresponding frames to generate an encoded representation of the current vocal performance; comparing the encoded representation of the current vocal performance with an encoded reference vocal performance (or the same user or a different person) having pitch and phonetic information associated therewith to determine if a variation between either pitch information, the phonetic information, or both, of the encoded current vocal performance and of the encoded reference vocal performance is within a predetermined, user-specified tolerance; and critiquing the user'"'"'s current vocal performance if the variation is determined to exceed the predetermined tolerance.

Citations

26 Claims

1. A system for analyzing a vocal performance, comprising:
- means for receiving input utterances corresponding to a current vocal performance of a user;
  
  means for extracting pitch information from each frame of said input utterances of said current vocal performance;
  
  means for extracting phonetic information from each frame of said input utterances of said current vocal performance;
  
  means for combining said pitch information and said phonetic information of corresponding frames to generate an encoded representation of said current vocal performance; and
  
  means for outputting said encoded representation of said current vocal performance.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 15, 21, 22)
- - 2. The system of claim 1, further comprising audio processing means for providing musical accompaniment during said current vocal performance.
  - 3. The system of claim 1, wherein said encoded representation comprises a time-varying sequence of the extracted pitch information and phonetic information.
  - 4. The system of claim 1, wherein said encoding means averages the extracted pitch information of a plurality of successive frames when said encoding means determines that a change in pitch information in the successive frames is below a specified threshold.
  - 5. The system of claim 1, further comprising:
6. The system of claim 1, further comprising:
- means for storing parameters associated with a song style; and
  
  means for comparing said parameters of said song style with one of said pitch information, phonetic information, and both, of said encoded current vocal performance to determine if a variation of the singing style of said current vocal performance and said song style is within a predetermined tolerance.
7. The system of claim 5, wherein said comparing means compares said encoded current vocal performance with said encoded reference vocal performance by comparing the encoded pitch information and phonetic information in corresponding frames of said encoded representations.
8. The system of claim 7, wherein said comparison means determines if a variation between the rhythm of said current vocal performance and the rhythm of said corresponding reference vocal performance is within a corresponding predetermined tolerance by comparing a timing of change of said phonetic information of said encoded current vocal performance and said encoded reference vocal performance on a frame-by-frame basis.
9. The system of claim 5, wherein said comparing means provides critique data when the variation between of one of said pitch information and phonetic information of said encoded current vocal performance and of said encoded reference performance exceeds the corresponding predetermined tolerance.
10. The system of claim 5, wherein said corresponding tolerances are user-programmable parameters.
11. The system of claim 9, wherein said critique data is one of successively provided during said current vocal performance and as batch data at the conclusion of said current vocal performance.
15. The method of claim 4, wherein said comparing step further includes the step of:
- determining if a variation between the timing of said current vocal performance and the timing of said reference vocal performance is within a corresponding predetermined tolerance by comparing a timing of change of said phonetic information of said encoded current vocal performance and of said reference vocal performance on a frame-by-frame basis.
21. The system of claim 1, wherein the means for extracting phonetic information comprises a speech recognition system and wherein the means for extracting the pitch information comprises a frequency extraction system.
22. The system of claim 21, wherein the speech recognition system and frequency extraction system operate in parallel to extract the phonetic and pitch information, respectively, from the current vocal performance.

12. A method for analyzing a vocal performance, comprising the steps of:
- providing acoustic utterances corresponding to a current vocal performance of a user;
  
  extracting pitch information from each frame of said acoustic utterances;
  
  extracting phonetic information from each frame of said acoustic utterances;
  
  combining said extracted pitch information and phonetic information of corresponding frames to generate an encoded representation of said current vocal performance;
  
  comparing said encoded representation of said current vocal performance with a corresponding encoded reference vocal performance having pitch and phonetic information associated therewith to determine if a variation between one of said pitch information, said phonetic information, and both, of said encoded current vocal performance and of said encoded reference vocal performance is within a corresponding predetermined tolerance; and
  
  critiquing the user'"'"'s current vocal performance if the variation is determined to exceed a corresponding predetermined tolerance.
- View Dependent Claims (13, 14, 23, 24)
- - 13. The method of claim 12, wherein said step of critiquing occurs one of during said current vocal performance and after said current vocal performance is completed.
  - 14. The method of claim 12, wherein said step of comparing said encoded current performance with said encoded reference performance comprises comparing the encoded pitch information and phonetic information in corresponding frames of said encoded representations.
  - 23. The method of claim 12, wherein said corresponding tolerances are user-programmable parameters.
  - 24. The method of claim 12, wherein the steps of extracting pitch and phonetic information are performed simultaneously.

16. A method for analyzing a vocal performance, comprising the steps of:
- providing acoustic utterances corresponding to a current vocal performance of a user;
  
  extracting pitch information from each frame of said acoustic utterances;
  
  extracting phonetic information from each frame of said acoustic utterances;
  
  combining said extracted pitch information and phonetic information of corresponding frames to generate an encoded representation of said current vocal performance;
  
  comparing said encoded representation of said current vocal performance with parameters of a preselected song style to determine if a variation between one of said pitch information, said phonetic information, and both, of said encoded current vocal performance and of said parameters of said song style is within a corresponding predetermined tolerance; and
  
  critiquing the user'"'"'s current vocal performance if the variation is determined to exceed a corresponding predetermined tolerance.
- View Dependent Claims (17, 18, 19, 20, 25, 26)
- - 17. The method of claim 16, wherein said step of critiquing occurs one of during said current vocal performance and after said current vocal performance is completed.
  - 18. The method of claim 16, wherein said parameters include phonetic information associated with said preselected song style and pitch difference between successive notes of said preselected song style.
  - 19. The method of claim 18, wherein said step of comparing said encoded current performance with said parameters of said preselected song style includes the steps of:
20. The method of claim 18, wherein said comparing step includes the step of:
- comparing said phonetic information of said encoded current vocal performance with said phonetic information parameter of said preselected song style to determine if said phonetic information of said encoded current vocal performance varies from said phonetic information parameter associated with said preselected song style within a corresponding predetermined tolerance.
25. The method of claim 16, wherein said corresponding tolerances are user-programmable parameters.
26. The method of claim 16, wherein the steps of extracting pitch and phonetic information are performed simultaneously.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Strother, Nelson B., Fong, Philip W.
Primary Examiner(s)
Zele, Krista
Assistant Examiner(s)
Opsasnick, Michael N.

Application Number

US09/145,322
Time in Patent Office

882 Days
Field of Search

434/307, 844/77, 846/09, 846/10, 846/12, 704/270, 704/278, 704/246
US Class Current

704/270
CPC Class Codes

G10L 25/48 specially adapted for parti...

G10L 25/90 Pitch determination of spee...

System and methods for analyzing and critiquing a vocal performance

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

26 Claims

Specification

Solutions

Use Cases

Quick Links

System and methods for analyzing and critiquing a vocal performance

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

26 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links