System and methods for analyzing and critiquing a vocal performance
First Claim
1. A system for analyzing a vocal performance, comprising:
- means for receiving input utterances corresponding to a current vocal performance of a user;
means for extracting pitch information from each frame of said input utterances of said current vocal performance;
means for extracting phonetic information from each frame of said input utterances of said current vocal performance;
means for combining said pitch information and said phonetic information of corresponding frames to generate an encoded representation of said current vocal performance; and
means for outputting said encoded representation of said current vocal performance.
1 Assignment
0 Petitions
Accused Products
Abstract
System and methods for analyzing a vocal performance by automatically critiquing pitch, rhythm and pronunciation or diction of a singer in accordance with pre-programmed criteria. In one aspect, a method for analyzing a vocal performance comprises the steps of capturing the acoustic utterances of a user'"'"'s vocal performance (singing a song); extracting pitch information from each frame of the acoustic utterances; extracting phonetic information from each frame of the acoustic utterances; combining the extracted pitch information and phonetic information of corresponding frames to generate an encoded representation of the current vocal performance; comparing the encoded representation of the current vocal performance with an encoded reference vocal performance (or the same user or a different person) having pitch and phonetic information associated therewith to determine if a variation between either pitch information, the phonetic information, or both, of the encoded current vocal performance and of the encoded reference vocal performance is within a predetermined, user-specified tolerance; and critiquing the user'"'"'s current vocal performance if the variation is determined to exceed the predetermined tolerance.
-
Citations
26 Claims
-
1. A system for analyzing a vocal performance, comprising:
-
means for receiving input utterances corresponding to a current vocal performance of a user;
means for extracting pitch information from each frame of said input utterances of said current vocal performance;
means for extracting phonetic information from each frame of said input utterances of said current vocal performance;
means for combining said pitch information and said phonetic information of corresponding frames to generate an encoded representation of said current vocal performance; and
means for outputting said encoded representation of said current vocal performance. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 15, 21, 22)
means for storing an encoded representation of a reference vocal performance comprising pitch information and phonetic information; and
means for comparing one of said pitch information, phonetic information, and both, of said encoded current vocal performance and of said encoded reference vocal performance to determine if a variation between one of said pitch information, phonetic information, and both, of said current vocal performance and of said reference vocal performance is within a corresponding predetermined tolerance.
-
-
6. The system of claim 1, further comprising:
-
means for storing parameters associated with a song style; and
means for comparing said parameters of said song style with one of said pitch information, phonetic information, and both, of said encoded current vocal performance to determine if a variation of the singing style of said current vocal performance and said song style is within a predetermined tolerance.
-
-
7. The system of claim 5, wherein said comparing means compares said encoded current vocal performance with said encoded reference vocal performance by comparing the encoded pitch information and phonetic information in corresponding frames of said encoded representations.
-
8. The system of claim 7, wherein said comparison means determines if a variation between the rhythm of said current vocal performance and the rhythm of said corresponding reference vocal performance is within a corresponding predetermined tolerance by comparing a timing of change of said phonetic information of said encoded current vocal performance and said encoded reference vocal performance on a frame-by-frame basis.
-
9. The system of claim 5, wherein said comparing means provides critique data when the variation between of one of said pitch information and phonetic information of said encoded current vocal performance and of said encoded reference performance exceeds the corresponding predetermined tolerance.
-
10. The system of claim 5, wherein said corresponding tolerances are user-programmable parameters.
-
11. The system of claim 9, wherein said critique data is one of successively provided during said current vocal performance and as batch data at the conclusion of said current vocal performance.
-
15. The method of claim 4, wherein said comparing step further includes the step of:
determining if a variation between the timing of said current vocal performance and the timing of said reference vocal performance is within a corresponding predetermined tolerance by comparing a timing of change of said phonetic information of said encoded current vocal performance and of said reference vocal performance on a frame-by-frame basis.
-
21. The system of claim 1, wherein the means for extracting phonetic information comprises a speech recognition system and wherein the means for extracting the pitch information comprises a frequency extraction system.
-
22. The system of claim 21, wherein the speech recognition system and frequency extraction system operate in parallel to extract the phonetic and pitch information, respectively, from the current vocal performance.
-
12. A method for analyzing a vocal performance, comprising the steps of:
-
providing acoustic utterances corresponding to a current vocal performance of a user;
extracting pitch information from each frame of said acoustic utterances;
extracting phonetic information from each frame of said acoustic utterances;
combining said extracted pitch information and phonetic information of corresponding frames to generate an encoded representation of said current vocal performance;
comparing said encoded representation of said current vocal performance with a corresponding encoded reference vocal performance having pitch and phonetic information associated therewith to determine if a variation between one of said pitch information, said phonetic information, and both, of said encoded current vocal performance and of said encoded reference vocal performance is within a corresponding predetermined tolerance; and
critiquing the user'"'"'s current vocal performance if the variation is determined to exceed a corresponding predetermined tolerance. - View Dependent Claims (13, 14, 23, 24)
-
-
16. A method for analyzing a vocal performance, comprising the steps of:
-
providing acoustic utterances corresponding to a current vocal performance of a user;
extracting pitch information from each frame of said acoustic utterances;
extracting phonetic information from each frame of said acoustic utterances;
combining said extracted pitch information and phonetic information of corresponding frames to generate an encoded representation of said current vocal performance;
comparing said encoded representation of said current vocal performance with parameters of a preselected song style to determine if a variation between one of said pitch information, said phonetic information, and both, of said encoded current vocal performance and of said parameters of said song style is within a corresponding predetermined tolerance; and
critiquing the user'"'"'s current vocal performance if the variation is determined to exceed a corresponding predetermined tolerance. - View Dependent Claims (17, 18, 19, 20, 25, 26)
comparing said pitch information of said encoded current vocal performance with said pitch difference parameter of said preselected song style to determine if said difference in pitch information between successive notes of said encoded current vocal performance varies from said pitch difference parameter associated with said preselected song style within a corresponding predetermined tolerance.
-
-
20. The method of claim 18, wherein said comparing step includes the step of:
comparing said phonetic information of said encoded current vocal performance with said phonetic information parameter of said preselected song style to determine if said phonetic information of said encoded current vocal performance varies from said phonetic information parameter associated with said preselected song style within a corresponding predetermined tolerance.
-
25. The method of claim 16, wherein said corresponding tolerances are user-programmable parameters.
-
26. The method of claim 16, wherein the steps of extracting pitch and phonetic information are performed simultaneously.
Specification