Speech recognition using unequally-weighted subvector error measures for determining a codebook vector index to represent plural speech parameters

US 6,389,389 B1
Filed: 10/13/1999
Issued: 05/14/2002
Est. Priority Date: 10/13/1998
Status: Expired due to Term

First Claim

Patent Images

1. A method of determining a vector index to represent a plurality of speech parameters in signal processing for identifying an utterance, the method comprising the steps of:

weighting a first intermediate result of an operation on a first set of the plurality of speech parameters differently than a second intermediate result of an operation on a second set of the plurality of speech parameters in a weighted representation of the plurality of speech parameters; and

employing the weighted representation of the plurality of speech parameters to determine the vector index.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Quantization unit (108) comprises evaluator (120) and comparator (122) in signal processing for identifying an utterance in system (100). The evaluator (120) weights a first intermediate result of an operation on a first set of a plurality of speech parameters (104) differently than a second intermediate result of an operation on a second set of the plurality of speech parameters (104) in a weighted representation of the plurality of speech parameters (104). The comparator (122) employs the weighted representation of the plurality of speech parameters (104) to determine a vector index to represent the plurality of speech parameters (104). The quantization unit (108), in one example, can employ split vector quantization in conjunction with the weighted representation to determine a vector index to represent the plurality of speech parameters (104).

Citations

50 Claims

1. A method of determining a vector index to represent a plurality of speech parameters in signal processing for identifying an utterance, the method comprising the steps of:
- weighting a first intermediate result of an operation on a first set of the plurality of speech parameters differently than a second intermediate result of an operation on a second set of the plurality of speech parameters in a weighted representation of the plurality of speech parameters; and
  
  employing the weighted representation of the plurality of speech parameters to determine the vector index.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
- - 2. The method of claim 1 wherein the step of weighting the first intermediate result of the operation on the first set of the plurality of speech parameters differently than the second intermediate result of the operation on the second set of the plurality of speech parameters in the weighted representation comprises the step of selecting the first set of the plurality of speech parameters to comprise a first type of parameter different from a second type of parameter that comprises the second set of the plurality of speech parameters.
  - 3. The method of claim 1 wherein the step of weighting the first intermediate result of the operation on the first set of the plurality of speech parameters differently than the second intermediate result of the operation on the second set of the plurality of speech parameters in the weighted representation comprises the steps of:
4. The method of claim 1 wherein the step of weighting the first intermediate result of the operation on the first set of the plurality of speech parameters differently than the second intermediate result of the operation on the second set of the plurality of speech parameters in the weighted representation comprises the steps of:
- selecting the weighted representation to comprise a weighted distortion measure, and weighting the first intermediate result of the operation on the first set of the plurality of speech parameters differently than the second intermediate result of the operation on the second set of the plurality of speech parameters in the weighted distortion measure.
5. The method of claim 4 wherein the step of employing the weighted representation to determine the vector index comprises the step of employing the weighted distortion measure to determine the vector index.
6. The method of claim 4 wherein the step of weighting the first intermediate result of the operation on the first set of the plurality of speech parameters differently than the second intermediate result of the operation on the second set of the plurality of speech parameters in the weighted representation comprises the step of selecting the weighted distortion measure to employ a covariance matrix, and wherein the step of employing the weighted representation to determine the vector index comprises the step of employing the covariance matrix in the weighted distortion measure to determine the vector index.
7. The method of claim 4 wherein the step of weighting the first intermediate result of the operation on the first set of the plurality of speech parameters differently than the second intermediate result of the operation on the second set of the plurality of speech parameters in the weighted representation comprises the step of selecting the weighted distortion measure to employ a diagonal inverse variance matrix, and wherein the step of employing the weighted representation to determine the vector index comprises the step of employing the diagonal inverse variance matrix in the weighted distortion measure to determine the vector index.
8. The method of claim 4 wherein the step of weighting the first intermediate result of the operation on the first set of the plurality of speech parameters differently than the second intermediate result of the operation on the second set of the plurality of speech parameters in the weighted representation comprises the step of selecting the weighted distortion measure to employ an empirically determined weight matrix, and wherein the step of employing the weighted representation to determine the vector index comprises the step of employing the empirically determined weight matrix in the weighted distortion measure to determine the vector index.
11. The method of claim 1 wherein the step of weighting the first intermediate result of the operation on the first set of the plurality of speech parameters differently than the second intermediate result of the operation on the second set of the plurality of speech parameters in the weighted representation comprises the step of selecting the first set of the plurality of speech parameters and the second set of the plurality of speech parameters to comprise scalar parameters of the plurality of speech parameters, and wherein the step of employing the weighted representation to determine the vector index comprises the step of determining the vector index to represent the scalar parameters.
12. The method of claim 11 wherein the step of selecting the first set of the plurality of speech parameters and the second set of the plurality of speech parameters to comprise the scalar parameters comprises the step of selecting the first set of the plurality of speech parameters and the second set of the plurality of speech parameters to comprise different representations of speech signal energy, and wherein the step of determining the vector index to represent the scalar parameters comprises the step of determining the vector index to represent the different representations of speech signal energy.
13. The method of claim 12 wherein the step of selecting the first set of the plurality of speech parameters and the second set of the plurality of speech parameters to comprise different representations of speech signal energy comprises the steps of:
- selecting the first set of the plurality of speech parameters to comprise log frame energy, and selecting the second set of the plurality of speech parameters to comprise real cepstrum energy, wherein the step of determining the vector index to represent the different representations of speech signal energy comprises the step of determining the vector index to represent the log frame energy and the real cepstrum energy.
14. The method of claim 1 wherein the step of weighting the first intermediate result of the operation on the first set of the plurality of speech parameters differently than the second intermediate result of the operation on the second set of the plurality of speech parameters in the weighted representation comprises the step of employing weight parameters to weight the first intermediate result and the second intermediate result to determine the weighted representation.
15. The method of claim 14 wherein the step of employing the weight parameters to determine the weighted representation comprises the step of deriving the weight parameters from at least one speech sample employed in a determination of a quantization table employed to determine the vector index.
16. The method of claim 14 wherein the step of employing the weight parameters to determine the weighted representation comprises the step of deriving the weight parameters from at least one speech sample that serves as a basis for pattern recognition of a signal based on the vector index.
17. The method of claim 14 wherein the step of employing the weight parameters to determine the weighted representation comprises the step of deriving the weight parameters from a first speech sample different from a second speech sample employed in a determination of a quantization table employed to determine the vector index and different from a third speech sample serving as a basis for pattern recognition of a signal based on the vector index.
18. The method of claim 14 wherein the step of employing the weight parameters to determine the weighted representation comprises the step of empirically determining the weight parameters.
19. The method of claim 1 wherein the vector index comprises a first vector index, wherein the plurality of speech parameters comprise a first plurality of speech parameters, wherein the weighted representation comprises a first weighted representation, in combination with a method of determining a second vector index to represent a second plurality of speech parameters in signal processing for identifying an utterance, further comprising the step of determining the first plurality of speech parameters and the second plurality of speech parameters based on a same speech input.
20. The method of claim 19 further comprising the steps of:
- weighting a first intermediate result of an operation on a first set of the second plurality of speech parameters differently than a second intermediate result of an operation on a second set of the second plurality of speech parameters in a second weighted representation of the second plurality of speech parameters, and employing the second weighted representation of the second plurality of speech parameters to determine the second vector index.
21. The method of claim 20 wherein the step of weighting the first intermediate result of the operation on the first set of the first plurality of speech parameters differently than the second intermediate result of the operation on the second set of the first plurality of speech parameters in the first weighted representation and the step of employing the first weighted representation to determine the first vector index comprises the step of employing a first distortion measure to determine the first vector index, andwherein the step of weighting the first intermediate result of the operation on the first set of the second plurality of speech parameters differently than the second intermediate result of the operation on the second set of the second plurality of speech parameters in the second weighted representation and the step of employing the second weighted representation to determine the second vector index comprise the step of employing a second distortion measure different from the first distortion measure to determine the second vector index.
22. The method of claim 20 in combination with a method of vector quantization further comprising the steps of:
- employing a codebook to quantize the first plurality of speech parameters to determine the first vector index, and employing a codebook to quantize the second plurality of speech parameters to determine the second vector index.
23. The method of claim 22 wherein the step of employing the codebook to quantize the first plurality of speech parameters to determine the first vector index comprises the step of employing a first codebook to quantize the first plurality of speech parameters to determine the first vector index, and wherein the step of employing the codebook to quantize the second plurality of speech parameters to determine the second vector index comprises the step of employing a second codebook different from the first codebook to quantize the second plurality of speech parameters to determine the second vector index.
24. The method of claim 22 wherein the step of employing the codebook to quantize the first plurality of speech parameters to determine the first vector index and the step of employing the codebook to quantize the second plurality of speech parameters to determine the second vector index comprise the step of employing split vector quantization to quantize the first plurality of speech parameters to determine the first vector index and quantize the second plurality of speech parameters to determine the second vector index.
25. The method of claim 1 wherein the step of weighting the first intermediate result of the operation on the first set of the plurality of speech parameters differently than the second intermediate result of the operation on the second set of the plurality of speech parameters in the weighted representation comprises the step of causing an increased effect of the first intermediate result in the weighted representation.

9. The metnod of 4 wherein the step of weighting the first intermediate result of the operation on the first set of the plurality of speech parameters differently than the second intermediate result of the operation on the second set of plurality of speech parameters in the weighted representation comprises the step of selecting the weighted distortion measure to employ a weight matrix scaled such that at least one matrix element is equal to one, and wherein the step of employing the weighted representation to determine the vector index comprises the step of employing the weight matrix in the weighted distortion measure to determine the vector index.

10. The method of 4 wherein the step of weighting the first intermediate result of the operation on the first set of the plurality of speech parameters differently than the second intermediate result of the operation on the second set of the plurality of speech parameters in the weighted representation comprises the step of selecting the weighted distortion measure to comprise a weight matrix that is symmetric, and wherein the step of employing the weighted representation to determine the vector index comprises the step of employing the weight matrix in the weighted distortion measure to determine vector index.

26. A system used in determining a vector index to represent a plurality of speech parameters in signal processing for identifying an utterance, the system comprising:
- an evaluator that weights a first intermediate result of an operation on a first set of the plurality of speech parameters differently than a second intermediate result of an operation on a second set of the plurality of speech parameters in a weighted representation of the plurality of speech parameters; and
  
  a comparator that employs the weighted representation of the plurality of speech parameters to determine the vector index.
- View Dependent Claims (27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50)
- - 27. The system of claim 26 wherein the first set of the plurality of speech parameters comprises a first type of parameter different from a second type of parameter that comprises the second set of the plurality of speech parameters.
  - 28. The system of claim 26 wherein the first set of the plurality of speech parameters comprises a vector speech parameter of the plurality of speech parameters, and wherein the second set of the plurality of speech parameters comprises a scalar speech parameter of the plurality of speech parameters.
  - 29. The system of claim 26 wherein the weighted representation comprises a weighted distortion measure.
  - 30. The system of claim 29 wherein the evaluator weights the first intermediate result differently than the second intermediate result in the weighted distortion measure, and wherein the comparator employs the weighted distortion measure to determine the vector index.
  - 31. The system of claim 29 wherein the weighted distortion measure employs a covariance matrix.
  - 32. The system of claim 29 wherein the weighted distortion measure employs a diagonal inverse variance matrix.
  - 33. The system of claim 29 wherein the weighted distortion measure employs an empirically determined weight matrix.
  - 34. The system of claim 29 wherein the weighted distortion measure employs a weight matrix scaled such that at least one matrix element is equal to one.
  - 35. The system of claim 29 wherein the weighted distortion measure comprises a weight matrix that is symmetric.
  - 36. The system of claim 26 wherein the first set of the plurality of speech parameters and the second set of the plurality of speech parameters comprise scalar parameters of the plurality of speech parameters.
  - 37. The system of claim 36 wherein the scalar parameters comprise different representations of speech signal energy.
  - 38. The system of claim 37 wherein the first set of the plurality of speech parameters comprises log frame energy, and wherein the second set of the plurality of speech parameters comprises real cepstrum energy.
  - 39. The system of claim 26 wherein the evaluator employs weight parameters to weight the first intermediate result and the second intermediate result to determine the weighted representation.
  - 40. The system of claim 39 wherein the weight parameters are derived from at least one speech sample employed in a determination of a quantization table employed to determine the vector index.
  - 41. The system of claim 39 wherein the weight parameters are derived from at least one speech sample that serves as a basis for pattern recognition of a signal based on the vector index.
  - 42. The system of claim 39 wherein the weight parameters are derived from a first speech sample different from a second speech sample employed in a determination of a quantization table employed to determine the vector index and different from a third speech sample that serves as a basis for pattern recognition of a signal based on the vector index.
  - 43. The system of claim 39 wherein the weight parameters comprise empirically derived weight parameters.
  - 44. The system of claim 26 wherein the vector index comprises a first vector index, wherein the plurality of speech parameters comprise a first plurality of speech parameters, wherein the weighted representation comprises a first weighted representation, and further comprising an extractor that determines the first plurality of speech parameters and a second plurality of speech parameters based on a same speech input.
  - 45. The system of claim 44 wherein the evaluator weights a first intermediate result of an operation on a first set of the second plurality of speech parameters differently than a second intermediate result of an operation on a second set of the second plurality of speech parameters in a second weighted representation of the second plurality of speech parameters, and wherein the comparator employs the second weighted representation of the second plurality of speech parameters to determine a second vector index.
  - 46. The system of claim 45 wherein the evaluator and the comparator comprise a quantization unit that employs a first distortion measure to determine the first vector index, and wherein the quantization unit employs a second distortion measure different from the first distortion measure to determine the second vector index.
  - 47. The system of claim 45 wherein the quantization unit employs a codebook to quantize the first plurality of speech parameters to determine the first vector index, and wherein the quantization unit employs a codebook to quantize the second plurality of speech parameters to determine the second vector index.
  - 48. The system of claim 47 wherein the quantization unit employs split vector quantization to quantize the first plurality of speech parameters to determine the first vector index and quantize the second plurality of speech parameters to determine the second vector index.
  - 49. The system of claim 45 wherein the quantization unit employs a first codebook to quantize the first plurality of speech parameters to determine the first vector index, and wherein the quantization unit employs a second codebook different from the first codebook to quantize the second plurality of speech parameters to determine the second vector index.
  - 50. The system of claim 26 wherein the evaluator increases an effect of the first intermediate result in the weighted representation of the plurality of speech parameters.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google Technology Holdings LLC (Alphabet Inc.)
Original Assignee
Motorola, Inc. (Motorola Solutions, Inc.)
Inventors
Meunier, Jeffrey A., Pearce, David John, Kushner, William M.
Primary Examiner(s)
{haeck over (S)}mits, Ta̅livaldis Ivars

Application Number

US09/417,371
Time in Patent Office

944 Days
Field of Search

704/221, 704/222, 704/251
US Class Current

704/222
CPC Class Codes

G10L 15/02 Feature extraction for spee...

G10L 15/10 using distance or distortio...

Speech recognition using unequally-weighted subvector error measures for determining a codebook vector index to represent plural speech parameters

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

50 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition using unequally-weighted subvector error measures for determining a codebook vector index to represent plural speech parameters

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

50 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links