Method and apparatus for rapid adapt via cumulative distribution function matching for continuous speech

US 6,470,314 B1
Filed: 04/06/2000
Issued: 10/22/2002
Est. Priority Date: 04/06/2000
Status: Expired due to Fees

First Claim

Patent Images

1. A method of adapting a speech recognition system to one or more acoustic conditions, the method comprising the steps of:

computing cumulative distribution functions based on dimensions of speech vectors associated with training speech data provided to the speech recognition system;

computing cumulative distribution functions based on dimensions of speech vectors associated with test speech data provided to the speech recognition system;

computing a nonlinear transformation mapping based on the cumulative distribution functions associated with the training speech data and the cumulative distribution functions associated with the test speech data; and

applying the nonlinear transformation mapping to speech vectors associated with the test speech data prior to recognition, wherein the speech vectors transformed in accordance with the nonlinear transformation mapping are substantially similar to speech vectors associated with the training speech data.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method of adapting a speech recognition system to one or more acoustic conditions comprises the steps of: (i) computing cumulative distribution functions based on dimensions of speech vectors associated with training speech data provided to the speech recognition system; (ii) computing cumulative distribution functions based on dimensions of speech vectors associated with test speech data provided to the speech recognition system; (iii) computing a nonlinear transformation mapping based on the cumulative distribution functions associated with the training speech data and the cumulative distribution functions associated with the test speech data; and (iv) applying the nonlinear transformation mapping to speech vectors associated with the test speech data prior to recognition, wherein the speech vectors transformed in accordance with the nonlinear transformation mapping are substantially similar to speech vectors associated with the training speech data.

26 Citations

View as Search Results

37 Claims

1. A method of adapting a speech recognition system to one or more acoustic conditions, the method comprising the steps of:
- computing cumulative distribution functions based on dimensions of speech vectors associated with training speech data provided to the speech recognition system;
  
  computing cumulative distribution functions based on dimensions of speech vectors associated with test speech data provided to the speech recognition system;
  
  computing a nonlinear transformation mapping based on the cumulative distribution functions associated with the training speech data and the cumulative distribution functions associated with the test speech data; and
  
  applying the nonlinear transformation mapping to speech vectors associated with the test speech data prior to recognition, wherein the speech vectors transformed in accordance with the nonlinear transformation mapping are substantially similar to speech vectors associated with the training speech data.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, wherein the step of computing the cumulative distribution functions associated with the training speech data further comprises the step of determining, for each dimension, a maximum value and a minimum value across the training speech data.
  - 3. The method of claim 1, further comprising the step of performing one or more model based adaptation techniques in accordance with the speech recognition system.
  - 4. The method of claim 1, further comprising the step of performing one or more other feature based transformation adaptation techniques in accordance with the speech recognition system.
  - 5. The method of claim 1, wherein the steps of computing the cumulative distribution functions comprise using a nonparametric histogram approach when sufficient adaptation data is available.
  - 6. The method of claim 1, wherein the steps of computing the cumulative distribution functions comprise using a parametric density form when insufficient adaptation data is available.
  - 7. The method of claim 1, wherein the nonlinear transformation mapping is applied to one of each dimension and multiple dimensions of each speech vector associated with the test speech data.
  - 8. The method of claim 1, wherein the speech recognition system is a continuous speech recognition system.
  - 9. The method of claim 1, wherein the cumulative distribution functions are piece-wise linear function approximations.
  - 10. The method of claim 1, wherein the speech recognition system is associated with a telephony application.

11. Apparatus for adapting a speech recognition system to one or more acoustic conditions, the apparatus comprising:
- at least one processing device operative to;
  
  (i) compute cumulative distribution functions based on dimensions of speech vectors associated with training speech data provided to the speech recognition system;
  
  (ii) compute cumulative distribution functions based on dimensions of speech vectors associated with test speech data provided to the speech recognition system;
  
  (iii) compute a nonlinear transformation mapping based on the cumulative distribution functions associated with the training speech data and the cumulative distribution functions associated with the test speech data; and
  
  (iv) apply the nonlinear transformation mapping to speech vectors associated with the test speech data prior to recognition, wherein the speech vectors transformed in accordance with the nonlinear transformation mapping are substantially similar to speech vectors associated with the training speech data.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 12. The apparatus of claim 11, wherein the operation of computing the cumulative distribution functions associated with the training speech data further comprises the operation of determining, for each dimension, a maximum value and a minimum value across the training speech data.
  - 13. The apparatus of claim 11, wherein the at least one processing device is further operative to perform one or more model based adaptation techniques in accordance with the speech recognition system.
  - 14. The apparatus of claim 11, wherein the at least one processing device is further operative to perform one or more other feature based transformation adaptation techniques in accordance with the speech recognition system.
  - 15. The apparatus of claim 11, wherein the operations of computing the cumulative distribution functions comprise using a nonparametric histogram approach when sufficient adaptation data is available.
  - 16. The apparatus of claim 11, wherein the operations of computing the cumulative distribution functions comprise using a parametric density form when insufficient adaptation data is available.
  - 17. The apparatus of claim 11, wherein the nonlinear transformation mapping is applied to one of each dimension and multiple dimensions of each speech vector associated with the test speech data.
  - 18. The apparatus of claim 11, wherein the speech recognition system is a continuous speech recognition system.
  - 19. The apparatus of claim 11, wherein the cumulative distribution functions are piece-wise linear function approximations.
  - 20. The apparatus of claim 11, wherein the speech recognition system is associated with a telephony application.

21. An article of manufacture for use in adapting a speech recognition system to one or more acoustic conditions, comprising a machine readable medium containing one or more programs which when executed implement the steps of:
- computing cumulative distribution functions based on dimensions of speech vectors associated with training speech data provided to the speech recognition system;
  
  computing cumulative distribution functions based on dimensions of speech vectors associated with test speech data provided to the speech recognition system;
  
  computing a nonlinear transformation mapping based on the cumulative distribution functions associated with the training speech data and the cumulative distribution functions associated with the test speech data; and
  
  applying the nonlinear transformation mapping to speech vectors associated with the test speech data prior to recognition, wherein the speech vectors transformed in accordance with the nonlinear transformation mapping are substantially similar to speech vectors associated with the training speech data.

22. A method of adapting a speech recognition system to one or more acoustic conditions, the method comprising the steps of:
- computing cumulative distribution functions based on dimensions of speech vectors associated with training speech data provided to the speech recognition system;
  
  computing cumulative distribution functions based on dimensions of speech vectors associated with test speech data provided to the speech recognition system;
  
  computing a nonlinear transformation mapping based on the cumulative distribution functions associated with the training speech data and the cumulative distribution functions associated with the test speech data; and
  
  applying the nonlinear transformation mapping to speech vectors associated with the test speech data prior to recognition, wherein the speech vectors transformed in accordance with the nonlinear transformation mapping are substantially similar to speech vectors associated with the training speech data;
  
  wherein the step of computing the cumulative distribution functions associated with the training speech data further comprises the steps of determining, for each dimension, a maximum value and a minimum value across the training speech data, and uniformly dividing a range associated with the minimum value and the maximum value into non-overlapping intervals.
- View Dependent Claims (23)
- - 23. The method of claim 22, wherein the step of computing the cumulative distribution functions associated with the training speech data further comprises the step of constructing a histogram on each interval using the training speech data.

24. A method of adapting a speech recognition system to one or more acoustic conditions, the method comprising the steps of:
- computing cumulative distribution functions based on dimensions of speech vectors associated with training speech data provided to the speech recognition system;
  
  computing cumulative distribution functions based on dimensions of speech vectors associated with test speech data provided to the speech recognition system;
  
  computing a nonlinear transformation mapping based on the cumulative distribution functions associated with the training speech data and the cumulative distribution functions associated with the test speech data; and
  
  applying the nonlinear transformation mapping to speech vectors associated with the test speech data prior to recognition, wherein the speech vectors transformed in accordance with the nonlinear transformation mapping are substantially similar to speech vectors associated with the training speech data;
  
  wherein the step of computing the cumulative distribution functions associated with the test speech data further comprises the step of determining, for each dimension, a maximum value and a minimum value across the test speech data.
- View Dependent Claims (25, 26)
- - 25. The method of claim 24, wherein the step of computing the cumulative distribution functions associated with the test speech data further comprises the step of uniformly dividing a range associated with the minimum value and the maximum value into non-overlapping intervals.
  - 26. The method of claim 25, wherein the step of computing the cumulative distribution functions associated with the test speech data further comprises the step of constructing a histogram on each interval using the test speech data.

27. A method of adapting a speech recognition system to one or more acoustic conditions, the method comprising the steps of:
- computing cumulative distribution functions based on dimensions of speech vectors associated with training speech data provided to the speech recognition system;
  
  computing cumulative distribution functions based on dimensions of speech vectors associated with test speech data provided to the speech recognition system;
  
  computing a nonlinear transformation mapping based on the cumulative distribution functions associated with the training speech data and the cumulative distribution functions associated with the test speech data; and
  
  applying the nonlinear transformation mapping to speech vectors associated with the test speech data prior to recognition, wherein the speech vectors transformed in accordance with the nonlinear transformation mapping are substantially similar to speech vectors associated with the training speech data;
  
  wherein the nonlinear transformation mapping is represented as (F_T)^−
  
  1F_A, where F_Trepresents the cumulative distribution functions associated with the training speech data and F_Arepresents the cumulative distribution functions associated with the test speech data.
- View Dependent Claims (28, 29)
- - 28. The method of claim 27, wherein the step of computing the nonlinear transformation mapping comprises the steps of constructing a table represented as (x_T^k,f_T^k)and (x_A^k,f_A^k)wherein x_Tare training samples, x_Aare adaptation samples, f_Tare training functions, f_Aare adaptation functions and k refers to a k^thincrement, such that f_A^kand f_T^kare equally spaced values in [0, 1], and constructing a table (x_A^k,x_T^k).
  - 29. The method of claim 28, wherein, for a given value x, a binary search is conducted on x_A^kto obtain the corresponding value x_T^kas the nonlinear transformation mapping φ
    - (x).

30. Apparatus for adapting a speech recognition system to one or more acoustic conditions, the apparatus comprising:
- at least one processing device operative to;
  
  (i) compute cumulative distribution functions based on dimensions of speech vectors associated with training speech data provided to the speech recognition system;
  
  (ii) compute cumulative distribution functions based on dimensions of speech vectors associated with test speech data provided to the speech recognition system;
  
  (iii) compute a nonlinear transformation mapping based on the cumulative distribution functions associated with the training speech data and the cumulative distribution functions associated with the test speech data; and
  
  (iv) apply the nonlinear transformation mapping to speech vectors associated with the test speech data prior to recognition, wherein the speech vectors transformed in accordance with the nonlinear transformation mapping are substantially similar to speech vectors associated with the training speech data;
  
  wherein the operation of computing the cumulative distribution functions associated with the training speech data further comprises the operations of determining, for each dimension, a maximum value and a minimum value across the training speech data, and uniformly dividing a range associated with the minimum value and the maximum value into non-overlapping intervals.
- View Dependent Claims (31)
- - 31. The apparatus of claim 30, wherein the operation of computing the cumulative distribution functions associated with the training speech data further comprises the operation of constructing a histogram on each interval using the training speech data.

32. Apparatus for adapting a speech recognition system to one or more acoustic conditions, the apparatus comprising:
- at least one processing device operative to;
  
  (i) compute cumulative distribution functions based on dimensions of speech vectors associated with training speech data provided to the speech recognition system;
  
  (ii) compute cumulative distribution functions based on dimensions of speech vectors associated with test speech data provided to the speech recognition system;
  
  (iii) compute a nonlinear transformation mapping based on the cumulative distribution functions associated with the training speech data and the cumulative distribution functions associated with the test speech data; and
  
  (iv) apply the nonlinear transformation mapping to speech vectors associated with the test speech data prior to recognition, wherein the speech vectors transformed in accordance with the nonlinear transformation mapping are substantially similar to speech vectors associated with the training speech data;
  
  wherein the operation of computing the cumulative distribution functions associated with the test speech data further comprises the operation of determining, for each dimension, a maximum value and a minimum value across the test speech data.
- View Dependent Claims (33, 34)
- - 33. The apparatus of claim 32, wherein the operation of computing the cumulative distribution functions associated with the test speech data further comprises the operation of uniformly dividing a range associated with the minimum value and the maximum value into non-overlapping intervals.
  - 34. The apparatus of claim 33, wherein the operation of computing the cumulative distribution functions associated with the test speech data further comprises the operation of constructing a histogram on each interval using the test speech data.

35. Apparatus for adapting a speech recognition system to one or more acoustic conditions, the apparatus comprising:
- at least one processing device operative to;
  
  (i) compute cumulative distribution functions based on dimensions of speech vectors associated with training speech data provided to the speech recognition system;
  
  (ii) compute cumulative distribution functions based on dimensions of speech vectors associated with test speech data provided to the speech recognition system;
  
  (iii) compute a nonlinear transformation mapping based on the cumulative distribution functions associated with the training speech data and the cumulative distribution functions associated with the test speech data; and
  
  (iv) apply the nonlinear transformation mapping to speech vectors associated with the test speech data prior to recognition, wherein the speech vectors transformed in accordance with the nonlinear transformation mapping are substantially similar to speech vectors associated with the training speech data;
  
  wherein the nonlinear transformation mapping is represented as (F_T)^−
  
  1F^A, where F_Trepresents the cumulative distribution functions associated with the training speech data and F_Arepresents the cumulative distribution functions associated with the test speech data.
- View Dependent Claims (36, 37)
- - 36. The apparatus of claim 35, wherein the operation of computing the nonlinear transformation mapping comprises the operations of constructing a table represented as (x_T^k,f_T^k) and (x_A^k,f_A^k)wherein x_Tare training samples, x_Aare adaptation samples, f_Tare training functions, f_Aare adaptation functions and k refers to a k^thincrement, such that f_A^kand f_T^kare equally spaced values in [0, 1], and constructing a table (x_A^k,f_T^k).
  - 37. The apparatus of claim 36, wherein, for a given value x, a binary search is conducted on x_A^kto obtain the corresponding value x_T^kas the nonlinear transformation mapping φ
    - (x).

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Dharanipragada, Satyanarayana, Padmanabhan, Mukund
Primary Examiner(s)
Dorvil, Richemond
Assistant Examiner(s)
NOLAN, DANIEL A

Application Number

US09/543,794
Time in Patent Office

929 Days
Field of Search

704/230-233, 704/254-256, 704/9, 704/246, 704/239, 704/243
US Class Current

704/231
CPC Class Codes

G10L 15/07 to the speaker

Method and apparatus for rapid adapt via cumulative distribution function matching for continuous speech

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

26 Citations

37 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for rapid adapt via cumulative distribution function matching for continuous speech

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

26 Citations

37 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links