System and method of mu-law or A-law compression of bark amplitudes for speech recognition

US 6,694,294 B1
Filed: 10/31/2000
Issued: 02/17/2004
Est. Priority Date: 10/31/2000
Status: Active Grant

First Claim

Patent Images

1. A voice recognizer of a distributed voice recognition system, comprising:

a bark amplitude generation module configured to convert a digitized speech signal to bark amplitudes;

a mu-log compression module coupled to the bark amplitude generation module, the mu-log compression module configured to perform mu-log compression of the bark amplitudes;

a RASTA filtering module coupled to the mu-log compression module, the RASTA filtering module configured to RASTA filter the mu-log bark amplitudes; and

a cepstral transformation module coupled to the RASTA filtering module, the cepstral transformation module configured to generate j static cepstral coefficients and j dynamic cepstral coefficients.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and system that improves voice recognition by improving the voice recognizer of a voice recognition system. Mu-law compression of bark amplitudes is used to reduce the effect of additive noise and thus improve the accuracy of the voice recognition system. A-law compression of bark amplitudes is used to improve the accuracy of the voice recognizer. Both mu-law compression and mu-law expansion can be used in the voice recognizer to improve the accuracy of the voice recognizer. Both A-law compression and A-law expansion can be used in the voice recognizer to improve the accuracy of the voice recognizer.

Citations

54 Claims

1. A voice recognizer of a distributed voice recognition system, comprising:
- a bark amplitude generation module configured to convert a digitized speech signal to bark amplitudes;
  
  a mu-log compression module coupled to the bark amplitude generation module, the mu-log compression module configured to perform mu-log compression of the bark amplitudes;
  
  a RASTA filtering module coupled to the mu-log compression module, the RASTA filtering module configured to RASTA filter the mu-log bark amplitudes; and
  
  a cepstral transformation module coupled to the RASTA filtering module, the cepstral transformation module configured to generate j static cepstral coefficients and j dynamic cepstral coefficients.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The voice recognizer of claim 1 further comprising a backend configured to process the j static cepstral coefficients and j dynamic cepstral coefficients and produces a recognition hypothesis.
  - 3. The voice recognizer of claim 1, wherein the mu-log compression is G.711 mu-log compression.
  - 4. The voice recognizer of claim 1, wherein the bark amplitude generation module is configured to convert a digitized speech signal to k bark amplitudes once very T milliseconds.
  - 5. The voice recognizer of claim 4, wherein the cepstral transformation module is configured to generate j static cepstral coefficients and j dynamic cepstral coefficients every T milliseconds.
  - 6. The voice recognizer of claim 5, wherein T equals 10.
  - 7. The voice recognizer of claim 4, wherein k equals 16.

8. A voice recognizer of a distributed voice recognition system, comprising:
- a bark amplitude generation module configured to convert a digitized speech signal to bark amplitudes;
  
  an A-log compression module coupled to the bark amplitude generation module, the A-log compression module configured to perform A-log compression of the bark amplitudes;
  
  a RASTA filtering module coupled to the A-log compression module, the RASTA filtering module configured to RASTA filter the A-log bark amplitudes; and
  
  a cepstral transformation module coupled to the RASTA filtering module, the cepstral transformation module configured to generate j static cepstral coefficients and j dynamic cepstral coefficients.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The voice recognizer of claim 8 further comprising a backend configured to process the j static cepstral coefficients and j dynamic cepstral coefficients and produces a recognition hypothesis.
  - 10. The voice recognizer of claim 8, wherein the A-log compression is G. 711 A-log compression.
  - 11. The voice recognizer of claim 8, wherein the bark amplitude generation module is configured to convert a digitized speech signal to k bark amplitudes once very T milliseconds.
  - 12. The voice recognizer of claim 11, wherein the cepstral transformation module is configured to generate j static cepstral coefficients and j dynamic cepstral coefficients every T milliseconds.
  - 13. The voice recognizer of claim 12, wherein T equals 10.
  - 14. The voice recognizer of claim 11, wherein k equals 16.

15. A voice recognizer of a distributed voice recognition system, comprising:
- a bark amplitude generation module configured to converts a digitized speech signal to bark amplitudes;
  
  a mu-log compression module coupled to the bark amplitude generation module, the mu-log compression module configured to perform mu-log compression of the bark amplitudes;
  
  a RASTA filtering module coupled to the mu-log compression module, the RASTA filtering module configured to RASTA filters the mu-log bark amplitudes; and
  
  a mu-log expansion module coupled to the RASTA filtering module, the mu-log expansion module configured to perform mu-log expansion of the filtered mu-log bark amplitudes.
- View Dependent Claims (16, 17, 18, 19, 20, 21)
- - 16. The voice recognizer of claim 15 further comprising a backend configured to process the expanded bark amplitudes and produces a recognition hypothesis.
  - 17. The voice recognizer of claim 15, wherein the mu-log compression and expansion is G.711 mu-log compression and expansion.
  - 18. The voice recognizer of claim 15, wherein the bark amplitude generation module is configured to convert a digitized speech signal to k bark amplitudes once very T milliseconds.
  - 19. The voice recognizer of claim 18, wherein the mu-log expansion module is configured to expand the filtered mu-log bark amplitudes into k expanded bark amplitudes.
  - 20. The voice recognizer of claim 19, wherein T equals 10.
  - 21. The voice recognizer of claim 18, wherein k equals 16.

22. A voice recognizer of a distributed voice recognition system, comprising:
- a bark amplitude generation module configured to convert a digitized speech signal to bark amplitudes;
  
  an A-log compression module coupled to the bark amplitude generation module, the A-log compression module configured to perform A-log compression of the bark amplitudes;
  
  a RASTA filtering module coupled to the A-log compression module, the RASTA filtering module configured to RASTA filter the A-log bark amplitudes; and
  
  an A-log expansion module coupled to the RASTA filtering module, the A-log expansion module configured to perform A-log expansion of the filtered A-log bark amplitudes.
- View Dependent Claims (23, 24, 25, 26, 27, 28)
- - 23. The voice recognizer of claim 22 further comprising a backend configured to process the expanded bark amplitudes and produces a recognition hypothesis.
  - 24. The voice recognizer of claim 22, wherein the A-log compression and expansion is G.711 A-log compression and expansion.
  - 25. The voice recognizer of claim 22, wherein the bark amplitude generation module is configured to convert a digitized speech signal to k bark amplitudes once very T milliseconds.
  - 26. The voice recognizer of claim 25, wherein the A-log expansion module is configured to expand the filtered A-log bark amplitudes into k expanded bark amplitudes.
  - 27. The voice recognizer of claim 25, wherein k equals 16.
  - 28. The voice recognizer of claim 27, wherein T equals 10.

29. A method of voice recognizer processing for voice recognition, comprising:
- converting a digitized speech signal to bark amplitudes;
  
  mu-log compressing the bark amplitudes;
  
  RASTA-filtering the mu-log bark amplitudes; and
  
  transforming cepstrally the mu-log bark amplitudes to j static cepstral coefficients and j dynamic cepstral coefficients.
- View Dependent Claims (30, 31, 32, 33, 34)
- - 30. The method of claim 29, wherein the mu-log compressing is G.711 mu-log compressing.
  - 31. The method of claim 29, wherein the coverting includes converting the digitized speech signal to k bark amplitudes once very T milliseconds.
  - 32. The method of claim 31, wherein the transforming includes transforming cepstrally the mu-log bark amplitudes to j static cepstral coefficients and j dynamic cepstral coefficients every T milliseconds.
  - 33. The method of claim 32, wherein T equals 10.
  - 34. The method of claim 31, wherein k equals 16.

35. A method of voice recognition, comprising:
- converting a digitized speech signal to bark amplitudes;
  
  mu-log compressing the bark amplitudes;
  
  RASTA-filtering the mu-log bark amplitudes;
  
  transforming cepstrally the mu-log bark amplitudes to j static cepstral coefficients and j dynamic cepstral coefficients; and
  
  producing a recognition hypothesis based on the j static cepstral coefficients and j dynamic cepstral coefficients.

36. A method of voice recognition, comprising:
- converting a digitized speech signal to bark amplitudes;
  
  A-log compressing the bark amplitudes;
  
  RASTA-filtering the A-log bark amplitudes; and
  
  transforming cepstrally the A-log bark amplitudes to j static cepstral coefficients and j dynamic cepstral coefficients.
- View Dependent Claims (37, 38, 39, 40, 41)
- - 37. The method of claim 36, wherein the A-log compressing is G.711 A-log compressing.
  - 38. The method of claim 36, wherein the coverting includes converting the digitized speech signal to k bark amplitudes once very T milliseconds.
  - 39. The method of claim 38, wherein the transforming includes transforming cepstrally the A-log bark amplitudes to j static cepstral coefficients and j dynamic cepstral coefficients every T milliseconds.
  - 40. The method of claim 38, wherein k equals 16.
  - 41. The method of claim 39, wherein T equals 10.

42. A method of voice recognition, comprising:
- converting a digitized speech signal to bark amplitudes;
  
  A-log compressing the bark amplitudes;
  
  RASTA-filtering the A-log bark amplitudes;
  
  transforming cepstrally the A-log bark amplitudes to j static cepstral coefficients and j dynamic cepstral coefficients; and
  
  producing a recognition hypothesis based on the j static cepstral coefficients and j dynamic cepstral coefficients.

43. A method of voice recognition, comprising:
- converting a digitized speech signal to bark amplitudes;
  
  mu-log compressing the bark amplitudes;
  
  RASTA-filtering the mu-log bark amplitudes; and
  
  mu-log expanding the filtered mu-log bark amplitudes.
- View Dependent Claims (44, 45, 46, 47)
- - 44. The method of claim 43, wherein the mu-log compressing is G.711 mu-log compressing.
  - 45. The method of claim 43, wherein the coverting includes converting the digitized speech signal to k bark amplitudes once very T milliseconds.
  - 46. The method of claim 45, wherein k equals 16.
  - 47. The method of claim 46, wherein T equals 10.

48. A method of voice recognition, comprising:
- converting a digitized speech signal to bark amplitudes;
  
  mu-log compressing the bark amplitudes;
  
  RASTA-filtering the mu-log bark amplitudes; and
  
  mu-log expanding the filtered mu-log bark amplitudes; and
  
  producing a recognition hypothesis based on the expanded mu-log bark amplitudes.

49. A method of voice recognition, comprising:
- converting a digitized speech signal to bark amplitudes;
  
  A-log compressing the bark amplitudes;
  
  RASTA-filtering the A-log bark amplitudes; and
  
  A-log expanding the filtered A-log bark amplitudes.
- View Dependent Claims (50, 51, 52, 53)
- - 50. The method of claim 49, wherein the A-log compressing is G.711 A-log compressing.
  - 51. The method of claim 49, wherein the coverting includes converting the digitized speech signal to k bark amplitudes once very T milliseconds.
  - 52. The method of claim 51, wherein k equals 16.
  - 53. The method of claim 52, wherein T equals 10.

54. A method of voice recognition, comprising:
- converting a digitized speech signal to bark amplitudes;
  
  A-log compressing the bark amplitudes;
  
  RASTA-filtering the A-log bark amplitudes; and
  
  A-log expanding the filtered A-log bark amplitudes; and
  
  producing a recognition hypothesis based on the expanded A-log bark amplitudes.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Qualcomm, Inc.
Original Assignee
Qualcomm, Inc.
Inventors
Garudadri, Harinath
Primary Examiner(s)
Dorvil, Richemond
Assistant Examiner(s)
Storm, Donald L.

Application Number

US09/703,191
Time in Patent Office

1,204 Days
Field of Search

704/234, 704/200.1, 704/211, 704/221, 704/226, 704/227
US Class Current

704/234
CPC Class Codes

G10L 15/20 Speech recognition techniqu...

System and method of mu-law or A-law compression of bark amplitudes for speech recognition

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

54 Claims

Specification

Solutions

Use Cases

Quick Links

System and method of mu-law or A-law compression of bark amplitudes for speech recognition

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

54 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links