SYSTEM AND METHOD FOR MERGING AUDIO DATA STREAMS FOR USE IN SPEECH RECOGNITION APPLICATIONS

US 20120046946A1
Filed: 08/20/2010
Published: 02/23/2012
Est. Priority Date: 08/20/2010
Status: Active Grant

First Claim

Patent Images

1. A method for merging at least a first and second audio data stream for use in a speech recognition application, the method comprising:

transforming the first audio data stream from a time domain to a frequency domain;

transforming the second audio data stream from the time domain function to the frequency domain;

determining a first feature data set for the first transformed audio stream for a first range of frequencies;

determining a second feature data set for the second transformed audio stream for a second range of frequencies, at least some of which frequencies are different from the first range of frequencies; and

combining predetermined feature data from the first and second feature data sets to form a merged feature data set.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method for merging audio data streams receive audio data streams from separate inputs, independently transform each data stream from the time to the frequency domain, and generate separate feature data sets for the transformed data streams. Feature data from each of the separate feature data sets is selected to form a merged feature data set that is output to a decoder for recognition purposes. The separate inputs can include an ear microphone and a mouth microphone.

Citations

20 Claims

1. A method for merging at least a first and second audio data stream for use in a speech recognition application, the method comprising:
- transforming the first audio data stream from a time domain to a frequency domain;
  
  transforming the second audio data stream from the time domain function to the frequency domain;
  
  determining a first feature data set for the first transformed audio stream for a first range of frequencies;
  
  determining a second feature data set for the second transformed audio stream for a second range of frequencies, at least some of which frequencies are different from the first range of frequencies; and
  
  combining predetermined feature data from the first and second feature data sets to form a merged feature data set.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method of claim 1, further comprising capturing the first audio data stream from an ear microphone and capturing the second audio data stream from a mouth microphone.
  - 3. The method of claim 2, wherein in the first and second ranges of frequencies overlap.
  - 4. The method of claim 3, wherein the first range of frequencies includes lower frequencies than the second range of frequencies.
  - 5. The method of claim 2, wherein the first and second audio data streams are handled as stereo audio inputs.
  - 6. The method of claim 1, wherein determining the first and second feature data sets includes determining respective first and second cepstral coefficient sets.
  - 7. The method of claim 6, wherein combining predetermined feature data includes taking predetermined cepstral coefficients from each of the first and second cepstral coefficient sets to form a merged cepstral coefficient set.
  - 8. The method of claim 7, wherein the first and second cepstral coefficient sets each include a zeroeth coefficient and an even number (N) of additional coefficients, and combining predetermined feature data includes selecting an equal number of the additional coefficients from the first and second cepstral coefficient sets for inclusion in the merged cepstral coefficient set.
  - 9. The method of claim 8, wherein the first N/2 coefficients are selected from the first and second cepstral coefficient sets.
  - 10. The method of claim 9, wherein N=12 and the first through sixth cepstral coefficients of the first cepstral coefficient set are used as the first through sixth cepstral coefficients of the merged cepstral coefficient set and the first through sixth cepstral coefficients of the second cepstral coefficient set are used as the seventh through twelfth cepstral coefficients of the merged cepstral coefficient set.
  - 11. The method of claim 8, wherein the zeroeth cepstral coefficient of the first cepstral coefficient set is selected as the zeroeth cepstral coefficient of the merged cepstral coefficient set.

12. A method of merging audio data streams from an ear microphone and a mouth microphone for use in a speech recognition application, the method comprising:
- transforming the ear microphone audio data stream from a time domain to a first frequency domain;
  
  transforming the mouth microphone audio data stream from the time domain function the frequency domain;
  
  determining a first feature data set for the transformed ear microphone audio stream for a first range of frequencies;
  
  determining a second feature data set for the transformed mouth microphone audio stream for a second range of frequencies including frequencies higher than the first range of frequencies; and
  
  combining predetermined feature data from the first and second feature data sets to form a merged feature data set.
- View Dependent Claims (13, 14, 15, 16, 17, 18)
- - 13. The method of claim 12, wherein the first range of frequencies is from approximately 50 Hz to approximately 4.5 kHz.
  - 14. The method of claim 13, wherein the second range of frequencies is from approximately 4.5 kHz to approximately 8 kHz.
  - 15. The method of claim 12, wherein the ear and mouth microphone audio data streams are handled as stereo audio inputs.
  - 16. The method of claim 12, wherein determining the first and second feature data sets includes determining respective first and second cepstral coefficient sets.
  - 17. The method of claim 16, wherein the first and second cepstral coefficient sets each include a zeroeth coefficient and an even number (N) of additional coefficients, and combining predetermined feature data includes selecting an equal number of the additional coefficients from the first and second cepstral coefficient sets for inclusion in a merged cepstral coefficient set.
  - 18. The method of claim 17, wherein the first N/2 coefficients are selected from the first and second cepstral coefficient sets.

19. A speech recognition system comprising:
- a computer system having at least one processor and machine readable memory configured to executea front end module adapted toreceive first and second audio data streams,transform the first and second audio data streams from a time domain to a frequency domain,independently determine respective first and second feature data sets for the first and second audio data streams in the frequency domain,generate a merged feature data set from the first and second feature data sets, andoutput the merged feature data set for use by a decoder.
- View Dependent Claims (20)
- - 20. The system of claim 19, further comprising:
    - an ear microphone connected to the computer system to input the first audio data stream; and
      
      a mouth microphone connected to the computer system to input the second audio data stream.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Adacel Systems, Inc. (Adacel Technologies Ltd.)
Original Assignee
Adacel Systems, Inc. (Adacel Technologies Ltd.)
Inventors
Shu, Chang-Qing

Granted Patent

US 8,731,923 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/243
CPC Class Codes

G10L 15/02 Feature extraction for spee...

G10L 2021/02165 Two microphones, one receiv...

SYSTEM AND METHOD FOR MERGING AUDIO DATA STREAMS FOR USE IN SPEECH RECOGNITION APPLICATIONS

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

SYSTEM AND METHOD FOR MERGING AUDIO DATA STREAMS FOR USE IN SPEECH RECOGNITION APPLICATIONS

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links