SYSTEM AND METHOD FOR MERGING AUDIO DATA STREAMS FOR USE IN SPEECH RECOGNITION APPLICATIONS
First Claim
Patent Images
1. A method for merging at least a first and second audio data stream for use in a speech recognition application, the method comprising:
- transforming the first audio data stream from a time domain to a frequency domain;
transforming the second audio data stream from the time domain function to the frequency domain;
determining a first feature data set for the first transformed audio stream for a first range of frequencies;
determining a second feature data set for the second transformed audio stream for a second range of frequencies, at least some of which frequencies are different from the first range of frequencies; and
combining predetermined feature data from the first and second feature data sets to form a merged feature data set.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for merging audio data streams receive audio data streams from separate inputs, independently transform each data stream from the time to the frequency domain, and generate separate feature data sets for the transformed data streams. Feature data from each of the separate feature data sets is selected to form a merged feature data set that is output to a decoder for recognition purposes. The separate inputs can include an ear microphone and a mouth microphone.
-
Citations
20 Claims
-
1. A method for merging at least a first and second audio data stream for use in a speech recognition application, the method comprising:
-
transforming the first audio data stream from a time domain to a frequency domain; transforming the second audio data stream from the time domain function to the frequency domain; determining a first feature data set for the first transformed audio stream for a first range of frequencies; determining a second feature data set for the second transformed audio stream for a second range of frequencies, at least some of which frequencies are different from the first range of frequencies; and combining predetermined feature data from the first and second feature data sets to form a merged feature data set. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method of merging audio data streams from an ear microphone and a mouth microphone for use in a speech recognition application, the method comprising:
-
transforming the ear microphone audio data stream from a time domain to a first frequency domain; transforming the mouth microphone audio data stream from the time domain function the frequency domain; determining a first feature data set for the transformed ear microphone audio stream for a first range of frequencies; determining a second feature data set for the transformed mouth microphone audio stream for a second range of frequencies including frequencies higher than the first range of frequencies; and combining predetermined feature data from the first and second feature data sets to form a merged feature data set. - View Dependent Claims (13, 14, 15, 16, 17, 18)
-
-
19. A speech recognition system comprising:
a computer system having at least one processor and machine readable memory configured to execute a front end module adapted to receive first and second audio data streams, transform the first and second audio data streams from a time domain to a frequency domain, independently determine respective first and second feature data sets for the first and second audio data streams in the frequency domain, generate a merged feature data set from the first and second feature data sets, and output the merged feature data set for use by a decoder. - View Dependent Claims (20)
Specification