Systems and methods of interpreting speech data
First Claim
1. An uncontrolled environment-based speech recognition system comprising:
- one or more audio data filters to each generate a set of processed audio data based on raw audio data received from one or more computing devices, the one or more audio data filters comprises;
a first audio data filter to apply a first filter process to the raw audio data to generate a first processed audio data, anda second audio data filter to apply a second filter process to the raw audio data to generate a second processed audio data,the first audio data filter being different from the second audio data filter, the one or more audio data filters comprising at least one audio data filter appropriate for the uncontrolled environment;
a translator, operable by a processor, to provide;
a first set of translation results based on the first processed audio data for the raw audio data, each translation result of the first set of translation results comprising a first text data and a first confidence level associated with that first text data; and
a second set of translation results based on the second processed audio data for the raw audio data, each translation result of the second set of translation results comprising a second text data and a second confidence level associated with that second text data; and
in response to receiving the first and second sets of translation results, a decision controller is operable by the processor to identify at least one translation result to represent the raw audio data, the decision controller is operable to;
identify at least one translation result that includes the text data associated with the confidence level that exceeds a confidence threshold;
determine whether the identified at least one translation result comprises more than one translation result;
in response to determining the identified at least one translation result comprises more than one translation result, determine an occurrence frequency for each text data of the identified at least one translation result and select the text data based on the occurrence frequency, the occurrence frequency representing a number of times that the text data appears in the set of translation results; and
generate an output signal associated with the identification of the at least one translation result.
1 Assignment
0 Petitions
Accused Products
Abstract
Method and systems are provided for interpreting speech data. A method and system for recognizing speech involving a filter module to generate a set of processed audio data based on raw audio data; a translation module to provide a set of translation results for the raw audio data; and a decision module to select the text data that represents the raw audio data. A method for minimizing noise in audio signals received by a microphone array is also described. A method and system of automatic entry of data into one or more data fields involving receiving a processed audio data; and operating a processing module to: search in a trigger dictionary for a field identifier that corresponds to the trigger identifier; identify a data field associated with a data field identifier corresponding to the field identifier; and providing content data associated with the trigger identifier to the identified data field.
-
Citations
18 Claims
-
1. An uncontrolled environment-based speech recognition system comprising:
-
one or more audio data filters to each generate a set of processed audio data based on raw audio data received from one or more computing devices, the one or more audio data filters comprises; a first audio data filter to apply a first filter process to the raw audio data to generate a first processed audio data, and a second audio data filter to apply a second filter process to the raw audio data to generate a second processed audio data, the first audio data filter being different from the second audio data filter, the one or more audio data filters comprising at least one audio data filter appropriate for the uncontrolled environment; a translator, operable by a processor, to provide; a first set of translation results based on the first processed audio data for the raw audio data, each translation result of the first set of translation results comprising a first text data and a first confidence level associated with that first text data; and a second set of translation results based on the second processed audio data for the raw audio data, each translation result of the second set of translation results comprising a second text data and a second confidence level associated with that second text data; and in response to receiving the first and second sets of translation results, a decision controller is operable by the processor to identify at least one translation result to represent the raw audio data, the decision controller is operable to; identify at least one translation result that includes the text data associated with the confidence level that exceeds a confidence threshold; determine whether the identified at least one translation result comprises more than one translation result; in response to determining the identified at least one translation result comprises more than one translation result, determine an occurrence frequency for each text data of the identified at least one translation result and select the text data based on the occurrence frequency, the occurrence frequency representing a number of times that the text data appears in the set of translation results; and generate an output signal associated with the identification of the at least one translation result. - View Dependent Claims (2, 3, 4, 5, 6, 7, 15, 16)
-
-
8. A computer-implemented method of operating an uncontrolled environment-based recognition system for recognizing speech, the method comprising:
-
operating one or more audio data filters to each generate a set of processed audio data based on raw audio data received from one or more computing devices, the one or more audio data filters comprises; a first audio data filter to apply a first filter process to the raw audio data to generate a first processed audio data, and a second audio data filter to apply a second filter process to the raw audio data to generate a second processed audio data, the first audio data filter being different from the second audio data filter, the one or more audio data filters comprising at least one audio data filter appropriate for the uncontrolled environment; operating a translator to provide; a first set of translation results based on the first processed audio data for the raw audio data, each translation result of the first set of translation results comprising a first text data and a first confidence level associated with that first text data; and a second set of translation results based on the second processed audio data for the raw audio data, each translation result of the second set of translation results comprising a second text data and a second confidence level associated with that second text data; and in response to receiving the first and second sets of translation results, operating a decision controller to identify at least one translation result to represent the raw audio data, wherein the decision controller is operable to; identify at least one translation result that includes the text data associated with the confidence level that exceeds a confidence threshold; determine whether the identified at least one translation result comprises more than one translation result; in response to determining the identified at least one translation result comprises more than one translation result, determine an occurrence frequency for each text data of the identified at least one translation result and select the text data based on the occurrence frequency, the occurrence frequency representing a number of times that the text data appears in the set of translation results; and generate an output signal associated with the identification of the at least one translation result. - View Dependent Claims (9, 10, 11, 12, 13, 14, 17, 18)
-
Specification