Systems and methods for combining subword detection and word detection for processing a spoken input
First Claim
1. A method for determining a plurality of hypothetical matches to a spoken input, comprising the computer-implemented steps of:
- detecting subword units in the spoken input to generate a first set of hypothetical matches to the spoken input;
detecting words in the spoken input to generate a second set of hypothetical matches to the spoken input; and
combining the first set of hypothetical matches with the second set of hypothetical matches to produce a combined set of hypothetical matches to the spoken input, the combined set having a predefined number of hypothetical matches.
2 Assignments
0 Petitions
Accused Products
Abstract
A computer-based detection (e.g. speech recognition) system combines a word decoder and subword decoder to detect words (or phrases) in a spoken input provided by a user into a speaker connected to the detection system. The word decoder detects words by comparing an input pattern (e.g., of hypothetical word matches) to reference patterns (e.g., words). The subword decoder compares an input pattern (e.g. hypothetical word matches based on subword or phoneme recognition) to reference patterns (e.g., words) based on a word pronunciation distance measure that indicates how close each input pattern is to matching each reference pattern. The word decoder and subword decoder each provide an N-best list of hypothetical matches to the spoken input. A list fusion module of the detection system selectively combines the two N-best lists to produce a final or combined N-best list.
-
Citations
28 Claims
-
1. A method for determining a plurality of hypothetical matches to a spoken input, comprising the computer-implemented steps of:
-
detecting subword units in the spoken input to generate a first set of hypothetical matches to the spoken input;
detecting words in the spoken input to generate a second set of hypothetical matches to the spoken input; and
combining the first set of hypothetical matches with the second set of hypothetical matches to produce a combined set of hypothetical matches to the spoken input, the combined set having a predefined number of hypothetical matches. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer system for determining a plurality of hypothetical matches to a spoken input, comprising:
-
a subword decoder for detecting subword units in the spoken input to generate a first set of hypothetical matches to the spoken input;
a word decoder detecting words in the spoken input to generate a second set of hypothetical matches to the spoken input; and
a list fusion module for combining the first set of hypothetical matches with the second set of hypothetical matches to produce a combined set of hypothetical matches to the spoken input, the combined set having a predefined number of hypothetical matches. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer program product comprising:
-
a computer usable medium for determining a plurality of hypothetical matches to a spoken input; and
a set of computer program instructions embodied on the computer useable medium, including instructions to;
detect subword units in the spoken input to generate a first set of hypothetical matches to the spoken input;
detect words in the spoken input to generate a second set of hypothetical matches to the spoken input; and
combine the first set of hypothetical matches with the second set of hypothetical matches to produce a combined set of hypothetical matches to the spoken input, the combined set having a predefined number of hypothetical matches.
-
-
14. A method for determining a plurality of hypothetical matches to a spoken input by detecting subword units in the spoken input, comprising the computer-implemented steps of:
-
detecting the subword units in the spoken input based on an acoustic model of the subword units and a language model of the subword units;
generating pattern comparisons between (i) an input pattern corresponding to the subword units in the spoken input and (ii) a source set of reference patterns based on a pronunciation dictionary, each generated pattern comparison based on the input pattern and one of the reference patterns; and
generating a set of the hypothetical matches by sorting the source set of reference patterns based on a closeness of each reference pattern to correctly matching the input pattern based on an evaluation of each generated pattern comparison, each evaluation determining a word pronunciation distance measure that indicates how close each input pattern is to matching each reference pattern. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
-
21. An computer system for determining a plurality of hypothetical matches to a spoken input by detecting subword units in the spoken input, comprising:
-
a subword decoder for detecting the subword units in the spoken input based on an acoustic model of the subword units and a language model of the subword units; and
a subword detection vocabulary look up module for generating pattern comparisons between (i) an input pattern corresponding to the subword units in the spoken input and (ii) a source set of reference patterns based on a pronunciation dictionary, each generated pattern comparison based on the input pattern and one of the reference patterns;
the subword detection vocabulary look up module generating a set of the hypothetical matches by sorting the source set of reference patterns based on a closeness of each reference pattern to correctly matching the input pattern based on an evaluation of each generated pattern comparison, each evaluation determining a word pronunciation distance measure that indicates how close each input pattern is to matching each reference pattern. - View Dependent Claims (22, 23, 24, 25, 26, 27)
-
-
28. A computer program product comprising:
-
a computer usable medium for determining a plurality of hypothetical matches to a spoken input by detecting subwords in the spoken input; and
a set of computer program instructions embodied on the computer useable medium, including instructions to;
detect the subword units in the spoken input based on an acoustic model of the subword units and a language model of the subword units;
generate pattern comparisons between (i) an input pattern corresponding to the subword units in the spoken input and (ii) a source set of reference patterns based on a pronunciation dictionary, each generated pattern comparison based on the input pattern and one of the reference patterns; and
generate a set of the hypothetical matches by sorting the source set of reference patterns based on a closeness of each reference pattern to correctly matching the input pattern based on an evaluation of each generated pattern comparison, each evaluation determining a word pronunciation distance measure that indicates how close each input pattern is to matching each reference pattern.
-
Specification