Common word graph based multimodal input
First Claim
Patent Images
1. A method for processing input received by a computing device comprising one or more processors, the method comprising:
- decoding input from a first input modality to produce posterior probabilities for words along paths in a common word graph;
recording a decoding front for each of a plurality of possible input modalities, each decoding front comprising a set of nodes in the common word graph that define an end of a last word along a path in the common word graph that was assigned a probability by decoding an input from the respective input modality;
receiving input from a second input modality after recording the decoding front for the second input modality;
using the nodes of the recorded decoding front for the second input modality to determine where in the common word graph to begin rescoring and pruning the common word graph based on the input from the second input modality;
using one or more of the processors, rescoring and pruning the common word graph based on averaging posterior probabilities from decoding input from the first input modality and decoding input from the second input modality; and
outputting a hypothesis for the input based on the common word graph.
2 Assignments
0 Petitions
Accused Products
Abstract
Multiple input modalities are selectively used by a user or process to prune a word graph. Pruning initiates rescoring in order to generate a new word graph with a revised best path.
29 Citations
14 Claims
-
1. A method for processing input received by a computing device comprising one or more processors, the method comprising:
-
decoding input from a first input modality to produce posterior probabilities for words along paths in a common word graph; recording a decoding front for each of a plurality of possible input modalities, each decoding front comprising a set of nodes in the common word graph that define an end of a last word along a path in the common word graph that was assigned a probability by decoding an input from the respective input modality; receiving input from a second input modality after recording the decoding front for the second input modality; using the nodes of the recorded decoding front for the second input modality to determine where in the common word graph to begin rescoring and pruning the common word graph based on the input from the second input modality; using one or more of the processors, rescoring and pruning the common word graph based on averaging posterior probabilities from decoding input from the first input modality and decoding input from the second input modality; and outputting a hypothesis for the input based on the common word graph. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer storage medium having computer-executable instructions that when executed by a computer perform steps to process input received by the computer comprising the steps of:
-
receiving input using a first modality; modifying a word graph based on the input; and rendering a hypothesis to a user for the input based on the word graph, and repeating the following steps until a desired hypothesis is obtained; modifying the word graph based on complementary information received using a second modality, the complementary information corresponding to at least a portion of the input, wherein the second modality is different from the first modality, in which modifying the word graph includes rescoring the word graph based on averaging posterior probabilities from the modalities of the input and the complementary information; and rendering a new hypothesis to the user for the input based on the word graph. - View Dependent Claims (9, 10)
-
-
11. A computing device comprising:
-
a first component configured to provide input into the computing device using a first modality; a second component configured to provide input into the computing device using a second modality; and a recognizer configured to receive input from the first component and the second component and configured to modify a common word graph based on input from the first component and input from the second component, wherein modifying a common word graph based on input from the second component comprises rescoring words in the common word graph beginning with words that occur after nodes set in a recorded decoding front for the second modality based on the input from the second component. - View Dependent Claims (12, 13, 14)
-
Specification