SPEECH TRANSCRIPTION INCLUDING WRITTEN TEXT
First Claim
1. A computer-implemented method comprising:
- obtaining a lexicon model that maps phones to spoken text;
obtaining a language model that assigns probabilities to written text;
generating a transducer that maps the written text to the spoken text, the transducer mapping multiple items of written text to an item of the spoken text; and
constructing a decoding network for transcribing utterances into written text, by composing the lexicon model, the inverse of the transducer, and the language model.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for transcribing utterances into written text are disclosed. The methods, systems, and apparatus include actions of obtaining a lexicon model that maps phones to spoken text and obtaining a language model that assigns probabilities to written text. Further includes generating a transducer that maps the written text to the spoken text, the transducer mapping multiple items of the written text to an item of the spoken text. Additionally, the actions include constructing a decoding network for transcribing utterances into written text, by composing the lexicon model, the inverse of the transducer, and the language model.
34 Citations
20 Claims
-
1. A computer-implemented method comprising:
-
obtaining a lexicon model that maps phones to spoken text; obtaining a language model that assigns probabilities to written text; generating a transducer that maps the written text to the spoken text, the transducer mapping multiple items of written text to an item of the spoken text; and constructing a decoding network for transcribing utterances into written text, by composing the lexicon model, the inverse of the transducer, and the language model. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system comprising:
one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising; obtaining a lexicon model that maps phones to spoken text; obtaining a language model that assigns probabilities to written text; generating a transducer that maps the written text to the spoken text, the transducer mapping multiple items of written text to an item of the spoken text; and constructing a decoding network for transcribing utterances into written text, by composing the lexicon model, the inverse of the transducer, and the language model. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
17. A computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
-
obtaining a lexicon model that maps phones to spoken text; obtaining a language model that assigns probabilities to written text; generating a transducer that maps the written text to the spoken text, the transducer mapping multiple items of written text to an item of the spoken text; and constructing a decoding network, for transcribing utterances into written text, by composing the lexicon model, the inverse of the transducer, and the language model. - View Dependent Claims (18, 19, 20)
-
Specification