Speech recognition system for natural language translation
First Claim
1. A speech recognition system comprising:
- means for displaying a source text comprising one or more words in a source language;
an acoustic processor;
for generating a sequence of coded representations of an utterance to be recognized, said utterance comprising one or more words in a target language different from the source language;
means for generating a set of one or more speech hypotheses, each speech hypothesis comprising one or more words from the target language;
means for generating an acoustic model of each speech hypothesis;
means for generating an acoustic match score for each speech hypothesis, each acoustic match score comprising an estimate of the closeness of a match between the acoustic model of the speech hypothesis and the sequence of coded representations of the utterance;
means for generating a translation match score for each speech hypothesis, each translation match score comprising an estimate of the probability of occurrence of the speech hypothesis given the occurrence, of the source text;
means for generating a hypothesis score for each hypothesis, each hypothesis score comprising a combination of the acoustic match score and the translation match score for the hypothesis;
means for storing a subset of one or more speech hypotheses, from the set of speech hypotheses, having the best hypothesis scores; and
means for outputting at least one word of one or more of the speech hypotheses in the subset of speech hypotheses having the best hypothesis scores.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech recognition system displays a source text of one or more words in a source language. The system has an acoustic processor for generating a sequence of coded representations of an utterance to be recognized. The utterance comprises a series of one or more words in a target language different from the source language. A set of one or more speech hypotheses, each comprising one or more words from the target language, are produced. Each speech hypothesis is modeled with an acoustic model. An acoustic match score for each speech hypothesis comprises an estimate of the closeness of a match between the acoustic model of the speech hypothesis and the sequence of coded representations of the utterance. A translation match score for each speech hypothesis comprises an estimate of the probability of occurrence of the speech hypothesis given the occurrence of the source text. A hypothesis score for each hypothesis comprises a combination of the acoustic match score and the translation match score. At least one word of one or more speech hypot
This invention was made with Government support under Contract Number N00014-91-C-0135 awarded by the office of Naval Research. The Government has certain rights in this invention.
175 Citations
36 Claims
-
1. A speech recognition system comprising:
-
means for displaying a source text comprising one or more words in a source language; an acoustic processor;
for generating a sequence of coded representations of an utterance to be recognized, said utterance comprising one or more words in a target language different from the source language;means for generating a set of one or more speech hypotheses, each speech hypothesis comprising one or more words from the target language; means for generating an acoustic model of each speech hypothesis; means for generating an acoustic match score for each speech hypothesis, each acoustic match score comprising an estimate of the closeness of a match between the acoustic model of the speech hypothesis and the sequence of coded representations of the utterance; means for generating a translation match score for each speech hypothesis, each translation match score comprising an estimate of the probability of occurrence of the speech hypothesis given the occurrence, of the source text; means for generating a hypothesis score for each hypothesis, each hypothesis score comprising a combination of the acoustic match score and the translation match score for the hypothesis; means for storing a subset of one or more speech hypotheses, from the set of speech hypotheses, having the best hypothesis scores; and means for outputting at least one word of one or more of the speech hypotheses in the subset of speech hypotheses having the best hypothesis scores. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A speech recognition method comprising:
-
displaying a source text comprising one or more words in a source language; generating a sequence of coded representations of an utterance to be recognized, said utterance comprising one or more words in a target language different from the source language; generating a set of one or more speech hypotheses, each speech hypothesis comprising one or more words from the target language; generating an acoustic model of each speech hypothesis; generating an acoustic match score for each speech hypothesis, each acoustic match score comprising an estimate of the closeness of a match between the acoustic model of the speech hypothesis and the sequence of coded representations of the utterance; generating a translation match score for each speech hypothesis, each translation match score comprising an estimate of the probability of occurrence of the speech hypothesis given the occurrence of the source text; generating a hypothesis score for each hypothesis, each hypothesis score comprising a combination of the acoustic match score and the translation match score for the hypothesis; storing a subset of one or more speech hypotheses, from the set of speech hypotheses, having the best hypothesis scores; and outputting at least one word of one or more of the speech hypotheses in the subset of speech hypotheses having the best hypothesis scores. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36)
-
Specification