Word dependent N-best search method
First Claim
1. A method of producing N-most likely sentence hypotheses defined as word sequences of one or more words from a limited vocabulary speech signal, each word having a set of states including a distinguished first and last state, said method comprising the steps of:
- a. dividing the speech signal of an utterance into frames and generating for each frame at least one vector that characterizes the speech signal;
b. computing for each frame for selected states in selected words, the probability of a sequence of vectors up to each such frame, given a most likely partial sentence hypothesis that begins with the utterance and ends with that state at that frame;
c. at each of said selected states accumulating a separate probability score for each of m most likely different partial sentence hypotheses that begin with the utterance and end at this state at that frame, but that differ in the previous word to the word to which this state belongs so as to provide m previous-word theories having respective identities, wherein m is an integer;
d. recording at each frame for the last state of each word the accumulated probability scores together with the identities of the respective previous-word theories;
e. starting the first state of each word with the probability score of each of n most likely respective previous-word theories and each said word according to a grammar model, wherein n is an integer; and
f. at the end of the utterance reassembling N likely different sentence hypotheses that have the highest accumulated scores using the recorded probability scores and previous-word theories recorded in step d so as to provide the N-most likely sentence hypotheses, wherein N is an integer.
16 Assignments
0 Petitions
Accused Products
Abstract
As a step in finding the one most likely word sequence in a spoken language system, an N-best search is conducted to find the N most likely sentence hypotheses. During the search, word theories are distinguished based only on the one previous word. At each state within a word, the total probability is calculated for each of a few previous words. At the end of each word, the probability score is recorded for each previous word theory, together with the name of the previous word. At the end of the sentence, a recursive traceback is performed to derive the list of the N best sentences.
280 Citations
20 Claims
-
1. A method of producing N-most likely sentence hypotheses defined as word sequences of one or more words from a limited vocabulary speech signal, each word having a set of states including a distinguished first and last state, said method comprising the steps of:
-
a. dividing the speech signal of an utterance into frames and generating for each frame at least one vector that characterizes the speech signal; b. computing for each frame for selected states in selected words, the probability of a sequence of vectors up to each such frame, given a most likely partial sentence hypothesis that begins with the utterance and ends with that state at that frame; c. at each of said selected states accumulating a separate probability score for each of m most likely different partial sentence hypotheses that begin with the utterance and end at this state at that frame, but that differ in the previous word to the word to which this state belongs so as to provide m previous-word theories having respective identities, wherein m is an integer; d. recording at each frame for the last state of each word the accumulated probability scores together with the identities of the respective previous-word theories; e. starting the first state of each word with the probability score of each of n most likely respective previous-word theories and each said word according to a grammar model, wherein n is an integer; and f. at the end of the utterance reassembling N likely different sentence hypotheses that have the highest accumulated scores using the recorded probability scores and previous-word theories recorded in step d so as to provide the N-most likely sentence hypotheses, wherein N is an integer. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A system for producing N-most likely sentence hypotheses defined as word sequences of one or more words from a limited vocabulary speech signal, each word having a set of states including a distinguished first and last state, said system comprising:
-
a. means for dividing the speech signal of an utterance into frames and generating for each frame at least one vector that characterizes the speech signal; b. means for computing for each frame for selected states in selected words, the probability of a sequence of vectors up to each such frame, given a most likely partial sentence hypothesis that begins with the utterance and ends with that state at that frame; c. means for accumulating at each of said selected states a separate probability score for each of m most likely different partial sentence hypotheses that begin with the utterance and end at this state at that frame, but that different in the previous word to the word to which this state belongs so as to provide m previous-word theories having respective identities, wherein m is an integer; d. means for recording at each frame for the last state of each word the accumulated probability scores together with the identities of the respective previous-word theories; e. means for starting the first state of each word with the probability score of each of n most likely respective previous-word theories and each said word according to a grammar model, wherein n is an integer; and f. means for reassembling at the end of the utterance the N likely different sentence hypotheses that have the highest accumulated scores using the recorded probability scores and previous-word theories recorded in step d so as to provide the N-most likely sentence hypotheses, wherein N is an integer.
-
Specification