N-gram spotting followed by matching continuation tree forward and backward from a spotted n-gram
First Claim
1. A speech recognition method comprising:
- obtaining a set of acoustic observations;
obtaining a list of target speech element sequences each containing at least one speech element;
for each target speech element sequence obtaining a forward sequence extension model and a backward sequence extension model;
spotting at least one spotted target speech element sequence by matching the sequence of speech element models against the set of acoustic observations;
obtaining from the set of acoustic observations the set of acoustic observations preceding the said at least one spotted target speech element sequence and the set of acoustic observations following the said at least one spotted target speech element sequence;
obtaining at least one hypothesis of a longer speech element sequence containing the said at least one spotted speech element sequence as a proper subsequence in which said at least one longer speech element sequence is consistent with at least one of said forward sequence extension model and said backward sequence extension model for said at least one spotted speech element sequence; and
evaluating said at least one hypothesis of a longer speech element sequence based on the degree of acoustic match between said longer speech element sequence and at least one of said set of acoustic observations preceding the said at least one spotted target speech element sequence and the set of acoustic observations following the said at least one spotted target speech element sequence.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech recognition method obtains a list of target speech element sequences each containing at least one speech element. For each target speech element sequence, a forward sequence extension model and a backward sequence extension model is obtained. At least one spotted target speech element sequence is found in a set of acoustic observations by matching it against the sequence of speech element models. From the set of acoustic observations, the set of acoustic observations preceding and following the at least one spotted target speech element sequence is obtained. At least one hypothesis of a longer speech element sequence containing the at least one spotted speech element sequence is obtained as a proper subsequence in which the at least one longer speech element sequence is consistent with at least one of the forward sequence extension model and the backward sequence extension model. The hypothesis of a longer speech element sequence is evaluated based on the degree of acoustic match between the longer speech element sequence and at least one of the set of preceding acoustic observations and following acoustic observations.
14 Citations
25 Claims
-
1. A speech recognition method comprising:
-
obtaining a set of acoustic observations;
obtaining a list of target speech element sequences each containing at least one speech element;
for each target speech element sequence obtaining a forward sequence extension model and a backward sequence extension model;
spotting at least one spotted target speech element sequence by matching the sequence of speech element models against the set of acoustic observations;
obtaining from the set of acoustic observations the set of acoustic observations preceding the said at least one spotted target speech element sequence and the set of acoustic observations following the said at least one spotted target speech element sequence;
obtaining at least one hypothesis of a longer speech element sequence containing the said at least one spotted speech element sequence as a proper subsequence in which said at least one longer speech element sequence is consistent with at least one of said forward sequence extension model and said backward sequence extension model for said at least one spotted speech element sequence; and
evaluating said at least one hypothesis of a longer speech element sequence based on the degree of acoustic match between said longer speech element sequence and at least one of said set of acoustic observations preceding the said at least one spotted target speech element sequence and the set of acoustic observations following the said at least one spotted target speech element sequence. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A speech recognition system, comprising:
-
means for obtaining a list of target speech element sequences from a set of acoustic observations, each said target speech element sequence containing at least one speech element;
means for obtaining, for each said target speech element sequence, a forward sequence extension model and a backward sequence extension model;
means for spotting at least one spotted target speech element sequence by matching the sequence of speech element models against the set of acoustic observations;
means for obtaining, from the set of acoustic observations, the set of acoustic observations preceding the said at least one spotted target speech element sequence and the set of acoustic observations following the said at least one spotted target speech element sequence;
means for obtaining at least one hypothesis of a longer speech element sequence containing the said at least one spotted speech element sequence as a proper subsequence in which said at least one longer speech element sequence is consistent with at least one of said forward sequence extension model and said backward sequence extension model for said at least one spotted speech element sequence; and
means for evaluating said at least one hypothesis of a longer speech element sequence based on the degree of acoustic match between said longer speech element sequence and at least one of said set of acoustic observations preceding the said at least one spotted target speech element sequence and the set of acoustic observations following the said at least one spotted target speech element sequence. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A program product having machine readable code for performing speech recognition, the program code, when executed, causing a machine to perform the following steps:
-
obtaining a list of target speech element sequences each containing at least one speech element;
for each target speech element sequence obtaining a forward sequence extension model and a backward sequence extension model;
spotting at least one spotted target speech element sequence in a set of acoustic observations by matching the sequence of speech element models against the set of acoustic observations;
obtaining from the set of acoustic observations the set of acoustic observations preceding the said at least one spotted target speech element sequence and the set of acoustic observations following the said at least one spotted target speech element sequence;
obtaining at least one hypothesis of a longer speech element sequence containing the said at least one spotted speech element sequence as a proper subsequence in which said at least one longer speech element sequence is consistent with at least one of said forward sequence extension model and said backward sequence extension model for said at least one spotted speech element sequence; and
evaluating said at least one hypothesis of a longer speech element sequence based on the degree of acoustic match between said longer speech element sequence and at least one of said set of acoustic observations preceding the said at least one spotted target speech element sequence and the set of acoustic observations following the said at least one spotted target speech element sequence. - View Dependent Claims (22, 23, 24)
-
-
25. A speech recognition method, comprising:
-
receiving a set of acoustic observations, and performing a speech recognition on the set of acoustic observations;
at the same time the speech recognition is being performed, determining whether or not an n-gram of speech elements occurs in the set of acoustic observations, wherein n is an integer greater than or equal to one;
if the determination is that an n-gram occurs, then performing at least one of a backward search and a forward search using a continuation tree that represents allowable continuations in a grammar that may precede or follow the spotted n-gram; and
determining a best matching path in the continuation tree with respect to the set of acoustic observations.
-
Specification