Time-anchored posterior indexing of speech
First Claim
Patent Images
1. A computer-implemented method of indexing a speech lattice for search of audio corresponding to the speech lattice, the method comprising:
- using a processor to identify at least two speech recognition hypotheses for a particular word which have time ranges satisfying a criteria, each of the at least two speech recognition hypotheses for the particular word having an associated start time, an associated end time, and an associated probability, at least some of the at least two speech recognition hypotheses having different associated start and/or associated finish times from each other, and satisfying the criteria requires that the at least two speech recognition hypotheses for the particular word have start times that are within a predetermined range of each other, and end times that are within a predetermined range of each other;
using the processor to merge the at least two speech recognition hypotheses, at least some of which having different associated start and/or associated finish times from each other, to generate a merged speech recognition hypothesis for the particular word such that start and end times for the merged speech recognition hypothesis are the same as start and end times for a best of the at least two speech recognition hypotheses, wherein merging the at least two speech recognition hypotheses to generate the merged speech recognition hypothesis for the particular word further comprises combining the associated probabilities of the at least two speech recognition hypotheses for the particular word which have time ranges satisfying the criteria; and
storing an index entry to represent the merged speech recognition hypothesis for the particular word.
3 Assignments
0 Petitions
Accused Products
Abstract
A computer-implemented method of indexing a speech lattice for search of audio corresponding to the speech lattice is provided. The method includes identifying at least two speech recognition hypotheses for a word which have time ranges satisfying a criteria. The method further includes merging the at least two speech recognition hypotheses to generate a merged speech recognition hypothesis for the word.
77 Citations
16 Claims
-
1. A computer-implemented method of indexing a speech lattice for search of audio corresponding to the speech lattice, the method comprising:
-
using a processor to identify at least two speech recognition hypotheses for a particular word which have time ranges satisfying a criteria, each of the at least two speech recognition hypotheses for the particular word having an associated start time, an associated end time, and an associated probability, at least some of the at least two speech recognition hypotheses having different associated start and/or associated finish times from each other, and satisfying the criteria requires that the at least two speech recognition hypotheses for the particular word have start times that are within a predetermined range of each other, and end times that are within a predetermined range of each other; using the processor to merge the at least two speech recognition hypotheses, at least some of which having different associated start and/or associated finish times from each other, to generate a merged speech recognition hypothesis for the particular word such that start and end times for the merged speech recognition hypothesis are the same as start and end times for a best of the at least two speech recognition hypotheses, wherein merging the at least two speech recognition hypotheses to generate the merged speech recognition hypothesis for the particular word further comprises combining the associated probabilities of the at least two speech recognition hypotheses for the particular word which have time ranges satisfying the criteria; and storing an index entry to represent the merged speech recognition hypothesis for the particular word. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-implemented method comprising:
-
accessing a speech lattice representing a plurality of speech recognition hypotheses for a portion of speech data, the plurality of speech recognition hypotheses including a plurality of word hypotheses for a plurality of words in the portion of speech data, each word hypothesis of the plurality of word hypotheses including an n-tuple representing a start time associated with the word hypothesis, an end time associated with the word hypothesis, a word TD that identifies a particular word represented by the word hypothesis, and an associated probability for the word hypothesis; selecting a set of word hypotheses, from the plurality of word hypotheses, that are hypotheses for a same word in the portion of speech data and that have start and end times that satisfy a criteria, the set of word hypotheses being selected using the word IDs, start times, and end times of the n-tuples for the plurality of word hypotheses, wherein each word hypothesis in the set that satisfy the criteria has an associated start time within a first predetermined range of the start times of all other word hypotheses in the set and has an associated end time within a second predetermined range of the end times of all other word hypotheses in the set, and at least two word hypotheses in the set have different associated start times and/or different associated end times from each other; and generating, using a processor of a computer, a merged word hypothesis for the same word in the portion of speech data by merging the set of word hypotheses, wherein generating comprises; merging the at least two word hypotheses in the set having different associated start times and/or different associated end times from each other; assigning start and end times to the merged word hypothesis that are the same as the start and end times associated with the word hypothesis in the set having a highest probability; and assigning a probability to the merged word hypothesis by combining the associated probabilities of the merged set of word hypotheses; and storing an index entry to represent the merged word hypothesis. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16)
-
Specification