Systems and methods for speech indexing
First Claim
Patent Images
1. A method of indexing speech, comprising:
- associating a first phonetic sequence with a first position in an audio signal using a phonetic recognizer;
associating said first phonetic sequence to a first linguistic element based on a first parameter;
associating a second linguistic element with a second position in said audio signal using a large vocabulary speech recognizer (LVSR);
comparing said first position and said second position to determine a phrase window;
comparing said first linguistic element to said second linguistic element if said phrase window meets a first criteria; and
adjusting said first parameter based upon a result of said step of comparing said first linguistic elementwherein said step of associating said second linguistic element is performed on a lesser portion of said audio signal than said step of associating said first phonetic sequence with said first position;
wherein said step of associating said first phonetic sequence to said first linguistic element also associates said first linguistic element with a confidence value and said lesser portion of said audio signal is selected to correspond to said first linguistic element based upon said confidence value.
4 Assignments
0 Petitions
Accused Products
Abstract
A speech index for a recording or other representation of an audio signal containing speech is generated using a phonetic automatic voice recognition engine. A second speech index is also generated using a more accurate, but slower, automatic voice recognition engine such as a large vocabulary speech recognition (LVSR) engine. These two speech indexes are compared. The results of the comparison are then used to adjust certain parameters used by the phonetic engine while generating a speech index. The results may also be used to correct all or parts of the speech index generated by the phonetic automatic speech recognition engine.
-
Citations
14 Claims
-
1. A method of indexing speech, comprising:
-
associating a first phonetic sequence with a first position in an audio signal using a phonetic recognizer; associating said first phonetic sequence to a first linguistic element based on a first parameter; associating a second linguistic element with a second position in said audio signal using a large vocabulary speech recognizer (LVSR); comparing said first position and said second position to determine a phrase window; comparing said first linguistic element to said second linguistic element if said phrase window meets a first criteria; and adjusting said first parameter based upon a result of said step of comparing said first linguistic element wherein said step of associating said second linguistic element is performed on a lesser portion of said audio signal than said step of associating said first phonetic sequence with said first position; wherein said step of associating said first phonetic sequence to said first linguistic element also associates said first linguistic element with a confidence value and said lesser portion of said audio signal is selected to correspond to said first linguistic element based upon said confidence value. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A system for indexing speech, comprising:
-
a phonetic decoder that associates audio features of an audio signal with a first phonetic sequence at a first position in said audio signal; a lexical interpreter that associates said first phonetic sequence with a first linguistic element based on a first parameter; large vocabulary speech recognizer that associates a second linguistic element with a second position in said audio signal; a speech index comparator that compares said first position and said second position to determine a phrase window; and, said speech index comparator also compares said first linguistic element to said second linguistic element if said phrase window meets a first criteria; and a parameter adjuster that adjusts said first parameter based upon a result of said speech index comparator wherein said large vocabulary speech recognizer performs said association on a lesser portion of said audio signal than said phonetic decoder; wherein said lexical interpreter also associates said first linguistic element with a confidence value and said lesser portion of said audio signal is selected to correspond to said first linguistic element based upon said confidence value. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine to perform method steps for indexing speech, comprising:
-
associating a first phonetic sequence with a first position in an audio signal using a phonetic recognizer; associating said first phonetic sequence to a first linguistic element based on a first parameter; associating a second linguistic element with a second position in said audio signal using a large vocabulary speech recognizer a (LVSR); comparing said first position and said second position to determine a phrase window; comparing said first linguistic element to said second linguistic element if said phrase window meets a first criteria; and
,adjusting said first parameter based upon a result of said step of comparing said first linguistic element wherein said step of associating said second linguistic element is performed on a lesser portion of said audio signal than said step of associating said first phonetic sequence with said first position; wherein said step of associating said first phonetic sequence to said first linguistic element also associates said first linguistic element with a confidence value and said lesser portion of said audio signal is selected to correspond to said first linguistic element based upon said confidence value. - View Dependent Claims (12, 13, 14)
-
Specification