AUTOMATIC SPEECH RECOGNITION WITH TEXTUAL CONTENT INPUT
First Claim
Patent Images
1. A method of recognizing speech, the method comprising:
- (a) extracting textual content from a visual content time segment associated with a rich media presentation;
(b) creating a textual content input comprising a word from the extracted textual content; and
(c) providing the textual content input to an automatic speech recognition algorithm such that there is an increased probability that the automatic speech recognition algorithm recognizes the word within an audio content time segment associated with the rich media presentation.
2 Assignments
0 Petitions
Accused Products
Abstract
A method of recognizing speech includes extracting textual content from a visual content time segment associated with a rich media presentation. A textual content input comprising a word from the extracted textual content is created. The textual content input is provided to an automatic speech recognition algorithm such that there is an increased probability that the automatic speech recognition algorithm recognizes the word within an audio content time segment associated with the rich media presentation.
129 Citations
31 Claims
-
1. A method of recognizing speech, the method comprising:
-
(a) extracting textual content from a visual content time segment associated with a rich media presentation; (b) creating a textual content input comprising a word from the extracted textual content; and (c) providing the textual content input to an automatic speech recognition algorithm such that there is an increased probability that the automatic speech recognition algorithm recognizes the word within an audio content time segment associated with the rich media presentation. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
-
-
26. A computer-readable medium having computer-readable instructions stored thereon that, upon execution by a processor, cause the processor to recognize speech, the instructions configured to:
-
(a) create a textual content input comprising a word, wherein the word is obtained from textual content extracted from a visual content time segment associated with a rich media presentation; and (b) provide the textual content input to an automatic speech recognition algorithm such that there is an increased probability that the automatic speech recognition algorithm recognizes the word within an audio content time segment associated with the rich media presentation. - View Dependent Claims (27)
-
-
28. A method of recognizing speech, the method comprising:
-
(a) creating a textual content input comprising a word obtained from textual metadata content associated with a rich media presentation; and (b) providing the textual content input to an automatic speech recognition algorithm such that there is an increased probability that the automatic speech recognition algorithm recognizes the word within an audio content time segment associated with the rich media presentation. - View Dependent Claims (29)
-
-
30. A system for recognizing speech comprising:
-
(a) an automatic speech recognition application, wherein the automatic speech recognition application comprises computer code configured to receive a textual content input comprising a word, wherein the word is obtained from textual content extracted from a visual content time segment associated with a rich media presentation; and use the textual content input to increase a probability that the word is recognized within an audio content time segment associated with the rich media presentation; (b) a memory configured to store the automatic speech recognition application; and (c) a processor coupled to the memory, wherein the processor is configured to execute the automatic speech recognition application.
-
-
31. A method of recognizing speech, the method comprising:
-
(a) extracting textual content from audiovisual content; (b) creating a textual content input comprising a word from the extracted textual content; and (c) providing the textual content input to an automatic speech recognition algorithm such that there is an increased probability that the automatic speech recognition algorithm recognizes the word within audio from the audiovisual content.
-
Specification