Speech recognition for internet video search and navigation
First Claim
Patent Images
1. A television comprising:
- an audio video receiver;
a display;
circuitry configured to;
receive speech signals representing a video site or video subject;
implement speech recognition on received speech signals to generate recognized speech data representing a video site or video subject;
using the recognized speech data representing the video site or video subject, access at least one database including indices derived from at least digitized voice soundtracks that accompany video, or at least descriptive text that is associated with video, or at least both digitized voice soundtracks that accompany video and descriptive text that is associated with video, the indices being associated with the at least one database; and
at least one index in the indices being correlated with the recognized speech and identified as at least one matching index element from the at least one database, the matching index element being useful for providing video to the display.
0 Assignments
0 Petitions
Accused Products
Abstract
Speech representing a desired video site or video subject is detected and digitized at a TV remote, and then sent to a TV. The TV or in some embodiments an Internet server communicating with the TV use speech recognition principles to recognize the speech, enter a database using the recognized speech as entering argument, and return a link to an Internet site hosting the desired video. The link can be displayed on the TV for selection thereof by a user to retrieve the video.
32 Citations
26 Claims
-
1. A television comprising:
-
an audio video receiver; a display; circuitry configured to; receive speech signals representing a video site or video subject; implement speech recognition on received speech signals to generate recognized speech data representing a video site or video subject; using the recognized speech data representing the video site or video subject, access at least one database including indices derived from at least digitized voice soundtracks that accompany video, or at least descriptive text that is associated with video, or at least both digitized voice soundtracks that accompany video and descriptive text that is associated with video, the indices being associated with the at least one database; and at least one index in the indices being correlated with the recognized speech and identified as at least one matching index element from the at least one database, the matching index element being useful for providing video to the display. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A television comprising:
-
an audio video device (AVD); a remote control; wherein the remote control comprises circuitry to digitize received speech and send the digitized speech to the AVD; wherein the AVD comprises circuitry to; generate wireless commands to an audio video device (AVD); receive digitized speech and generate recognized speech from the digitized speech, the recognized speech being associated with a video; using the recognized speech as entering argument, access a data structure correlating speech associated with video to computer storage locations of stored video, the data structure comprising at least one index derived from at least digitized voice soundtracks that accompany video, or at least descriptive text that is associated with video, or at least both digitized voice soundtracks that accompany video and descriptive text that is associated with video; and retrieving, from the data structure, at least an identification of at least one video correlated to a match of the recognized speech. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A machine-executed method comprising:
-
receiving speech signals representing a video site or video subject; implementing speech recognition on received speech signals representing a video site or video subject to generate recognized speech; using the recognized speech representing the video site or video subject, access at least one database including at least one index derived from at least digitized voice soundtracks that accompany video, or at least descriptive text that is associated with video, or at least both digitized voice soundtracks that accompany video and descriptive text that is associated with video; and correlating the recognized speech with at least one element of the index identified by the accessing to identify at least one matching index element from the at least one database, the matching index element being useful for providing video to the AVD. - View Dependent Claims (23, 24, 25)
-
-
26. A computer-implemented method comprising:
-
recognizing digitized speech representing a video and generating recognized speech in response; using the recognized speech representing a video as entering argument, access a data structure correlating speech associated with video to computer storage locations of stored video, the data structure comprising at least one index derived from at least digitized voice soundtracks that accompany video, or at least descriptive text that is associated with video, or at least both digitized voice soundtracks that accompany video and descriptive text that is assocaited with video; retrieving, from the data structure, at least an identification of at one video correlated to a match of the recognized speech.
-
Specification