System, method and apparatus providing collateral information for a video/audio stream
First Claim
1. A method for providing collateral information for inclusion with an information stream, comprising steps of:
- examining the information stream to recognize a presence of events that occur in the information stream, wherein said events are derived from the information stream based on one or more predetermined taxonomies, wherein the step of examining the information stream comprises the steps of automatically extracting text from the information stream, segmenting the text into sentences and a step of operating on the sentences to identify topics that correspond to topic taxonomies of the predetermined taxonomies and the presence of names of entities;
assembling a list comprised of an identified topic having a start time and an end time, as well as any named entities that occur between the start time and the end time;
assembling a query object comprised of named entities that occur between the start time and the end time of the identified topic;
searching at least one database to identify a first set of stored documents that correspond to the topic;
identifying a subset of the first set of documents that contain the named entities;
identifying a second set of documents that correspond to words found in the text;
scoring the returned documents based on a plurality of criteria and ranking the documents based on their scores;
automatically generating database queries from said derived events; and
analyzing results of said database queries so as to rank and select said results to be inserted into the information stream as information that is collateral to said derived events.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method is disclosed for performing Automatic Stream Analysis for Broadcast Information which takes speech audio as input, converts the audio stream into text using a speech recognition system, applies a variety of analyzers to the text stream to identify information elements, automatically generates queries from these information elements, and extracts data from search results that is relevant to a current program. The data is multiplexed into the broadcast signal and transmitted along with the original audio/video program. The system is fully automatic and operates in real time, allowing broadcasters to add relevant collateral information to live programming.
309 Citations
15 Claims
-
1. A method for providing collateral information for inclusion with an information stream, comprising steps of:
-
examining the information stream to recognize a presence of events that occur in the information stream, wherein said events are derived from the information stream based on one or more predetermined taxonomies, wherein the step of examining the information stream comprises the steps of automatically extracting text from the information stream, segmenting the text into sentences and a step of operating on the sentences to identify topics that correspond to topic taxonomies of the predetermined taxonomies and the presence of names of entities;
assembling a list comprised of an identified topic having a start time and an end time, as well as any named entities that occur between the start time and the end time;
assembling a query object comprised of named entities that occur between the start time and the end time of the identified topic;
searching at least one database to identify a first set of stored documents that correspond to the topic;
identifying a subset of the first set of documents that contain the named entities;
identifying a second set of documents that correspond to words found in the text;
scoring the returned documents based on a plurality of criteria and ranking the documents based on their scores;
automatically generating database queries from said derived events; and
analyzing results of said database queries so as to rank and select said results to be inserted into the information stream as information that is collateral to said derived events. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method for providing collateral information for multiplexing with an information stream, comprising steps of:
-
converting the information stream into text;
analyzing the text to identify information elements based on one or more predetermined taxonomies;
automatically generating queries from the information elements for searching at least one database;
extracting data from database search results that is relevant to the information stream, wherein the step of extracting comprises a step of ranking extracted document information based on a score derived from a free text search of a document database using the text, on a number of named entities extracted from the text that are found in the documents, and on a taxonomy path score, where the taxonomy path score represents an amount of relatedness between a taxonomy-related information element identified in the text and a tree of the predetermined taxonomies; and
multiplexing the data into the information stream for presentation at a destination of the information stream. - View Dependent Claims (8, 9)
-
-
10. A system for providing collateral information for inclusion with an information stream, said system operating in real time or substantially real time and comprising:
-
a subsystem for examining the information stream to recognize a presence of events that occur in the information stream, wherein said events are derived from the information stream based on one or more predetermined taxonomies;
a subsystem, having an input coupled to an output of said examination subsystem, for automatically generating database queries from said derived events;
a database for receiving said database queries; and
a subsystem, having an input coupled to an output of said database, for analyzing results of said database queries so as to rank and select said results to be inserted into the information stream as information that is collateral to said derived events, wherein the analyzing subsystem employs ranking criteria comprised of a score derived from a free text search of the database using text that is automatically extracted from the information stream, on a number of named entities appearing in the text and in the database query results, and on a taxonomy path score, where the taxonomy path score represents an amount of relatedness between a taxonomy-related information element found in the text and a tree of the predetermined taxonomies, and wherein the query generation subsystem generates queries based on information corresponding to a list that identifies topics in the text that is automatically extracted from the information stream, where the topics correspond to elements of the taxonomy tree. - View Dependent Claims (11, 12, 13)
-
-
14. A system for providing collateral information for inclusion with an information stream, said system operating in real time or substantially real time and comprising:
-
a subsystem for examining the information stream to recognize a presence of events that occur in the information stream, wherein said events are derived from the information stream based on one or more predetermined taxonomies, wherein said examining subsystem comprises at least one unit for automatically extracting text from the information stream, a unit for segmenting the text into sentences and at least one unit for operating on the sentences to identify topics that correspond to topic taxonomies of the predetermined taxonomies;
a subsystem, having an input coupled to an output of said examination subsystem, for automatically generating database queries from said derived events, wherein said query generation subsystem automatically generates database queries based at least in part on identified topics;
a database for receiving said database queries;
a subsystem, having an input coupled to an output of said database, for analyzing results of said database queries so as to rank and select said results to be inserted into the information stream as information that is collateral to said derived event; and
a unit for operating on the sentences to identify the presence of names of entities, and further comprising a unit for assembling a list comprised of an identified topic having a start time and an end time, as well as any named entities that occur between the start time and the end time, and where the query generation subsystem assembles a query object comprised of named entities that occur between the start time and the end time of the identified topic for searching said database to identify a first set of stored documents that correspond to the topic, a subset of the first set of documents that contain the named entities, a second set of documents that correspond to words found in the text; and
where said analyzing subsystem scores the returned documents based on a plurality of criteria and ranks the documents based on their scores.
-
-
15. A computer readable media having recorded thereon a program for providing collateral information for inclusion with an information stream, the program comprising instructions for:
-
examining the information stream to recognize a presence of events that occur in the information stream, wherein the events are derived from the information stream based on one or more predetermined taxonomies, wherein the instruction for examining the information stream comprises instructions for automatically extracting text from the information stream, for segmenting the text into sentences and for operating on the sentences to identify topics that correspond to topic taxonomies of the predetermined taxonomies and the presence of names of entities;
assembling a list comprised of an identified topic having a start time and an end time, as well as any named entities that occur between the start time and the end time;
assembling a query object comprised of named entities that occur between the start time and the end time of the identified topic;
searching at least one database to identify a first set of stored documents that correspond to the topic;
identifying a subset of the first set of documents that contain the named entities;
identifying a second set of documents that correspond to words found in the text;
scoring the returned documents based on a plurality of criteria and ranking the documents based on their scores;
automatically generating database queries from said derived events; and
analyzing results of said database queries so as to rank and select said results to be inserted into the information stream as information that is collateral to said derived events.
-
Specification