Labeling events in historic news
First Claim
Patent Images
1. A method comprising:
- determining, by one or more processors, a plurality of documents relating to a query,the plurality of documents being associated with a plurality of timestamps;
determining, by the one or more processors, a first group of documents, of the plurality of documents, associated with a first timestamp of the plurality of timestamps;
determining, by the one or more processors, a particular quantity of times that one or more forms of the query are included in the first group of documents;
identifying, by the one or more processors, a first document of the first group of documents;
labeling, by the one or more processors, a first point on a graph by using a first headline associated with the first document,the first point corresponding to the first timestamp and a value based on the particular quantity of times, andthe value based on the particular quantity of times satisfying a particular threshold;
determining, by the one or more processors, a second group of documents, of the plurality of documents, associated with a second timestamp of the plurality of timestamps;
identifying, by the one or more processors, a second document of the second group of documents;
labeling, by the one or more processors, a second point on the graph by using a second headline associated with the second document,the second point corresponding to the second timestamp,the graph including a plurality of points,the plurality of points including the first point, the second point, and two or more other points, andthe two or more other points being below the first point and the second point on the graph; and
providing, by the one or more processors, the graph as a response to the query.
2 Assignments
0 Petitions
Accused Products
Abstract
A system identifies a set of documents from a corpus of documents that are relevant to a word, phrase or sentence and that were published at approximately a same time period, where each document of the set of documents includes news content and has an associated headline. The system extracts headlines from the set of documents and derives a score for each headline of the extracted headlines based on how many times selected words in each headline occurs among all of the extracted headlines.
20 Citations
20 Claims
-
1. A method comprising:
-
determining, by one or more processors, a plurality of documents relating to a query, the plurality of documents being associated with a plurality of timestamps; determining, by the one or more processors, a first group of documents, of the plurality of documents, associated with a first timestamp of the plurality of timestamps; determining, by the one or more processors, a particular quantity of times that one or more forms of the query are included in the first group of documents; identifying, by the one or more processors, a first document of the first group of documents; labeling, by the one or more processors, a first point on a graph by using a first headline associated with the first document, the first point corresponding to the first timestamp and a value based on the particular quantity of times, and the value based on the particular quantity of times satisfying a particular threshold; determining, by the one or more processors, a second group of documents, of the plurality of documents, associated with a second timestamp of the plurality of timestamps; identifying, by the one or more processors, a second document of the second group of documents; labeling, by the one or more processors, a second point on the graph by using a second headline associated with the second document, the second point corresponding to the second timestamp, the graph including a plurality of points, the plurality of points including the first point, the second point, and two or more other points, and the two or more other points being below the first point and the second point on the graph; and providing, by the one or more processors, the graph as a response to the query. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system comprising:
one or more processors to; determine a plurality of documents relating to a query, the plurality of documents being associated with a plurality of timestamps; determine a first group of documents, of the plurality of documents, associated with a first timestamp of the plurality of timestamps; determine a particular quantity of times that one or more forms of the query are included in the first group of documents; identify a first document of the first group of documents; label a first point on a graph by using a first headline associated with the first document, the first point corresponding to the first timestamp and a value based on the particular quantity of times, and the value based on the particular quantity of times satisfying a particular threshold; determine a second group of documents, of the plurality of documents, associated with a second timestamp of the plurality of timestamps; identify a second document of the second group of documents; label a second point on the graph by using a second headline associated with the second document, the second point corresponding to the second timestamp, the graph including a plurality of points, the plurality of points including the first point, the second point, and two or more other points, and the two or more other points being below the first point and the second point on the graph; and provide the graph as a response to the query. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
15. A non-transitory computer-readable medium storing instructions, the instructions comprising:
one or more instructions that, when executed by one or more processors, cause the one or more processors to; determine a plurality of documents relating to a query, the plurality of documents being associated with a plurality of timestamps; determine a first group of documents, of the plurality of documents, associated with a first timestamp of the plurality of timestamps; determine a particular quantity of times that one or more forms of the query are included in the first group of documents; identify a first document of the first group of documents; label a first point on a graph by using a first headline associated with the first document, the first point corresponding to the first timestamp and a value based on the particular quantity of times, and the value based on the particular quantity of times satisfying a particular threshold; determine a second group of documents, of the plurality of documents, associated with a second timestamp of the plurality of timestamps; identify a second document of the second group of documents; label a second point on the graph by using a second headline associated with the second document, the second point corresponding to the second timestamp, the graph including a plurality of points, the plurality of points including the first point, the second point, and two or more other points, and the two or more other points being below the first point and the second point on the graph; and provide the graph as a response to the query. - View Dependent Claims (16, 17, 18, 19, 20)
Specification