Event detection
First Claim
1. A machine readable medium storing a program which when executed by at least one processing unit of a research system identifies an event for a category, the program comprising sets of instructions for:
- for each of a plurality of different pre-identified categories for which the research system stores retrievable data, classifying a first set of documents from a current time period and a second set of documents from a background time period as relevant to the category, wherein the current time period and the background time period are separated by a buffer time period in order to isolate the current time period from the background time period;
for each of the plurality of categories, calculating a score for the category for the current time period that quantifies a relative difference in a size of the first set of documents from the current time period and a size of the second set of documents from the background time period;
when the calculated score for a particular category is above a threshold, determining the occurrence of an event for the category in the current time period; and
storing data in the research system indicating the occurrence of an event for each of the plurality of categories for which the calculated score is above the threshold.
5 Assignments
0 Petitions
Accused Products
Abstract
Some embodiments provide a method for identifying an event for a particular category. The method classifies several documents as relevant to several different categories. The method identifies a number of documents relevant to the particular category for a current time period and a background time period. Based on a comparison of the number of documents from the current time period relevant to the particular category and the number of documents from the background time period relevant to the particular category, the method identifies an event for the category for the current time period. Some embodiments calculated a score for the event, and normalize the score based on an average number of documents relevant to each of a set of related categories including the particular category.
-
Citations
17 Claims
-
1. A machine readable medium storing a program which when executed by at least one processing unit of a research system identifies an event for a category, the program comprising sets of instructions for:
-
for each of a plurality of different pre-identified categories for which the research system stores retrievable data, classifying a first set of documents from a current time period and a second set of documents from a background time period as relevant to the category, wherein the current time period and the background time period are separated by a buffer time period in order to isolate the current time period from the background time period; for each of the plurality of categories, calculating a score for the category for the current time period that quantifies a relative difference in a size of the first set of documents from the current time period and a size of the second set of documents from the background time period; when the calculated score for a particular category is above a threshold, determining the occurrence of an event for the category in the current time period; and storing data in the research system indicating the occurrence of an event for each of the plurality of categories for which the calculated score is above the threshold. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A machine-implemented method for identifying an event for a category, the method comprising:
-
for each of a plurality of different pre-identified categories for which a research system stores retrievable data, classifying a first set of documents from a current time period and a second set of documents from a background time period as relevant to the category, wherein the current time period and the background time period are separated by a buffer time period in order to isolate the current time period from the background time period; for each of the plurality of categories, calculating a score for the category for the current time period that quantifies a relative difference in a size of the first set of documents from the current time period and a size of the second set of documents from the background time period; when the calculated score for a particular category is above a threshold, determining the occurrence of an event for the category in the current time period; and storing data in the research system indicating the occurrence of an event for each of the plurality of categories for which the calculated score is above the threshold. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17)
-
Specification