Event detection through text analysis using dynamic self evolving/learning module
First Claim
Patent Images
1. A computer-implemented method comprising:
- identifying, by a computer, a plurality of features in a data stream associated with a data source;
assigning, by the computer, an initial confidence score to each respective feature of the plurality of features;
determining, by the computer, a candidate score of one or more features of the plurality of features using the initial confidence score of each respective feature in the one or more features, based upon a number of occurrences of each respective feature identified in the one or more features;
identifying, by the computer, an event candidate when the candidate score of the one or more features satisfies a predetermined threshold, wherein the event candidate is defined by the one or more features;
automatically determining, by the computer, whether the one or more features identified in the data stream as the event candidate satisfy one or more event models in a categorization table, based upon the computer comparing the one or more features of the data stream against the one or more event models, wherein an event concept store comprises a non-transitory machine-readable memory storing the one or more event models; and
responsive to the computer determining that the one or more features from the data stream fail to satisfy at least one event model in at least one categorization table stored in the event concept store;
comparing, by the computer, the one or more features against one or more uncategorized event models in an uncategorized event table stored in the event concept store wherein the uncategorized event table store records associated with new unknown event models;
storing, by the computer, the one or more features as a new uncategorized event model in the uncategorized event table, in response to determining the one or more features fail to satisfy at least one uncategorized event model;
generating, by the computer, an increased confidence score of an uncategorized event model of the one or more uncategorized event models in the uncategorized event table when the one or more features satisfy the uncategorized event model of the one or more uncategorized event models in the uncategorized event table;
calculating, by the computer, a probability score for the event candidate based on a likelihood that the one or more features represent a new event model;
comparing, by the computer, the increased confidence score of the event candidate and the probability score of the event candidate with a pre-determined threshold score of the uncategorized event model of the one or more uncategorized event models in the uncategorized event table; and
storing, by the computer, the uncategorized event model in the categorization table when the increased confidence score and the probability score is higher than or matches the pre-determined threshold score.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method for detecting events based on input data from a plurality of sources. The system may receive input from a plurality of sources containing information about possible events. A method for event detection involves pre-processing and normalizing a data input from a plurality of sources, extracting and disambiguating events and entities, associate event and entities, correlate events and entities associated from a data input to results from a different data source to determine if an event has occurred, and store the detected events in a data storage.
95 Citations
11 Claims
-
1. A computer-implemented method comprising:
-
identifying, by a computer, a plurality of features in a data stream associated with a data source; assigning, by the computer, an initial confidence score to each respective feature of the plurality of features; determining, by the computer, a candidate score of one or more features of the plurality of features using the initial confidence score of each respective feature in the one or more features, based upon a number of occurrences of each respective feature identified in the one or more features; identifying, by the computer, an event candidate when the candidate score of the one or more features satisfies a predetermined threshold, wherein the event candidate is defined by the one or more features; automatically determining, by the computer, whether the one or more features identified in the data stream as the event candidate satisfy one or more event models in a categorization table, based upon the computer comparing the one or more features of the data stream against the one or more event models, wherein an event concept store comprises a non-transitory machine-readable memory storing the one or more event models; and responsive to the computer determining that the one or more features from the data stream fail to satisfy at least one event model in at least one categorization table stored in the event concept store; comparing, by the computer, the one or more features against one or more uncategorized event models in an uncategorized event table stored in the event concept store wherein the uncategorized event table store records associated with new unknown event models; storing, by the computer, the one or more features as a new uncategorized event model in the uncategorized event table, in response to determining the one or more features fail to satisfy at least one uncategorized event model; generating, by the computer, an increased confidence score of an uncategorized event model of the one or more uncategorized event models in the uncategorized event table when the one or more features satisfy the uncategorized event model of the one or more uncategorized event models in the uncategorized event table; calculating, by the computer, a probability score for the event candidate based on a likelihood that the one or more features represent a new event model; comparing, by the computer, the increased confidence score of the event candidate and the probability score of the event candidate with a pre-determined threshold score of the uncategorized event model of the one or more uncategorized event models in the uncategorized event table; and storing, by the computer, the uncategorized event model in the categorization table when the increased confidence score and the probability score is higher than or matches the pre-determined threshold score. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. An event extraction system comprising:
-
an event concept store that comprises a non-transitory machine-readable memory storing one or more event models, one or more event categorization tables, and an uncategorized event table, wherein each event model of the one or more event models is associated with an event candidate and further comprises a threshold event score and a set of one or more features, wherein each event categorization table comprises one or more known event models, and wherein the at least one uncategorized event table comprises a set of one or more features, a set of one or more entities, and a set of one or more topics, each associated with one or more uncategorized event models; and an event category validation processor communicatively coupled to a network and executes an event extractor module configured to; receive a set of extracted features, a set of extracted entities, and a set of extracted topics from a normalized pre-processed data stream associated with a data source, wherein at least one feature of the set of extracted features, the set of extracted entities, and the set of extracted topics is an event candidate; assign an initial confidence score to the set of extracted features, the set of extracted entities, and the set of extracted topics; calculate a score using the initial confidence score of the set of extracted features, the set of extracted entities, and the set of extracted topics that repeat themselves or overlap in different data sources; determine that an event candidate occurred when the calculated score is greater than a predetermined threshold; compare each of the sets in which the at least one feature is the event candidate with the one or more event categorization tables to determine whether the extracted features, entities, and topics, correspond to a known event model; and compare each of the sets with the uncategorized event table to determine whether the extracted features, entities, and topics, correspond with at least one uncategorized event model, wherein the event extractor module is further configured to increase a confidence score corresponding to the at least one uncategorized event model when the set of extracted features are similar to the at least one uncategorized event model in the uncategorized event table, wherein the uncategorized event table store records associated with new unknown event models and a probability score indicating features representing new event models, and wherein the event extractor module is further configured to calculate a probability score indicating features, wherein the probability score is determined based on a likelihood that the features represents new event models; and
store the at least one uncategorized event model as a new event model in the one or more event categorization tables responsive to determining the confidence score and the probability score corresponding to the at least one uncategorized event model satisfies a threshold score. - View Dependent Claims (8, 9, 10, 11)
-
Specification