Systems and methods for detecting and coordinating changes in lexical items
First Claim
Patent Images
1. A method for detecting and coordinating change events in a data stream comprising:
- monitoring, by a processor, over time, a probability of occurrence of lexical items in a data stream comprising a plurality of lexical items and a metavalue associated therewith, according to a lexical occurrence model, to detect a plurality of change events in the data stream;
applying, by the processor, a significance test to the change events to determine if the change events are statistically significant;
applying, by the processor, an interestingness test to the change events to determine a measure of interest (I) indicating whether the change events are likely to be of interest to a user, the interestingness test defined using conditional mutual information between the lexical items (W) and the lexical occurrence model (M) given a time span (T) as provided by a relationship;
I(W;
M|T)=H(W|T)−
H(W|M,T)where H represents conditional entropy; and
grouping the change events across the lexical items and the metavalue to summarize the change events that are synchronous in time, the grouping forming a set of grouped change events.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods for efficiently detecting and coordinating step changes, trends, cycles, and bursts affecting lexical items within data streams are provided. Data streams can be sourced from documents that can optionally be labeled with metadata. Changes can be grouped across lexical and/or metavalue vocabularies to summarize the changes that are synchronous in time. The methods described herein can be applied either retrospectively to a corpus of data or in a streaming mode.
9 Citations
20 Claims
-
1. A method for detecting and coordinating change events in a data stream comprising:
-
monitoring, by a processor, over time, a probability of occurrence of lexical items in a data stream comprising a plurality of lexical items and a metavalue associated therewith, according to a lexical occurrence model, to detect a plurality of change events in the data stream; applying, by the processor, a significance test to the change events to determine if the change events are statistically significant; applying, by the processor, an interestingness test to the change events to determine a measure of interest (I) indicating whether the change events are likely to be of interest to a user, the interestingness test defined using conditional mutual information between the lexical items (W) and the lexical occurrence model (M) given a time span (T) as provided by a relationship;
I(W;
M|T)=H(W|T)−
H(W|M,T)where H represents conditional entropy; and grouping the change events across the lexical items and the metavalue to summarize the change events that are synchronous in time, the grouping forming a set of grouped change events. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A non-transitory computer readable storage medium comprising computer readable instructions that;
- when executed by a processor, cause the processor to perform operations comprising;
monitoring, over time, a probability of occurrence of lexical items in a data stream comprising a plurality of lexical items and a metavalue associated therewith, according to a lexical occurrence model, to detect a plurality of change events in the data stream; applying a significance test to the change events to determine if the change events are statistically significant; applying an interestingness test to the change events to determine a measure of interest (I) indicating whether the change events are likely to be of interest to a user;
the interestingness test defined using conditional mutual information between the lexical items (W) and the lexical occurrence model (M) given a time span (T) as provided by a relationship;
I(W;
M|T)=H(W|T)−
H(W|M,T)where H represents conditional entropy; and grouping the change events across the lexical items and the metavalue to summarize the change events that are synchronous in time, the grouping forming a set of grouped change events. - View Dependent Claims (9, 10, 11, 12, 13, 14)
- when executed by a processor, cause the processor to perform operations comprising;
-
15. A system for detecting and coordinating change events in a data stream, comprising:
-
a processor; a memory in communication with the processor, the memory having stored thereon instructions, executable by the processor to cause the processor to perform operations comprising; monitoring, over time, a probability of occurrence of lexical items in a data stream comprising a plurality of lexical items and a metavalue associated therewith, according to a lexical occurrence model, to detect a plurality of change events in the data stream; applying a significance test to the change events to determine if the change events are statistically significant; applying an interestingness test to the change events to determine a measure of interest (I) indicating whether the change events are likely to be of interest to a user, the interestingness test defined using conditional mutual information between the lexical items (W) and the lexical occurrence model (M) given a time span (T) as provided by a relationship;
I(W;
M|T)=H(W|T)−
H(W|M,T)where H represents conditional entropy; and grouping the change events across the lexical items and the metavalue to summarize the change events that are synchronous in time, the grouping forming a set of grouped change events. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification