Dynamically processing an event using an extensible data model
First Claim
Patent Images
1. A method for dynamically processing an event including a dataset that is streamed from a source to a sink via nodes, the event being raw data generated by a machine, the method comprising:
- recording, at a respective node, the event including the dataset in a memory of the respective node using a data model; and
annotating the event by adding or updating one or more attributes associated with the event in the data model, the annotating performed based on the respective node reading at least a portion of the dataset and determining annotation in accordance with one or more functions configured to generate an analytical result, the data model having a number of fields for representing select raw data in the event, and the data model being extensible to add additional attributes to the event by a subsequent node configured to further process the dataset, including performing a query on the dataset, as the event is streamed from the source to the sink,wherein said annotating includes specifying, based on the dataset, one or more fields of the event in the data model so as to enable the subsequent node (1) to process the event based on the annotation; and
/or (2) to route the event based on the annotation, wherein the dataset includes at least one of;
a timestamp, a source machine, a body, or a priority.
5 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods of dynamically processing an event using an extensible data model are disclosed. One embodiment includes, specifying attributes of the event in a data model; the data model being extensible to add properties to the event as the dataset is streamed from the source to the sink.
143 Citations
20 Claims
-
1. A method for dynamically processing an event including a dataset that is streamed from a source to a sink via nodes, the event being raw data generated by a machine, the method comprising:
-
recording, at a respective node, the event including the dataset in a memory of the respective node using a data model; and annotating the event by adding or updating one or more attributes associated with the event in the data model, the annotating performed based on the respective node reading at least a portion of the dataset and determining annotation in accordance with one or more functions configured to generate an analytical result, the data model having a number of fields for representing select raw data in the event, and the data model being extensible to add additional attributes to the event by a subsequent node configured to further process the dataset, including performing a query on the dataset, as the event is streamed from the source to the sink, wherein said annotating includes specifying, based on the dataset, one or more fields of the event in the data model so as to enable the subsequent node (1) to process the event based on the annotation; and
/or (2) to route the event based on the annotation, wherein the dataset includes at least one of;
a timestamp, a source machine, a body, or a priority. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method for dynamically processing an event including a dataset that is streamed from a source to a sink via nodes, the event being raw data generated by a machine, the method comprising:
-
recording, at a respective node, the event including the dataset in a memory of the respective node using a data model; and annotating the event by adding or updating one or more attributes associated with the event in the data model, the annotating performed based on the respective node reading at least a portion of the dataset and determining annotation in accordance with one or more functions configured to generate analytical result, the data model having a number of fields for representing select raw data in the event, and the data model being extensible to add additional attributes to the event by a subsequent node configured to further process the dataset, including performing a query on the dataset, as the event is streamed from the source to the sink, wherein the one or more attributes include;
(1) a map from a string attribute name to an array of bytes, (2) a route to one or more storage locations for the event, or (3) one or more formats for outputting of the dataset at the sink. - View Dependent Claims (7, 8)
-
-
9. A non-transitory computer readable medium storing a plurality of instructions which, upon execution by a processor, cause the processor to perform a method for dynamically processing an event including a dataset that is streamed from a source to a sink via nodes, the event being raw data generated by a machine, the method comprising:
-
recording, at a respective node, the event including the dataset in a memory of the respective node using a data model; and annotating the event by adding or updating one or more attributes associated with the event in the data model, the annotating performed based on the respective node reading at least a portion of the dataset and determining annotation in accordance with one or more functions configured to generate an analytical result, the data model having a number of fields for representing select raw data in the event, and the data model being extensible to add additional attributes to the event by a subsequent node which is configured to further process the dataset, including performing a query on the dataset, as the event is streamed from the source to the sink, wherein said annotating includes specifying, based on the dataset, one or more fields of the event in the data model so as to enable the subsequent node (1) to process the event based on the annotation; and
/or (2) to route the event based on the annotation, wherein the dataset includes at least one of;
a timestamp, a source machine, a body, or a priority. - View Dependent Claims (10, 11, 12, 13)
-
-
14. A system having a processor and a memory, the memory storing a plurality of instructions which, when executed by the processor, cause the processor to perform a method for dynamically processing an event including a dataset that is streamed from a source to a sink via nodes, the event being raw data generated by a machine, the method comprising:
-
recording, at a respective node, the event including the dataset in a memory of the respective node using a data model; and annotating the event by adding or updating one or more attributes associated with the event in the data model, the annotating performed based on the respective node reading at least a portion of the dataset and determining annotation in accordance with one or more functions configured to generate an analytical result, the data model having a number of fields for representing select raw data in the event, and the data model being extensible to add additional attributes to the event by a subsequent node configured to further process the dataset, including performing a query on the dataset, as the event is streamed from the source to the sink, wherein said annotating includes specifying, based on the dataset, one or more fields of the event in the data model so as to enable the subsequent node (1) to process the event based on the annotation; and
/or (2) to route the event based on the annotation, wherein the dataset includes at least one of;
a timestamp, a source machine, a body, or a priority. - View Dependent Claims (15, 16, 17)
-
-
18. A computer system configured for collecting and aggregating datasets for storage in a file system with fault tolerance, the file system including an agent node, a collector node, and a master node, the computer system having a processor configured to perform a method comprising:
-
collecting the datasets via an agent node operating in a remote machine, wherein the datasets include a batch of messages written by the remote machine, and wherein the batch of messages are processed by the agent node when the a size or a lapsed time of the batch of messages reaches a select threshold; generating, by the agent node, a batch identifier (ID) for the batch of messages; assigning, by the agent node, an event tag to the batch of messages; computing, by the agent node, a checksum for the batch of messages; writing, by the agent node, the batch of messages along with the batch ID, the event tag, and the checksum as an entry in a write-ahead-log (WAL) storage maintained by the agent node on the remote machine; transmitting, by the agent node, the datasets to a collector machine, wherein the datasets are transmitted in a data model as an event, the data model being extensible to add additional attributes to the event by a subsequent node which is configured to further process the dataset as the event is streamed from a source to a sink; verifying the checksum by a collector node operating in the collector machine; upon the checksum being verified, adding, by the collector node, a tag to a map of tags, wherein the map of tags are associated with multiple tags assigned to multiple batches of messages from the datasets; writing, by the collector node, the datasets to a destination location; and based on whether the batch of messages has been successfully written to the destination location, selectively publishing, by the collector node, the tag to the master node. - View Dependent Claims (19, 20)
-
Specification