Database system and methods for domain-tailored detection of outliers, patterns, and events in data streams
First Claim
1. A method, comprising the steps of:
- obtaining a domain-dependent definition of one or more of data outliers based on one or more predefined outlier criteria, data patterns based on one or more predefined pattern criteria, and data events based on one or more predefined event criteria;
obtaining, into a database system, real-time time series measurement data from a plurality of sensors;
determining, substantially simultaneously with said obtaining said real-time time series measurement data, using at least one processing device, whether individual samples in the measurement data satisfy said domain-dependent definitions of said one or more of said data outliers, said data patterns and said data events; and
storing said individual samples in the database system using the at least one processing device, with an indication of whether said individual samples satisfy said domain-dependent definitions of said one or more of said data outliers, said data patterns and said data events.
7 Assignments
0 Petitions
Accused Products
Abstract
Methods and apparatus are provided for domain-tailored detection of outliers, patterns, and/or events in data streams. An exemplary method comprises obtaining a domain-dependent definition of (i) data outliers based on predefined outlier criteria; (ii) data patterns based on predefined pattern criteria; and/or (iii) data events based on predefined event criteria; obtaining time series measurement data from a plurality of sensors; determining, substantially simultaneously with the obtaining, whether individual samples satisfy the domain-dependent definitions of the data outliers, data patterns and/or data events; and storing the individual samples with an indication of whether the individual samples satisfy the domain-dependent definitions of the data outliers, data patterns and/or data events. The domain-dependent definitions are optionally specified using a declarative command language. Query are optionally processed comprising one or more declarative statements that reference and/or manipulate the data outliers, data patterns and/or data events.
-
Citations
20 Claims
-
1. A method, comprising the steps of:
-
obtaining a domain-dependent definition of one or more of data outliers based on one or more predefined outlier criteria, data patterns based on one or more predefined pattern criteria, and data events based on one or more predefined event criteria; obtaining, into a database system, real-time time series measurement data from a plurality of sensors; determining, substantially simultaneously with said obtaining said real-time time series measurement data, using at least one processing device, whether individual samples in the measurement data satisfy said domain-dependent definitions of said one or more of said data outliers, said data patterns and said data events; and storing said individual samples in the database system using the at least one processing device, with an indication of whether said individual samples satisfy said domain-dependent definitions of said one or more of said data outliers, said data patterns and said data events.
-
-
2. The method of claim 1, wherein one or more users specify said domain-dependent definitions using a declarative command language.
-
3. The method of claim 2, wherein said domain-dependent definitions using said declarative command language can include one or more of:
- system calls to custom functions, defined by said one or more users, that implement necessary constraints that enable said determining;
one or more constraints based on statistical metrics of said measurement data; and
system calls to machine learning algorithms to train a specialized model from said measurement data.
- system calls to custom functions, defined by said one or more users, that implement necessary constraints that enable said determining;
-
4. The method of claim 2, wherein said domain-dependent definitions comprise a specification of a sampling selection criterion that specifies one or more of which and how many of said individual samples from said measurement data are used to compute one or more of said statistical metrics and said specialized model.
-
5. The method of claim 2, wherein one or more users specify said domain-dependent definition at a substantially same time as said one or more users create one or more tables that will store said measurement data.
-
6. The method of claim 1, wherein said indication comprises a bitmap, and wherein said bitmap is updated substantially simultaneously with said obtaining.
-
7. The method of claim 1, further comprising the step of receiving an update to said domain-dependent definition from one or more users.
-
8. The method of claim 5, wherein said update further comprises an indication of a specific time from which said update applies to said measurement data.
-
9. The method of claim 1, further comprising the step of receiving a query from one or more users, wherein said query comprises one or more declarative statements that reference and manipulate one or more or a logical combination of said data outliers, said data patterns and said data events.
-
10. A computer program product comprising a tangible machine-readable storage medium having encoded therein executable code of one or more software programs, wherein the one or more software programs when executed perform the following steps:
-
obtaining a domain-dependent definition of one or more of data outliers based on one or more predefined outlier criteria, data patterns based on one or more predefined pattern criteria, and data events based on one or more predefined event criteria; obtaining, into a database system, real-time time series measurement data from a plurality of sensors; determining, substantially simultaneously with said obtaining said real-time time series measurement data, using at least one processing device, whether individual samples in the measurement data satisfy said domain-dependent definitions of said one or more of said data outliers, said data patterns and said data events; and storing said individual samples in the database system using the at least one processing device, with an indication of whether said individual samples satisfy said domain-dependent definitions of said one or more of said data outliers, said data patterns and said data events.
-
-
11. The computer program product of claim 10, wherein said indication comprises a bitmap, and wherein said bitmap is updated substantially simultaneously with said obtaining.
-
12. The computer program product of claim 10, further comprising the step of receiving an update to said domain-dependent definition from one or more users.
-
13. The computer program product of claim 12, wherein said update further comprises an indication of a specific time from which said update applies to said measurement data.
-
14. The computer program product of claim 10, further comprising the step of receiving a query from one or more users, wherein said query comprises one or more declarative statements that reference and manipulate one or more or a logical combination of said data outliers, said data patterns and said data events.
-
15. A system, comprising:
-
a memory; and at least one processing device, coupled to the memory, operative to implement the following steps; obtaining a domain-dependent definition of one or more of data outliers based on one or more predefined outlier criteria, data patterns based on one or more predefined pattern criteria, and data events based on one or more predefined event criteria; obtaining, into a database system, real-time time series measurement data from a plurality of sensors; determining, substantially simultaneously with said obtaining said real-time time series measurement data, using the at least one processing device, whether individual samples in the measurement data satisfy said domain-dependent definitions of said one or more of said data outliers, said data patterns and said data events; and storing said individual samples in the database system using the at least one processing device, with an indication of whether said individual samples satisfy said domain-dependent definitions of said one or more of said data outliers, said data patterns and said data events.
-
-
16. The system of claim 15, wherein one or more users specify said domain-dependent definitions using a declarative command language.
-
17. The system of claim 15, wherein said indication comprises a bitmap, and wherein said bitmap is updated substantially simultaneously with said obtaining.
-
18. The system of claim 15, further comprising the step of receiving an update to said domain-dependent definition from one or more users.
-
19. The system of claim 18, wherein said update further comprises an indication of a specific time from which said update applies to said measurement data.
-
20. The system of claim 15, further comprising the step of receiving a query from one or more users, wherein said query comprises one or more declarative statements that reference and manipulate one or more or a logical combination of said data outliers, said data patterns and said data events.
Specification