System and method for detecting sensitivity content in time-series data
First Claim
1. A method for detecting sensitivity content in time-series data, the method comprising:
- receiving, by a processor in a server, the time-series data from a source in real-time, wherein the time-series data is received for one or more instances, and wherein an instance of the one or more instances is associated with a value of the time-series data, and wherein the source comprises one or more sensors, wherein the one or more sensors measure the value of the time-series data and convert the value into one or more signals, wherein the one or more sensors include one or more wireless sensors to transmit the time-series data to the server in real-time;
detecting, by the processor, the sensitivity content in the time-series data, wherein the sensitivity content indicates presence of an anomaly and the sensitivity content is defined as a minute statistical anomaly indicative of existence of private information in the time-series data, wherein the sensitivity content is detected for a value corresponding to each instance from the one or more instances in the time-series data, wherein the detecting comprises;
determining a kurtosis value corresponding to each instance of the one or more instances associated with the value of the time-series data in a data distribution of the time-series data, wherein the value of the time-series data includes a plurality of time stamps associated with the one or more instances;
comparing the kurtosis value with a reference value;
determining a data distribution of the time-series data based upon the comparison, wherein the data distribution is one of a platykurtic distribution when the kurtosis value is less than the reference value, a mesokurtic distribution when the kurtosis value is equal to the reference value, and a leptokurtic distribution when the kurtosis value is greater than the reference value;
processing the time-series data using a Hampel filter and a median-based Rosner filter, wherein the Hampel filter is used when the data distribution of the time-series data is either of the platykurtic distribution or the mesokurtic distribution, and wherein the median-based Rosner filter is used when the data distribution of the time-series data is the leptokurtic distribution, wherein the Hampel filter is used to minimize masking effect when detecting the sensitivity content by choosing a higher value for a breakdown point, and wherein the median-based Rosner filter is used, when a number of outliers is unknown, to provide optimal swamping breakdown point;
identifying, by the processor, a density of the detected sensitivity content, wherein the density of the detected sensitivity content indicates presence of the anomaly in at least two successive instances of the one or more instances,wherein the Hampel filter and the median-based Rosner filter minimizes false positive and false negative alarm rates while detecting the sensitivity content in the time-series data.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for detecting sensitivity content in time-series data is disclosed. The method comprises receiving the time-series data from a source. The data is received for one or more instances. The method further comprises detecting the sensitivity content in the time-series data. The sensitivity content indicates presence of an anomaly. The detecting comprises determining a kurtosis value corresponding to the time-series data. The detecting further comprises comparing the kurtosis value with a reference value. The detecting further comprises processing the data using a first filtering means or a second filtering means. The first filtering means is used when the data distribution of the time-series data is either of a platykurtic distribution or a mesokurtic distribution. The second filtering means is used when the data distribution of the time-series data is a leptokurtic distribution.
7 Citations
9 Claims
-
1. A method for detecting sensitivity content in time-series data, the method comprising:
-
receiving, by a processor in a server, the time-series data from a source in real-time, wherein the time-series data is received for one or more instances, and wherein an instance of the one or more instances is associated with a value of the time-series data, and wherein the source comprises one or more sensors, wherein the one or more sensors measure the value of the time-series data and convert the value into one or more signals, wherein the one or more sensors include one or more wireless sensors to transmit the time-series data to the server in real-time; detecting, by the processor, the sensitivity content in the time-series data, wherein the sensitivity content indicates presence of an anomaly and the sensitivity content is defined as a minute statistical anomaly indicative of existence of private information in the time-series data, wherein the sensitivity content is detected for a value corresponding to each instance from the one or more instances in the time-series data, wherein the detecting comprises; determining a kurtosis value corresponding to each instance of the one or more instances associated with the value of the time-series data in a data distribution of the time-series data, wherein the value of the time-series data includes a plurality of time stamps associated with the one or more instances; comparing the kurtosis value with a reference value; determining a data distribution of the time-series data based upon the comparison, wherein the data distribution is one of a platykurtic distribution when the kurtosis value is less than the reference value, a mesokurtic distribution when the kurtosis value is equal to the reference value, and a leptokurtic distribution when the kurtosis value is greater than the reference value; processing the time-series data using a Hampel filter and a median-based Rosner filter, wherein the Hampel filter is used when the data distribution of the time-series data is either of the platykurtic distribution or the mesokurtic distribution, and wherein the median-based Rosner filter is used when the data distribution of the time-series data is the leptokurtic distribution, wherein the Hampel filter is used to minimize masking effect when detecting the sensitivity content by choosing a higher value for a breakdown point, and wherein the median-based Rosner filter is used, when a number of outliers is unknown, to provide optimal swamping breakdown point; identifying, by the processor, a density of the detected sensitivity content, wherein the density of the detected sensitivity content indicates presence of the anomaly in at least two successive instances of the one or more instances, wherein the Hampel filter and the median-based Rosner filter minimizes false positive and false negative alarm rates while detecting the sensitivity content in the time-series data. - View Dependent Claims (2, 3, 4)
-
-
5. A system for detecting sensitivity content in time-series data, the system implemented on a server comprising:
-
a processor; a memory coupled to the processor, wherein the processor executes a plurality of modules stored in the memory, and wherein the plurality of modules comprising; a reception module to receive the time-series data from a source, wherein the data is received for one or more instances, and wherein an instance of the one or more instances is associated with a value of the time-series data, and wherein the source comprises one or more sensors, wherein the one or more sensors measure the value of the time-series data and convert the value into one or more signals, wherein the one or more sensors include one or more wireless sensors to transmit the time-series data to the server in real-time; a detection module to detect the sensitivity content in the time-series data, wherein the sensitivity content indicates presence of an anomaly and the sensitivity content is defined as a minute statistical anomaly indicative of existence of private information in the time-series data, wherein the sensitivity content is detected for a value corresponding to each instance from the one or more instances in the time-series data, wherein the detection further comprising; determining a kurtosis value corresponding to each instance of the one or more instances associated with the value of the time-series data in a data distribution of the time-series data, wherein the value of the time-series data includes a plurality of time stamps associated with the one or more instances; comparing the kurtosis value with a reference value; determining a data distribution of the time-series data based upon the comparison, wherein the data distribution is one of a platykurtic distribution when the kurtosis value is less than the reference value, a mesokurtic distribution when the kurtosis value is equal to the reference value, and a leptokurtic distribution when the kurtosis value is greater than the reference value; and processing the time-series data using a Hampel filter and a median-based Rosner filter, wherein the Hampel filter is used when the data distribution of the time-series data is either of the platykurtic distribution or the mesokurtic distribution, and wherein the median-based Rosner filter is used when the data distribution of the time-series data is the leptokurtic distribution, wherein the Hampel filter is used to minimize masking effect when detecting the sensitivity content by choosing a higher value for a breakdown point, and wherein the median-based Rosner filter is used, when a number of outliers is unknown, to provide optimal swamping breakdown point; and identifying a density of the detected sensitivity content, wherein the density of the detected sensitivity content indicates presence of the anomaly in at least two successive instances of the one or more instances, wherein the Hampel filter and the median-based Rosner filter minimizes false positive and false negative alarm rates while detecting the sensitivity content in the time-series data. - View Dependent Claims (6, 7, 8)
-
-
9. A non-transitory computer readable medium embodying a program executable in a server for detecting sensitivity content in time-series data, the program comprising:
-
a program code for receiving the time-series data from a source, wherein the time-series data is received for one or more instances, and wherein an instance of the one or more instances is associated with a value of the time-series data, and wherein the source comprises one or more sensors, wherein the one or more sensors measure the value of the time-series data and convert the value into one or more signals, wherein the one or more sensors include one or more wireless sensors to transmit the time-series data to the server in real-time; a program code for detecting the sensitivity content in the time-series data, wherein the sensitivity content indicates presence of an anomaly and the sensitivity content is defined as a minute statistical anomaly indicative of existence of private information in the time-series data, wherein the sensitivity content is detected for a value corresponding to each instance from the one or more instances in the time-series data, wherein the program code for detecting comprises; a program code for determining a kurtosis value corresponding to each instance of the one or more instances associated with the value of the time-series data in a data distribution of the time-series data, wherein the value of the time-series data includes a plurality of time stamps associated with the one or more instances; a program code for comparing the kurtosis value with a reference value; a program code for determining a data distribution of the time-series data based upon the comparison, wherein the data distribution is one of a platykurtic distribution when the kurtosis value is less than the reference value, a mesokurtic distribution when the kurtosis value is equal to the reference value, and a leptokurtic distribution when the kurtosis value is greater than the reference value; and a program code for processing the time-series data using a Hampel filter and a median-based Rosner filter, wherein the Hampel filter is used when the data distribution of the time-series data is either of the platykurtic distribution or the mesokurtic distribution, and wherein the median-based Rosner filter is used when the data distribution of the time-series data is the leptokurtic distribution, wherein the Hampel filter is used to minimize masking effect when detecting the sensitivity content by choosing a higher value for a breakdown point, and wherein the median-based Rosner filter is used, when a number of outliers is unknown, to provide optimal swamping breakdown point; and a program code for identifying a density of the detected sensitivity content, wherein the density of the detected sensitivity content indicates presence of the anomaly in at least two successive instances of the one or more instances, wherein the Hampel filter and the median-based Rosner filter minimizes false positive and false negative alarm rates while detecting the sensitivity content in the time-series data.
-
Specification