Systems and methods for adaptively identifying and mitigating statistical outliers in aggregated data
First Claim
1. An apparatus, comprising:
- a storage device; and
at least one processor coupled to the storage device, the storage device storing software instructions that are executable by the at least one processor, and the at least one processor being operative with the software instructions and configured to;
obtain aggregated data collected by a plurality of first communications devices associated with corresponding surveyors, the aggregated data comprising aggregated survey response data collected by the first communications devices;
detect at least one data outlier within the aggregated survey response data by applying a numerical technique comprising at least one of a linear regression analysis, conditional or robust regressions, an interquartile outlier check, or a standard deviation analysis, the data outlier corresponding to at least one element of the aggregated survey response data;
determine a magnitude by which the at least one data outlier exceeds a validation limit based on results from the numerical technique;
transmit information identifying the data outlier and at least a portion of the aggregated data that includes the data outlier to a second communications device associated with a survey manager, the information instructing the second communications device to;
generate a first graphical user interface displaying the portion of the aggregated survey response data that includes the data outlier, the data outlier being displayed visually distinguishable within the presented aggregated survey response data portion and being associated with a hyperlink, andgenerate a second graphical user interface when the survey manager selects the hyperlink, the second graphical user interface comprising an editable box and displaying the magnitude by which the data outlier exceeds a validation limit, the second graphical user interface being different from the first graphical user interface;
in response to the transmitted information, receive a request to modify the aggregated data from the second communications device, the request comprising a value entered in the editable box;
perform operations that modify at least a portion of the aggregated data in accordance with the received request; and
generate metadata associated with the aggregated data, the metadata comprising;
effected modifications to the portion of the aggregated data; and
sources of the effected modifications.
1 Assignment
0 Petitions
Accused Products
Abstract
The disclosed embodiments include computerized methods and systems that facilitate automated detection and precision correction of aggregated data collected by multiple, geographically dispersed mobile communications devices. In one embodiment, an apparatus detect a data outlier within portions of the aggregated data having numerical and/or categorical values. The apparatus may transmit information identifying the data outliner and a portion of the aggregated data that includes the data outlier to an additional communications device, which may present the aggregated data portion to a user in a manner that visually distinguishes the data outlined from other elements of aggregated data. In response to a request from the additional communications device, the apparatus may modify portions of the aggregated data in an effort to mitigate the data outlier.
15 Citations
20 Claims
-
1. An apparatus, comprising:
-
a storage device; and at least one processor coupled to the storage device, the storage device storing software instructions that are executable by the at least one processor, and the at least one processor being operative with the software instructions and configured to; obtain aggregated data collected by a plurality of first communications devices associated with corresponding surveyors, the aggregated data comprising aggregated survey response data collected by the first communications devices; detect at least one data outlier within the aggregated survey response data by applying a numerical technique comprising at least one of a linear regression analysis, conditional or robust regressions, an interquartile outlier check, or a standard deviation analysis, the data outlier corresponding to at least one element of the aggregated survey response data; determine a magnitude by which the at least one data outlier exceeds a validation limit based on results from the numerical technique; transmit information identifying the data outlier and at least a portion of the aggregated data that includes the data outlier to a second communications device associated with a survey manager, the information instructing the second communications device to; generate a first graphical user interface displaying the portion of the aggregated survey response data that includes the data outlier, the data outlier being displayed visually distinguishable within the presented aggregated survey response data portion and being associated with a hyperlink, and generate a second graphical user interface when the survey manager selects the hyperlink, the second graphical user interface comprising an editable box and displaying the magnitude by which the data outlier exceeds a validation limit, the second graphical user interface being different from the first graphical user interface; in response to the transmitted information, receive a request to modify the aggregated data from the second communications device, the request comprising a value entered in the editable box; perform operations that modify at least a portion of the aggregated data in accordance with the received request; and generate metadata associated with the aggregated data, the metadata comprising; effected modifications to the portion of the aggregated data; and sources of the effected modifications. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-implemented method, comprising:
-
obtaining, by at least one processor, aggregated data collected by a plurality of first communications devices associated with corresponding surveyors, the aggregated data comprising aggregated survey response data collected by the first communications devices; detecting, by the at least one processor, at least one data outlier within the aggregated survey response data by applying a numerical technique comprising at least one of a linear regression analysis, conditional or robust regressions, an interquartile outlier check, or a standard deviation analysis, the data outlier corresponding to at least one element of the aggregated survey response data; determining a magnitude by which the at least one data outlier exceeds a validation limit based on results from the numerical technique; generating, by the at least one processor, an electronic command to transmit information identifying the data outlier and at least a portion of the aggregated data that includes the data outlier to a second communications device associated with a survey manager, the information instructing the second communications device to; generate a first graphical user interface displaying the portion of the aggregated survey response data that includes the data outlier, the data outlier being displayed visually distinguishable within the presented aggregated survey response data portion, and being associated with a hyperlink, and generate a second graphical user interface when the survey manager selects the hyperlink, the second graphical user interface comprising an editable box and displaying the magnitude by which the data outlier exceeds a validation limit, the second graphical user interface being different from the first graphical user interface; in response to the transmitted information, receiving, by at least one processor, a request to modify the aggregated data from the second communications device, the request comprising a value entered in the editable box; performing, by at least one processor, operations that modify at least a portion of the aggregated data in accordance with the received request; and generating metadata associated with the aggregated data, the metadata comprising; effected modifications to the portion of the aggregated data; and sources of the effected modifications. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
-
-
20. A tangible, non-transitory computer-readable medium storing instructions that, when executed by at least one processor, perform a method comprising:
-
obtaining aggregated data collected by a plurality of first communications devices associated with corresponding surveyors, the aggregated data comprising aggregated survey response data collected by the first communications devices; detecting at least one data outlier within the aggregated survey response data by applying a numerical technique comprising at least one of a linear regression analysis, conditional or robust regression, and an interquartile check, or a standard deviation analysis, the data outlier corresponding to at least one element of the aggregated survey response data; determining a magnitude by which the at least one data outlier exceeds a validation limit based on results from the numerical technique; generating an electronic command to transmit information identifying the data outlier and at least a portion of the aggregated data that includes the data outlier to a second communications device associated with a survey manager, the information instructing the second communications device to; generate a first graphical user interface displaying the portion of the aggregated survey response data that includes the data outlier, the data outlier being displayed visually distinguishable within the presented aggregated survey response data portion, and being associated with a hyperlink, and generate a second graphical user interface when the survey manger selects the hyperlink, the second graphical user interface comprising an editable box and displaying the magnitude by which the data outlier exceeds a validation limit, the second graphical user interface being different from the first graphical user interface; in response to the transmitted information, receiving a request to modify the aggregated data from the second communications device, the request comprising a value entered in the editable box; performing operations that modify at least a portion of the aggregated data in accordance with the received request; and generating metadata associated with the aggregated data, the metadata comprising; effected modifications to the portion of the aggregated data; and sources of the effected modifications.
-
Specification