Data discovery nodes
First Claim
1. A computer program product for processing scientific data according to a model that is independent of any specific data set, the computer program product comprising:
- a data discovery node data structure resident on a non-transitory computer-readable storage medium, the data discovery node data structure comprising (1) a specification of scientific data to be subjected to an iterative scientific data analysis, (2) a specification of an output format for the iterative scientific data analysis, and (3) a specification of a plurality of operational variables for controlling the iterative scientific data analysis, the specified operational variables comprising (i) a specification of an algorithm to be performed on the specified scientific data as part of the iterative scientific data analysis, (ii) a specification of metadata, the specified metadata configured to define conditions under which the specified algorithm will be applied to the specified scientific data, and (iii) a specification of a satisfaction variable, the specified satisfaction variable configured to control how many iterations are performed as part of the iterative scientific data analysis; and
a plurality of processor-executable instructions that are resident on a non-transitory computer-readable storage medium, wherein the instructions are configured, upon execution by a processor of a computer, to cause the computer to;
read and invoke the data discovery data structure to perform the iterative scientific data analysis on a specific data set corresponding to the specified scientific data according to the specified operational variables, determine whether the metadata meets a metadata rule criteria specified by one of the plurality of operational variables, and generate a result in the specified output format,wherein the step of determining whether the metadata meets a metadata rule criteria comprises testing the metadata against the metadata rule criteria according to a mode selected from the group consisting of a loose mode, a moderate mode, and a strict mode,wherein the loose mode specifies the metadata has no requirements to meet the metadata rule criteria,wherein the moderate mode specifies that the metadata must meet a number of criteria of the metadata rule criteria over a user-set threshold, andwherein the strict mode specifies that the metadata must meet all criteria of the metadata rule criteria.
1 Assignment
0 Petitions
Accused Products
Abstract
A framework and interface for invoking and assimilating external algorithms and interacting with said algorithms in-session and real-time are described herein. An example embodiment also includes reproducible, updatable nodes that can be leveraged for data-driven analysis whereby the data itself can direct the algorithm choice, variables, and presentation leading to iteration and optimization in an analysis workflow. With example embodiments, an entire discovery or diagnosis process may be executed on a particular data set, thereby divorcing the discovery or diagnosis process from a specific data set such that the same discovery or diagnosis process, phenotype identification, and visualizations may be repeated on future experiments, published, validated, or shared with another investigator.
-
Citations
24 Claims
-
1. A computer program product for processing scientific data according to a model that is independent of any specific data set, the computer program product comprising:
-
a data discovery node data structure resident on a non-transitory computer-readable storage medium, the data discovery node data structure comprising (1) a specification of scientific data to be subjected to an iterative scientific data analysis, (2) a specification of an output format for the iterative scientific data analysis, and (3) a specification of a plurality of operational variables for controlling the iterative scientific data analysis, the specified operational variables comprising (i) a specification of an algorithm to be performed on the specified scientific data as part of the iterative scientific data analysis, (ii) a specification of metadata, the specified metadata configured to define conditions under which the specified algorithm will be applied to the specified scientific data, and (iii) a specification of a satisfaction variable, the specified satisfaction variable configured to control how many iterations are performed as part of the iterative scientific data analysis; and a plurality of processor-executable instructions that are resident on a non-transitory computer-readable storage medium, wherein the instructions are configured, upon execution by a processor of a computer, to cause the computer to;
read and invoke the data discovery data structure to perform the iterative scientific data analysis on a specific data set corresponding to the specified scientific data according to the specified operational variables, determine whether the metadata meets a metadata rule criteria specified by one of the plurality of operational variables, and generate a result in the specified output format,wherein the step of determining whether the metadata meets a metadata rule criteria comprises testing the metadata against the metadata rule criteria according to a mode selected from the group consisting of a loose mode, a moderate mode, and a strict mode, wherein the loose mode specifies the metadata has no requirements to meet the metadata rule criteria, wherein the moderate mode specifies that the metadata must meet a number of criteria of the metadata rule criteria over a user-set threshold, and wherein the strict mode specifies that the metadata must meet all criteria of the metadata rule criteria. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method for analyzing scientific data comprising:
-
applying a data discovery node data structure to a data file, the data file comprising scientific data collected by an acquisition instrument, the data file having metadata associated therewith, wherein the applying step comprises; loading a plurality of operational variables associated with the data discovery node and the metadata associated with the data file into memory; determining whether the metadata meets a metadata rule criteria specified by one of the plurality of operational variables; and in response to a determination that the metadata meets the metadata rule criteria; loading the scientific data associated with the data file into memory; executing a first analysis algorithm on the scientific data associated with the data file, wherein one of the plurality of operational variables specifies the first analysis algorithm; creating a temporary data object that defines a satisfaction variable; determining whether the temporary data object'"'"'s satisfaction variable satisfies a satisfaction threshold specified by one of the plurality of operational variables; and in response to a determination that the temporary data object'"'"'s satisfaction variable does not satisfy the satisfaction threshold, (1) executing either the first analysis algorithm or a second analysis algorithm on a full set or a subset of the scientific data associated with the data file, wherein one of the plurality of operational variables defines whether to apply the first analysis algorithm or the second analysis algorithm to the full set or the subset of the raw data, and (2) updating the temporary data object based on the executing of the first analysis algorithm or the second analysis algorithm; and repeatedly performing the steps of (1) determining whether the temporary data object'"'"'s satisfaction variable satisfies the satisfaction threshold, (2) executing either the first analysis algorithm or the second analysis algorithm, and (3) updating the temporary data object until the updated temporary data object'"'"'s satisfaction variable satisfies the satisfaction threshold, wherein the step of determining whether the metadata meets a metadata rule criteria comprises testing the metadata against the metadata rule criteria according to a mode selected from the group consisting of a loose mode, a moderate mode, and a strict mode, wherein the loose mode specifies the metadata has no requirements to meet the metadata rule criteria, wherein the moderate mode specifies that the metadata must meet a number of criteria of the metadata rule criteria over a user-set threshold, wherein the strict mode specifies that the metadata must meet all criteria of the metadata rule criteria, and wherein the method steps are performed by a processor. - View Dependent Claims (12, 13, 14, 15, 16)
-
-
17. A computer program product comprising:
a plurality of processor-executable instructions that are resident on a non-transitory computer-readable storage medium, wherein the instructions are configured for execution by the processor to analyze scientific data by causing the computer to; apply a node data structure to a data file, the data file comprising scientific data collected by an acquisition instrument, the data file having metadata associated therewith, wherein the apply operation is configured to; load a plurality of operational variables associated with the data discovery node and the metadata associated with the data file into memory; determine whether the metadata meets a metadata rule criteria specified by one of the plurality of operational variables; and in response to a determination that the metadata meets the metadata rule criteria; load the scientific data associated with the data file into memory; execute a first analysis algorithm on the scientific data associated with the data file, wherein one of the plurality of operational variables specifies the first analysis algorithm; create a temporary data object that defines a satisfaction variable; determine whether the temporary data object'"'"'s satisfaction variable satisfies a satisfaction threshold specified by one of the plurality of operational variables; and in response to a determination that the temporary data object'"'"'s satisfaction variable does not satisfy the satisfaction threshold, (1) execute either the first analysis algorithm or a second analysis algorithm on a full set or a subset of the scientific data associated with the data file, wherein one of the plurality of operational variables defines whether to apply the first analysis algorithm or the second analysis algorithm to the full set or the subset of the raw data, and (2) update the temporary data object based on the executing of the first analysis algorithm or the second analysis algorithm; and repeatedly perform the (1) determination operation as whether the temporary data object'"'"'s satisfaction variable satisfies the satisfaction threshold, (2) the first analysis algorithm or the second analysis algorithm execution operation, and (3) the update operation until the updated temporary data object'"'"'s satisfaction variable satisfies the satisfaction threshold, wherein the step of determining whether the metadata meets a metadata rule criteria comprises testing the metadata against the metadata rule criteria according to a mode selected from the group consisting of a loose mode, a moderate mode, and a strict mode, wherein the loose mode specifies the metadata has no requirements to meet the metadata rule criteria, wherein the moderate mode specifies that the metadata must meet a number of criteria of the metadata rule criteria over a user-set threshold, and wherein the strict mode specifies that the metadata must meet all criteria of the metadata rule criteria.
-
18. A method for analyzing scientific data comprising:
-
receiving a specification of a plurality of operational variables, wherein the specification comprises (1) a specification of a satisfaction criteria, (2) a specification of a first analysis algorithm, (3) a specification of a second analysis algorithm, and (4) a specification of metadata specifying conditions under which the first and second analysis algorithms are to be applied to the scientific data; executing the first analysis algorithm on at least a portion of the scientific data based on the operational variable that specifies the first analysis algorithm and the operational variable that specifies the conditions under which the first analysis algorithm is to be applied to the scientific data; and repeatedly executing the first analysis algorithm or a second analysis algorithm on at least a portion of the scientific data based on the results of the executing step and the operational variables until the satisfaction criteria is met; and determining whether the metadata meets a metadata rule criteria specified by one of the plurality of operational variables, wherein the step of determining whether the metadata meets a metadata rule criteria comprises testing the metadata against the metadata rule criteria according to a mode selected from the group consisting of a loose mode, a moderate mode, and a strict mode, wherein the loose mode specifies the metadata has no requirements to meet the metadata rule criteria, wherein the moderate mode specifies that the metadata must meet a number of criteria of the metadata rule criteria over a user-set threshold, wherein the strict mode specifies that the metadata must meet all criteria of the metadata rule criteria, and wherein the method steps are performed by a processor. - View Dependent Claims (19, 20, 21, 22, 23)
-
-
24. A computer program product comprising:
-
a plurality of processor-executable instructions that are resident on a non-transitory computer-readable storage medium, wherein the instructions are configured for execution by the processor to analyze scientific data by causing the computer to; receive a specification of a plurality of operational variables, wherein the specification comprises (1) a specification of a satisfaction criteria, (2) a specification of a first analysis algorithm, (3) a specification of a second analysis algorithm, and (4) a specification of metadata specifying conditions under which the first and second analysis algorithms are to be applied to the scientific data; execute the first analysis algorithm on at least a portion of the scientific data based on the operational variable that specifies the first analysis algorithm and the operational variable that specifies the conditions under which the first analysis algorithm is to be applied to the scientific data; and repeatedly execute the first analysis algorithm or a second analysis algorithm on at least a portion of the scientific data based on the results of the executing step and the operational variables until the satisfaction criteria is met; determining whether the metadata meets a metadata rule criteria specified by one of the plurality of operational variables, wherein the step of determining whether the metadata meets a metadata rule criteria comprises testing the metadata against the metadata rule criteria according to a mode selected from the group consisting of a loose mode, a moderate mode, and a strict mode, wherein the loose mode specifies the metadata has no requirements to meet the metadata rule criteria, wherein the moderate mode specifies that the metadata must meet a number of criteria of the metadata rule criteria over a user-set threshold, and wherein the strict mode specifies that the metadata must meet all criteria of the metadata rule criteria.
-
Specification