System, method and software arrangement utilizing a multi-strip procedure that can be applied to gene characterization using DNA-array data
First Claim
1. A process for determining statistically-outlying data points present in at least one dataset, comprising:
- a) receiving the at least one dataset;
b) determining at least one interval associated with the dataset;
c) using a hardware processing arrangement which comprises a processor, determining a plurality of subintervals of the at least one interval by iteratively dividing at least one of the at least one interval more than once until at least one predetermined criteria is met; and
d) determining the statistically-outlying data points present in the at least one dataset, wherein each data point is associated with a particular subinterval of the subintervals, and wherein the determination is performed based on information related to the subintervals and as a function of the length of the particular subinterval of the subintervals associated with the particular data point.
1 Assignment
0 Petitions
Accused Products
Abstract
A system, method and software arrangement are provided that use a fast adaptive multiscale procedure to characterize a random set of points spanning a high dimensional Euclidean space, and concentrated around special lower dimensional subsets. The procedure can be adapted to analyze gene expression data from microarray experiments, and may be applied generally to existing datasets without regard to whether a particular model exists to otherwise describe the dataset. The procedure accordingly can be used for identifying and mathematically isolating stable sets of data points in a given dataset from those in the same dataset that deviate from a stable model under various conditions.
-
Citations
32 Claims
-
1. A process for determining statistically-outlying data points present in at least one dataset, comprising:
-
a) receiving the at least one dataset; b) determining at least one interval associated with the dataset; c) using a hardware processing arrangement which comprises a processor, determining a plurality of subintervals of the at least one interval by iteratively dividing at least one of the at least one interval more than once until at least one predetermined criteria is met; and d) determining the statistically-outlying data points present in the at least one dataset, wherein each data point is associated with a particular subinterval of the subintervals, and wherein the determination is performed based on information related to the subintervals and as a function of the length of the particular subinterval of the subintervals associated with the particular data point. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A non-transitory storage medium which includes thereon a software arrangement to be executed by a hardware processing arrangement for determining statistically-outlying data points present in at least one dataset, the software arrangement comprising:
-
a) a first set of instructions operable to configure the processing arrangement to receive the at least one dataset; b) a second set of instructions operable to configure the processing arrangement to determine at least one interval associated with the dataset; c) a third set of instructions operable to configure the processing arrangement to determine a plurality of subintervals of the at least one interval by iteratively dividing at least one of the at least one interval more than once until at least one predetermined criteria is met; and d) a fourth set of instructions operable to configure the processing arrangement to determine the statistically-outlying data points present in the at least one dataset, wherein each data point is associated with a particular subinterval of the subintervals, and wherein the determination is performed based on information related to the subintervals and as a function of the length of the particular subinterval of the subintervals associated with the particular data point. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
-
-
27. A system for determining statistically-outlying data points present in at least one dataset, comprising:
-
a hardware processing arrangement which includes a processor and which is operably configured to; a) receive the at least one dataset; b) determine at least one interval associated with the dataset; c) determine a plurality of subintervals of the at least one interval by iteratively dividing at least one of the at least one interval more than once until at least one predetermined criteria is met; and
d) determine the statistically-outlying data points present in the at least one dataset, wherein each data point is associated with a particular subinterval of the subintervals, and wherein the determination is performed based on information related to the subintervals and as a function of the length of the particular subinterval of the subintervals associated with the particular data point. - View Dependent Claims (28, 29, 30, 31, 32)
-
Specification