Method and Apparatus for Representing Multidimensional Data
First Claim
1. A program storage device readable by a machine, said device tangibly embodying at least one program of instructions executable by the machine to cause the machine to perform steps for a method of representing data at multiple resolutions, said method comprising:
- a. providing a data set;
b. representing said data in a multidimensional space;
c. dividing said multidimensional space into discrete data bins; and
d. subdividing data from each bin into finer resolution bins, wherein for at least one current bin, the subdividing comprises;
i. determining the direction of maximum variance of data contained within the current bin;
ii. rotating the coordinates of the data space in the direction of maximum variance, wherein the first axis of the rotated coordinates is parallel to the direction of maximum variance;
iii. determining the median value of the first coordinate in the rotated coordinate system for the collection of data comprising the selected bin;
iv. splitting the data comprising the current bin into two finer resolution bins, the first portion of the selected, split bin being comprised of events with a first coordinate less than or equal to the median, the second portion of the selected, split bin being comprised of events with a value of the first coordinate greater than the median; and
v. recording the rotation and median value (split value) associated with the current, split bin to a storage device.
3 Assignments
0 Petitions
Accused Products
Abstract
The present invention relates to methods for representing multidimensional data. The methods of the present invention are well suited but not limited to the representation of multidimensional data in such a way as to enable the comparison and differentiation of data sets. For example, the invention may be applied to the representation of flow cytometric data. The invention further relates to a program storage device having instructions for controlling a computer system to perform the methods, and to a program storage device containing data structures used in the practice of the methods.
-
Citations
20 Claims
-
1. A program storage device readable by a machine, said device tangibly embodying at least one program of instructions executable by the machine to cause the machine to perform steps for a method of representing data at multiple resolutions, said method comprising:
-
a. providing a data set; b. representing said data in a multidimensional space; c. dividing said multidimensional space into discrete data bins; and d. subdividing data from each bin into finer resolution bins, wherein for at least one current bin, the subdividing comprises; i. determining the direction of maximum variance of data contained within the current bin; ii. rotating the coordinates of the data space in the direction of maximum variance, wherein the first axis of the rotated coordinates is parallel to the direction of maximum variance; iii. determining the median value of the first coordinate in the rotated coordinate system for the collection of data comprising the selected bin; iv. splitting the data comprising the current bin into two finer resolution bins, the first portion of the selected, split bin being comprised of events with a first coordinate less than or equal to the median, the second portion of the selected, split bin being comprised of events with a value of the first coordinate greater than the median; and v. recording the rotation and median value (split value) associated with the current, split bin to a storage device. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A program storage device readable by a machine, said device tangibly embodying at least one program of instructions executable by the machine to cause the machine to perform steps for a method of representing data at multiple resolutions, said method comprising:
-
a. providing a first data set; b. representing said data in a multidimensional space; c. dividing said multidimensional space into discrete data bins; d. subdividing data from each bin into finer resolution bins; e. determining the direction of maximum variance of data contained within at least one bin; f. rotating the coordinates of the data space in the direction of maximum variance, wherein the first axis of the rotated coordinates is parallel to the direction of maximum variance, further wherein the rotation is based on the data from said first data set; g. determining the median value of the first coordinate in the rotated coordinate system for the collection of data comprising the selected bin; h. splitting the data comprising the selected bin into two bins at the next hierarchical resolution level, the first portion of the selected, split bin being comprised of events with a first coordinate value less than or equal to the median, the second portion of the selected, split bin being comprised of events with a first coordinate value greater than the median; i. recording the rotation matrix and median value (split value) associated with the current, split bin to a storage device, j. representing a second data set in a second multidimensional space; k. dividing said second multidimensional space into a second set of discrete data bins; l. subdividing data from each of said second bins into finer resolution bins; m. rotating the coordinates of the second data space based on the corresponding rotation matrix from said first data set; n. in a selected second bin, splitting the data comprising the second selected bin into two bins at the next hierarchical resolution level, the first portion of the second selected, split bin being comprised of events with a first coordinate value less than or equal to the median of the corresponding bin determined for said first data set in step g.), the second portion of the second selected, split bin being comprised of events with a first coordinate value greater than the median of the corresponding bin determined for said first data set in step g.); and o. determining one-dimensional lists of numbers comprising fingerprints for a set of instances relative to the representation of a multidimensional data set processed by the binning procedure, the method comprising forming a template instance by combining the events from a set of instances into a single data set.
-
-
20. A computing environment providing a device readable by a machine, said device tangibly embodying at least one program of instructions executable by the machine to cause the machine to perform steps for a method of representing data at multiple resolutions, said method comprising:
-
a. providing a data set; b. representing said data in a multidimensional space; c. dividing said multidimensional space into discrete data bins; d. subdividing data from each bin into finer resolution bins; e. determining the direction of maximum variance of data contained within at least one bin; f. rotating the coordinates of the data space in the direction of maximum variance, wherein the first axis of the rotated coordinates is parallel to the direction of maximum variance; g. determining the median value of the first coordinate in the rotated coordinate system for the collection of data comprising the selected bin; h. splitting the data comprising the selected bin into two bins at the next hierarchical resolution level, the first portion of the selected, split bin being comprised of events with a first coordinate value less than or equal to the median, the second portion of the selected, split bin being comprised of events with a first coordinate value greater than the median; and i. recording the rotation matrix and median value (split value) associated with the current, split bin to a storage device.
-
Specification