SPANNING-TREE PROGRESSION ANALYSIS OF DENSITY-NORMALIZED EVENTS (SPADE)
First Claim
Patent Images
1. A computer implemented method of analyzing and sorting feature data from a large number of samples comprising:
- detecting features of said samples using a feature detecting system;
determining numerical feature values representing said detected features;
storing said numerical feature values in an initial sample database in a digital memory at a computer system, said initial sample database comprising an array with dimensions roughly equal to the number of said samples by the number of different feature values stored for each sample;
density-dependent downsampling said sample database using executable logic at said computer system by determining a local density value for samples in said array and removing a portion of samples in dense regions of said array;
storing a downsampled sample database comprising a downsampled array in said digital memory at said computer system;
clustering samples in said downsampled array by agglomerative clustering using executable logic at said computer system to determine a plurality of sample clusters;
storing data regarding said sample clusters in said digital memory at said computer system;
determining one or more progression trees connecting said clusters using said executable logic at said computer system;
storing data regarding said progression trees at said computer system, andsaid computer system outputting to a user multiple representations of a progression tree of said clusters, a topology of said representations indicating a progression or hierarchy of said clusters, and color or other indicators of said representations indicating different feature values of said clusters.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods and systems for determining progression and other characteristics of microarray expression levels and similar information, alternatively using a network or communications medium or tangible storage medium or logic processor.
12 Citations
64 Claims
-
1. A computer implemented method of analyzing and sorting feature data from a large number of samples comprising:
-
detecting features of said samples using a feature detecting system; determining numerical feature values representing said detected features; storing said numerical feature values in an initial sample database in a digital memory at a computer system, said initial sample database comprising an array with dimensions roughly equal to the number of said samples by the number of different feature values stored for each sample; density-dependent downsampling said sample database using executable logic at said computer system by determining a local density value for samples in said array and removing a portion of samples in dense regions of said array; storing a downsampled sample database comprising a downsampled array in said digital memory at said computer system; clustering samples in said downsampled array by agglomerative clustering using executable logic at said computer system to determine a plurality of sample clusters; storing data regarding said sample clusters in said digital memory at said computer system; determining one or more progression trees connecting said clusters using said executable logic at said computer system; storing data regarding said progression trees at said computer system, and said computer system outputting to a user multiple representations of a progression tree of said clusters, a topology of said representations indicating a progression or hierarchy of said clusters, and color or other indicators of said representations indicating different feature values of said clusters. - View Dependent Claims (4, 5, 7, 8, 13, 18, 19, 29, 31, 32, 40, 45, 52, 54, 63)
-
-
2. (canceled)
-
3. A computer implemented method for clustering and visualization of multicolor flow cytometry data comprising:
-
receiving cell samples from one or more subjects; analyzing the samples using a flow cytometer, thereby yielding a multi-dimensional data set; estimating a density function for cell sample points in said multi-dimensional data set; creating a down-sampled array by removing a portion of samples in dense regions of said array; clustering cell samples in said downsampled array by agglomerative clustering to determine a plurality of sample clusters; estimating one or more progression trees in a Euclidean space having a dimensionality of three or less representing progression or hierarchy of said clusters, where the steps of creating, clustering, and estimating are executed by a processor of a computing device; and graphically displaying relationships between clusters using data in the Euclidean space on a display of the computing device.
-
-
6. (canceled)
-
9-12. -12. (canceled)
-
14-17. -17. (canceled)
-
20-28. -28. (canceled)
-
30. (canceled)
-
33-39. -39. (canceled)
-
41-44. -44. (canceled)
-
46-51. -51. (canceled)
-
55-60. -60. (canceled)
-
61. A system for flow cytometry or biologic analysis or diagnosis comprising:
-
an input component reading sample data comprising multiple feature values for each sample; a density dependent downsampling component able to reduce the density of samples in a large dataset while preserving rare-samples and overall dataset shape; a clustering component and processor clustering samples into a number of sample clusters; a feature progression and differentiation determining component determining underlying sample cluster progression and differentiation using one or more of said feature values; a progression tree output and analysis module providing output and analysis of progression trees to determine hierarchy, progression, or differentiation of said samples. - View Dependent Claims (62)
-
-
64-73. -73. (canceled)
Specification