Indexing of large scale patient set
First Claim
1. A non-transitory computer readable storage medium comprising a computer readable program for indexing data, wherein the computer readable program when executed on a computer causes the computer to perform the steps of:
- formulating an objective function to index a dataset, a portion of the dataset including supervision information;
determining a data property component and a supervised component for grouping data of the dataset, the determining the data property component including maximizing a variance of the dataset by applying a principal component analysis (PCA) and maximum variance unfolding (MVU), the MVU comprising forming an inner product of a data matrix and a learned partition hyperplane and maximizing overall pairwise distances in a projected space to reduce computational costs during the indexing, and the PCA comprising determining an eigenvector from a data covariance matrix with a largest corresponding eigenvector; and
optimizing, using a processor, the objective function based upon the data property component and the supervised component to partition a node into a plurality of child nodes.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods for indexing data include formulating an objective function to index a dataset, a portion of the dataset including supervision information. A data property component of the objective function is determined, which utilizes a property of the dataset to group data of the dataset. A supervised component of the objective function is determined, which utilizes the supervision information to group data of the dataset. The objective function is optimized using a processor based upon the data property component and the supervised component to partition a node into a plurality of child nodes.
25 Citations
8 Claims
-
1. A non-transitory computer readable storage medium comprising a computer readable program for indexing data, wherein the computer readable program when executed on a computer causes the computer to perform the steps of:
-
formulating an objective function to index a dataset, a portion of the dataset including supervision information; determining a data property component and a supervised component for grouping data of the dataset, the determining the data property component including maximizing a variance of the dataset by applying a principal component analysis (PCA) and maximum variance unfolding (MVU), the MVU comprising forming an inner product of a data matrix and a learned partition hyperplane and maximizing overall pairwise distances in a projected space to reduce computational costs during the indexing, and the PCA comprising determining an eigenvector from a data covariance matrix with a largest corresponding eigenvector; and optimizing, using a processor, the objective function based upon the data property component and the supervised component to partition a node into a plurality of child nodes.
-
-
2. A system for indexing data, comprising:
a processor operatively coupled to a memory, the processor being configured to; formulate an objective function to index a dataset stored on a computer readable storage medium, a portion of the dataset including supervision information; determine a data property component and a supervision module configured to determine a supervised component of the objective function for grouping data of the dataset, the data property component being determined by maximizing a variance of the dataset by applying a principal component analysis (PCA) and maximum variance unfolding (MVU), MVU comprising forming an inner product of a data matrix and a learned partition hyperplane and maximizing overall pairwise distances in a projected space to reduce computational costs during the indexing, and the PCA comprising determining an eigenvector from a data covariance matrix with a largest corresponding eigenvector; and optimize the objective function based upon the data property component and the supervised component to partition a node into a plurality of child nodes. - View Dependent Claims (3, 4, 5, 6, 7, 8)
Specification