Indexing of large scale patient set
First Claim
1. A non-transitory computer readable storage medium comprising a computer readable program for indexing data, wherein the computer readable program when executed on a computer causes the computer to perform the steps of:
- formulating an objective function to index a dataset, a portion of the dataset including supervision information;
determining a data property component of the objective function which utilizes a property of the dataset to group data of the dataset, wherein determining the data property component includes maximizing a variance of the dataset, the maximizing the variance including applying maximum variance unfolding by forming a projection as an inner product of a data matrix and a learned partition hyperplane and maximizing overall pairwise distances in a projected space;
determining a supervised component of the objective function which utilizes the supervision information to group data of the dataset; and
optimizing the objective function using a processor based upon the data property component and the supervised component to partition a node into a plurality of child nodes.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods for indexing data include formulating an objective function to index a dataset, a portion of the dataset including supervision information. A data property component of the objective function is determined, which utilizes a property of the dataset to group data of the dataset. A supervised component of the objective function is determined, which utilizes the supervision information to group data of the dataset. The objective function is optimized using a processor based upon the data property component and the supervised component to partition anode into a plurality of child nodes.
38 Citations
17 Claims
-
1. A non-transitory computer readable storage medium comprising a computer readable program for indexing data, wherein the computer readable program when executed on a computer causes the computer to perform the steps of:
-
formulating an objective function to index a dataset, a portion of the dataset including supervision information; determining a data property component of the objective function which utilizes a property of the dataset to group data of the dataset, wherein determining the data property component includes maximizing a variance of the dataset, the maximizing the variance including applying maximum variance unfolding by forming a projection as an inner product of a data matrix and a learned partition hyperplane and maximizing overall pairwise distances in a projected space; determining a supervised component of the objective function which utilizes the supervision information to group data of the dataset; and optimizing the objective function using a processor based upon the data property component and the supervised component to partition a node into a plurality of child nodes. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system for indexing data, comprising:
-
a formulation module configured to formulate an objective function to index a dataset stored on a non-transitory computer readable storage medium, a portion of the dataset including supervision information; a data property module configured to determine a data property component of the objective function which utilizes a property of the dataset to group data of the dataset, the data property module including a variance maximization module configured to maximize a variance of the dataset, the maximizing the variance including applying maximum variance unfolding by forming a projection as an inner product of a data matrix and a learned partition hyperplane and maximizing overall pairwise distances in a projected space; a supervision module configured to determine a supervised component of the objective function which utilizes the supervision information to group data of the dataset; and an optimization module configured to optimize the objective function based upon the data property component and the supervised component to partition a node into a plurality of child nodes. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
Specification