Systems and methods for capture of relationships within information
First Claim
1. A system comprising:
- memory;
at least one processor;
an input module configured to control the at least one processor to receive a multidimensional dataset, each data point in the multidimensional dataset having multiple dimensions, each dimension having a value;
a landmark module configured to control the at least one processor to choose a set of landmarks from the data points, the set of landmarks being a subset of the multidimensional dataset;
an analysis module configured to control the at least one processor to map each landmark of the set of landmarks into a finite metric space based on the values of the dimensions of each landmark;
a nearest neighbor module configured to control the at least one processor to compute, for each landmark, a predetermined number of nearest neighbor landmarks in the set of landmarks, distance between every two landmarks being based on dimensions of each of the two landmarks and the finite metric space;
a graph construction module configured to control the at least one processor to identify at least one pair of landmarks that are nearest neighbors to each other relative to other pairs of landmarks;
an edge generator module configured to control the at least one processor to add an edge between the at least one pair of landmarks; and
a non-landmark projection module configured to control the at least one processor to, for each data point in the multidimensional dataset that is not a member of the set of landmarks, determine distances of each of the data points that is not a member of the set of landmarks to the finite metric space to at least one of the landmarks and project each data point that is not a member of the set of landmarks to the finite metric space based on the determined distances, thereby enabling at least one shape to indicate relationships in the data.
5 Assignments
0 Petitions
Accused Products
Abstract
Exemplary systems and methods to improve capture of relationships within information are provided. In various embodiments, a system comprises a landmark module configured to choose a set of landmarks from data in a finite metric space, the set of landmarks being a subset of points in the finite metric space, a nearest neighbor module configured to compute, for each landmark, a predetermined number of nearest neighbor landmarks in the set of landmarks, a graph construction module configured to identify at least one pair of landmarks that are nearest neighbors to each other, an edge generator module configured to add an edge between the at least one pair of landmarks, and a non-landmark projection module configured to project non-landmark points based on the landmarks and one or more edges thereby enabling at least one shape to indicate relationships in the data.
25 Citations
21 Claims
-
1. A system comprising:
-
memory; at least one processor; an input module configured to control the at least one processor to receive a multidimensional dataset, each data point in the multidimensional dataset having multiple dimensions, each dimension having a value; a landmark module configured to control the at least one processor to choose a set of landmarks from the data points, the set of landmarks being a subset of the multidimensional dataset; an analysis module configured to control the at least one processor to map each landmark of the set of landmarks into a finite metric space based on the values of the dimensions of each landmark; a nearest neighbor module configured to control the at least one processor to compute, for each landmark, a predetermined number of nearest neighbor landmarks in the set of landmarks, distance between every two landmarks being based on dimensions of each of the two landmarks and the finite metric space; a graph construction module configured to control the at least one processor to identify at least one pair of landmarks that are nearest neighbors to each other relative to other pairs of landmarks; an edge generator module configured to control the at least one processor to add an edge between the at least one pair of landmarks; and a non-landmark projection module configured to control the at least one processor to, for each data point in the multidimensional dataset that is not a member of the set of landmarks, determine distances of each of the data points that is not a member of the set of landmarks to the finite metric space to at least one of the landmarks and project each data point that is not a member of the set of landmarks to the finite metric space based on the determined distances, thereby enabling at least one shape to indicate relationships in the data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method comprising:
-
receiving a multidimensional dataset, each data point in the multidimensional dataset having multiple dimensions, each dimension having a value; selecting a set of landmarks from the data points, the set of landmarks being a subset of the multidimensional dataset; mapping each landmark of the set of landmarks into a finite metric space based on the values of the dimensions of each landmark; computing, for each landmark, a predetermined number of nearest neighbor landmarks in the set of landmarks, distance between every two landmarks being based on dimensions of each of the two landmarks and the finite metric space; identifying at least one pair of landmarks that are nearest neighbors to each other relative to other pairs of landmarks; adding an edge between the at least one pair of landmarks; and for each data point in the multidimensional dataset that is not a member of the set of landmarks; determining distances of each of the data points that is not a member of the set of landmarks to the finite metric space to at least one of the landmarks, and projecting each data point that is not a member of the set of landmarks to the finite metric space based on the determined distances, thereby enabling at least one shape to indicate relationships in the data. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A non-transitory computer readable medium, the non-transitory computer readable medium comprising processing instructions executable by a processor to perform a method, the method comprising:
-
receiving a multidimensional dataset, each data point in the multidimensional dataset having multiple dimensions, each dimension having a value; selecting a set of landmarks from the data points in a finite metric space, the set of landmarks being a subset of the multidimensional dataset points in the finite metric space; mapping each landmark of the set of landmarks into a finite metric space based on the values of the dimensions of each landmark; computing, for each landmark, a predetermined number of nearest neighbor landmarks in the set of landmarks, distance between every two landmarks being based on dimensions of each of the two landmarks and the finite metric space; identifying at least one pair of landmarks that are nearest neighbors to each other relative to other pairs of landmarks; adding an edge between the at least one pair of landmarks; and for each data point in the multidimensional dataset that is not a member of the set of landmarks; determining distances of each of the data points that is not a member of the set of landmarks to the finite metric space to at least one of the landmarks, and projecting each data point that is not a member of the set of landmarks to the finite metric space based on the determined distances, thereby enabling at least one shape to indicate relationships in the data.
-
Specification