Systems and methods for reducing data density in large datasets
First Claim
1. A computer-implemented method, comprising:
- obtaining a reference data point;
determining first representative data associated with the reference data point, the first representative data indicating a relationship between a vector associated with the reference data point and a plurality of projected vectors;
removing a number of bits associated with the reference data point from storage in response to determining the first representative data;
obtaining an unknown data point;
determining second representative data associated with the unknown data point, the second representative data indicating a relationship between a vector associated with the unknown data point and the plurality of projected vectors; and
identifying, using the first representative data and the second representative data, one or more candidate data points for matching the unknown data point, wherein identifying the one or more candidate data points includes comparing the unknown data point to the reference data point.
2 Assignments
0 Petitions
Accused Products
Abstract
Techniques and systems are provided for identifying unknown content. For example, a number of vectors out of a plurality of vectors projected from an origin point can be determined that are between a reference data point and an unknown data point. The number of vectors can be used to estimate an angle between a first vector (from the origin point to a reference data point) and a second vector (from the origin point to an unknown data point). A distance between the reference data point and the unknown data point can then be determined. Using the determined distance, candidate data points can be determined from a set of reference data points. The candidate data points can be analyzed to identify the unknown data point.
298 Citations
20 Claims
-
1. A computer-implemented method, comprising:
-
obtaining a reference data point; determining first representative data associated with the reference data point, the first representative data indicating a relationship between a vector associated with the reference data point and a plurality of projected vectors; removing a number of bits associated with the reference data point from storage in response to determining the first representative data; obtaining an unknown data point; determining second representative data associated with the unknown data point, the second representative data indicating a relationship between a vector associated with the unknown data point and the plurality of projected vectors; and identifying, using the first representative data and the second representative data, one or more candidate data points for matching the unknown data point, wherein identifying the one or more candidate data points includes comparing the unknown data point to the reference data point. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A system comprising:
-
one or more processors; one or more non-transitory machine-readable storage media containing instructions which when executed on the one or more processors, cause the one or more processors to perform operations including; obtaining a reference data point; determining first representative data associated with the reference data point, the first representative data indicating a relationship between a vector associated with the reference data point and a plurality of projected vectors; removing a number of bits associated with the reference data point from storage in response to determining the first representative data; obtaining an unknown data point; determining second representative data associated with the unknown data point, the second representative data indicating a relationship between a vector associated with the unknown data point and the plurality of projected vectors; and identifying, using the first representative data and the second representative data, one or more candidate data points for matching the unknown data point, wherein identifying the one or more candidate data points includes comparing the unknown data point to the reference data point. - View Dependent Claims (19, 20)
-
Specification