Determining documents that match a query
First Claim
Patent Images
1. A method for determining documents that are nearest to a query, the method comprising:
- constructing, by a computer processor, a vantage point tree based on a plurality of document vectors;
searching, by a computer processor, the vantage point tree to determine a plurality of nearest neighbor document vectors to a query vector by removing a portion of the plurality of document vectors from the vantage point tree based on one or more vantage points for each of a plurality of nodes in the vantage point tree and a specified search radius centered about the query vector;
removing any of the plurality of document vectors belonging to document clusters that do not intersect a hypersphere of the specified search radius centered about the query vector;
removing any of the plurality of document vectors that do not satisfy a triangle inequality condition for the specified search radius between one of the one or more vantage points, the document vector, and the query vector; and
adjusting the specified search radius such that only a specified number of nearest neighbor document vectors are remaining after document vectors that are outside the hypersphere of the specified search radius for the query vector and document vectors that do not satisfy the triangle inequality condition have been removed.
2 Assignments
0 Petitions
Accused Products
Abstract
A computer-implemented method and system for determining documents that are nearest to a query are provided herein. The method includes constructing a vantage point tree based on a number of document vectors. The method also includes searching the vantage point tree to determine a number of nearest neighbor document vectors to a query vector by removing a portion of the document vectors from the vantage point tree based on one or more vantage points for each of a number of nodes in the vantage point tree and a specified search radius centered about the query vector.
5 Citations
20 Claims
-
1. A method for determining documents that are nearest to a query, the method comprising:
-
constructing, by a computer processor, a vantage point tree based on a plurality of document vectors; searching, by a computer processor, the vantage point tree to determine a plurality of nearest neighbor document vectors to a query vector by removing a portion of the plurality of document vectors from the vantage point tree based on one or more vantage points for each of a plurality of nodes in the vantage point tree and a specified search radius centered about the query vector; removing any of the plurality of document vectors belonging to document clusters that do not intersect a hypersphere of the specified search radius centered about the query vector; removing any of the plurality of document vectors that do not satisfy a triangle inequality condition for the specified search radius between one of the one or more vantage points, the document vector, and the query vector; and adjusting the specified search radius such that only a specified number of nearest neighbor document vectors are remaining after document vectors that are outside the hypersphere of the specified search radius for the query vector and document vectors that do not satisfy the triangle inequality condition have been removed. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 20)
-
-
11. A computing system for determining documents that are nearest to a query, comprising:
-
a processor that is adapted to execute stored instructions; and a system memory, wherein the system memory comprises code configured to; construct a vantage point tree based on a plurality of document vectors; traverse the vantage point tree using one or more vantage points for each of a plurality of nodes in the vantage point tree by removing any of the plurality of document vectors that are outside a hypersphere of a specified search radius centered about a query vector and remove any of the plurality of document vectors that do not satisfy a triangle inequality condition for the specified search radius between a vantage point, the document vector, and the query vector; determine a plurality of nearest neighbor document vectors to the query vector based on a distance between each remaining document vector and the query vector; and adjust the specified search radius such that only a specified number of nearest neighbor document vectors are remaining after document vectors that are outside the hypersphere of the specified search radius for the query vector and document vectors that do not satisfy the triangle inequality condition have been removed. - View Dependent Claims (12, 13, 14, 15, 16)
-
-
17. One or more computer-readable storage devices for storing computer-readable instructions, the computer-readable instructions providing a system for determining documents that are nearest to a query when executed by one or more processing devices, the computer-readable instructions comprising code configured to:
-
construct a vantage point tree based on a plurality of document vectors; traverse the vantage point tree using one or more vantage points for each of a plurality of nodes in the vantage point tree by removing a portion of the plurality of document vectors from the vantage point tree based on a specified search radius centered about a query vector and a triangle inequality condition; search the vantage point tree to determine a specified number of nearest neighbor document vectors to the query vector; and remove a portion of the plurality of document vectors that are outside a hypersphere of the specified search radius for the query vector; remove a portion of the plurality of document vectors that do not satisfy the triangle inequality condition; and adjust the specified search radius such that only a specified number of nearest neighbor document vectors are remaining after document vectors that are outside the hypersphere of the specified search radius for the query vector and document vectors that do not satisfy the triangle inequality condition have been removed. - View Dependent Claims (18, 19)
-
Specification