Similarity search system with compact data structures
First Claim
1. A method of searching a plurality of stored objects comprising the steps of:
- generating a collection of multi-dimensional vectors representing each said object, each of said multi-dimensional vectors having an associated weight;
defining a similarity distance between said objects using a distance function; and
finding objects closest to a query object based upon said distance function.
2 Assignments
0 Petitions
Accused Products
Abstract
A content-addressable and searchable storage system for managing and exploring massive amounts of feature-rich data such as images, audio or scientific data, is shown. The system comprises a segmentation and feature extraction unit for segmenting data corresponding to an object into a plurality of data segments and generating a feature vector for each data segment; a sketch construction component for converting a feature vector into a compact bit-vector corresponding to the object; a similarity index comprising a plurality of compact bit-vectors corresponding to a plurality of objects; and an index insertion component for inserting a compact bit-vector corresponding to an object into the similarity index. The system may further comprise an indexing unit for identifying a candidate set of objects from said similarity index based upon a compact bit-vector corresponding to a query object. Still further, the system may additionally comprise a similarity ranking component for ranking objects in said candidate set by estimating their distances to the query object.
-
Citations
35 Claims
-
1. A method of searching a plurality of stored objects comprising the steps of:
-
generating a collection of multi-dimensional vectors representing each said object, each of said multi-dimensional vectors having an associated weight;
defining a similarity distance between said objects using a distance function; and
finding objects closest to a query object based upon said distance function. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A search system comprising:
means for inputting data;
a segmentation and feature extraction unit for segmenting data and generating feature vectors representing segmented data; and
a similarity search engine comprising;
a sketch construction unit for converting feature vectors into sketches;
a similarity index;
an indexing unit for identifying a candidate set of objects in said similarity index; and
a similarity ranking component for ranking objects in the candidate set. - View Dependent Claims (8, 9, 10, 11, 12)
-
13. A method of processing data comprising the steps of:
-
segmenting said data into a plurality of segments;
extracting a feature vector from each of said plurality of segments;
converting each of said feature vectors into a segment sketch;
calculating a segment weight for each of said segments; and
embedding a plurality of said segment sketches and weights into a composite data feature vector. - View Dependent Claims (14)
-
-
15. A method of comparing a search image to a first plurality of stored images comprising the steps of:
-
segmenting the search image into a plurality of search image regions;
extracting a region feature vector from each of said search image regions;
converting each of said region feature vectors into a region sketch;
storing said region sketches;
calculating a region weight for each of said search image regions;
embedding all of said region sketches and region weights into a composite search image feature vector;
storing said composite search image feature vector; and
selecting a second plurality of images from said database using said composite search image feature vector, wherein said second plurality of images comprises a subset of said first plurality of images. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. A method of processing an image comprising the steps of:
-
segmenting said image into a plurality of regions;
extracting a feature vector from each of said regions;
converting each of said feature vectors into a region bit vector;
storing each of said region bit vectors;
embedding all of said region bit vectors into a composite image feature vector;
converting said composite image feature vector into an image bit vector;
storing said image bit vector.
-
-
26. A method for performing a similarity search:
-
segmenting input data;
extracting input data feature vectors from said segmented input data;
constructing an input data sketch from said feature vectors;
indexing said input data based upon said sketch;
segmenting query data;
extracting query data feature vectors from said segmented query data;
constructing a query data sketch from said query data feature vectors; and
comparing said query data sketch to a plurality of input data sketches.
-
-
27. A system for performing similarity searches on data comprising:
-
a segmentation and feature extraction unit for segmenting data corresponding to an object into a plurality of data segments and generating a feature vector for each data segment;
a sketch construction component for converting a feature vector into a compact bit-vector corresponding to said object;
a similarity index comprising a plurality of compact bit-vectors corresponding to a plurality of objects; and
an index insertion component for inserting a compact bit-vector corresponding to an object into said similarity index. - View Dependent Claims (28, 29)
-
-
30. A system for performing similarity searches on data comprising:
-
a first segmentation and feature extraction unit for segmenting data corresponding to a first type of object into a plurality of data segments and generating a feature vector for each data segment;
a second segmentation and feature extraction unit for segmenting data corresponding to a second type of object into a plurality of data segments and generating a feature vector for each data segment;
a sketch construction component for converting a feature vector into a compact bit-vector corresponding to an object;
a similarity index comprising a plurality of compact bit-vectors corresponding to a plurality of objects; and
an index insertion component for inserting a compact bit-vector corresponding to an object into said similarity index. - View Dependent Claims (31, 32, 33, 34, 35)
-
Specification