Two-dimensional indexes for quick multiple attribute search in a catalog system
First Claim
1. A method for generating an index for searching data containers in a storage system, comprising a processor and a memory, based on selected metadata attributes, the method comprising:
- mapping, for a given data container of a plurality of data containers, a pair of values corresponding to a pair of attributes associated with metadata describing the given data container that stores data, wherein the pair of values represent coordinates for the given data container in a two-dimensional (2D) space;
imposing a grid comprising a plurality of cells on the 2D space so that each cell in the grid represents coordinates of a single data container of the plurality of data containers in the 2D space;
computing a space filling curve value for each cell in the grid;
generating a searching index wherein each computed space filling curve value is used as a key in the searching index to search for the plurality of data containers in the storage system;
receiving a query to search for one or more particular data containers based on ranges of metadata values of selected attributes;
computing particular space-filling curve values based on the ranges of metadata values of selected attributes associated with the query; and
searching the searching index to identify the one or more particular data containers based on the particular space-filling curve values.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments of the present invention provide mechanisms that overcome limitations of existing indexes by creating two-dimensional (2D) spatial indexes to quickly locate data containers that match two or more predicates. This is accomplished by representing metadata attributes describing a data container as dimensions in a 2D space so that a data container can be expressed as a point or a cell in a 2D space with coordinates being a pair of values of the selected attributes. A space filling curve is used to traverse the 2D space and convert each pair of the 2D coordinates to a single space filling curve value. A 2D spatial index is then created based on the computed space filling curve values so that one value can be associated with one or more points (data containers) in the index. Advantageously, the created spatial index provides for searching and processing fewer metadata entries, thereby decreasing the time typically used to search for data.
-
Citations
20 Claims
-
1. A method for generating an index for searching data containers in a storage system, comprising a processor and a memory, based on selected metadata attributes, the method comprising:
-
mapping, for a given data container of a plurality of data containers, a pair of values corresponding to a pair of attributes associated with metadata describing the given data container that stores data, wherein the pair of values represent coordinates for the given data container in a two-dimensional (2D) space; imposing a grid comprising a plurality of cells on the 2D space so that each cell in the grid represents coordinates of a single data container of the plurality of data containers in the 2D space; computing a space filling curve value for each cell in the grid; generating a searching index wherein each computed space filling curve value is used as a key in the searching index to search for the plurality of data containers in the storage system; receiving a query to search for one or more particular data containers based on ranges of metadata values of selected attributes; computing particular space-filling curve values based on the ranges of metadata values of selected attributes associated with the query; and searching the searching index to identify the one or more particular data containers based on the particular space-filling curve values. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method for computerized searching of data containers on a storage system having a processor and a memory, comprising:
-
maintaining metadata describing a plurality of attributes associated with each data container of the data containers that stores information, wherein each attribute has an associated value; representing a pair of values corresponding to a pair of attributes of the plurality of attributes associated with a given data container as dimensions in a two-dimensional (2D) space so that the pair of the selected values corresponding to the pair of attributes represents coordinates for a given data container in the 2D space; imposing a grid comprising a plurality of cells on the 2D space so that each cell in the grid represents coordinates of a single data container in the 2D space; computing a space filling curve value for each cell in the grid; generating a searching index wherein each computed space filling curve value is used as a key in the searching index; receiving a query to search for one or more data containers based on ranges of metadata values of selected attributes; and searching the searching index for the one or more data containers based on a match between particular space-filling curve values associated with the range of metadata values of selected attributes and one or more keys in the searching index.
-
-
12. A catalog system for generating an index for searching data containers stored in a computerized system, comprising a processor and a memory, the method comprising:
-
an attribute collecting module configured to collect values associated with selected metadata attributes from a data structure storing metadata for a plurality of data containers; an index generation module configured to represent the selected attributes as dimensions in a two-dimensional (2D)space so that a pair of values corresponding to a pair of selected metadata attributes represent coordinates of a given data container in the (2D) (2D) space having a plurality of cells where each cell represents a single data container in the 2D space, to compute a space filling curve value for each cell; and
to generate a searching index wherein each computed space filling curve value is used as a key in the searching index, wherein a match between one or more keys in the searching index and particular space-filling curve values associated with a range of metadata values of selected attributes identifies one or more particular data containers. - View Dependent Claims (13, 14)
-
-
15. A non-transitory computer-readable medium containing executable program instructions executed by a processor, comprising:
-
program instructions that map, for a given data container of the plurality of data containers, a pair of values corresponding to a pair of attributes associated with metadata describing the given data container that stores data, wherein the pair of values represent coordinates for the given data container in a two-dimensional (2D) space; program instructions that impose a grid comprising a plurality of cells on the 2D space so that each cell in the grid represents coordinates of a single data container in the 2D space; program instructions that compute a space filling curve value for each cell in the grid; program instructions that generate a searching index wherein each computed space filling curve value is used as key in the searching index to search for one or more data containers; program instructions that receive a query to search for the one or more data containers based on ranges of metadata values of selected attributes; program instructions that compute particular space-filling curve values based on the ranges of metadata values of selected attributes associated with the query; and searching the searching index to identify the one or more data containers based on the particular space-filling curve values. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification