Method and system for creating indices and loading key-value pairs for NoSQL databases
First Claim
1. An apparatus for creating indices and loading key-value pairs for NoSQL databases, the apparatus comprising:
- a processor; and
one or more stored sequences of instructions which, when executed by the processor,cause the processor to carry out the steps of;
creating a plurality of attributes that correspond to a plurality of records in a NoSQL database, wherein each attribute of the plurality of attributes comprises data from a corresponding plurality of record fields;
creating an index based on the plurality of attributes;
loading, in a memory, a plurality of attributes that correspond to a subset of the index as keys in a key-value pair and a plurality of identifiers that correspond to a plurality of records that correspond to the plurality of attributes as values in the key-value pair;
sorting, in the memory, the plurality of attributes that correspond to the subset of the index; and
identifying, in the memory, any duplicate attributes from the sorted plurality of attributes, wherein any identifiers that correspond to the any duplicate attributes also identify records in the NoSQL database to be evaluated as to whether the identified records are duplicates.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods are provided for creating indices and loading key-value pairs for NoSQL databases. Attributes are created that correspond to records in a NoSQL database based on corresponding record fields. An index is created based on the attributes. A memory is loaded with attributes that correspond to a subset of the index as keys in a key-value pair and identifiers that correspond to records that correspond to the attributes as values in the key-value pair. The attributes that correspond to the subset of the index are sorted in the memory. Any duplicate attributes are identified from the sorted attributes in the memory. Any identifiers that correspond to any duplicate attributes also identify records in the NoSQL database to be evaluated as potential duplicate records.
-
Citations
20 Claims
-
1. An apparatus for creating indices and loading key-value pairs for NoSQL databases, the apparatus comprising:
-
a processor; and one or more stored sequences of instructions which, when executed by the processor, cause the processor to carry out the steps of; creating a plurality of attributes that correspond to a plurality of records in a NoSQL database, wherein each attribute of the plurality of attributes comprises data from a corresponding plurality of record fields; creating an index based on the plurality of attributes; loading, in a memory, a plurality of attributes that correspond to a subset of the index as keys in a key-value pair and a plurality of identifiers that correspond to a plurality of records that correspond to the plurality of attributes as values in the key-value pair; sorting, in the memory, the plurality of attributes that correspond to the subset of the index; and identifying, in the memory, any duplicate attributes from the sorted plurality of attributes, wherein any identifiers that correspond to the any duplicate attributes also identify records in the NoSQL database to be evaluated as to whether the identified records are duplicates. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A non-transitory machine-readable medium carrying one or more sequences of instructions for creating indices and loading key-value pairs for NoSQL databases, which instructions, when executed by one or more processors, cause the one or more processors to carry out the steps of:
-
creating a plurality of attributes that correspond to a plurality of records in a NoSQL database, wherein each attribute of the plurality of attributes comprises data from a corresponding plurality of record fields; creating an index based on the plurality of attributes; loading, in a memory, a plurality of attributes that correspond to a subset of the index as keys in a key-value pair and a plurality of identifiers that correspond to a plurality of records that correspond to the plurality of attributes as values in the key-value pair; sorting, in the memory, the plurality of attributes that correspond to the subset of the index; and identifying, in the memory, any duplicate attributes from the sorted plurality of attributes, wherein any identifiers that correspond to the any duplicate attributes also identify records in the NoSQL database to be evaluated as to whether the identified records are duplicates. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A method for creating indices and loading key-value pairs for NoSQL databases, the method comprising:
-
creating a plurality of attributes that correspond to a plurality of records in a NoSQL database, wherein each attribute of the plurality of attributes comprises data from a corresponding plurality of record fields; creating an index based on the plurality of attributes; loading, in a memory, a plurality of attributes that correspond to a subset of the index as keys in a key-value pair and a plurality of identifiers that correspond to a plurality of records that correspond to the plurality of attributes as values in the key-value pair; sorting, in the memory, the plurality of attributes that correspond to the subset of the index; and identifying, in the memory, any duplicate attributes from the sorted plurality of attributes, wherein any identifiers that correspond to the any duplicate attributes also identify records in the NoSQL database to be evaluated as to whether the identified records are duplicates. - View Dependent Claims (12, 13, 14, 15)
-
-
16. A method for transmitting code for creating indices and loading key-value pairs for NoSQL databases on a transmission medium, the method comprising:
-
transmitting code to create a plurality of attributes that correspond to a plurality of records in a NoSQL database, wherein each attribute of the plurality of attributes comprises data from a corresponding plurality of record fields; transmitting code to create an index based on the plurality of attributes; transmitting code to load, in a memory, a plurality of attributes that correspond to a subset of the index as keys in a key-value pair and a plurality of identifiers that correspond to a plurality of records that correspond to the plurality of attributes as values in the key-value pair; transmitting code to sort, in the memory, the plurality of attributes that correspond to the subset of the index; and transmitting code to identify, in the memory, any duplicate attributes from the sorted plurality of attributes, wherein any identifiers that correspond to the any duplicate attributes also identify records in the NoSQL database to be evaluated as to whether the identified records are duplicates. - View Dependent Claims (17, 18, 19, 20)
-
Specification