Generating a schema of a not-only-structured-query-language database
First Claim
Patent Images
1. A method comprising:
- determining a record schema of a set of entry(ies) in a Not-only-Structured-Query-Language (NoSQL) type database, the record schema being a treelike relationship between key hierarchies with a key as a node in each entry of the set of entry(ies), wherein the key corresponds to an attribute name;
determining a node in which there is an attribute name variation based on a similarity of sub-nodes in the treelike relationship, and updating a corresponding record schema according to the determined result;
CinP-encoding the record schema of the each entry of the set of entry(ies);
parsing the encoded record schema in the tree structure into a path set, the path including the codes of all nodes which are traversed from a leaf node to a root node;
putting into buckets according to the P-Code of a certain tier;
in the buckets, determining a parent node in which there is an attribute name variation according to the similarity of the sub-nodes in the tier;
updating a source P-Code of part of the determined parent node in which there is an attribute name variation to a target P-Code, and updating the C-Code code of a parent node corresponding to the updated parent node; and
generating a schema for the NoSQL type database based on the updated record schema;
wherein;
the CinP-code of each node includes a code P-Code of the node itself and a code C-Code of the sub-nodes thereof; and
the determination of a node in which there is an attribute name variation includes determining a node in which there is an attribute name variation based on a similarity of sub-nodes in the CinP-encoded record schema.
1 Assignment
0 Petitions
Accused Products
Abstract
Generation of a schema of a NoSQL type database, where a set of entry(ies) of a NoSQL type database can be determined, and the record schema is a treelike relationship between key hierarchies with a key as a node in the entry, wherein the key corresponds to an attribute name. For at least one record schema, a node in which there is an attribute name variation is determined based on a similarity of sub-nodes in the treelike relationship, and the record schema is updated according to the determined result. The schema of the NoSQL type database is generated based on the updated record schema.
29 Citations
9 Claims
-
1. A method comprising:
-
determining a record schema of a set of entry(ies) in a Not-only-Structured-Query-Language (NoSQL) type database, the record schema being a treelike relationship between key hierarchies with a key as a node in each entry of the set of entry(ies), wherein the key corresponds to an attribute name; determining a node in which there is an attribute name variation based on a similarity of sub-nodes in the treelike relationship, and updating a corresponding record schema according to the determined result; CinP-encoding the record schema of the each entry of the set of entry(ies); parsing the encoded record schema in the tree structure into a path set, the path including the codes of all nodes which are traversed from a leaf node to a root node; putting into buckets according to the P-Code of a certain tier; in the buckets, determining a parent node in which there is an attribute name variation according to the similarity of the sub-nodes in the tier; updating a source P-Code of part of the determined parent node in which there is an attribute name variation to a target P-Code, and updating the C-Code code of a parent node corresponding to the updated parent node; and generating a schema for the NoSQL type database based on the updated record schema; wherein; the CinP-code of each node includes a code P-Code of the node itself and a code C-Code of the sub-nodes thereof; and the determination of a node in which there is an attribute name variation includes determining a node in which there is an attribute name variation based on a similarity of sub-nodes in the CinP-encoded record schema. - View Dependent Claims (2, 3)
-
-
3. The method according to claim 2, wherein the generation of the schema of the NoSQL type database includes:
-
in the path code, extracting a sub-sequence containing only the P-Code, counting a frequency at which the sub-sequence occurs statistically, and merging the same P-Code sub-sequences; and retaining the P-Code sub-sequence having a frequency exceeding a threshold value G2, and decoding the P-Code sub-sequence according to the code maps of the tiers to generate the schema of the NoSQL type database.
-
-
4. A device comprising:
-
a determiner configured to determine a record schema of a set of entry(ies) in a NoSQL (Not-only-Structured-Query-Language) type database, the record schema being a treelike relationship between key hierarchies with a key as a node in each entry of the set of entry(ies), wherein the key corresponds to an attribute name; an updater configured to determine a node in which there is an attribute name variation based on a similarity of sub-nodes in the treelike relationship, and updating a corresponding record schema according to the determined result, the updater including; a module configured to parse the encoded record schema in the tree structure into a path set, the path comprising the codes of all nodes which are traversed from a leaf node to a root node, a module configured to put into buckets according to the P-Code of a certain tier, a module configured to, in the buckets, determine a parent node in which there is an attribute name variation according to the similarity of the sub-nodes in the tier, and a module configured to update a source P-Code of part of the determined parent node in which there is an attribute name variation to a target P-Code, and update the C-Code code of a parent node corresponding to the updated parent node; and a generator configured to generate a schema of the NoSQL type database based on the updated record schema; wherein; the CinP-code of each node includes a code P-Code of the node itself and a code C-Code of the sub-nodes thereof; and the determination of a node in which there is an attribute name variation includes determining a node in which there is an attribute name variation based on a similarity of sub-nodes in the CinP-encoded record schema. - View Dependent Claims (5, 6)
-
-
7. A computer program product comprising a computer readable storage medium having stored thereon:
-
first program instructions programmed to determine a record schema of a set of entry(ies) in a Not-only-Structured-Query-Language (NoSQL) type database, the record schema being a treelike relationship between key hierarchies with a key as a node in each entry of the set of entry(ies), wherein the key corresponds to an attribute name; second program instructions programmed to determine a node in which there is an attribute name variation based on a similarity of sub-nodes in the treelike relationship, and updating a corresponding record schema according to the determined result, the second program instructions including; third program instructions programmed to parse the encoded record schema in the tree structure into a path set, the path including the codes of all nodes which are traversed from a leaf node to a root node, fourth program instructions programmed to put into buckets according to the P-Code of a certain tier, fifth program instructions programmed to, in the buckets, determine a parent node in which there is an attribute name variation according to the similarity of the sub-nodes in the tier, and sixth program instructions programmed to update a source P-Code of part of the determined parent node in which there is an attribute name variation to a target P-Code, and updating the C-Code code of a parent node corresponding to the updated parent node; seventh program instructions programmed to generate a schema for the NoSQL type database based on the updated record schema; eighth program instructions programmed to CinP-encode the record schema of each entry of the set of entry(ies); wherein; the CinP-code of each node includes a code P-Code of the node itself and a code C-Code of the sub-nodes thereof; and second program instructions include ninth program instructions programmed to determine a node in which there is an attribute name variation based on a similarity of sub-nodes in the CinP-encoded record schema. - View Dependent Claims (8, 9)
-
Specification