Indexing and searching JSON objects
First Claim
1. A method of encoding JavaScript Object Notation (JSON) documents in an inverted index, said method comprising:
- generating a tree representation of a JSON document;
shredding said JSON document into a list of <
value, path, type, jdewey>
tuples for each atom node, n, in said tree, where value is a label associated with n, path is a concatenation of node labels associated with ancestors of n, starting from a root of said tree, type is a description of a type of value, and jdewey of n is a partial Dewey code of its closest ancestor array node when an ancestor array node exists and jdewey of n is empty when no closest ancestor array node exists; and
building an inverted index using <
path, type, value>
as index term, and jdewey as payload, said inverted index is organized as a list of ordered index terms, with each term in said list of ordered index terms pointing to a posting list, and each post is a <
d, plist>
pair, wherein d is the document ID and plist is an ordered list of positions within said JSON document and jdewey is stored in payload of each position.
1 Assignment
0 Petitions
Accused Products
Abstract
Disclosed is a method of encoding JavaScript Object Notation (JSON) documents in an inverted index, wherein a tree representation of a JSON document is first generated, and, next, the JSON document is shredded into a list of <value, path, type, jdewey> tuples for each atom node, n, in the tree, where value is a label associated with n, path is a concatenation of node labels associated with ancestors of n, type is a description of a type of value, and jdewey of n is a partial Dewey code of its closest ancestor array node, if one exists, or empty, otherwise. Lastly, an inverted index is built using <path, type, value> as index term, and jdewey as payload. A method is also described to search the inverted index.
17 Citations
16 Claims
-
1. A method of encoding JavaScript Object Notation (JSON) documents in an inverted index, said method comprising:
-
generating a tree representation of a JSON document; shredding said JSON document into a list of <
value, path, type, jdewey>
tuples for each atom node, n, in said tree, where value is a label associated with n, path is a concatenation of node labels associated with ancestors of n, starting from a root of said tree, type is a description of a type of value, and jdewey of n is a partial Dewey code of its closest ancestor array node when an ancestor array node exists and jdewey of n is empty when no closest ancestor array node exists; andbuilding an inverted index using <
path, type, value>
as index term, and jdewey as payload, said inverted index is organized as a list of ordered index terms, with each term in said list of ordered index terms pointing to a posting list, and each post is a <
d, plist>
pair, wherein d is the document ID and plist is an ordered list of positions within said JSON document and jdewey is stored in payload of each position. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. An article of manufacture having non-transitory computer usable medium storing computer readable program code implementing a computer-based method of encoding JavaScript Object Notation (JSON) documents in an inverted index, said medium comprising:
-
computer readable program code generating a tree representation of a JSON document; computer readable program code shredding said JSON document into a list of <
value, path, type, jdewey>
tuples for each atom node, n, in said tree, where value is a label associated with n, path is a concatenation of node labels associated with ancestors of n, starting from a root of said tree, type is a description of a type of value, and jdewey of n is a partial Dewey code comprising a Dewey code of its closest ancestor array node when an ancestor array node exists and jdewev of n is empty when no closest ancestor array node exists; andcomputer readable program code building an inverted index using <
path, type, value>
as index term, and jdewey as payload, said inverted index is organized as a list of ordered index terms, with each term in said list of ordered index terms pointing to a posting list, and each post is a <
d, plist>
pair, wherein d is the document ID and plist is an ordered list of positions within said JSON document and jdewey is stored in payload of each position. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
Specification