×

Storing semi-structured data

  • US 9,754,048 B1
  • Filed: 10/06/2014
  • Issued: 09/05/2017
  • Est. Priority Date: 10/06/2014
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • maintaining a plurality of schemas, wherein each schema is associated with one or more encoded data items stored in a first data format in a data item repository, wherein each encoded data item stores a respective value at each of one or more locations in the encoded data item, and wherein each schema maps each of the locations in the data items associated with the schema to a respective key to which the value stored at the location in the data items associated with the schema corresponds;

    receiving a first semi-structured data item, wherein the first semi-structured data item is in a semi-structured data format, and wherein the first semi-structured data item comprises one or more first key/value pairs;

    determining that i) a first subset of the first key/value pairs of the first semi-structured data item do not match any of the schemas in the plurality of schemas and that ii) a second subset of the first key/value pairs of the first semi-structured data item match a first schema of the plurality of schemas; and

    in response to determining that i) a first subset of the first key/value pairs of the first semi-structured data item do not match any of the schemas in the plurality of schemas and that ii) a second subset of the first key/value pairs of the first semi-structured data item match a first schema of the plurality of schemas;

    generating a new schema that i) for a first subset of locations in a data item associated with the new schema, maps the locations to a respective key to which the value that is stored at the location corresponds, and that ii) for a second subset of locations in the data item associated with the new schema, identifies the respective key to which the value that is stored at the location corresponds by reference to the first schema,encoding, in accordance with the new schema, the first semi-structured data item in the first data format to generate a first new encoded data item by i) storing values corresponding to values from the first subset of the first key/value pairs at respective locations in the first new encoded data item, and by ii) storing values corresponding to values from the second subset of the key/value pairs in corresponding locations in the second subset of locations that are identified by the first schema,storing the first new encoded data item in the data item repository, andassociating the first new encoded data item with the new schema.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×