×

Indexing and querying semi-structured data

  • US 9,507,848 B1
  • Filed: 09/23/2010
  • Issued: 11/29/2016
  • Est. Priority Date: 09/25/2009
  • Status: Active Grant
First Claim
Patent Images

1. A method for generating an inverted index comprising:

  • receiving a file comprising data, wherein at least a portion of the data is unstructured or semi-structured data;

    parsing the data to extract structure from at least a portion of the data, wherein the parsing comprises (i) dividing the data into logical groupings, (ii) determining whether each of the logical groupings matches any of a plurality of stored parsing expressions, (iii) applying each of the matching parsing expressions to a corresponding one of the logical groupings to produce elements of data;

    (iv) applying a general parser to any of the logical groupings that do not match any of the stored parsing expressions to produce elements of data, and (v) identifying a data type associated with each of the elements of data produced in (iii) and (iv);

    generating the inverted index using the extracted structure, wherein each item in the inverted index includes one of the elements of data produced in (iii) and (iv), a location identifier, a position identifier, and a data type identifier for one or more entries of the inverted index,wherein the location identifier specifies an identifier of the file that contains the element of data, the position identifier specifies a position of the element of data within the file, and the data type identifier specifies the data type of the element of data, and the inverted index is utilized to respond to queries to identify patterns within the data.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×