×

Systems and methods for high-speed searching and filtering of large datasets

  • US 9,171,054 B1
  • Filed: 01/04/2013
  • Issued: 10/27/2015
  • Est. Priority Date: 01/04/2012
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method comprising:

  • (a) receiving at one or more computer processors, from a first computer-readable storage medium operatively coupled to the one or more computer processors, first electronic indicia of a dataset comprising a multitude of alphanumeric data records, each data record including data strings for multiple corresponding defined data fields;

    (b) using the one or more computer processors, the one or more computer processors being programmed therefor, generating second electronic indicia of the dataset, the second electronic indicia comprising (1) an alphanumeric or binary clump header table comprising a plurality of clump data records, (2) an inline tree data structure, and (3) one or more auxiliary data structures; and

    (c) storing the clump header table, the inline tree data structure, and the one or more auxiliary data structures on the first computer-readable storage medium or on a second computer-readable storage medium operatively coupled to the one or more computer processors,wherein;

    (d) first and second sets of the one or more data fields among the defined data fields define a hierarchical tree relationship among subranges of data strings of the data fields of the first and second sets, which subranges correspond to first-level and second-level subsets, respectively, of the data records of the dataset;

    (e) the inline tree data structure comprises a sequence of (1) multiple first-level binary string segments, each followed by (2) a subset of one or more corresponding second-level binary string segments;

    (f) each first-level binary string segment encodes a subrange of data strings in a selected filterable subset of the first set of data fields of a corresponding one of the first-level subsets of the data records, and excludes a non-filterable subset of the first set of data fields;

    (g) each second-level binary string segment encodes a subrange of data strings in a selected filterable subset of the second set of data fields of a corresponding one of the second-level subsets of the data records, and excludes a non-filterable subset of the second set of data fields;

    (h) for a clumped set of the defined data fields, which clumped set excludes data fields of the first and second sets, each combination of specific data strings that occurs in the dataset is indicated by a corresponding one of the plurality of clump data records of the clump header table;

    (i) each clump data record in the clump header table includes an indicator of a location in the inline tree data structure of a corresponding first-level binary string segment;

    (j) each of the one or more auxiliary data structures comprises electronic indicia of a corresponding auxiliary set of data fields, which auxiliary set of data fields comprises (1) one or more of the defined data fields or (2) one or more additional data fields that are not among the defined data fields; and

    (k) the electronic indicia of each one of the one or more auxiliary data structures comprise a corresponding set of auxiliary binary string segments, a corresponding auxiliary inline tree data structure, or a corresponding set of auxiliary alphanumeric string segments.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×