Systems and methods for high-speed searching and filtering of large datasets

US 9,171,054 B1
Filed: 01/04/2013
Issued: 10/27/2015
Est. Priority Date: 01/04/2012
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method comprising:

(a) receiving at one or more computer processors, from a first computer-readable storage medium operatively coupled to the one or more computer processors, first electronic indicia of a dataset comprising a multitude of alphanumeric data records, each data record including data strings for multiple corresponding defined data fields;

(b) using the one or more computer processors, the one or more computer processors being programmed therefor, generating second electronic indicia of the dataset, the second electronic indicia comprising (1) an alphanumeric or binary clump header table comprising a plurality of clump data records, (2) an inline tree data structure, and (3) one or more auxiliary data structures; and

(c) storing the clump header table, the inline tree data structure, and the one or more auxiliary data structures on the first computer-readable storage medium or on a second computer-readable storage medium operatively coupled to the one or more computer processors,wherein;

(d) first and second sets of the one or more data fields among the defined data fields define a hierarchical tree relationship among subranges of data strings of the data fields of the first and second sets, which subranges correspond to first-level and second-level subsets, respectively, of the data records of the dataset;

(e) the inline tree data structure comprises a sequence of (1) multiple first-level binary string segments, each followed by (2) a subset of one or more corresponding second-level binary string segments;

(f) each first-level binary string segment encodes a subrange of data strings in a selected filterable subset of the first set of data fields of a corresponding one of the first-level subsets of the data records, and excludes a non-filterable subset of the first set of data fields;

(g) each second-level binary string segment encodes a subrange of data strings in a selected filterable subset of the second set of data fields of a corresponding one of the second-level subsets of the data records, and excludes a non-filterable subset of the second set of data fields;

(h) for a clumped set of the defined data fields, which clumped set excludes data fields of the first and second sets, each combination of specific data strings that occurs in the dataset is indicated by a corresponding one of the plurality of clump data records of the clump header table;

(i) each clump data record in the clump header table includes an indicator of a location in the inline tree data structure of a corresponding first-level binary string segment;

(j) each of the one or more auxiliary data structures comprises electronic indicia of a corresponding auxiliary set of data fields, which auxiliary set of data fields comprises (1) one or more of the defined data fields or (2) one or more additional data fields that are not among the defined data fields; and

(k) the electronic indicia of each one of the one or more auxiliary data structures comprise a corresponding set of auxiliary binary string segments, a corresponding auxiliary inline tree data structure, or a corresponding set of auxiliary alphanumeric string segments.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A data structure comprises a clump header table, an inline tree data structure, and one or more auxiliary data structures. Each clump header record includes an indicator of a location in the inline tree data structure of corresponding binary string segments. Clump header records or auxiliary header records include indicators of corresponding locations in the corresponding auxiliary data structure. Each auxiliary data structure can be altered without necessarily altering the inline tree or clump header table. A dedicated, specifically adapted conversion program generates the clump header file, the inline tree data structure, and the one or more auxiliary data structures. The data structure can be stored on any computer-readable medium, and can be read entirely into RAM to be searched (with or without filtering on one or more filter data fields). A dedicated, specifically adapted search and filter program is employed, which can list or enumerate the retrieved data records.

Citations

24 Claims

1. A computer-implemented method comprising:
- (a) receiving at one or more computer processors, from a first computer-readable storage medium operatively coupled to the one or more computer processors, first electronic indicia of a dataset comprising a multitude of alphanumeric data records, each data record including data strings for multiple corresponding defined data fields;
  
  (b) using the one or more computer processors, the one or more computer processors being programmed therefor, generating second electronic indicia of the dataset, the second electronic indicia comprising (1) an alphanumeric or binary clump header table comprising a plurality of clump data records, (2) an inline tree data structure, and (3) one or more auxiliary data structures; and
  
  (c) storing the clump header table, the inline tree data structure, and the one or more auxiliary data structures on the first computer-readable storage medium or on a second computer-readable storage medium operatively coupled to the one or more computer processors,wherein;
  
  (d) first and second sets of the one or more data fields among the defined data fields define a hierarchical tree relationship among subranges of data strings of the data fields of the first and second sets, which subranges correspond to first-level and second-level subsets, respectively, of the data records of the dataset;
  
  (e) the inline tree data structure comprises a sequence of (1) multiple first-level binary string segments, each followed by (2) a subset of one or more corresponding second-level binary string segments;
  
  (f) each first-level binary string segment encodes a subrange of data strings in a selected filterable subset of the first set of data fields of a corresponding one of the first-level subsets of the data records, and excludes a non-filterable subset of the first set of data fields;
  
  (g) each second-level binary string segment encodes a subrange of data strings in a selected filterable subset of the second set of data fields of a corresponding one of the second-level subsets of the data records, and excludes a non-filterable subset of the second set of data fields;
  
  (h) for a clumped set of the defined data fields, which clumped set excludes data fields of the first and second sets, each combination of specific data strings that occurs in the dataset is indicated by a corresponding one of the plurality of clump data records of the clump header table;
  
  (i) each clump data record in the clump header table includes an indicator of a location in the inline tree data structure of a corresponding first-level binary string segment;
  
  (j) each of the one or more auxiliary data structures comprises electronic indicia of a corresponding auxiliary set of data fields, which auxiliary set of data fields comprises (1) one or more of the defined data fields or (2) one or more additional data fields that are not among the defined data fields; and
  
  (k) the electronic indicia of each one of the one or more auxiliary data structures comprise a corresponding set of auxiliary binary string segments, a corresponding auxiliary inline tree data structure, or a corresponding set of auxiliary alphanumeric string segments.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. The method of claim 1 wherein each first-level binary string segment and one or more corresponding second-level binary string segments form a substantially contiguous portion within the inline tree data structure.
  - 3. The method of claim 1 further comprising altering stored electronic indicia of at least one of the one or more auxiliary data structures.
  - 4. The method of claim 3 wherein the altering of stored electronic indicia of the auxiliary data structure is performed without altering the clump header table or the inline tree data structure.
  - 5. The method of claim 1 wherein at least a portion of the electronic indicia of at least one of the one or more auxiliary data structures correspond to altered data strings in one or more of the defined data fields of the corresponding auxiliary set.
  - 6. The method of claim 1 wherein at least a portion of the electronic indicia of at least one of the one or more auxiliary data structures correspond to replacement data strings for one or more of the defined data fields of the corresponding auxiliary set.
  - 7. The method of claim 1 wherein:
    - (l) a third set of the one or more data fields among the defined data fields define a hierarchical tree relationship among subranges of data strings of the data fields of the first, second, and third sets, which subranges correspond to first-level, second-level, and third-level subsets, respectively, of the data records of the dataset;
      
      (m) the inline tree data structure further comprises a subset of one or more corresponding third-level binary string segments following each second-level binary string segment; and
      
      (n) each third-level binary string segment encodes the range of data strings in the third set of data fields of a corresponding one of the third-level subsets of the data records.
  - 8. The method of claim 7 wherein each second-level binary string segment and one or more corresponding third-level binary string segments form a substantially contiguous portion within the inline tree data structure.
  - 9. The method of claim 1 wherein at least one of the one or more auxiliary data structures includes a corresponding auxiliary clump header table, wherein the auxiliary clump header table includes, for each clump data record, an indicator of a location, in the corresponding set of auxiliary binary string segments or in the corresponding auxiliary inline tree structure, of electronic indicia of the corresponding auxiliary set of data fields of data records of the corresponding first-level subset of data records.
  - 10. The method of claim 1 wherein each clump data record includes an indicator of a location, in at least one of the corresponding sets of auxiliary binary string segments or in at least one of the corresponding auxiliary inline tree structures, of electronic indicia of the corresponding auxiliary set of data fields of data records of the corresponding first-level subset of data records.
  - 11. The method of claim 10 wherein at least one of the corresponding auxiliary inline tree structures or at least one of the auxiliary sets of binary string segments is arranged in an ordered sequence that corresponds to an ordered sequence of arrangement of the first-level and second-level binary string segments in the inline tree data structure.
  - 12. The method of claim 11 wherein the indicator of the location in each set of auxiliary binary string segments or in each auxiliary inline tree structure comprises a total number of data records represented by preceding clump data records.

13. A computer system comprising one or more computer processors and one or more computer-readable non-transitory storage media structured and connected to perform a method, the method comprising:
- (a) receiving at the one or more computer processors, from a first one of the one or more computer-readable storage media operatively coupled to the one or more computer processors, first electronic indicia of a dataset comprising a multitude of alphanumeric data records, each data record including data strings for multiple corresponding defined data fields;
  
  (b) using the one or more computer processors, the one or more computer processors being programmed therefor, generating second electronic indicia of the dataset, the second electronic indicia comprising (1) an alphanumeric or binary clump header table comprising a plurality of clump data records, (2) an inline tree data structure, and (3) one or more auxiliary data structures; and
  
  (c) storing the clump header table, the inline tree data structure, and the one or more auxiliary data structures on the first one of the one or more computer-readable storage media or on a second one of the one or more computer-readable storage media operatively coupled to the one or more computer processors,wherein;
  
  (d) first and second sets of the one or more data fields among the defined data fields define a hierarchical tree relationship among subranges of data strings of the data fields of the first and second sets, which subranges correspond to first-level and second-level subsets, respectively, of the data records of the dataset;
  
  (e) the inline tree data structure comprises a sequence of (1) multiple first-level binary string segments, each followed by (2) a subset of one or more corresponding second-level binary string segments;
  
  (f) each first-level binary string segment encodes a subrange of data strings in a selected filterable subset of the first set of data fields of a corresponding one of the first-level subsets of the data records, and excludes a non-filterable subset of the first set of data fields;
  
  (g) each second-level binary string segment encodes a subrange of data strings in a selected filterable subset of the second set of data fields of a corresponding one of the second-level subsets of the data records, and excludes a non-filterable subset of the second set of data fields;
  
  (h) for a clumped set of the defined data fields, which clumped set excludes data fields of the first and second sets, each combination of specific data strings that occurs in the dataset is indicated by a corresponding one of the plurality of clump data records of the clump header table;
  
  (i) each clump data record in the clump header table includes an indicator of a location in the inline tree data structure of a corresponding first-level binary string segment;
  
  (j) each of the one or more auxiliary data structures comprises electronic indicia of a corresponding auxiliary set of data fields, which auxiliary set of data fields comprises (1) one or more of the defined data fields or (2) one or more additional data fields that are not among the defined data fields; and
  
  (k) the electronic indicia of each one of the one or more auxiliary data structures comprise a corresponding set of auxiliary binary string segments, a corresponding auxiliary inline tree data structure, or a corresponding set of auxiliary alphanumeric string segments.

14. An article comprising one or more tangible, non-transitory program-storage media encoding computer-readable instructions that, when applied to a computer system, instruct the computer system to perform a method, the method comprising:
- (a) receiving at one or more computer processors, from a first computer-readable data-storage medium operatively coupled to the one or more computer processors, first electronic indicia of a dataset comprising a multitude of alphanumeric data records, each data record including data strings for multiple corresponding defined data fields;
  
  (b) using the one or more computer processors, the one or more computer processors being programmed therefor, generating second electronic indicia of the dataset, the second electronic indicia comprising (1) an alphanumeric or binary clump header table comprising a plurality of clump data records, (2) an inline tree data structure, and (3) one or more auxiliary data structures; and
  
  (c) storing the clump header table, the inline tree data structure, and the one or more auxiliary data structures on the first computer-readable data-storage medium or on a second computer-readable data-storage medium operatively coupled to the one or more computer processors,wherein;
  
  (d) first and second sets of the one or more data fields among the defined data fields define a hierarchical tree relationship among subranges of data strings of the data fields of the first and second sets, which subranges correspond to first-level and second-level subsets, respectively, of the data records of the dataset;
  
  (e) the inline tree data structure comprises a sequence of (1) multiple first-level binary string segments, each followed by (2) a subset of one or more corresponding second-level binary string segments;
  
  (f) each first-level binary string segment encodes a subrange of data strings in a selected filterable subset of the first set of data fields of a corresponding one of the first-level subsets of the data records, and excludes a non-filterable subset of the first set of data fields;
  
  (g) each second-level binary string segment encodes a subrange of data strings in a selected filterable subset of the second set of data fields of a corresponding one of the second-level subsets of the data records, and excludes a non-filterable subset of the second set of data fields;
  
  (h) for a clumped set of the defined data fields, which clumped set excludes data fields of the first and second sets, each combination of specific data strings that occurs in the dataset is indicated by a corresponding one of the plurality of clump data records of the clump header table;
  
  (i) each clump data record in the clump header table includes an indicator of a location in the inline tree data structure of a corresponding first-level binary string segment;
  
  (j) each of the one or more auxiliary data structures comprises electronic indicia of a corresponding auxiliary set of data fields, which auxiliary set of data fields comprises (1) one or more of the defined data fields or (2) one or more additional data fields that are not among the defined data fields; and
  
  (k) the electronic indicia of each one of the one or more auxiliary data structures comprise a corresponding set of auxiliary binary string segments, a corresponding auxiliary inline tree data structure, or a corresponding set of auxiliary alphanumeric string segments.

15. An article comprising one or more tangible, non-transitory computer-readable media encoded to store an alphanumeric or binary clump header table, an inline tree data structure, and one or more auxiliary data structures, wherein:
- (a) the alphanumeric or binary clump header table, the inline tree data structure, and the one or more auxiliary data structures are derived from a dataset comprising a multitude of alphanumeric data records;
  
  (b) each data record includes data strings for multiple corresponding defined data fields;
  
  (c) the alphanumeric or binary clump header table comprises a plurality of clump data records;
  
  (d) first and second sets of the one or more data fields among the defined data fields define a hierarchical tree relationship among subranges of data strings of the data fields of the first and second sets, which subranges correspond to first-level and second-level subsets, respectively, of the data records of the dataset;
  
  (e) the inline tree data structure comprises a sequence of (1) multiple first-level binary string segments, each followed by (2) a subset of one or more corresponding second-level binary string segments;
  
  (f) each first-level binary string segment encodes a subrange of data strings in a selected filterable subset of the first set of data fields of a corresponding one of the first-level subsets of the data records, and excludes a non-filterable subset of the first set of data fields;
  
  (g) each second-level binary string segment encodes a subrange of data strings in a selected filterable subset of the second set of data fields of a corresponding one of the second-level subsets of the data records, and excludes a non-filterable subset of the second set of data fields;
  
  (h) for a clumped set of the defined data fields, which clumped set excludes data fields of the first and second sets, each combination of specific data strings that occurs in the dataset is indicated by a corresponding one of the plurality of clump data records of the clump header table;
  
  (i) each clump data record in the clump header table includes an indicator of a location in the inline tree data structure of a corresponding first-level binary string segment;
  
  (j) each of the one or more auxiliary data structures comprises electronic indicia of a corresponding auxiliary set of data fields, which auxiliary set of data fields comprises (1) one or more of the defined data fields or (2) one or more additional data fields that are not among the defined data fields; and
  
  (k) the electronic indicia of each one of the one or more auxiliary data structures comprise a corresponding set of auxiliary binary string segments, a corresponding auxiliary inline tree data structure, or a corresponding set of auxiliary alphanumeric string segments.
- View Dependent Claims (16, 17, 18)
- - 16. The article of claim 15 wherein one or more of the computer-readable media encoded to store the inline tree data structure is directly accessible to a computer processor.
  - 17. The article of claim 15 wherein one or more of the computer-readable media encoded to store at least one of the sets of auxiliary binary string segments or at least one of the auxiliary inline tree structures is directly accessible to a computer processor.
  - 18. The article of claim 17 wherein one or more of the media directly accessible to the computer processor comprise random access memory.

19. A computer-implemented method for searching a clump header table, an inline tree data structure, and one or more auxiliary data structures stored on one or more tangible, non-transitory computer-readable media, wherein:
- (a) the alphanumeric or binary clump header table, the inline tree data structure, and the one or more auxiliary data structures are derived from a dataset comprising a multitude of alphanumeric data records;
  
  (b) each data record includes data strings for multiple corresponding defined data fields;
  
  (c) the alphanumeric or binary clump header table comprises a plurality of clump data records;
  
  (d) first and second sets of the one or more data fields among the defined data fields define a hierarchical tree relationship among subranges of data strings of the data fields of the first and second sets, which subranges correspond to first-level and second-level subsets, respectively, of the data records of the dataset;
  
  (e) the inline tree data structure comprises a sequence of (1) multiple first-level binary string segments, each followed by (2) a subset of one or more corresponding second-level binary string segments;
  
  (f) each first-level binary string segment encodes a subrange of data strings in a selected filterable subset of the first set of data fields of a corresponding one of the first-level subsets of the data records, and excludes a non-filterable subset of the first set of data fields;
  
  (g) each second-level binary string segment encodes a subrange of data strings in a selected filterable subset of the second set of data fields of a corresponding one of the second-level subsets of the data records, and excludes a non-filterable subset of the second set of data fields;
  
  (h) for a clumped set of the defined data fields, which clumped set excludes data fields of the first and second sets, each combination of specific data strings that occurs in the dataset is indicated by a corresponding one of the plurality of clump data records of the clump header table;
  
  (i) each clump data record in the clump header table includes an indicator of a location in the inline tree data structure of a corresponding first-level binary string segment;
  
  (j) each of the one or more auxiliary data structures comprises electronic indicia of a corresponding auxiliary set of data fields, which auxiliary set of data fields comprises (1) one or more of the defined data fields or (2) one or more additional data fields that are not among the defined data fields; and
  
  (k) the electronic indicia of each one of the one or more auxiliary data structures comprise a corresponding set of auxiliary binary string segments, a corresponding auxiliary inline tree data structure, or a corresponding set of auxiliary alphanumeric string segments,the method comprising;
  
  (A) receiving, at one or more computer processors programmed for performing the method and operatively coupled to the one or more computer-readable media, an electronic query for data records, or an enumeration thereof, having data strings in one or more specified clumped, filterable, or auxiliary data fields that fall within corresponding specified filter subranges for those data fields;
  
  (B) in response to the query of part (A), with the one or more computer processors, automatically electronically interrogating the clump header table to identify one or more clump data records that correspond to data strings in specified clump data fields that fall within the specified filter subranges according to the query of part (A);
  
  (C) automatically electronically interrogating, with the one or more computer processors, those first-level binary string segments indicated by the clump data records identified in part (B), to identify one or more first-level binary string segments that indicate one or more data records that have data strings in specified filterable data fields within the specified filter subranges according to the query of in part (A);
  
  (D) automatically electronically interrogating, with the one or more computer processors, those second-level binary string segments corresponding to the first-level binary string segments identified in part (C), to identify one or more second-level binary string segments that indicate one or more data records in specified filterable data fields that have data strings within the specified filter subranges according to the query of part (A);
  
  (E) in response to the query of part (A), with the one or more computer processors, automatically electronically interrogating the one or more auxiliary data structures to identify one or more data records that correspond to data strings in specified auxiliary data fields that fall within the specified filter subranges according to the query of part (A); and
  
  (F) automatically generating, with the one or more computer processor, a list or an enumeration of one or more data records that correspond to the clump data records identified in part (B), the first-level binary strings segments identified in part (C), the second-level binary strings identified in part (D), or the data records identified in part (E).
- View Dependent Claims (20, 21)
- - 20. The method of claim 19 wherein the inline tree data structure is stored in one or more computer-readable media that are directly accessible to the computer processor of part (C), (D), or (E).
  - 21. The method of claim 19 wherein at least one of the one or more auxiliary data structures is stored in one or more computer-readable media that are directly accessible to the computer processor of part (C), (D), or (E).

22. An article comprising one or more tangible, non-transitory computer-readable data-output media encoded to store electronic indicia of a list or an enumeration of data records generated by a computer-implemented method for searching a clump header table, an inline tree data structure, and one or more auxiliary data structures encoded onto one or more tangible, non-transitory computer-readable data-storage media, wherein:
- (a) the alphanumeric or binary clump header table, the inline tree data structure, and the one or more auxiliary data structures are derived from a dataset comprising a multitude of alphanumeric data records;
  
  (b) each data record includes data strings for multiple corresponding defined data fields;
  
  (c) the alphanumeric or binary clump header table comprises a plurality of clump data records;
  
  (d) first and second sets of the one or more data fields among the defined data fields define a hierarchical tree relationship among subranges of data strings of the data fields of the first and second sets, which subranges correspond to first-level and second-level subsets, respectively, of the data records of the dataset;
  
  (e) the inline tree data structure comprises a sequence of (1) multiple first-level binary string segments, each followed by (2) a subset of one or more corresponding second-level binary string segments;
  
  (f) each first-level binary string segment encodes a subrange of data strings in a selected filterable subset of the first set of data fields of a corresponding one of the first-level subsets of the data records, and excludes a non-filterable subset of the first set of data fields;
  
  (g) each second-level binary string segment encodes a subrange of data strings in a selected filterable subset of the second set of data fields of a corresponding one of the second-level subsets of the data records, and excludes a non-filterable subset of the second set of data fields;
  
  (h) for a clumped set of the defined data fields, which clumped set excludes data fields of the first and second sets, each combination of specific data strings that occurs in the dataset is indicated by a corresponding one of the plurality of clump data records of the clump header table;
  
  (i) each clump data record in the clump header table includes an indicator of a location in the inline tree data structure of a corresponding first-level binary string segment;
  
  (j) each of the one or more auxiliary data structures comprises electronic indicia of a corresponding auxiliary set of data fields, which auxiliary set of data fields comprises (1) one or more of the defined data fields or (2) one or more additional data fields that are not among the defined data fields; and
  
  (k) the electronic indicia of each one of the one or more auxiliary data structures comprise a corresponding set of auxiliary binary string segments, a corresponding auxiliary inline tree data structure, or a corresponding set of auxiliary alphanumeric string segments,and the method comprises;
  
  (A) receiving, at one or more computer processors programmed for performing the method and operatively coupled to the one or more data-storage media and the one or more data-output media, an electronic query for data records, or an enumeration thereof, having data strings in one or more specified clumped, filterable, or auxiliary data fields that fall within corresponding specified filter subranges for those data fields;
  
  (B) in response to the query of part (A), with the one or more computer processors, automatically electronically interrogating the clump header table to identify one or more clump data records that correspond to data strings in specified clump data fields that fall within the specified filter subranges according to the query of part (A);
  
  (C) automatically electronically interrogating, with the one or more computer processors, those first-level binary string segments indicated by the clump data records identified in part (B), to identify one or more first-level binary string segments that indicate one or more data records that have data strings in specified filterable data fields within the specified filter subranges according to the query of in part (A);
  
  (D) automatically electronically interrogating, with the one or more computer processors, those second-level binary string segments corresponding to the first-level binary string segments identified in part (C), to identify one or more second-level binary string segments that indicate one or more data records in specified filterable data fields that have data strings within the specified filter subranges according to the query of part (A);
  
  (E) in response to the query of part (A), with the one or more computer processors, automatically electronically interrogating the one or more auxiliary data structures to identify one or more data records that correspond to data strings in specified auxiliary data fields that fall within the specified filter subranges according to the query of part (A); and
  
  (F) automatically generating and encoding onto the one or more data-output media, with the one or more computer processors, the list or the enumeration of one or more data records, the listed or enumerated data records corresponding to the clump data records identified in part (B), the first-level binary strings segments identified in part (C), the second-level binary strings identified in part (D), or the data records identified in part (E).

23. A computer system structured and connected to perform a method for searching a clump header table, an inline tree data structure, and one or more auxiliary data structures stored on one or more tangible, non-transitory, computer-readable data-storage media, the computer system comprising one or more programmed computer processors, the one or more data-storage media operatively coupled to the one or more processors, and one or more tangible, non-transitory, computer-readable data-output media operatively coupled to the one or more processors, wherein:
- (a) the alphanumeric or binary clump header table, the inline tree data structure, and the one or more auxiliary data structures are derived from a dataset comprising a multitude of alphanumeric data records;
  
  (b) each data record includes data strings for multiple corresponding defined data fields;
  
  (c) the alphanumeric or binary clump header table comprises a plurality of clump data records;
  
  (d) first and second sets of the one or more data fields among the defined data fields define a hierarchical tree relationship among subranges of data strings of the data fields of the first and second sets, which subranges correspond to first-level and second-level subsets, respectively, of the data records of the dataset;
  
  (e) the inline tree data structure comprises a sequence of (1) multiple first-level binary string segments, each followed by (2) a subset of one or more corresponding second-level binary string segments;
  
  (f) each first-level binary string segment encodes a subrange of data strings in a selected filterable subset of the first set of data fields of a corresponding one of the first-level subsets of the data records, and excludes a non-filterable subset of the first set of data fields;
  
  (g) each second-level binary string segment encodes a subrange of data strings in a selected filterable subset of the second set of data fields of a corresponding one of the second-level subsets of the data records, and excludes a non-filterable subset of the second set of data fields;
  
  (h) for a clumped set of the defined data fields, which clumped set excludes data fields of the first and second sets, each combination of specific data strings that occurs in the dataset is indicated by a corresponding one of the plurality of clump data records of the clump header table;
  
  (i) each clump data record in the clump header table includes an indicator of a location in the inline tree data structure of a corresponding first-level binary string segment;
  
  (j) each of the one or more auxiliary data structures comprises electronic indicia of a corresponding auxiliary set of data fields, which auxiliary set of data fields comprises (1) one or more of the defined data fields or (2) one or more additional data fields that are not among the defined data fields; and
  
  (k) the electronic indicia of each one of the one or more auxiliary data structures comprise a corresponding set of auxiliary binary string segments, a corresponding auxiliary inline tree data structure, or a corresponding set of auxiliary alphanumeric string segments,and the method comprises;
  
  (A) receiving at the one or more computer processors an electronic query for data records, or an enumeration thereof, having data strings in one or more specified clumped, filterable, or auxiliary data fields that fall within corresponding specified filter subranges for those data fields;
  
  (B) in response to the query of part (A), with the one or more computer processors, automatically electronically interrogating the clump header table to identify one or more clump data records that correspond to data strings in specified clump data fields that fall within the specified filter subranges according to the query of part (A);
  
  (C) automatically electronically interrogating, with the one or more computer processors, those first-level binary string segments indicated by the clump data records identified in part (B), to identify one or more first-level binary string segments that indicate one or more data records that have data strings in specified filterable data fields within the specified filter subranges according to the query of in part (A);
  
  (D) automatically electronically interrogating, with the one or more computer processors, those second-level binary string segments corresponding to the first-level binary string segments identified in part (C), to identify one or more second-level binary string segments that indicate one or more data records in specified filterable data fields that have data strings within the specified filter subranges according to the query of part (A);
  
  (E) in response to the query of part (A), with the one or more computer processors, automatically electronically interrogating the one or more auxiliary data structures to identify one or more data records that correspond to data strings in specified auxiliary data fields that fall within the specified filter subranges according to the query of part (A); and
  
  (F) automatically generating and storing on the one or more data-output media, with the one or more computer processors, a list or an enumeration of one or more data records that correspond to the clump data records identified in part (B), the first-level binary strings segments identified in part (C), the second-level binary strings identified in part (D), or the data records identified in part (E).

24. An article comprising one or more tangible, non-transitory program-storage media encoding computer-readable instructions that, when applied to a computer system, instruct the computer system to perform a method for searching a clump header table, an inline tree data structure, and one or more auxiliary data structures stored on a tangible, non-transitory computer-readable data-storage medium, wherein:
- (a) the alphanumeric or binary clump header table, the inline tree data structure, and the one or more auxiliary data structures are derived from a dataset comprising a multitude of alphanumeric data records;
  
  (b) each data record includes data strings for multiple corresponding defined data fields;
  
  (c) the alphanumeric or binary clump header table comprises a plurality of clump data records;
  
  (d) first and second sets of the one or more data fields among the defined data fields define a hierarchical tree relationship among subranges of data strings of the data fields of the first and second sets, which subranges correspond to first-level and second-level subsets, respectively, of the data records of the dataset;
  
  (e) the inline tree data structure comprises a sequence of (1) multiple first-level binary string segments, each followed by (2) a subset of one or more corresponding second-level binary string segments;
  
  (f) each first-level binary string segment encodes a subrange of data strings in a selected filterable subset of the first set of data fields of a corresponding one of the first-level subsets of the data records, and excludes a non-filterable subset of the first set of data fields;
  
  (g) each second-level binary string segment encodes a subrange of data strings in a selected filterable subset of the second set of data fields of a corresponding one of the second-level subsets of the data records, and excludes a non-filterable subset of the second set of data fields;
  
  (h) for a clumped set of the defined data fields, which clumped set excludes data fields of the first and second sets, each combination of specific data strings that occurs in the dataset is indicated by a corresponding one of the plurality of clump data records of the clump header table;
  
  (i) each clump data record in the clump header table includes an indicator of a location in the inline tree data structure of a corresponding first-level binary string segment;
  
  (j) each of the one or more auxiliary data structures comprises electronic indicia of a corresponding auxiliary set of data fields, which auxiliary set of data fields comprises (1) one or more of the defined data fields or (2) one or more additional data fields that are not among the defined data fields; and
  
  (k) the electronic indicia of each one of the one or more auxiliary data structures comprise a corresponding set of auxiliary binary string segments, a corresponding auxiliary inline tree data structure, or a corresponding set of auxiliary alphanumeric string segments,and the method comprises;
  
  (A) receiving, at one or more computer processors of the computer system programmed for performing the method and operatively coupled to the one or more data-storage media, an electronic query for data records, or an enumeration thereof, having data strings in one or more specified clumped, filterable, or auxiliary data fields that fall within corresponding specified filter subranges for those data fields;
  
  (B) in response to the query of part (A), with the one or more computer processors, automatically electronically interrogating the clump header table to identify one or more clump data records that correspond to data strings in specified clump data fields that fall within the specified filter subranges according to the query of part (A);
  
  (C) automatically electronically interrogating, with the one or more computer processors, those first-level binary string segments indicated by the clump data records identified in part (B), to identify one or more first-level binary string segments that indicate one or more data records that have data strings in specified filterable data fields within the specified filter subranges according to the query of in part (A);
  
  (D) automatically electronically interrogating, with the one or more computer processors, those second-level binary string segments corresponding to the first-level binary string segments identified in part (C), to identify one or more second-level binary string segments that indicate one or more data records in specified filterable data fields that have data strings within the specified filter subranges according to the query of part (A);
  
  (E) in response to the query of part (A), with the one or more computer processors, automatically electronically interrogating the one or more auxiliary data structures to identify one or more data records that correspond to data strings in specified auxiliary data fields that fall within the specified filter subranges according to the query of part (A); and
  
  (F) automatically generating, with the one or more computer processors, a list or an enumeration of one or more data records that correspond to the clump data records identified in part (B), the first-level binary strings segments identified in part (C), the second-level binary strings identified in part (D), or the data records identified in part (E).

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Moonshadow Mobile, Inc.
Original Assignee
Moonshadow Mobile, Inc.
Inventors
Ward, Roy W.
Primary Examiner(s)
ROBINSON, GRETA LEE

Application Number

US13/733,890
Time in Patent Office

1,026 Days
Field of Search

707/693, 707/795, 707/713, 707/715, 707/716, 707/793, 707/797, 707/800, 707/798, 707/812, 707/E17.039, 707/754, 707/791, 707/796, 707/802, 707/803, 707/769, 707/778, 707/811, 711/117, 711/154
US Class Current

1/1
CPC Class Codes

G06F 16/2246   Trees, e.g. B+trees

G06F 16/2453   Query optimisation

G06F 16/282   Hierarchical databases, e.g...

Systems and methods for high-speed searching and filtering of large datasets

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

24 Claims

Specification

Solutions

Use Cases

Quick Links

Systems and methods for high-speed searching and filtering of large datasets

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

24 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links