Systems and methods for high-speed searching and filtering of large datasets
First Claim
1. A computer system comprising one or more computer processors and one or more computer-readable non-transitory storage media structured and connected to perform a method comprising:
- (a) generating, from a multitude of alphanumeric data records, using one or more of the computer processors programmed therefor, (1) an alphanumeric or binary clump header table comprising a plurality of clump data records, (2) an inline tree data structure, and (3) one or more auxiliary data structures; and
(b) storing the clump header table, the inline tree data structure, and the one or more auxiliary data structures on one of the computer-readable storage media,wherein;
(c) the multitude of alphanumeric data records represent a dataset, each alphanumeric data record includes data strings for multiple corresponding defined data fields, and the clump header table, the inline tree data structure, and the one or more auxiliary data structures also represent said dataset;
(d) first and second sets of the one or more data fields among the defined data fields define a hierarchical tree relationship among subranges of data strings of the data fields of the first and second sets, which subranges correspond to first-level and second-level subsets, respectively, of the data records of the dataset;
(e) the inline tree data structure comprises a sequence of (1) multiple first-level binary string segments, each followed by (2) a subset of one or more corresponding second-level binary string segments;
(f) each first-level binary string segment encodes a subrange of data strings in a selected filterable subset of the first set of data fields of a corresponding one of the first-level subsets of the data records, and excludes a non-filterable subset of the first set of data fields;
(g) each second-level binary string segment encodes a subrange of data strings in a selected filterable subset of the second set of data fields of a corresponding one of the second-level subsets of the data records, and excludes a non-filterable subset of the second set of data fields;
(h) for a clumped set of the defined data fields, which clumped set excludes data fields of the first and second sets, each combination of specific data strings that occurs in the dataset is indicated by a corresponding one of the plurality of clump data records of the clump header table;
(i) each clump data record in the clump header table includes an indicator of a location in the inline tree data structure of a corresponding first-level binary string segment;
(j) each of the one or more auxiliary data structures comprises electronic indicia of a corresponding auxiliary set of data fields, which auxiliary set of data fields comprises (1) one or more of the defined data fields or (2) one or more additional data fields that are not among the defined data fields; and
(k) the electronic indicia of each one of the one or more auxiliary data structures comprise a corresponding set of auxiliary binary string segments, a corresponding auxiliary inline tree data structure, or a corresponding set of auxiliary alphanumeric string segments.
0 Assignments
0 Petitions
Accused Products
Abstract
A data structure comprises a clump header table, an inline tree data structure, and one or more auxiliary data structures. Each clump header record includes an indicator of a location in the inline tree data structure of corresponding binary string segments. Clump header records or auxiliary header records include indicators of corresponding locations in the corresponding auxiliary data structure. Each auxiliary data structure can be altered without necessarily altering the inline tree or clump header table. A dedicated, specifically adapted conversion program generates the clump header file, the inline tree data structure, and the one or more auxiliary data structures. The data structure can be stored on any computer-readable medium, and can be read entirely into RAM to be searched (with or without filtering on one or more filter data fields). A dedicated, specifically adapted search and filter program is employed, which can list or enumerate the retrieved data records.
-
Citations
24 Claims
-
1. A computer system comprising one or more computer processors and one or more computer-readable non-transitory storage media structured and connected to perform a method comprising:
-
(a) generating, from a multitude of alphanumeric data records, using one or more of the computer processors programmed therefor, (1) an alphanumeric or binary clump header table comprising a plurality of clump data records, (2) an inline tree data structure, and (3) one or more auxiliary data structures; and (b) storing the clump header table, the inline tree data structure, and the one or more auxiliary data structures on one of the computer-readable storage media, wherein; (c) the multitude of alphanumeric data records represent a dataset, each alphanumeric data record includes data strings for multiple corresponding defined data fields, and the clump header table, the inline tree data structure, and the one or more auxiliary data structures also represent said dataset; (d) first and second sets of the one or more data fields among the defined data fields define a hierarchical tree relationship among subranges of data strings of the data fields of the first and second sets, which subranges correspond to first-level and second-level subsets, respectively, of the data records of the dataset; (e) the inline tree data structure comprises a sequence of (1) multiple first-level binary string segments, each followed by (2) a subset of one or more corresponding second-level binary string segments; (f) each first-level binary string segment encodes a subrange of data strings in a selected filterable subset of the first set of data fields of a corresponding one of the first-level subsets of the data records, and excludes a non-filterable subset of the first set of data fields; (g) each second-level binary string segment encodes a subrange of data strings in a selected filterable subset of the second set of data fields of a corresponding one of the second-level subsets of the data records, and excludes a non-filterable subset of the second set of data fields; (h) for a clumped set of the defined data fields, which clumped set excludes data fields of the first and second sets, each combination of specific data strings that occurs in the dataset is indicated by a corresponding one of the plurality of clump data records of the clump header table; (i) each clump data record in the clump header table includes an indicator of a location in the inline tree data structure of a corresponding first-level binary string segment; (j) each of the one or more auxiliary data structures comprises electronic indicia of a corresponding auxiliary set of data fields, which auxiliary set of data fields comprises (1) one or more of the defined data fields or (2) one or more additional data fields that are not among the defined data fields; and (k) the electronic indicia of each one of the one or more auxiliary data structures comprise a corresponding set of auxiliary binary string segments, a corresponding auxiliary inline tree data structure, or a corresponding set of auxiliary alphanumeric string segments.
-
-
2. An article comprising one or more tangible, non-transitory program-storage media encoding computer-readable instructions that, when applied to a computer system comprising one or more programmed electronic processors operatively coupled to one or more computer-readable storage media, instruct the computer system to perform a method comprising:
-
(a) generating, from a multitude of alphanumeric data records, using one or more of the computer processors programmed therefor, (1) an alphanumeric or binary clump header table comprising a plurality of clump data records, (2) an inline tree data structure, and (3) one or more auxiliary data structures; and (b) storing the clump header table, the inline tree data structure, and the one or more auxiliary data structures on one of the computer-readable storage media, wherein; (c) the multitude of alphanumeric data records represent a dataset, each alphanumeric data record includes data strings for multiple corresponding defined data fields, and the clump header table, the inline tree data structure, and the one or more auxiliary data structures also represent said dataset; (d) first and second sets of the one or more data fields among the defined data fields define a hierarchical tree relationship among subranges of data strings of the data fields of the first and second sets, which subranges correspond to first-level and second-level subsets, respectively, of the data records of the dataset; (e) the inline tree data structure comprises a sequence of (1) multiple first-level binary string segments, each followed by (2) a subset of one or more corresponding second-level binary string segments; (f) each first-level binary string segment encodes a subrange of data strings in a selected filterable subset of the first set of data fields of a corresponding one of the first-level subsets of the data records, and excludes a non-filterable subset of the first set of data fields; (g) each second-level binary string segment encodes a subrange of data strings in a selected filterable subset of the second set of data fields of a corresponding one of the second-level subsets of the data records, and excludes a non-filterable subset of the second set of data fields; (h) for a clumped set of the defined data fields, which clumped set excludes data fields of the first and second sets, each combination of specific data strings that occurs in the dataset is indicated by a corresponding one of the plurality of clump data records of the clump header table; (i) each clump data record in the clump header table includes an indicator of a location in the inline tree data structure of a corresponding first-level binary string segment; (j) each of the one or more auxiliary data structures comprises electronic indicia of a corresponding auxiliary set of data fields, which auxiliary set of data fields comprises (1) one or more of the defined data fields or (2) one or more additional data fields that are not among the defined data fields; and (k) the electronic indicia of each one of the one or more auxiliary data structures comprise a corresponding set of auxiliary binary string segments, a corresponding auxiliary inline tree data structure, or a corresponding set of auxiliary alphanumeric string segments.
-
-
3. An article comprising one or more tangible, non-transitory computer-readable data-storage media encoded to store the clump header table, the inline tree data structure, and the one or more auxiliary data structures generated by a method implemented using a computer system comprising one or more programmed electronic processors operatively coupled to one or more computer-readable storage media, the method comprising:
-
(a) generating, from a multitude of alphanumeric data records, using one or more of the computer processors programmed therefor, (1) an alphanumeric or binary clump header table comprising a plurality of clump data records, (2) an inline tree data structure, and (3) one or more auxiliary data structures; and (b) storing the clump header table, the inline tree data structure, and the one or more auxiliary data structures on one of the computer-readable storage media, wherein; (c) the multitude of alphanumeric data records represent a dataset, each alphanumeric data record includes data strings for multiple corresponding defined data fields, and the clump header table, the inline tree data structure, and the one or more auxiliary data structures also represent said dataset; (d) first and second sets of the one or more data fields among the defined data fields define a hierarchical tree relationship among subranges of data strings of the data fields of the first and second sets, which subranges correspond to first-level and second-level subsets, respectively, of the data records of the dataset; (e) the inline tree data structure comprises a sequence of (1) multiple first-level binary string segments, each followed by (2) a subset of one or more corresponding second-level binary string segments; (f) each first-level binary string segment encodes a subrange of data strings in a selected filterable subset of the first set of data fields of a corresponding one of the first-level subsets of the data records, and excludes a non-filterable subset of the first set of data fields; (g) each second-level binary string segment encodes a subrange of data strings in a selected filterable subset of the second set of data fields of a corresponding one of the second-level subsets of the data records, and excludes a non-filterable subset of the second set of data fields; (h) for a clumped set of the defined data fields, which clumped set excludes data fields of the first and second sets, each combination of specific data strings that occurs in the dataset is indicated by a corresponding one of the plurality of clump data records of the clump header table; (i) each clump data record in the clump header table includes an indicator of a location in the inline tree data structure of a corresponding first-level binary string segment; (j) each of the one or more auxiliary data structures comprises electronic indicia of a corresponding auxiliary set of data fields, which auxiliary set of data fields comprises (1) one or more of the defined data fields or (2) one or more additional data fields that are not among the defined data fields; and (k) the electronic indicia of each one of the one or more auxiliary data structures comprise a corresponding set of auxiliary binary string segments, a corresponding auxiliary inline tree data structure, or a corresponding set of auxiliary alphanumeric string segments. - View Dependent Claims (4, 5, 6)
-
-
7. A method, implemented using a computer system comprising one or more programmed electronic processors operatively coupled to one or more computer-readable storage media, for searching an alphanumeric or binary clump header table, an inline tree data structure, and one or more auxiliary data structures stored on one or more tangible, non-transitory computer-readable data-storage media operatively coupled to one or more of the one or more programmed electronic processors, wherein the clump header table, the inline tree data structure, and the one or more auxiliary data structures represent a dataset that is also represented by a multitude of alphanumeric data records, the method comprising:
-
(A) receiving an electronic query for data records, or an enumeration thereof, having data strings in one or more specified clumped, filterable, or auxiliary data fields that fall within corresponding specified filter subranges for those data fields; (B) in response to the query of part (A), with one or more of the computer processors programmed therefor and linked to the one or more computer-readable data-storage media, automatically electronically interrogating the clump header table to identify one or more clump data records that correspond to data strings in specified clump data fields that fall within the specified filter subranges according to the query of part (A); (C) automatically electronically interrogating, with one or more of the computer processors programmed therefor and linked to the one or more computer-readable data-storage media, those first-level binary string segments indicated by the clump data records identified in part (B), to identify one or more first-level binary string segments that indicate one or more data records that have data strings in specified filterable data fields within the specified filter subranges according to the query of in part (A); (D) automatically electronically interrogating, with one or more of the computer processors programmed therefor and linked to the one or more computer-readable data-storage media, those second-level binary string segments corresponding to the first-level binary string segments identified in part (C), to identify one or more second-level binary string segments that indicate one or more data records in specified filterable data fields that have data strings within the specified filter subranges according to the query of part (A); (E) in response to the query of part (A), with one of the computer processors programmed therefor and linked to the one or more computer-readable data-storage media, automatically electronically interrogating the one or more auxiliary data structures to identify one or more data records that correspond to data strings in specified auxiliary data fields that fall within the specified filter subranges according to the query of part (A); and (F) automatically generating, with one of the computer processors programmed therefor, a list or an enumeration of one or more data records that correspond to the clump data records identified in part (B), the first-level binary strings segments identified in part (C), the second-level binary strings identified in part (D), or the data records identified in part (E), wherein; (a) each alphanumeric data record includes data strings for multiple corresponding defined data fields; (b) first and second sets of the one or more data fields among the defined data fields define a hierarchical tree relationship among subranges of data strings of the data fields of the first and second sets, which subranges correspond to first-level and second-level subsets, respectively, of the data records of the dataset; (c) the inline tree data structure comprises a sequence of (1) multiple first-level binary string segments, each followed by (2) a subset of one or more corresponding second-level binary string segments; (d) each first-level binary string segment encodes a subrange of data strings in a selected filterable subset of the first set of data fields of a corresponding one of the first-level subsets of the data records, and excludes a non-filterable subset of the first set of data fields; (e) each second-level binary string segment encodes a subrange of data strings in a selected filterable subset of the second set of data fields of a corresponding one of the second-level subsets of the data records, and excludes a non-filterable subset of the second set of data fields; (f) for a clumped set of the defined data fields, which clumped set excludes data fields of the first and second sets, each combination of specific data strings that occurs in the dataset is indicated by a corresponding one of the plurality of clump data records of the clump header table; (g) each clump data record in the clump header table includes an indicator of a location in the inline tree data structure of a corresponding first-level binary string segment; (h) each of the one or more auxiliary data structures comprises electronic indicia of a corresponding auxiliary set of data fields, which auxiliary set of data fields comprises (1) one or more of the defined data fields or (2) one or more additional data fields that are not among the defined data fields; and (i) the electronic indicia of each one of the one or more auxiliary data structures comprise a corresponding set of auxiliary binary string segments, a corresponding auxiliary inline tree data structure, or a corresponding set of auxiliary alphanumeric string segments. - View Dependent Claims (8, 9)
-
-
10. An article comprising one or more tangible, non-transitory computer-readable data-output media encoded to store electronic indicia of a list or enumeration of data records, wherein the list or enumeration is generated by a method, implemented using a computer system comprising one or more programmed electronic processors operatively coupled to one or more computer-readable storage media, for searching an alphanumeric or binary clump header table, an inline tree data structure, and one or more auxiliary data structures stored on one or more tangible, non-transitory computer-readable data-storage media operatively coupled to one or more of the one or more programmed electronic processors, wherein the clump header table, the inline tree data structure, and the one or more auxiliary data structures represent a dataset that is also represented by a multitude of alphanumeric data records, the method comprising:
-
(A) receiving an electronic query for data records, or an enumeration thereof, having data strings in one or more specified clumped, filterable, or auxiliary data fields that fall within corresponding specified filter subranges for those data fields; (B) in response to the query of part (A), with one or more of the computer processors programmed therefor and linked to the one or more computer-readable data-storage media, automatically electronically interrogating the clump header table to identify one or more clump data records that correspond to data strings in specified clump data fields that fall within the specified filter subranges according to the query of part (A); (C) automatically electronically interrogating, with one or more of the computer processors programmed therefor and linked to the one or more computer-readable data-storage media, those first-level binary string segments indicated by the clump data records identified in part (B), to identify one or more first-level binary string segments that indicate one or more data records that have data strings in specified filterable data fields within the specified filter subranges according to the query of in part (A); (D) automatically electronically interrogating, with one or more of the computer processors programmed therefor and linked to the one or more computer-readable data-storage media, those second-level binary string segments corresponding to the first-level binary string segments identified in part (C), to identify one or more second-level binary string segments that indicate one or more data records in specified filterable data fields that have data strings within the specified filter subranges according to the query of part (A); (E) in response to the query of part (A), with one of the computer processors programmed therefor and linked to the one or more computer-readable data-storage media, automatically electronically interrogating the one or more auxiliary data structures to identify one or more data records that correspond to data strings in specified auxiliary data fields that fall within the specified filter subranges according to the query of part (A); and (F) automatically generating, with one of the computer processors programmed therefor, a list or an enumeration of one or more data records that correspond to the clump data records identified in part (B), the first-level binary strings segments identified in part (C), the second-level binary strings identified in part (D), or the data records identified in part (E), wherein; (a) each alphanumeric data record includes data strings for multiple corresponding defined data fields; (b) first and second sets of the one or more data fields among the defined data fields define a hierarchical tree relationship among subranges of data strings of the data fields of the first and second sets, which subranges correspond to first-level and second-level subsets, respectively, of the data records of the dataset; (c) the inline tree data structure comprises a sequence of (1) multiple first-level binary string segments, each followed by (2) a subset of one or more corresponding second-level binary string segments; (d) each first-level binary string segment encodes a subrange of data strings in a selected filterable subset of the first set of data fields of a corresponding one of the first-level subsets of the data records, and excludes a non-filterable subset of the first set of data fields; (e) each second-level binary string segment encodes a subrange of data strings in a selected filterable subset of the second set of data fields of a corresponding one of the second-level subsets of the data records, and excludes a non-filterable subset of the second set of data fields; (f) for a clumped set of the defined data fields, which clumped set excludes data fields of the first and second sets, each combination of specific data strings that occurs in the dataset is indicated by a corresponding one of the plurality of clump data records of the clump header table; (g) each clump data record in the clump header table includes an indicator of a location in the inline tree data structure of a corresponding first-level binary string segment; (h) each of the one or more auxiliary data structures comprises electronic indicia of a corresponding auxiliary set of data fields, which auxiliary set of data fields comprises (1) one or more of the defined data fields or (2) one or more additional data fields that are not among the defined data fields; and (i) the electronic indicia of each one of the one or more auxiliary data structures comprise a corresponding set of auxiliary binary string segments, a corresponding auxiliary inline tree data structure, or a corresponding set of auxiliary alphanumeric string segments.
-
-
11. A computer system comprising one or more computer processors, one or more tangible, non-transitory computer-readable data-storage media, and one or more tangible, non-transitory computer-readable data-output media structured and connected to perform a method for searching an alphanumeric or binary clump header table, an inline tree data structure, and one or more auxiliary data structures stored on one or more tangible, non-transitory computer-readable data-storage media operatively coupled to one or more of the one or more programmed electronic processors, wherein the clump header table, the inline tree data structure, and the one or more auxiliary data structures represent a dataset that is also represented by a multitude of alphanumeric data records, the method comprising:
-
(A) receiving an electronic query for data records, or an enumeration thereof, having data strings in one or more specified clumped, filterable, or auxiliary data fields that fall within corresponding specified filter subranges for those data fields; (B) in response to the query of part (A), with one or more of the computer processors programmed therefor and linked to the one or more computer-readable data-storage media, automatically electronically interrogating the clump header table to identify one or more clump data records that correspond to data strings in specified clump data fields that fall within the specified filter subranges according to the query of part (A); (C) automatically electronically interrogating, with one or more of the computer processors programmed therefor and linked to the one or more computer-readable data-storage media, those first-level binary string segments indicated by the clump data records identified in part (B), to identify one or more first-level binary string segments that indicate one or more data records that have data strings in specified filterable data fields within the specified filter subranges according to the query of in part (A); (D) automatically electronically interrogating, with one or more of the computer processors programmed therefor and linked to the one or more computer-readable data-storage media, those second-level binary string segments corresponding to the first-level binary string segments identified in part (C), to identify one or more second-level binary string segments that indicate one or more data records in specified filterable data fields that have data strings within the specified filter subranges according to the query of part (A); (E) in response to the query of part (A), with one of the computer processors programmed therefor and linked to the one or more computer-readable data-storage media, automatically electronically interrogating the one or more auxiliary data structures to identify one or more data records that correspond to data strings in specified auxiliary data fields that fall within the specified filter subranges according to the query of part (A); and (F) automatically generating, with one of the computer processors programmed therefor, a list or an enumeration of one or more data records that correspond to the clump data records identified in part (B), the first-level binary strings segments identified in part (C), the second-level binary strings identified in part (D), or the data records identified in part (E), wherein; (a) each alphanumeric data record includes data strings for multiple corresponding defined data fields; (b) first and second sets of the one or more data fields among the defined data fields define a hierarchical tree relationship among subranges of data strings of the data fields of the first and second sets, which subranges correspond to first-level and second-level subsets, respectively, of the data records of the dataset; (c) the inline tree data structure comprises a sequence of (1) multiple first-level binary string segments, each followed by (2) a subset of one or more corresponding second-level binary string segments; (d) each first-level binary string segment encodes a subrange of data strings in a selected filterable subset of the first set of data fields of a corresponding one of the first-level subsets of the data records, and excludes a non-filterable subset of the first set of data fields; (e) each second-level binary string segment encodes a subrange of data strings in a selected filterable subset of the second set of data fields of a corresponding one of the second-level subsets of the data records, and excludes a non-filterable subset of the second set of data fields; (f) for a clumped set of the defined data fields, which clumped set excludes data fields of the first and second sets, each combination of specific data strings that occurs in the dataset is indicated by a corresponding one of the plurality of clump data records of the clump header table; (g) each clump data record in the clump header table includes an indicator of a location in the inline tree data structure of a corresponding first-level binary string segment; (h) each of the one or more auxiliary data structures comprises electronic indicia of a corresponding auxiliary set of data fields, which auxiliary set of data fields comprises (1) one or more of the defined data fields or (2) one or more additional data fields that are not among the defined data fields; and (i) the electronic indicia of each one of the one or more auxiliary data structures comprise a corresponding set of auxiliary binary string segments, a corresponding auxiliary inline tree data structure, or a corresponding set of auxiliary alphanumeric string segments.
-
-
12. An article comprising one or more tangible, non-transitory program-storage media encoding computer-readable instructions that, when applied to a computer system comprising one or more programmed electronic processors operatively coupled to one or more computer-readable storage media, instruct the computer system to perform a method for searching an alphanumeric or binary clump header table, an inline tree data structure, and one or more auxiliary data structures stored on one or more tangible, non-transitory computer-readable data-storage media operatively coupled to one or more of the one or more programmed electronic processors, wherein the clump header table, the inline tree data structure, and the one or more auxiliary data structures represent a dataset that is also represented by a multitude of alphanumeric data records, the method comprising:
-
(A) receiving an electronic query for data records, or an enumeration thereof, having data strings in one or more specified clumped, filterable, or auxiliary data fields that fall within corresponding specified filter subranges for those data fields; (B) in response to the query of part (A), with one or more of the computer processors programmed therefor and linked to the one or more computer-readable data-storage media, automatically electronically interrogating the clump header table to identify one or more clump data records that correspond to data strings in specified clump data fields that fall within the specified filter subranges according to the query of part (A); (C) automatically electronically interrogating, with one or more of the computer processors programmed therefor and linked to the one or more computer-readable data-storage media, those first-level binary string segments indicated by the clump data records identified in part (B), to identify one or more first-level binary string segments that indicate one or more data records that have data strings in specified filterable data fields within the specified filter subranges according to the query of in part (A); (D) automatically electronically interrogating, with one or more of the computer processors programmed therefor and linked to the one or more computer-readable data-storage media, those second-level binary string segments corresponding to the first-level binary string segments identified in part (C), to identify one or more second-level binary string segments that indicate one or more data records in specified filterable data fields that have data strings within the specified filter subranges according to the query of part (A); (E) in response to the query of part (A), with one of the computer processors programmed therefor and linked to the one or more computer-readable data-storage media, automatically electronically interrogating the one or more auxiliary data structures to identify one or more data records that correspond to data strings in specified auxiliary data fields that fall within the specified filter subranges according to the query of part (A); and (F) automatically generating, with one of the computer processors programmed therefor, a list or an enumeration of one or more data records that correspond to the clump data records identified in part (B), the first-level binary strings segments identified in part (C), the second-level binary strings identified in part (D), or the data records identified in part (E), wherein; (a) each alphanumeric data record includes data strings for multiple corresponding defined data fields; (b) first and second sets of the one or more data fields among the defined data fields define a hierarchical tree relationship among subranges of data strings of the data fields of the first and second sets, which subranges correspond to first-level and second-level subsets, respectively, of the data records of the dataset; (c) the inline tree data structure comprises a sequence of (1) multiple first-level binary string segments, each followed by (2) a subset of one or more corresponding second-level binary string segments; (d) each first-level binary string segment encodes a subrange of data strings in a selected filterable subset of the first set of data fields of a corresponding one of the first-level subsets of the data records, and excludes a non-filterable subset of the first set of data fields; (e) each second-level binary string segment encodes a subrange of data strings in a selected filterable subset of the second set of data fields of a corresponding one of the second-level subsets of the data records, and excludes a non-filterable subset of the second set of data fields; (f) for a clumped set of the defined data fields, which clumped set excludes data fields of the first and second sets, each combination of specific data strings that occurs in the dataset is indicated by a corresponding one of the plurality of clump data records of the clump header table; (g) each clump data record in the clump header table includes an indicator of a location in the inline tree data structure of a corresponding first-level binary string segment; (h) each of the one or more auxiliary data structures comprises electronic indicia of a corresponding auxiliary set of data fields, which auxiliary set of data fields comprises (1) one or more of the defined data fields or (2) one or more additional data fields that are not among the defined data fields; and (i) the electronic indicia of each one of the one or more auxiliary data structures comprise a corresponding set of auxiliary binary string segments, a corresponding auxiliary inline tree data structure, or a corresponding set of auxiliary alphanumeric string segments.
-
-
13. A method implemented using a computer system comprising one or more programmed electronic processors operatively coupled to one or more computer-readable storage media, the method comprising:
-
(a) generating, from a multitude of alphanumeric data records, using one or more of the computer processors programmed therefor, (1) an alphanumeric or binary clump header table comprising a plurality of clump data records, (2) an inline tree data structure, and (3) one or more auxiliary data structures; and (b) storing the clump header table, the inline tree data structure, and the one or more auxiliary data structures on one of the computer-readable storage media, wherein; (c) the multitude of alphanumeric data records represent a dataset, each alphanumeric data record includes data strings for multiple corresponding defined data fields, and the clump header table, the inline tree data structure, and the one or more auxiliary data structures also represent said dataset; (d) first and second sets of the one or more data fields among the defined data fields define a hierarchical tree relationship among subranges of data strings of the data fields of the first and second sets, which subranges correspond to first-level and second-level subsets, respectively, of the data records of the dataset; (e) the inline tree data structure comprises a sequence of (1) multiple first-level binary string segments, each followed by (2) a subset of one or more corresponding second-level binary string segments; (f) each first-level binary string segment encodes a subrange of data strings in a selected filterable subset of the first set of data fields of a corresponding one of the first-level subsets of the data records, and excludes a non-filterable subset of the first set of data fields; (g) each second-level binary string segment encodes a subrange of data strings in a selected filterable subset of the second set of data fields of a corresponding one of the second-level subsets of the data records, and excludes a non-filterable subset of the second set of data fields; (h) for a clumped set of the defined data fields, which clumped set excludes data fields of the first and second sets, each combination of specific data strings that occurs in the dataset is indicated by a corresponding one of the plurality of clump data records of the clump header table; (i) each clump data record in the clump header table includes an indicator of a location in the inline tree data structure of a corresponding first-level binary string segment; (j) each of the one or more auxiliary data structures comprises electronic indicia of a corresponding auxiliary set of data fields, which auxiliary set of data fields comprises (1) one or more of the defined data fields or (2) one or more additional data fields that are not among the defined data fields; and (k) the electronic indicia of each one of the one or more auxiliary data structures comprise a corresponding set of auxiliary binary string segments, a corresponding auxiliary inline tree data structure, or a corresponding set of auxiliary alphanumeric string segments. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
Specification