System for iterated generation from an array of records of a posting file with row segments based on column entry value ranges
First Claim
1. A segmented posting file generating system for generating a segmented posting file in response to a record base comprising a plurality of the record base entries represented by an array comprising a plurality of columns and rows, the entries within each column having an order according to their respective word identifiers, the segmented posting file generating system generating said segmented posting file in a series of segment generation iteration, the segmented posting file generating system comprising:
- A. a computer for performing processing operations in response to commands;
B. a segmented posting file generation control including;
i. a segment word value identifier determination portion for providing commands to enable said computer to(a) select, for each segment generation iteration, an entry in each column of said record base as a segment word value determiner entry, and(b) identify as a segment word value identifier range value a value having a selected value relation to the word identifier values in the segment word value determiner entries; and
ii. a segment establishment portion for, during each segment generation iteration, providing commands to enable said computer to generate a series of rows of the segmented posting file in each of a series of segment row generation iterations, the commands enabling the computer to, for each column of the record base(a) determined whether, a record base entry of the column of the record base contains a word identifier having a value having a selected relation to the segment word identifier determination value and(b) in response to a positive determination, copy the entry in the record base to the entry in a corresponding column of the segmented posting file and select the next entry of the record base for the next segment row generation iteration.
5 Assignments
0 Petitions
Accused Products
Abstract
A query processing system for processing queries in connection with a document text base which has entries each identifying a document and a word in the document. The query processing system includes a plurality of processing elements for processing data in response to commands, and a control arrangement for controlling the processing elements in parallel. The control arrangement first enables the processing elements to generate a segmented posting file having entries, at least some of which have a word identifier and a document identifier. The entries form an array all of whose entries with the same document identifier are contained within one column. The rows of the segmented posting file are aggregated into segments each having a selected number of rows with each segment containing entries having word identifiers within an identified word identifier range. Thereafter, the control arrangement enables the processing elements to use the segmented posting file to process, in parallel, a query in a series of iterations each with respect to a query word. In each iteration, the processing elements receive respective portions of columns comprising a segment of the segmented posting file associated with the word identifier range containing the query word, then identify entries in the segment whose word identifiers correspond to the query word, and finally modify a score maintained for the document identified in the identified entry. Those documents which have a selected score at the end of the series of iterations have the required relationship to the query.
77 Citations
46 Claims
-
1. A segmented posting file generating system for generating a segmented posting file in response to a record base comprising a plurality of the record base entries represented by an array comprising a plurality of columns and rows, the entries within each column having an order according to their respective word identifiers, the segmented posting file generating system generating said segmented posting file in a series of segment generation iteration, the segmented posting file generating system comprising:
-
A. a computer for performing processing operations in response to commands; B. a segmented posting file generation control including; i. a segment word value identifier determination portion for providing commands to enable said computer to (a) select, for each segment generation iteration, an entry in each column of said record base as a segment word value determiner entry, and (b) identify as a segment word value identifier range value a value having a selected value relation to the word identifier values in the segment word value determiner entries; and ii. a segment establishment portion for, during each segment generation iteration, providing commands to enable said computer to generate a series of rows of the segmented posting file in each of a series of segment row generation iterations, the commands enabling the computer to, for each column of the record base (a) determined whether, a record base entry of the column of the record base contains a word identifier having a value having a selected relation to the segment word identifier determination value and (b) in response to a positive determination, copy the entry in the record base to the entry in a corresponding column of the segmented posting file and select the next entry of the record base for the next segment row generation iteration. - View Dependent Claims (2, 3)
-
-
4. A query processing system for processing queries in connection with a record base including a plurality of record base entries each of which includes a record identifier and a word identifier, each query containing at least one query word, the query processing system identifying records having a selected relationship between the query and the word identifiers contained in the record base, said query processing system comprising:
-
A. a plurality of processing elements, each for performing processing operations in response to commands; B. a control arrangement including; i. a segmented posting file generation control portion for providing commands to enable said processing elements to generate, in response to said record base, a segmented posting file, said segmented posting file having a plurality of segmented posting file entries, at least some of said segmented posting file entries having a word identifier and a record identifier, said segmented posting file entries being represented by an array comprising a plurality of columns and rows, said rows being aggregated into segments each having a selected number of rows with each segment containing entries having word identifiers within an identified word identifier range, the segmented posting file generation control portion generating commands so as to enable said processing elements to generate the columns of said segmented posting file in parallel; and ii. a query processing control portion for providing, in a series of iterations each with respect to a query word in the query, commands to enable said processing elements to, in parallel; (a) receive respective portions of columns comprising a segment of the segmented posting file associated with the word identifier range containing the query word, each processing element receiving said portion of one of said columns, (b) identify entries in the segment whose word identifiers correspond to the query word, and (c) modify a score maintained for the record identified in the identified entry, the records having a selected score at the end of the series of iterations being determined to have the selected relationship to the query. - View Dependent Claims (5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A segmented posting file generating system for generating a segmented posting file in response to a record base comprising a plurality of record base entries represented by an array comprising a plurality of columns and rows, the entries within each column having an order according to their respective word identifiers, the segmented posting file generating system enabling said processing elements to establish the segmented posting file in response to the record base in a series of segment generation iterations, the segmented posting file generating system comprising:
-
A. a plurality of processing elements, each for performing processing operations in response to commands; B. a segmented posting file generation control including; i. a segment word value identifier determination portion for providing commands to enable said processing elements to, in parallel, (a) select for an entry in said record base as a segment word value determiner entry, and (b) identify as a segment word value identifier range value a value having a selected value relation to the word identifier values in the segment word value determiner entries of all of said processing elements; and ii. a segment establishment portion for, during each of said segment generation iterations, providing commands to enable said processing elements to, in parallel, generate a series of rows of the segmented posting file in each of a series of segment row generation iterations, the commands enabling the processing elements to (a) determine whether a record base entry of the record base contains a word identifier having a value having a selected value relation to the segment word value identifier determination value, and (b) in response to a positive determination, copy the record base entry to the segmented posting file entry in the respective column of the segmented posting file and select the next entry of the record base for the next segment row generation iteration. - View Dependent Claims (15, 16)
-
-
17. A control arrangement for generating commands for controlling a plurality of processing elements to facilitate the processing of queries in connection with a record base including a plurality of record base entries each of which includes a record identifier and a word identifier, each query containing at least one query word, the query processing system identifying records having a selected relationship between the query and the word identifiers contained in the record base, said control arrangement including:
-
A. a segmented posting file generation control portion for providing commands to enable said processing elements to generate, in response to said record base, a segmented posting file, said segmented posting file having a plurality of segmented posting file entries, at least some of said segmented posting file entries having a word identifier and a record identifier, said segmented posting file entries being represented by an array comprising a plurality of columns and rows, said rows being aggregated into segments each having a selected number of rows with each segment containing segmented posting file entries having word identifiers within an identified word identifier range, the segmented posting file generation control portion generating commands so as to enable said processing elements to generate the columns of said segmented posting file in parallel; and B. a query processing control portion for providing, in a series of iterations each with respect to a query word in the query, commands to enable said processing elements to, in parallel; (i) receive respective portions of columns comprising a segment of the segmented posting file associated with the word identifier range containing the query word, each processing element receiving said portion of one of said columns, (ii) identify entries in the segment whose word identifiers correspond to the query word, and (iii) modify a score maintained for the record identified in the identified entry, records having a selected score at the end of the series of iterations being determined to have the selected relationship to the query. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25)
-
-
26. A method of controlling a computer to process queries in connection with a record base including a plurality of record base entries each of which includes a record identifier and a word identifier, each query containing at least one query word, the query processing system identifying records having a selected relationship between the query and the word identifiers contained in the record base, said method comprising:
-
A. a segmented posting file generation step in which commands are provided to enable said computer to generate, in response to said record base, a segmented posting file, said segmented posting file having a plurality of segmented posting file entries, at least some of said segmented posting file entries having a word identifier and a record identifier, said segmented posting file entries being represented by an array comprising a plurality of columns and rows, said rows being aggregated into segments each having a selected number of rows with each segment containing entries having word identifiers within an identified word identifier range; and B. a query processing step in which, in a series of iterations each with respect to a query word in the query, commands are provided to enable said computer to; (i) receive respective portions of columns comprising a segment of the segmented posting file associated with the word identifier range containing the query word, (ii) identify entries in the segment whose word identifiers correspond to the query word, and (iii) modify a score maintained for the record identified in the identified entry, records having a selected score at the end of the series of iterations being determined to have the selected relationship to the query. - View Dependent Claims (27, 28, 29, 30, 31, 32, 33)
-
-
34. A method of controlling a computer to enable the generation of a segmented posting file in response to a record base comprising a plurality of record base entries represented by an array comprising a plurality of columns and rows, the entries within each column having an order according to their respective word identifiers, the method enabling said computer to establish the segmented posting file in response to the record base in a series of segment generation iterations, during each segment generation iteration the method including:
-
A. a segment word value identifier determination step during which commands are provided to enable said computer to i. select for the segment generation iteration a record base entry in said record base as a segment word value determiner entry, and ii. identify as a segment word value identifier range value a value having a selected value relation to the word identifier values in the segment word value determiner entries; and B. a segment establishment step during which commands are provided to enable said computer to, in a series of segment row generation iterations, establish a series of rows thereby to form a segment of said segmented posting file, for each row of the segment the commands enabling said computer to i. determine, for each column of the record base, whether a record base entry of the record base contains a word identifier having a value having a selected value relation to the segment word value identifier determination value and ii. in response to a positive determination, copy the record base entry to a segmented posting file entry in the associated column of the segmented posting file and select the next record base entry of the record base for the next segment row generation iteration. - View Dependent Claims (35, 36)
-
-
37. A query processing system for processing queries in connection with a record base including a plurality of record base entries each of which includes a record identifier and a word identifier, each query containing at least one query word, the query processing system identifying records having a selected relationship between the query and the word identifiers contained in the record base, said query processing system comprising:
-
A. a computer for performing processing operations in response to commands; B. a control arrangement including; i. a segmented posting file generation control portion for providing commands to enable said computer to generate, in response to said record base, a segmented posting file, said segmented posting file having a plurality of segmented posting file entries, at least some of said segmented posting file entries having a word identifier and a record identifier, said segmented posting file entries being represented by an array comprising a plurality of columns and rows, said rows being aggregated into segments each having a selected number of rows with each segment containing entries having word identifiers within an identified word identifier range; and ii. a query processing control portion for providing, in a series of iterations each with respect to a query word in the query, commands to; (a) enable said computer to receive respective portions of columns comprising a segment of the segmented posting file associated with the word identifier range containing the query word, (b) identify entries in the segment whose word identifiers correspond to the query word, and (c) modify a score maintained for the record identified in the identified entry, records having a selected score at the end of the series of iterations being determined to have the selected relationship to the query. - View Dependent Claims (38, 39, 40, 41, 42, 43, 44, 45, 46)
-
Specification