Real-time search of vertically partitioned, inverted indexes

US 9,152,697 B2
Filed: 07/13/2011
Issued: 10/06/2015
Est. Priority Date: 07/13/2011
Status: Active Grant

First Claim

Patent Images

1. A computer system for processing a query, comprising:

a processor; and

a storage device connected to the processor, wherein the storage device has stored thereon a program, and wherein the processor is configured to execute instructions of the program to perform operations, wherein the operations comprise;

receiving the query that includes a document constraint and an annotation constraint;

parsing the query to separate the document constraint from the annotation constraint and to create a query parse tree with a primary query processor at a lowest level of the query parse tree and with an auxiliary query processor at the lowest level of the query parse tree;

processing the document constraint with the primary query processor to generate a first posting list that is ordered by document identifier;

processing the annotation constraint with the auxiliary query processor to generate a second posting list that is ordered by annotation identifier and that includes the document identifier associated with each annotation that is identified by the annotation identifier and that is re-ordered by the document identifier;

evaluating the query parse tree with the primary query processor and with the auxiliary query processor by iterating through the first posting list and the second posting list; and

performing a run-time join of the first posting list and the second posting list to obtain a final result set that is returned in response that combines documents and annotations that have a same document identifier with a union operation.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Provided are techniques for processing a query. A query including constraints for at least two vertically partitioned, inverted indexes is received. The constraints in the query are separated based on the vertically partitioned, inverted indexes. A document identifier iterator is obtained for each of the constraints, wherein each document identifier iterator is associated with a posting list, and wherein each posting list is ordered by document identifier order. A run-time join of the posting lists is performed to obtain a final result set.

36 Citations

12 Claims

1. A computer system for processing a query, comprising:
- a processor; and
  
  a storage device connected to the processor, wherein the storage device has stored thereon a program, and wherein the processor is configured to execute instructions of the program to perform operations, wherein the operations comprise;
  
  receiving the query that includes a document constraint and an annotation constraint;
  
  parsing the query to separate the document constraint from the annotation constraint and to create a query parse tree with a primary query processor at a lowest level of the query parse tree and with an auxiliary query processor at the lowest level of the query parse tree;
  
  processing the document constraint with the primary query processor to generate a first posting list that is ordered by document identifier;
  
  processing the annotation constraint with the auxiliary query processor to generate a second posting list that is ordered by annotation identifier and that includes the document identifier associated with each annotation that is identified by the annotation identifier and that is re-ordered by the document identifier;
  
  evaluating the query parse tree with the primary query processor and with the auxiliary query processor by iterating through the first posting list and the second posting list; and
  
  performing a run-time join of the first posting list and the second posting list to obtain a final result set that is returned in response that combines documents and annotations that have a same document identifier with a union operation.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The computer system of claim 1, wherein the operations further comprise:
    - mapping the document constraint to the primary query processor and mapping the annotation constraint to an auxiliary query processor.
  - 3. The computer system of claim 1, wherein the operations further comprise:
    - storing document identifiers in the primary index storing document data; and
      
      at index construction time, storing the document identifiers in the auxiliary index storing annotation data, wherein the document identifiers comprise foreign keys.
  - 4. The computer system of claim 1, wherein a text document is categorized in a category, and wherein the category is stored in the auxiliary index for search and analysis in real-time.
  - 5. The computer system of claim 1, wherein the operations further comprise:
    - creating vertically partitioned, inverted indexes, with each of the vertically partitioned, inverted indexes associated with one or more constraints.
  - 6. The computer system of claim 1, wherein the operations further comprise:
    - returning the final result set in response to the received query that includes the document constraint and the annotation constraint.

7. A computer program product for processing a query, the computer program product comprising:
- a non-transitory computer readable storage medium having computer readable program code embodied therewith, wherein the computer readable program code, when executed by a processor of a computer, is configured to perform;
  
  receiving the query that includes a document constraint and an annotation constraint;
  
  parsing the query to separate the document constraint from the annotation constraint and to create a query parse tree with a primary query processor at a lowest level of the query parse tree and with an auxiliary query processor at the lowest level of the query parse tree;
  
  processing the document constraint with the primary query processor to generate a first posting list that is ordered by document identifier;
  
  processing the annotation constraint with the auxiliary query processor to generate a second posting list that is ordered by annotation identifier and that includes the document identifier associated with each annotation that is identified by the annotation identifier and that is re-ordered by the document identifier;
  
  evaluating the query parse tree with the primary query processor and with the auxiliary query processor by iterating through the first posting list and the second posting list; and
  
  performing a run-time join of the first posting list and the second posting list to obtain a final result set that is returned in response that combines documents and annotations that have a same document identifier with a union operation.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The computer program product of claim 7, wherein the computer readable program code, when executed by the processor of the computer, is configured to perform:
    - mapping the document constraint to the primary query processor and mapping the annotation constraint to the auxiliary query processor.
  - 9. The computer program product of claim 7, wherein the computer readable program code, when executed by the processor of the computer, is configured to perform:
    - storing document identifiers in the primary index storing document data; and
      
      at index construction time, storing the document identifiers in the auxiliary index storing annotation data, wherein the document identifiers comprise foreign keys.
  - 10. The computer program product of claim 7, wherein a text document is categorized in a category, and wherein the category is stored in the auxiliary index for search and analysis in real-time.
  - 11. The computer program product of claim 7, wherein the computer readable program code, when executed by the processor of the computer, is configured to perform:
    - creating vertically partitioned, inverted indexes, with each of the vertically partitioned, inverted indexes associated with one or more constraints.
  - 12. The computer program product of claim 7, wherein the computer readable program code, when executed by the processor of the computer, is configured to perform:
    - returning the final result set in response to the received query that includes the document constraint and the annotation constraint.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Busch, Michael, Desai, Rajesh M., Foyle, Robert A., Jayapandian, Magesh
Primary Examiner(s)
Morrison, Jay
Assistant Examiner(s)
GORTAYO, DANGELINO N

Application Number

US13/181,891
Publication Number

US 20130018916A1
Time in Patent Office

1,546 Days
Field of Search

707/100, 707/102, 707/104, 707/736, 707/741, 707/742, 707/759, 707/763, 707/771, 707/776, 707/779
US Class Current

1/1
CPC Class Codes

G06F 16/313   Selection or weighting of t...

G06F 16/319   Inverted lists

G06F 16/3331   Query processing

Real-time search of vertically partitioned, inverted indexes

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

36 Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

Real-time search of vertically partitioned, inverted indexes

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

36 Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links