×

EFFICIENT MULTIFACETED SEARCH IN INFORMATION RETRIEVAL SYSTEMS

  • US 20080133473A1
  • Filed: 11/30/2006
  • Published: 06/05/2008
  • Est. Priority Date: 11/30/2006
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method of querying multifaceted information in an information retrieval system, comprising:

  • constructing, by said information retrieval (IR) system, an inverted index having a plurality of unique indexed tokens associated with a plurality of posting lists in a one-to-one correspondence, each posting list including one or more documents of a plurality of documents, wherein an indexed token of said plurality of unique indexed tokens is one of a facet token included as an annotation in a document of said plurality of documents and a path prefix of said facet token, wherein said annotation indicates a path within a tree structure representing a facet that includes said document, said tree structure including a plurality of nodes representing a category and one or more sub-categories that categorize said document, wherein said constructing said inverted index includes;

    generating a full path token and a full path token posting list associated therewith by said inverted index, said full path token posting list including a plurality of identifiers representing said plurality of documents, wherein an identifier of said plurality of identifiers represents said document and includes a payload value, said payload value identifying a full path of said document in said tree structure, and said payload value including a set of full path indicators provided by a Dewey labeling scheme that uniquely labels each sibling node of said tree structure;

    receiving, by said IR system, a query that includes a plurality of constraints on said plurality of documents, said plurality of constraints being associated with multiple indexed tokens of said plurality of unique indexed tokens and multiple posting lists corresponding to said multiple indexed tokens, wherein said plurality of constraints includes one or more facet constraints and one or more free-text constraints; and

    executing said query by said JR system, said executing including;

    identifying said multiple posting lists via a utilization of said plurality of constraints and said inverted index, andintersecting said multiple posting lists to obtain a result of said query,wherein said identifying said multiple posting lists includes;

    identifying, via said inverted index, a first set of one or more indexed tokens associated with said one or more facet constraints in a one-to-one correspondence and a second set of one or more indexed tokens associated with said one or more free-text constraints in a one-to-one correspondence, said first set and said second set of one or more indexed tokens included in said plurality of unique indexed tokens, andidentifying, via said inverted index, a first group of one or more posting lists and a second group of one or more posting lists, said one or more posting lists of said first group associated with said one or more indexed tokens of said first set in a one-to-one correspondence and said one or more posting lists of said second group associated with said one or more indexed tokens of said second set in a one-to-one correspondence.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×