Systems and methods for authoritativeness grading, estimation and sorting of documents in large heterogeneous document collections
First Claim
1. A method for re-ranking a set of relevant documents identified following a search, the method comprising:
- determining a set of document content feature values for each relevant document identified;
determining one or more of a textual authoritativeness value or a textual authority class for each relevant document using a trained document textual authority model based on the determined set of document content feature values; and
rearranging the set of relevant documents in an order selected using at least the determined textual authoritativeness value or the textual authority class for the documents to re-rank the set of relevant documents.
8 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for determining the authoritativeness of a document based on textual, non-topical cues. The authoritativeness of a document is determined by evaluating a set of document content features contained within each document to determine a set of document content feature values, processing the set of document content feature values through a trained document textual authority model, and determining a textual authoritativeness value and/or textual authority class for each document evaluated using the predictive models included in the trained document textual authority model. Estimates of a document'"'"'s textual authoritativeness value and/or textual authority class can be used to re-rank documents previously retrieved by a search, to expand and improve document query searches, to provide a more complete and robust determination of a document'"'"'s authoritativeness, and to improve the aggregation of rank-ordered lists with numerically-ordered lists.
78 Citations
27 Claims
-
1. A method for re-ranking a set of relevant documents identified following a search, the method comprising:
-
determining a set of document content feature values for each relevant document identified;
determining one or more of a textual authoritativeness value or a textual authority class for each relevant document using a trained document textual authority model based on the determined set of document content feature values; and
rearranging the set of relevant documents in an order selected using at least the determined textual authoritativeness value or the textual authority class for the documents to re-rank the set of relevant documents. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method for determining the authoritativeness of a document having a plurality of document content features, the method comprising:
-
selecting a predetermined number of top-ordered documents identified following a topic search of a large document collection;
evaluating a link structure of each top-ordered document;
determining one or more of a textual authoritativeness value or a textual authority class of each top-ordered document; and
determining a weighted social authority rank for each top-ordered document based on the one or more of a textual authoritativeness value or a textual authority class of each top-ordered document. - View Dependent Claims (8, 9, 10, 11, 12, 13)
-
-
14. A method for expanding a search query based on a textual authoritativeness of a document, the method comprising:
-
identifying a first set of relevant documents using an initial set of query terms;
determining a textual authoritativeness value for each document of the first set of relevant documents;
identifying a second set of relevant documents from the first set of relevant documents based on the textual authoritativeness values determined for at least some of the first set of documents;
defining a candidate set of query expansion terms from the second set of relevant documents; and
selecting at least one query expansion term from the candidate set of query expansion terms. - View Dependent Claims (15, 16, 17, 18, 19)
-
-
20. A method of combining at least two sets of rank orderings to produce an aggregate set ordering that is closest in some distance to each of the least two sets of rank orderings, the method comprising:
-
determining a first set rank ordering of relevant documents;
determining a textual authoritativeness for each document in the first set rank ordering of relevant documents;
determining a second set rank ordering of relevant documents from the first set rank ordering based on the textual authoritativeness determined for at least some of the first set of documents; and
combining the first set rank ordering of relevant documents and the second set rank ordering of relevant documents using a rank aggregation algorithm model or method. - View Dependent Claims (21, 22, 23)
-
-
24. A method for simultaneously estimating an aggregate rank and aggregate weights to be assigned to two or more ranked document lists or document rank orderings, the method comprising determining a first set rank ordering of relevant documents;
-
determining a textual authoritativeness value for each document in the first set rank ordering of relevant documents;
determining a second set rank ordering of relevant documents from the first set rank ordering based on the textual authoritativeness values determined for at least some of the first set of documents; and
combining the first set rank ordering of relevant documents and the second set rank ordering of relevant documents using a rank aggregation algorithm model or method. - View Dependent Claims (25, 26, 27)
-
Specification