Systems and methods for authoritativeness grading, estimation and sorting of documents in large heterogeneous document collections
First Claim
1. A method for re-ranking a set of relevant documents identified following a search, the method comprising:
- determining a set of document content feature values for each relevant document identified;
generating a textual authoritativeness value and determining a textual authority class for each relevant document using a trained document textual authority model based on the determined set of document content feature values, the trained document textual authority model reclassifying the documents by assigning the document content feature values to a textual authority rank;
rearranging the set of relevant documents in an order selected based on the textual authority rank by using the generated textual authoritativeness value and the determined textual authority class for the documents to rearrange the set of relevant; and
displaying the rearranged set of relevant documents;
wherein the textual authoritativeness value is generated for each document using a document textual authority framework model that considers at least one of an author'"'"'s background, a targeted audience, an author'"'"'s institutional affiliation, or whether the document has been reviewed by others.
8 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for determining the authoritativeness of a document based on textual, non-topical cues. The authoritativeness of a document is determined by evaluating a set of document content features contained within each document to determine a set of document content feature values, processing the set of document content feature values through a trained document textual authority model, and determining a textual authoritativeness value and/or textual authority class for each document evaluated using the predictive models included in the trained document textual authority model. Estimates of a document'"'"'s textual authoritativeness value and/or textual authority class can be used to re-rank documents previously retrieved by a search, to expand and improve document query searches, to provide a more complete and robust determination of a document'"'"'s authoritativeness, and to improve the aggregation of rank-ordered lists with numerically-ordered lists.
111 Citations
30 Claims
-
1. A method for re-ranking a set of relevant documents identified following a search, the method comprising:
-
determining a set of document content feature values for each relevant document identified; generating a textual authoritativeness value and determining a textual authority class for each relevant document using a trained document textual authority model based on the determined set of document content feature values, the trained document textual authority model reclassifying the documents by assigning the document content feature values to a textual authority rank; rearranging the set of relevant documents in an order selected based on the textual authority rank by using the generated textual authoritativeness value and the determined textual authority class for the documents to rearrange the set of relevant; and displaying the rearranged set of relevant documents; wherein the textual authoritativeness value is generated for each document using a document textual authority framework model that considers at least one of an author'"'"'s background, a targeted audience, an author'"'"'s institutional affiliation, or whether the document has been reviewed by others. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method for determining the authoritativeness of a document having a plurality of document content features, the method comprising:
-
selecting a predetermined number of top-ordered documents identified following a topic search of a large document collection; evaluating a link structure of each top-ordered document; determining a textual authority from a textual authoritativeness value and a textual authority class of each top-ordered document; determining a weighted social authority rank for each top-ordered document based on the textual authority of each top-ordered document by associating a set of hyper-linked pages with a set of corresponding nodes using an adjacency matrix; and displaying the weighted social authority ranked documents, wherein the textual authoritativeness value is generated for each document using a document textual authority framework model that considers at least one of an author'"'"'s background, a targeted audience, an author'"'"'s institutional affiliation, and whether the document has been reviewed by others. - View Dependent Claims (8, 9, 10, 11, 12, 13, 28)
-
-
14. A method for expanding a search query based on a textual authoritativeness of a document, the method comprising:
-
identifying a first set of relevant documents using an initial set of query terms; generating a textual authoritativeness value for each document of the first set of relevant documents; identifying a second set of relevant documents from the first set of relevant documents based on the textual authoritativeness values generated for at least some of the first set of documents; defining a candidate set of query expansion terms from the second set of relevant documents by evaluating and extracting at least one term most frequently present in the second set of relevant documents; selecting at least one query expansion term from the candidate set of query expansion terms; and displaying the at least one ciuerv expansion term; wherein the textual authoritativeness value is generated for each document using a document textual authority framework model that considers at least one of an author'"'"'s background, a targeted audience, an author'"'"'s institutional affiliation, and whether the document has been reviewed by others. - View Dependent Claims (15, 16, 17, 18, 19)
-
-
20. A method of combining at least two sets of rank orderings to produce an aggregate set ordering that is closest in some distance to each of the at least two sets of rank orderings, the method comprising:
-
determining first set rank ordering of relevant documents;
generating a textual authoritativeness value for each document in the first set rank ordering of relevant documents;determining a second set rank ordering of relevant documents from the first set rank ordering based on the textual authoritativeness value generated for at least some of the first set of documents; and combining the first set rank ordering of relevant documents and the second set rank ordering of relevant documents using a rank aggregation algorithm model or method based on a Markov chain, further including; assigning a current state to a first page from the first set rank ordering, selecting at least one second page uniformly chosen in an ordered list from at least one of the first set rank of ordering and the second set rank of ordering, determining a first rank for the first page, determining at least one second rank for the at least one second page in the ordered list, reassigning the current state to the second page if the second page is ranked higher than the first page by the majority of the ordered lists, and displaying the combined set of relevant documents, wherein the textual authoritativeness value is generated for each document using a document textual authority framework model that considers at least one of an author'"'"'s background, a targeted audience, an author'"'"'s institutional affiliation, and whether the document has been reviewed by others. - View Dependent Claims (21, 22, 23, 29)
-
-
24. A method for simultaneously estimating an aggregate rank and aggregate weights to be assigned to two or more ranked document lists or document rank orderings, the method comprising:
-
determining a first set rank ordering of relevant documents;
generating a textual authoritativeness value for each document in the first set rank ordering of relevant documents;determining a second set rank ordering of relevant documents from the first set rank ordering based on the textual authoritativeness values generated for at least some of the first set of documents; and combining the first set rank ordering of relevant documents and the second set rank ordering of relevant documents using a rank aggregation algorithm model or method based on a Markov chain, further including; assigning a current state to a first page from the first set rank ordering, selecting at least one second page uniformly chosen in an ordered list from at least one of the first set rank of ordering and the second set rank of ordering, determining a first rank for the first page, determining at least one second rank for the at least one second page in the ordered list, and reassigning the current state to the second page if the second page is ranked higher than the first page by the majority of the ordered lists, and displaying the combined set of relevant documents, wherein the textual authoritativeness value is generated for each document using a document textual authority framework model that considers at least one of an author'"'"'s background, a targeted audience, an author'"'"'s institutional affiliation, and whether the document has been reviewed by others. - View Dependent Claims (25, 26, 27, 30)
-
Specification