Ranking content using content and content authors
First Claim
Patent Images
1. A computer-implemented method comprising:
- accessing, by one or more processors, a corpus of documents;
determining, by the one or more processors, that a particular document in the corpus of documents includes multiple content pieces that each occur in at least one other earlier document in the corpus of documents;
determining, by the one or more processors, an extent to which the entire content of the particular document is made up of the multiple content pieces that each occur in the at least one other earlier document in the corpus of documents;
adjusting, by the one or more processors, a rank of a source of the particular document in relation to a source of other documents in the corpus of documents based at least in part on the extent to which the entire content of the particular document is made up of the multiple content pieces that each occur in the at least one other earlier document in the corpus of documents; and
configuring a web crawling or search result ranking process for the source of the particular document based at least in part on the adjusted rank.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer program products for identifying original content. In one aspect a method is described that includes identifying a first document in a collection of documents. The first document contains a content piece and the content piece does not occur in any earlier document in the collection. The first document is associated with a first author and the first author associated with a first rank. The first rank of the first author is determined using a score of the content piece. The score is a figure of merit of the content piece.
-
Citations
20 Claims
-
1. A computer-implemented method comprising:
-
accessing, by one or more processors, a corpus of documents; determining, by the one or more processors, that a particular document in the corpus of documents includes multiple content pieces that each occur in at least one other earlier document in the corpus of documents; determining, by the one or more processors, an extent to which the entire content of the particular document is made up of the multiple content pieces that each occur in the at least one other earlier document in the corpus of documents; adjusting, by the one or more processors, a rank of a source of the particular document in relation to a source of other documents in the corpus of documents based at least in part on the extent to which the entire content of the particular document is made up of the multiple content pieces that each occur in the at least one other earlier document in the corpus of documents; and configuring a web crawling or search result ranking process for the source of the particular document based at least in part on the adjusted rank. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A non-transitory computer readable medium having stored thereon instructions, which, when executed by one or more processors, causes the one or more processors to perform the operations comprising:
-
accessing, by one or more processors, a corpus of documents; determining, by the one or more processors, that a particular document in the corpus of documents includes multiple content pieces that each occur in at least one other earlier document in the corpus of documents; determining, by the one or more processors, an extent to which the entire content of the particular document is made up of the multiple content pieces that each occur in the at least one other earlier document in the corpus of documents; adjusting, by the one or more processors, a rank of a source of the particular document in relation to a source of other documents in the corpus of documents based at least in part on the extent to which the entire content of the particular document is made up of the multiple content pieces that each occur in the at least one other earlier document in the corpus of documents; and configuring a web crawling or search result ranking process for the source of the particular document based at least in part on the adjusted rank. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A system comprising:
-
one or more processors; and a computer readable medium coupled to the data processing apparatus, having instructions stored thereon which, when executed by the data processing apparatus, cause the data processing apparatus to perform operations comprising; accessing, by one or more processors, a corpus of documents; determining, by the one or more processors, that a particular document in the corpus of documents includes multiple content pieces that each occur in at least one other earlier document in the corpus of documents; determining, by the one or more processors, an extent to which the entire content of the particular document is made up of the multiple content pieces that each occur in the at least one other earlier document in the corpus of documents; adjusting, by the one or more processors, a rank of a source of the particular document in relation to a source of other documents in the corpus of documents based at least in part on the extent to which the entire content of the particular document is made up of the multiple content pieces that each occur in the at least one other earlier document in the corpus of documents; and configuring a web crawling or search result ranking process for the source of the particular document based at least in part on the adjusted rank. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification