System and method for determining originality of data content
First Claim
1. A computer-implemented method comprising:
- as implemented by one or more computing devices configured with specific executable instructions,for each of one or more other versions of an item that are different than a first version of the item, generating an originality score for a pairing including the other version of the item and the first version of the item, the originality score indicating a degree to which content of the other version of the item is diverse from content of the first version of the item, wherein the originality score is generated based at least in part on a comparison of the content of the other version of the item with the content of the first version of the item; and
determining a diversity measure for the first version of the item that indicates a degree to which the first version of the item differs from the one or more other versions of the item, wherein the diversity measure is determined based at least in part on the one or more generated originality scores, wherein the diversity measure indicates a percentage or amount of content of the first version of the item that is different than content of the one or more other versions of the item.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention provides systems and methods for determining the originality of data content. In one embodiment, the determined originality of a particular item (e.g., a book) as compared to one or more other items can be used as a factor in recommending the item to a user. For example, in one embodiment, upon a user'"'"'s selection of an item (e.g., a book), one or more items that have content most diverse from the selected item are determined and provided to the user. In another embodiment, various versions of an item are compared to each other to determine how content in each version differs from that in another version. In another embodiment, content in a collection of items are compared against content from publicly (freely) available sources (e.g., web pages) to determine the originality of the content in the collection of items.
-
Citations
12 Claims
-
1. A computer-implemented method comprising:
as implemented by one or more computing devices configured with specific executable instructions, for each of one or more other versions of an item that are different than a first version of the item, generating an originality score for a pairing including the other version of the item and the first version of the item, the originality score indicating a degree to which content of the other version of the item is diverse from content of the first version of the item, wherein the originality score is generated based at least in part on a comparison of the content of the other version of the item with the content of the first version of the item; and determining a diversity measure for the first version of the item that indicates a degree to which the first version of the item differs from the one or more other versions of the item, wherein the diversity measure is determined based at least in part on the one or more generated originality scores, wherein the diversity measure indicates a percentage or amount of content of the first version of the item that is different than content of the one or more other versions of the item. - View Dependent Claims (2, 3, 4, 5)
-
6. A system comprising:
-
a data store configured to store content of items; and a computing device, comprising one or more processors, in communication with the data store that is configured to; retrieve, from the data store, content of a first version of an item; retrieve, from the data store, content of one or more other versions of the item that are different than the first version of the item; for each of the one or more other versions of the item, generate an originality score for a pairing including the other version of the item and the first version of the item, the originality score indicating a degree to which content of the other version of the item is diverse from content of the first version of the item, wherein the originality score is generated based at least in part on a comparison of the content of the other version of the item with the content of the first version of the item; and determine a diversity measure for the first version of the item that indicates a degree to which the first version of the item differs from the one or more other versions of the item, wherein the diversity measure is determined based at least in part on the one or more generated originality scores, wherein the diversity measure indicates a percentage or amount of content of the first version of the item that is different than content of the one or more other versions of the item. - View Dependent Claims (7, 8)
-
-
9. A computer-readable, non-transitory storage medium storing computer-executable instructions that, when executed by a computer system, configure the computer system to perform operations comprising:
-
retrieving, from an electronic data store, content of a first version of an item and content of two or more other versions of the item; for each of the two or more other versions of the item, generating an originality score for a pairing including the other version of the item and the first version of the item, the originality score indicating a degree to which content of the other version of the item is diverse from content of the first version of the item, wherein the originality score is generated based at least in part on a comparison of the content of the other version of the item with the content of the first version of the item; and determining a diversity measure for the first version of the item that indicates a degree to which the first version of the item differs from the two or more other versions of the item, wherein the diversity measure is determined based at least in part on the two or more generated originality scores. - View Dependent Claims (10, 11, 12)
-
Specification