Organizing books by series
First Claim
1. A computer-implemented method of identifying a book series, comprising:
- receiving book information describing one or more books from a book information server;
creating book records identifying the books described by the book information, a book record having fields with values describing attributes of a given book in the books derived from the book information;
clustering the book records into a plurality of clusters based on the values of a subset of the fields of the book records;
identifying a set containing a plurality of related clusters of book records by determining similarity of values of fields of the clustered book records other than the subset of fields on which the records were clustered;
identifying a separate series name candidate for each cluster in the set to produce a plurality of series name candidates;
selecting a name of the book series from among the plurality of series name candidates by comparing the plurality of series name candidates to identify a predominant series name candidate;
storing information describing the selected name of the book series in a repository; and
identifying a set of books in the book series having the selected name.
2 Assignments
0 Petitions
Accused Products
Abstract
Book information describing a plurality of books is analyzed to identify the plurality of books described in the book information and create book records for the respective ones of identified books. A given book record contains fields describing attributes of a respective one of the plurality of books derived from the book information. The book records are clustered into a plurality of clusters based on the values of the fields of the book records. One or more clusters are analyzed to identify a name of a book series based on the book records therein. The book records in a cluster may further be placed in buckets representing individual books in the series and, in turn, the buckets are described based on the book information therein and organized based their description. The identified series name, bucket descriptions and organization thereof may be stored in a repository and presented to users.
7 Citations
18 Claims
-
1. A computer-implemented method of identifying a book series, comprising:
-
receiving book information describing one or more books from a book information server; creating book records identifying the books described by the book information, a book record having fields with values describing attributes of a given book in the books derived from the book information; clustering the book records into a plurality of clusters based on the values of a subset of the fields of the book records; identifying a set containing a plurality of related clusters of book records by determining similarity of values of fields of the clustered book records other than the subset of fields on which the records were clustered; identifying a separate series name candidate for each cluster in the set to produce a plurality of series name candidates; selecting a name of the book series from among the plurality of series name candidates by comparing the plurality of series name candidates to identify a predominant series name candidate; storing information describing the selected name of the book series in a repository; and identifying a set of books in the book series having the selected name. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A non-transitory computer-readable storage medium storing executable computer program instructions for identifying a book series, the instructions performing steps comprising:
-
receiving book information describing one or more books from a book information server; creating book records identifying the books described by the book information, a book record having fields with values describing attributes of a given book in the books derived from the book information; clustering the book records into a plurality of clusters based on a subset of the values of the fields of the book records; identifying a set containing a plurality of related clusters of book records by determining similarity of values of fields of the clustered book records other than the subset of fields on which the records were clustered; identifying a separate series name candidate for each cluster in the set to produce a plurality of series name candidates; selecting a name of the book series from among the plurality of series name candidates by comparing the plurality of series name candidates to identify a predominant series name candidate; storing information describing the identified name of the book series in a repository; and identifying a set of books in the book series having the selected name. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A computer system for identifying a book series, the computer system comprising:
-
a non-transitory computer-readable storage medium storing computer program instructions executable to perform steps comprising; receiving book information describing one or more books from a book information server; creating book records identifying the books described by the book information, a book record having fields with values describing attributes of a given book in the books derived from the book information; clustering the book records into a plurality of clusters based on the values of a subset of the fields of the book records; identifying a set containing a plurality of related clusters of book records by determining similarity of values of fields of the clustered book records other than the subset of fields on which the records were clustered; identifying a separate series name candidate for each cluster in the set to produce a plurality of series name candidates; selecting a name of the book series from among the plurality of series name candidates by comparing the plurality of series name candidates to identify a predominant series name candidate; storing information describing the identified name of the book series in a repository; and identifying a set of books in the book series having the selected name; and a processor for executing the computer program instructions. - View Dependent Claims (15, 16, 17, 18)
-
Specification