Clustering web pages on a search engine results page
First Claim
1. A method of clustering documents for search results, the method comprising:
- accessing a database that is associated with a search engine, wherein the database includes a plurality of stored documents retrievable by the search engine;
clustering some of the stored documents into one or more clusters based on content similarity;
subdividing each of the one or more clusters into one or more subdivided clusters according to publication date;
assigning an identifier to each of the clusters of the stored documents, wherein the identifier is assigned during a life span of each of the clustered stored documents, and wherein the identifier of each of the clusters remains persistent throughout the life span of each of the clustered stored documents;
responsive to a search query, generating search results for presentation on a search results page, the search results comprising one or more of the subdivided clusters; and
presenting the search results on the search result page, wherein each subdivided cluster presented on the search results page includes a synopsis of the subdivided cluster and links to documents contained within the subdivided cluster.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and media are provided for delivering clustered search results for recent and non-recent events by maintaining the identification (ID) numbers of the respective clustered documents beyond the “fresh” life span of the clustered documents. When clusters are formed according to similar content, an ID number and associated attributes are assigned to each of the clusters. This provides a mechanism to track and retrieve the respective clusters for subsequent delivery of search results. The respective ID numbers of the clusters are maintained, even after the documents are no longer considered “fresh.” These similar-content clusters are further subdivided according to publication date. This provides individual subdivided clusters for similar content events that occurred at different time spans, which are delivered along with individual non-clustered search results in a SERP.
-
Citations
20 Claims
-
1. A method of clustering documents for search results, the method comprising:
-
accessing a database that is associated with a search engine, wherein the database includes a plurality of stored documents retrievable by the search engine; clustering some of the stored documents into one or more clusters based on content similarity; subdividing each of the one or more clusters into one or more subdivided clusters according to publication date; assigning an identifier to each of the clusters of the stored documents, wherein the identifier is assigned during a life span of each of the clustered stored documents, and wherein the identifier of each of the clusters remains persistent throughout the life span of each of the clustered stored documents; responsive to a search query, generating search results for presentation on a search results page, the search results comprising one or more of the subdivided clusters; and presenting the search results on the search result page, wherein each subdivided cluster presented on the search results page includes a synopsis of the subdivided cluster and links to documents contained within the subdivided cluster. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system for generating search results comprising clustered documents, comprising:
-
one or more memory storage devices configured to store a database that includes a plurality of stored documents; one or more computing devices configured to; access the database that includes the plurality of stored documents, cluster some of the stored documents into one or more clusters based on content similarity, assign an identifier to each of the clusters of the stored documents, wherein the identifier of each of the clusters is assigned during a life span of each of the clustered stored documents, and remains persistent throughout the life span of each of the clustered stored documents, subdividing each of the one or more clusters into one or more subdivided clusters according to publication date, and responsive to a search query, generating search results for presentation on a search results page, wherein the search results are organized into one or more subdivided clusters. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer-implemented method of generating search results comprising clustered documents using a computing device having processor, memory, and data storage subsystems, the computer-implemented method comprising:
-
grouping a plurality of documents stored in a database based on page content similarity to form one or more clusters; assigning an identifier and one or more respective related attributes to each of the one or more clusters; maintaining the assigned identifiers and the respective related attributes for each of the one or more clusters, wherein the identifier of each of the clusters remains persistent throughout an entire life span of each of the clustered stored documents; subdividing each of the one or more clusters into one or more subdivided clusters according to publication date; and responsive to a search query, generating search results for presentation on a search results page, the search results comprising one or more of the clusters. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification