Clustering web pages on a search engine results page
First Claim
1. A computer-implemented method of delivering search results of one or more events using a computing device having processor, memory, and data storage subsystems, the computer-implemented method comprising:
- providing a plurality of documents, wherein the plurality of documents includes fresh documents and non-fresh documents, wherein fresh documents have life spans falling within a predetermined period of time, and wherein non-fresh documents have life spans exceeding the predetermined period of time;
grouping the plurality of documents based on page content similarity to form one or more clusters;
assigning an identification (ID) number and one or more respective related attributes to each of the one or more clusters;
maintaining the assigned ID numbers and the respective related attributes for each of the one or more clusters after the plurality of documents are no longer considered to be fresh documents; and
subdividing each of the one or more clusters into one or more subdivided clusters according to publication date.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and media are provided for delivering clustered search results for recent and non-recent events by maintaining the identification (ID) numbers of the respective clustered documents beyond the “fresh” life span of the clustered documents. When clusters are formed according to similar content, an ID number and associated attributes are assigned to each of the clusters. This provides a mechanism to track and retrieve the respective clusters for subsequent delivery of search results. The respective ID numbers of the clusters are maintained, even after the documents are no longer considered “fresh.” These similar-content clusters are further subdivided according to publication date. This provides individual subdivided clusters for similar content events that occurred at different time spans, which are delivered along with individual non-clustered search results in a SERP.
28 Citations
19 Claims
-
1. A computer-implemented method of delivering search results of one or more events using a computing device having processor, memory, and data storage subsystems, the computer-implemented method comprising:
-
providing a plurality of documents, wherein the plurality of documents includes fresh documents and non-fresh documents, wherein fresh documents have life spans falling within a predetermined period of time, and wherein non-fresh documents have life spans exceeding the predetermined period of time; grouping the plurality of documents based on page content similarity to form one or more clusters; assigning an identification (ID) number and one or more respective related attributes to each of the one or more clusters; maintaining the assigned ID numbers and the respective related attributes for each of the one or more clusters after the plurality of documents are no longer considered to be fresh documents; and subdividing each of the one or more clusters into one or more subdivided clusters according to publication date. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. One or more computer-readable storage media storing computer-readable instructions embodied thereon that, when executed by a computing device, perform a method of delivering persistent clusters in a search engine results page, the method comprising:
-
retrieving documents from a database according to a received search query; clustering some of the retrieved documents into one or more clusters based on content similarity and publication date; assigning an identification (ID) number to each of the clusters of the retrieved documents, wherein the ID number of each of the clusters remains persistent throughout a life span of each of the clustered retrieved documents; and delivering each of the clusters with other individual non-clustered results in the search engine results page to a user interface in response to the received search query. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. One or more computer-readable storage media storing computer-readable instructions embodied thereon that, when executed by a computing device, perform a method of providing clustered non-unique results in a search engine results page, the method comprising:
-
providing a plurality of documents comprising fresh documents that have life spans falling within a predetermined period of time; grouping the fresh documents based on page content similarity to form one or more clusters; assigning an identification (ID) number and one or more respective related attributes to each of the one or more clusters; maintaining the assigned ID numbers and the respective related attributes for each of the one or more clusters after the clustered documents are no longer considered to be fresh documents, wherein the clustered documents are no longer considered to be fresh documents when their life spans exceed the predetermined period of time; retrieving a set of documents from the plurality of documents in response to a received user search query, wherein the set of documents includes A) fresh documents having life spans falling within a predetermined period of time, and B) non-fresh documents having life spans exceeding the predetermined period of time, wherein each document in the retrieved set of documents is associated with one or more of the ID numbers assigned to the clusters, regardless of whether the document is a fresh document or a non-fresh document; selecting a set number of top results from the retrieved set of documents; grouping the top results according to publication date or content similarity using one or more of the ID numbers of one or more respective retrieved clusters; and delivering search results to a user interface in response to the received user search query, the search engine results page comprising the grouped top results. - View Dependent Claims (16, 17, 18, 19)
-
Specification