Ranking of search results based on microblog data
First Claim
1. A computer-implemented method for generating a ranked list of resources in response to a query, comprising:
- pairing the query with a plurality of microblog resource identifiers, wherein each microblog resource identifier comprises a resource identifier obtained from monitoring a received data stream of microblog posts, thereby generating a plurality of query/microblog resource identifier pairs;
generating a feature set for each query/microblog resource identifier pair, wherein generating the feature set includes generating at least one textual feature by analyzing text of one or more microblog posts that refer to the microblog resource identifier in conjunction with text of the query or generating at least one social networking feature by analyzing one or more characteristics associated with one or more microblog users that issued or received the microblog resource identifier via the microblog;
processing the feature sets associated with each query/microblog resource identifier pair in a first machine learned ranker to produce a ranking for each microblog resource identifier; and
generating one single combined ranked list of resources by combining the rankings for each microblog resource identifier produced by the first machine learned ranker with rankings generated for a plurality of network resource identifiers, wherein each network resource identifier comprises a resource identifier obtained from resources other than the received data stream of microblog posts.
9 Assignments
0 Petitions
Accused Products
Abstract
An information retrieval system is described herein that monitors a microblog data stream that includes microblog posts to discover and index fresh resources for searching by a search engine. The information retrieval system also uses data from the microblog data stream as well as data obtained from a microblog subscription system to compute novel and effective features for ranking fresh resources which would otherwise have impoverished representations. An embodiment of the present invention advantageously enables a search engine to produce a fresher set of resources and to rank such resources for both relevancy and freshness in a more accurate manner.
-
Citations
22 Claims
-
1. A computer-implemented method for generating a ranked list of resources in response to a query, comprising:
-
pairing the query with a plurality of microblog resource identifiers, wherein each microblog resource identifier comprises a resource identifier obtained from monitoring a received data stream of microblog posts, thereby generating a plurality of query/microblog resource identifier pairs; generating a feature set for each query/microblog resource identifier pair, wherein generating the feature set includes generating at least one textual feature by analyzing text of one or more microblog posts that refer to the microblog resource identifier in conjunction with text of the query or generating at least one social networking feature by analyzing one or more characteristics associated with one or more microblog users that issued or received the microblog resource identifier via the microblog; processing the feature sets associated with each query/microblog resource identifier pair in a first machine learned ranker to produce a ranking for each microblog resource identifier; and generating one single combined ranked list of resources by combining the rankings for each microblog resource identifier produced by the first machine learned ranker with rankings generated for a plurality of network resource identifiers, wherein each network resource identifier comprises a resource identifier obtained from resources other than the received data stream of microblog posts. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 22)
-
-
10. An information retrieval system, comprising:
-
one or more processing units; a microblog URL filter configured for pairing a query with a plurality of microblog resource identifiers, wherein each microblog resource identifier comprises a resource identifier obtained from monitoring a received data stream of microblog posts, thereby generating a plurality of query/microblog resource identifier pairs; a first feature generator, at least partially executed by at least one of the one or more processing units, configured for generating a feature set for each query/microblog resource identifier pair, wherein generating the feature set includes generating at least one textual feature by analyzing text of one or more microblog posts that refer to the microblog resource identifier in conjunction with text of the query or generating at least one social networking feature by analyzing one or more characteristics associated with one or more microblog users that issued or received the microblog resource identifier via the microblog; a first machine learned ranker configured for processing the feature sets associated with each query/microblog resource identifier pair to produce a ranking for each microblog resource identifier; and a ranked resource identifier combiner configured for generating one single combined ranked list of resources by combining the rankings for each microblog resource identifier produced by the first machine learned ranker with rankings generated for a plurality of network resource identifiers, wherein each network resource identifier comprises a resource identifier obtained from resources other than the received data stream of microblog posts. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A method for identifying resources in response to a query received from a user, comprising:
-
storing resource identifiers extracted from a data stream of microblog posts in a microblog resource identifier index; determining whether the query is a recency-sensitive query; responsive to determining that the query is a recency-sensitive query, including resources identified by the resource identifiers in the microblog resource identifier index among resources to be searched based on the query; identifying resources among the resources to be searched based on the query; ranking the identified resources; and providing one single combined list of the identified resources to the user, wherein the one single combined list is ordered based on the ranking and at least one other ranking generated for a plurality of network resource identifiers, wherein each network resource identifier comprises a resource identifier obtained from resources other than the received data stream of microblog posts. - View Dependent Claims (20)
-
-
21. A method for identifying and ranking resources in response to a query received from a user comprising:
-
selecting a first group of resources from among a plurality of resources represented in a first index based on the query, wherein the first index is created by crawling a network of nodes that store resources; ranking the first group of resources to generate a first ranked list of resources; selecting a second group of resources from among a plurality of resources represented in a second index based on the query, wherein the second index is created by monitoring a received data stream of microblog posts to identify resource identifiers included in the microblog posts, and the second group of resources are different from the first group of resources; ranking the second group of resources to generate a second ranked list of resources; combining the first and second ranked list of resources to generate one single combined ranked list of resources; and returning the one single combined ranked list of resources to the user.
-
Specification