OPTIMIZING AN INDEX OF WEB DOCUMENTS
First Claim
1. One or more computer storage media (the “
- media”
) storing computer-useable instructions that, when used by one or more computing devices, cause the one or more computing devices to perform a method for predicting the likelihood of retrieval of web documents during a web search, the method comprising;
receiving historical usage data related to user queries and training properties of a plurality of web pages in an index;
training a mathematical model to predict a likelihood of retrieval for the plurality of web pages based on the historical usage data and the training properties;
extracting properties from the plurality of web pages in the index;
applying the mathematical model to the properties;
calculating a sortrank value for each web page based on the mathematical model and the properties;
reordering the index based on the sortrank value for each web page;
3 Assignments
0 Petitions
Accused Products
Abstract
Historical usage data related to user queries and training properties for a plurality of web pages is received and utilized to train a mathematical model to predict the likelihood of retrieval of a web page during a web search. Properties are extracted from the plurality of web pages in the index and the mathematical model is applied to the properties for each web page to calculate a sortrank value. The index is reordered based on the sortrank value such that the web pages most likely to be retrieved by a user submitting a search query appear first in the index. After a search query is received from a user the index is traversed in an order determined by the sortrank value. Responsive web pages are presented to the user in an order determined by a search engine ranking algorithm.
42 Citations
20 Claims
-
1. One or more computer storage media (the “
- media”
) storing computer-useable instructions that, when used by one or more computing devices, cause the one or more computing devices to perform a method for predicting the likelihood of retrieval of web documents during a web search, the method comprising;receiving historical usage data related to user queries and training properties of a plurality of web pages in an index; training a mathematical model to predict a likelihood of retrieval for the plurality of web pages based on the historical usage data and the training properties; extracting properties from the plurality of web pages in the index; applying the mathematical model to the properties; calculating a sortrank value for each web page based on the mathematical model and the properties; reordering the index based on the sortrank value for each web page; - View Dependent Claims (2, 3, 4, 5, 6)
- media”
-
7. A computer system for predicting the likelihood of retrieval of web documents during a web search, the computer system comprising a processor coupled to a computer-storage medium, the computer-storage medium having stored thereon a plurality of computer software components executable by the processor, the computer software components comprising:
-
an extraction component for extracting properties from a plurality of web pages in an index; a ranking component for determining a sortrank value for each web page based on the properties; and an indexing component for reordering the index based on the sortrank value; - View Dependent Claims (8, 9, 10, 11, 12, 13, 14)
-
-
15. A computerized method for predicting the likelihood of retrieval of web documents, the method comprising:
-
receiving historical usage data based on a frequency of web page retrieval for a sample query set; training a mathematical model with the historical usage data and training properties of web pages to predict a likelihood of retrieval; extracting one or more query independent properties from a plurality of web pages in an index; determining, by the mathematical model, a sortrank value for each web page; assigning the sortrank value to each web page based on the one or more query independent properties; and sorting the plurality of web pages in the index based on the sortrank value. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification