Ranking search results using language types
First Claim
Patent Images
1. A computer-implemented method for ranking search results, comprising:
- determining a first property associated with each document in a collection of documents;
wherein the first property is a language type associated with the document that identifies a language of the document;
wherein the language of the document is determined by performing a statistical analysis of a character distribution in the document and comparing it to a trained language character distribution;
storing an identified language for each of the documents when it is determined that the identified language is not a default language in a language storage that is a query independent rank (QIR) storage that is separate from a QIR storage that stores other values used at query time;
determining a query language of a search query;
estimating a ranking value corresponding to properties for each document, wherein the ranking value corresponds to a measure of the relevance of each document based on the search query;
ranking each document that is responsive to the search query to obtain the search results, wherein each document is ranked based on the estimated ranking value and a comparison of the query language with the first property value;
ranking the documents according to a scoring function (score) that is determined according to at least;
a computed click distance (CD), a weight of a query-independent component (wcd), a weight of the click distance (bcd), a weight of a URL depth (bud), the URL depth (UD) and a click distance saturation constant (Kcd); and
using the ranking of the documents to display the search results.
2 Assignments
0 Petitions
Accused Products
Abstract
Search results of a search query on a network are ranked according to an additional ranking function for the prior probability of relevance of a document based on document property. The ranking function can be adjusted based on a comparison of the language that a document is written in and the language that is associated with a search query. Both query-independent values and query-dependent values can be used to rank the document.
214 Citations
20 Claims
-
1. A computer-implemented method for ranking search results, comprising:
-
determining a first property associated with each document in a collection of documents;
wherein the first property is a language type associated with the document that identifies a language of the document;
wherein the language of the document is determined by performing a statistical analysis of a character distribution in the document and comparing it to a trained language character distribution;
storing an identified language for each of the documents when it is determined that the identified language is not a default language in a language storage that is a query independent rank (QIR) storage that is separate from a QIR storage that stores other values used at query time;determining a query language of a search query; estimating a ranking value corresponding to properties for each document, wherein the ranking value corresponds to a measure of the relevance of each document based on the search query; ranking each document that is responsive to the search query to obtain the search results, wherein each document is ranked based on the estimated ranking value and a comparison of the query language with the first property value; ranking the documents according to a scoring function (score) that is determined according to at least;
a computed click distance (CD), a weight of a query-independent component (wcd), a weight of the click distance (bcd), a weight of a URL depth (bud), the URL depth (UD) and a click distance saturation constant (Kcd); andusing the ranking of the documents to display the search results. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A system for ranking search results, comprising:
-
a processor; a search engine included on a computing device, the search engine configured to execute computer-executable instructions, the computer-executable instructions comprising; determining a first property associated with each document in a collection of documents;
wherein the first property is a language type that identifies a language of the document, wherein the language type for each of the documents is only stored in a separate QIR (Query Independent Rank) storage when the language type is not a default language from another QIR storage used for storing values that can be used at query time for searching each document;determining a query language of a search query; estimating a ranking value corresponding to properties for each document, wherein the ranking value corresponds to a measure of the relevance of each document based on the search query; and ranking each document that is responsive to the search query to obtain the search results, wherein each document is ranked based on the estimated ranking value and a comparison of the query language with the first property value; ranking the documents according to a scoring function (score) that is determined according to at least;
a computed click distance (CD), a weight of a query-independent component (wcd), a weight of the click distance (bcd), a weight of a URL depth (bud), the URL depth (UD), a click distance saturation constant (Kcd), a weighted term frequency (wtf), a weighted document length (wdl), an average weighted document length (avwdl), a number of documents on the network (N);
a number of documents containing a query term (n); andusing the ranking of the documents to display the search results. - View Dependent Claims (15, 16)
-
-
17. A computer-readable storage medium that includes computer-executable instructions for ranking search results, the computer-executable instructions comprising:
-
determining a first property associated with each document in a collection of documents;
wherein the first property is a language type associated with the document;
wherein the collection of documents comprises documents of a default language and documents not of a default language;
wherein the language type identifies a language of the document and is only stored in a language QIR (Query Independent Rank) storage when the language type is not the default language;
wherein a separate QIR storage from the language QIR storage is used for storing values that can be used at query time for searching each document;determining a query language of a search query; estimating a ranking value corresponding to properties for each document, wherein the ranking value corresponds to a measure of the relevance of each document based on the search query; ranking each document that is responsive to the search query to obtain the search results, wherein each document is ranked based on the estimated ranking value and a comparison of the query language with the first property value; and using the ranking of the documents to display the search results. - View Dependent Claims (18, 19, 20)
-
Specification