Systems and methods of retrieving topic specific information
First Claim
1. A computer-implemented method of ranking the relevancy of a collection of hypertext pages to a topic specific keyword-based query, comprising:
- calculating an analytic rank of a page;
calculating an editorial rank of the page; and
calculating a rank of the page by combining the analytic rank and the editorial rank.
3 Assignments
0 Petitions
Accused Products
Abstract
The present invention provides systems and methods of searching web pages relevant to a specific topic based on quality of individual pages. The rank of a page for a keyword may be a combination of analytic rank and editorial rank. The analytic rank of a page may be calculated by combining intrinsic and extrinsic ranks. Intrinsic rank is a measure of relevancy of a page to a given keyword as claimed by an author of the page, while extrinsic rank is a measure of the relevancy of a page on a given keyword as indicated by other pages. The former may be obtained from an analysis of keyword matching in various parts of the page while the latter is obtained from context-sensitive connectivity analysis of the link structure of the entire Internet. Methods are described to solve the self-consistent equation satisfied by the page-weights and site-weights in a very efficient iterative way. The ranking mechanism for multi-word query is also described.
141 Citations
36 Claims
-
1. A computer-implemented method of ranking the relevancy of a collection of hypertext pages to a topic specific keyword-based query, comprising:
-
calculating an analytic rank of a page;
calculating an editorial rank of the page; and
calculating a rank of the page by combining the analytic rank and the editorial rank. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
-
-
26. A web search engine, comprising:
-
a web page database;
a crawler configured to fetch pages from the Internet and store the pages in the web page database;
a URL extractor configured to extract outbound link information from the pages;
a URL management system configured to assign an identification number to a URL of each page, and store the identification number and URL pairs in the web page database and send new URLs to the crawler to be retrieved from the Internet;
a link database;
a link extractor configured to extract anchor text and a link information from the pages and store in the link database;
an index database;
an indexer configured to parse keywords from the pages and store the keyword and URL identification pairs in the index database; and
a ranker configured to rank a page based on analytic rank and editorial rank of the page. - View Dependent Claims (27, 28, 29, 30, 31, 32, 33, 34, 35)
-
-
36. A computer readable medium having embodied thereon a program, the program being executable by a machine to perform a method for ranking the relevancy of a collection of hypertext pages to a topic specific keyword-based query, the method comprising:
-
calculating an analytic rank of a page;
calculating an editorial rank of the page; and
calculating a rank of the page by combining the analytic rank and the editorial rank.
-
Specification