Systems and methods of retrieving relevant information
First Claim
22. A computer-implemented method of ranking a collection of hypertext pages, comprising:
- calculating the intrinsic rank of a page for a multi-keyword query;
calculating the extrinsic rank of the page for the multi-keyword query; and
calculating the rank of the page in the collection of hypertext pages by combining the intrinsic rank and the extrinsic rank.
4 Assignments
0 Petitions
Accused Products
Abstract
The present invention provides systems and methods of retrieving the pages according to the quality of the individual pages. The rank of a page for a keyword is a combination of intrinsic and extrinsic ranks. Intrinsic rank is the measure of the relevancy of a page to a given keyword as claimed by the author of the page while extrinsic rank is a measure of the relevancy of a page on a given keyword as indicated by other pages. The former is obtained from the analysis of the keyword matching in various parts of the page while the latter is obtained from the context-sensitive connectivity analysis of the links connecting the entire Web. The present invention also provides the methods to solve the self-consistent equation satisfied by the page weights iteratively in a very efficient way. The ranking mechanism for multi-word query is also described. Finally, the present invention provides a method to obtain the more relevant page weights by dividing the entire hypertext pages into distinct number of groups.
-
Citations
52 Claims
-
22. A computer-implemented method of ranking a collection of hypertext pages, comprising:
-
calculating the intrinsic rank of a page for a multi-keyword query;
calculating the extrinsic rank of the page for the multi-keyword query; and
calculating the rank of the page in the collection of hypertext pages by combining the intrinsic rank and the extrinsic rank. - View Dependent Claims (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 23, 24, 25, 26, 27, 29, 30, 31, 32, 33, 34, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52)
-
-
28. A Web search engine, comprising:
-
a Web page database;
a crawler to fetch pages from the Web and store the pages in the Web page database;
a link extractor to extract link information from the pages;
a URL management system to assign an identification number to the URL of each page, and store the identification number and URL pairs in the Web page database and send new URLs to the crawler to be retrieved from the Web;
anchor text and link database;
an anchor text and link extractor to extract the anchor text and the link information from the pages and store in the anchor text and link database;
indexed database;
an indexer to parse keywords from the pages and store the keyword and URL identification pairs in the indexed database; and
a ranker to rank a page based on intrinsic rank and extrinsic rank of the page.
-
-
35. A computer system for ranking search results from a query on a collection of hypertext pages, comprising:
-
a crawler to fetch pages from the collection of hypertext pages;
a link extractor to extract page locator information from the fetched pages;
a page locator management system for storing and retrieving the page locator information;
a page database to store the pages;
an indexer to parse keywords from the pages and store the keyword page locator pairs in the indexed database;
an anchor text and link extractor to extract the anchor text and link structures from the pages;
an anchor text and link database, wherein the anchor text and link extractor writes the anchor text and link structures into the anchor text and link database; and
a ranker to assign a rank value to a page based on intrinsic and extrinsic rank.
-
-
47-1. The system of claim 35, wherein the ranker obtains the rank values from the link connectivity graph.
Specification