Method and apparatus for intranet searching
First Claim
1. A method for creating data for use in ranking results of a search of a plurality of pages in an intranet, each page comprising a plurality of terms arranged in a hierarchical structure, the method comprising acts of:
- (A) parsing the plurality of pages to identify the plurality of terms, wherein at least one page of the plurality of pages comprises a plurality of instances of a term of the plurality of terms;
(B) automatically computing, for each instance of each identified term and for each page in the plurality of pages in which the identified term appears, a specificity value of the instance, the specificity value being computed, for each instance of the identified term on the page, based upon a combination of a tag parameter associated with the instance and a term level of the instance within the hierarchical structure of the page, the term level being measured from at least one root level of the page;
(C) the act of combining, for each identified term and for each page in the plurality of pages in which the identified term appears, the computed specificity values of the instances of the identified term within the page to produce a combined computed specificity value of the identified term;
(D) storing, for each identified term and for each page in the plurality of pages in which the identified term appears, the combined computed specificity value of the identified term and an identifier for the page; and
(E) ranking a plurality of matching pages in the plurality of pages, the matching pages returned in response to a search query performed on the plurality of pages, the ranking based on a comparison, for each matching page, between the combined computed specificity value stored for an identified term matching a term of the query and a degree of specificity associated with the query, wherein the matching pages are identified as including an identified term matching a term of the query based on the identifier for the page stored for the identified term.
1 Assignment
0 Petitions
Accused Products
Abstract
A search system for finding information relevant on an intranet in response to a user query. Separate from the query, the system crawls the intranet to select at least some of the pages in the intranet. The system determines, for items in a selected page, an indication of the specificity or generality of that information based on structural information, such as display formatting or a number of links in a shortest path from a root page to the identified page. The system can also determine the level of specificity of a search query. When a user provides a query to request information over the intranet, pages matching the query can be ranked by comparing the level of specificity or generality of the query to the level of specificity or generality of the terms that match the query.
23 Citations
19 Claims
-
1. A method for creating data for use in ranking results of a search of a plurality of pages in an intranet, each page comprising a plurality of terms arranged in a hierarchical structure, the method comprising acts of:
-
(A) parsing the plurality of pages to identify the plurality of terms, wherein at least one page of the plurality of pages comprises a plurality of instances of a term of the plurality of terms; (B) automatically computing, for each instance of each identified term and for each page in the plurality of pages in which the identified term appears, a specificity value of the instance, the specificity value being computed, for each instance of the identified term on the page, based upon a combination of a tag parameter associated with the instance and a term level of the instance within the hierarchical structure of the page, the term level being measured from at least one root level of the page; (C) the act of combining, for each identified term and for each page in the plurality of pages in which the identified term appears, the computed specificity values of the instances of the identified term within the page to produce a combined computed specificity value of the identified term; (D) storing, for each identified term and for each page in the plurality of pages in which the identified term appears, the combined computed specificity value of the identified term and an identifier for the page; and (E) ranking a plurality of matching pages in the plurality of pages, the matching pages returned in response to a search query performed on the plurality of pages, the ranking based on a comparison, for each matching page, between the combined computed specificity value stored for an identified term matching a term of the query and a degree of specificity associated with the query, wherein the matching pages are identified as including an identified term matching a term of the query based on the identifier for the page stored for the identified term. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 10, 19)
-
-
9. The method of 1, wherein the act (B) further comprises computing the specificity value of the instance of the identified term based, at least in part, on a specificity value determined for each matching page.
-
11. A method for creating a term index for use in ranking results of a search of a plurality of pages in an intranet, each page comprising a plurality of terms arranged in a hierarchical structure, the method comprising acts of:
-
(A) parsing the plurality of pages to identify the plurality of terms, wherein at least one page of the plurality of pages comprises a plurality of instances of a term of the plurality of terms; (B) for each of the plurality of pages; (i) automatically computing, for each instance of each identified term, a specificity value of the instance based upon a combination of a tag parameter associated with the instance and a term level of the instance within the hierarchical structure of the page, the term level being measured from a root tag of the page; (ii) combining, for each identified term, the computed specificity value of the instances of the identified term within the page to produce a combined specificity value for the identified term; and (iii) storing in the term index for each identified term, the page in which the identified term appears, and information identifying the identified term in conjunction with information indicating the combined specificity value for the identified term. - View Dependent Claims (12, 13, 16, 17, 18)
-
-
14. A computer storage medium encoded with a program for execution on at least one processor, the program, when executed on the at least one processor, performing a method for creating data for use in ranking results of a search of a plurality of pages in an intranet, each page comprising a plurality of terms arranged in a hierarchical structure, the method comprising:
-
(A) the act of parsing the plurality of pages to identify the plurality of terms, wherein at least one page of the plurality of pages comprises a plurality of instances of a term of the plurality of terms; (B) the act of computing, for each instance of each identified term within the plurality of pages, a specificity value of the instance based on a combination of a tag parameter associated with the instance and a term level of the instance within the hierarchical structure of the page of the instance, the term level being measured from a root tag of the page of the instance; (C) the act of combining, for each identified term and for each page in the plurality of pages in which the identified term appears, the computed specificity value of each instance of the identified term within the page to produce a combined computed specificity value of the identified term; and (D) the act of storing information, for each identified term and for each page in the plurality of pages in which the identified term appears, the combined computed specificity value of the identified term and an identifier for the page. - View Dependent Claims (15)
-
Specification