System and methods for automatic clustering of ranked and categorized search objects
First Claim
1. A computer implemented method of presenting a search report identifying documents relevant to an input query text, said method comprising the steps of:
- a) first determining a primary top-n set of documents corresponding to a query text, wherein said query text is provided through a user interface, wherein said first determining step is operative to match said query text against a plurality of terms stored in a database, wherein said plurality of terms correspond to anchor texts occurring within documents of an analyzed document collection, wherein said plurality of terms are associated with sets of document addresses identifying the documents of anchor text occurrence, and wherein said primary top-n set of documents correspond to those top ranked based on frequency of occurrence of the matched subset of said plurality of terms;
b) second determining a set of keywords occurring within said primary top-n set of documents, wherein said database stores a pre-established keyword ontology with keyword associated ranking values determined with respect to said analyzed document collection, and wherein said pre-established keyword ontology includes said set of keywords;
c) clustering said set of keywords into an ordered plurality of keyword lists dependent on a ranked relatedness determined by reference to said pre-established keyword ontology, said step of clustering including the iterative steps ofi) computing a unified keyword ranking for each of said set of keywords with respect to said primary top-n set of documents and said pre-established keyword ontology keyword associated ranking values;
ii) selecting a top-n subset of said set of keywords based on said unified keyword ranking as a keyword cluster; and
iii) removing said top-n subset from said set of keywords and repeating said step of clustering until a predetermined number of clusters are found or exhausting said set of keywords;
d) presenting, through said user interface, said ordered plurality of keyword lists as categorized keyword lists.
1 Assignment
0 Petitions
Accused Products
Abstract
A search results page includes multiple search lists generated by multiple clustering operations applied to an initial match set of documents selected based on a user query. A first result list is constructed by clustering a top-n set of documents by primary domain address and sorting based on extrinsic ranking factors such that the first list includes a ranked and ordered list of primary domain linked anchor text. A second result list is constructed by clustering the top-n set of documents based on a unified ranked occurrence of keywords within the top-n set of documents. The generated second list contains a plurality of cluster class references with each of the cluster class reference including a ranked ordered sub-list of the keywords occurring within the top-n set of documents and respectively associated with the cluster class reference, each of the keywords of the ranked ordered sub-lists including linking references to a corresponding one of the top-n set of documents. A third result list is constructed by clustering the top-n set of documents based on a ranked frequency of occurrence of internally linked anchor texts. The generated third result list includes the top-n set of the internally linked anchor texts and respective ranked and ordered sub-lists of linking references to primary domain Web-pages containing the corresponding one of the internally linked anchor texts.
-
Citations
30 Claims
-
1. A computer implemented method of presenting a search report identifying documents relevant to an input query text, said method comprising the steps of:
-
a) first determining a primary top-n set of documents corresponding to a query text, wherein said query text is provided through a user interface, wherein said first determining step is operative to match said query text against a plurality of terms stored in a database, wherein said plurality of terms correspond to anchor texts occurring within documents of an analyzed document collection, wherein said plurality of terms are associated with sets of document addresses identifying the documents of anchor text occurrence, and wherein said primary top-n set of documents correspond to those top ranked based on frequency of occurrence of the matched subset of said plurality of terms; b) second determining a set of keywords occurring within said primary top-n set of documents, wherein said database stores a pre-established keyword ontology with keyword associated ranking values determined with respect to said analyzed document collection, and wherein said pre-established keyword ontology includes said set of keywords; c) clustering said set of keywords into an ordered plurality of keyword lists dependent on a ranked relatedness determined by reference to said pre-established keyword ontology, said step of clustering including the iterative steps of i) computing a unified keyword ranking for each of said set of keywords with respect to said primary top-n set of documents and said pre-established keyword ontology keyword associated ranking values; ii) selecting a top-n subset of said set of keywords based on said unified keyword ranking as a keyword cluster; and iii) removing said top-n subset from said set of keywords and repeating said step of clustering until a predetermined number of clusters are found or exhausting said set of keywords; d) presenting, through said user interface, said ordered plurality of keyword lists as categorized keyword lists. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer implemented method of presenting a search results Web-page identifying documents of an Web-based document collection responsive to an input query text presented through a Web-based user interface, said method comprising the steps of:
-
a) generating a plurality of results lists responsive to an input query text presented through a Web-based user interface, wherein said plurality of results lists are derived from a top-n set of documents found by i) matching said input query text to a plurality of terms representing anchor text instances occurring within a Web-based document collection to obtain a list of documents containing matched instances of said plurality of terms; ii) ordering said list of documents based on a keyword rank value determined for each document proportional to the frequency of occurrence of predetermined keywords in an analyzed set of said Web-based document collection and the frequency of occurrence of said predetermined keywords in said document; and iii) selecting, based on keyword rank value, said top-n set of documents having at least a predetermined threshold keyword rank value, wherein said plurality of lists include i) a top-n domains list determined by aggregation of the domains of occurrence of said top-n set of documents; ii) a related keywords list determined from an iterative reduction clustering of keyword occurrences within said top-n set of documents; and iii) a categories list determined from the set of internal link anchor texts occurring within respective domain hierarchies; and b) compositing said plurality of results lists together in a search results Web-page for presentation though said Web-based user interface. - View Dependent Claims (7, 8, 9, 10, 11)
-
-
12. A computer implemented method of producing a search results Web-page in response to the presentation of a user query, said method comprising the steps of:
-
a) evaluating a user query text provided through a Web-based user interface to select a top-n set of Web-page documents, wherein said Web-page documents are selected based on ranked frequency of occurrence of said user query text in said Web-page documents; b) generating a plurality of result lists, including; i) a first result list constructed by a first clustering said top-n set of Web-pages documents by primary domain address and sorting based on predetermined extrinsic ranking factors, said first list containing primary domain address identifying anchor text with respective linking references to said primary domain addresses; ii) a second result list constructed by a second clustering said top-n set of Web-page documents based on a unified ranked occurrence of predetermined keywords within said top-n set of Web-page documents, said second list containing a plurality of cluster class references with each said cluster class reference including a ranked ordered sub-list of said predetermined keywords occurring within said top-n set of Web-page documents and respectively associated with said cluster class reference, each said predetermined keywords of said ranked ordered sub-lists including linking references to a corresponding one of said top-n set of Web-page documents; iii) a third result list constructed by a third clustering said top-n set of Web-page documents based on a ranked frequency of occurrence of internally linked anchor texts, said third result list including a top-n set of said internally linked anchor texts and respective ranked and ordered sub-lists of linking references to primary domain Web-pages containing the corresponding one of said internally linked anchor texts; and c) displaying said plurality of result lists together in a search results Web-page though said Web-based user interface.
-
-
13. A computer implemented method of producing a search results Web-page in response to the presentation of a user query, said method comprising the steps of:
-
a) deriving a plurality of keywords from an analyzed set of Web-pages dependent on a user query text presented through a user interface; b) associate keyword values with said plurality of keywords, said keyword values being determined in relation to said analyzed set of Web-pages; c) performing an iterative reduction clustering of said plurality of keywords based on said associated keyword values to obtain a plurality of keyword lists; and d) displaying said plurality of keyword lists as a list set component of a search results Web-page through said user interface. - View Dependent Claims (14, 15, 27, 28, 29, 30)
-
-
16. A computer implemented method of producing a search results Web-page in response to the presentation of a user query, said method comprising the steps of:
-
a) identifying a plurality of Web-pages from an analyzed set of Web-pages as corresponding to a user query text presented through a user interface; b) resolving a domain list corresponding to said plurality of Web-pages; c) sorting said domain list based on predetermined criteria including the number of said plurality of Web-pages corresponding to each domain within said domain list; and d) displaying said domain list in sorted order as a list set component of a search results Web-page through said user interface. - View Dependent Claims (17, 18)
-
-
19. A computer implemented method of producing a search results Web-page in response to the presentation of a user query, said method comprising the steps of:
-
a) identifying a plurality of Web-pages from an analyzed set of Web-pages as corresponding to a user query text presented through a user interface; b) resolving an anchor text list from said plurality of Web-pages, wherein said anchor text list includes the anchor text of internal links occurring within said plurality of Web-pages; c) ranking each anchor text of said anchor text list based on predetermined criteria including the frequency and relative location of occurrence in said plurality of Web-pages; d) displaying said anchor text list in sorted order, based on relative ranking, as a list set component of a search results Web-page through said user interface. - View Dependent Claims (20, 21, 22)
-
-
23. A computer implemented method of producing a search results Web-page in response to the presentation of a user query, said method comprising the steps of:
-
a) identifying a plurality of Web-pages from an analyzed set of Web-pages as corresponding to a user query text presented through a user interface, wherein said step of identifying selects said plurality of Web-pages dependent on matching anchor texts, occurring within Web-pages of said analyzed set of Web-pages, with predetermined portions of said user query text; b) first resolving an anchor text list including said matched anchor texts; c) sorting said anchor text list based on predetermined criteria including the number of said plurality of Web-pages corresponding to each anchor text within said anchor text list; and d) displaying said anchor text list in sorted order as a list set component of a search results Web-page through said user interface. - View Dependent Claims (24, 25, 26)
-
Specification