Method for providing a substitute for a requested inaccessible object by identifying substantially similar objects using weights corresponding to object features
First Claim
1. In a computer system, a method comprising computer-implemented steps of:
- obtaining a first plurality of weights for a first plurality of features of a first data object and a second plurality of weights for a second plurality of features of a second data object, each of said weights being associated with a corresponding one of the features in the first or second pluralities of features, wherein said each weight quantitatively specifies an amount by which the corresponding one feature distinguishes an associated data object from other data objects contained within a collection of data objects;
comparing a first weight for a selected one of the features of the first data object with a second weight for the selected one feature of the second data object so as to determine whether the first and second data objects are substantially identical; and
if the first and second data objects are substantially identical, accessing the second data object, in response to a request emanating from a requesting party for the first data object, when the first data object cannot be accessed in a predefined manner by the requesting party.
2 Assignments
0 Petitions
Accused Products
Abstract
Weights are assigned to features of data objects and the weights are utilized to determine whether data objects are substantially identical or not. One application of such weights is to assign weights to terms in web page documents. The weights assigned to the terms may then be utilized to determine whether web page documents are substantially identical. A set of identicals may be generated for each web page that is indexed by the system and utilized to repair broken hyperlinks. Specifically, when a uniform resource locator (URL) associated with the hyperlink cannot be resolved or cannot be resolved in a timely fashion, one of the identicals of the desired web page documents may be returned to provide a requesting party with the desired content.
-
Citations
47 Claims
-
1. In a computer system, a method comprising computer-implemented steps of:
-
obtaining a first plurality of weights for a first plurality of features of a first data object and a second plurality of weights for a second plurality of features of a second data object, each of said weights being associated with a corresponding one of the features in the first or second pluralities of features, wherein said each weight quantitatively specifies an amount by which the corresponding one feature distinguishes an associated data object from other data objects contained within a collection of data objects; comparing a first weight for a selected one of the features of the first data object with a second weight for the selected one feature of the second data object so as to determine whether the first and second data objects are substantially identical; and if the first and second data objects are substantially identical, accessing the second data object, in response to a request emanating from a requesting party for the first data object, when the first data object cannot be accessed in a predefined manner by the requesting party. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. In a computer system having access to web pages, a method comprising computer-implemented steps of:
-
obtaining a first plurality of weights for a first plurality of terms of a first web page and a second plurality of weights for a second plurality of terms of a second web page, each of said weights being associated with a corresponding one of the terms in the first or second pluralities of terms, wherein said each weight quantitatively specifies an amount by which the corresponding one term distinguishes an associated web page from other web pages contained within a collection of web pages; comparing a first weight for a selected one of the terms of the first web page with a second weight for the selected one term of the second web page so as to determine whether the first and second web pages are substantially identical; and if the first and second web pages are substantially identical, accessing the second web page, in response to a request emanating from a requesting party for the first web page, when the first web page cannot be accessed in a predefined manner by the requesting party. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. In a computer system, a method comprising the steps of:
-
providing a hypertext document having a hyperlink to a first web page holding media content; in response to a party selecting the hyperlink, attempting to access the first web page so as to define an access attempt; and if the access attempt is unsuccessful, identifying a second web page, holding substantially identical media content to the media content in said first web page, through the steps of; obtaining a first plurality of weights for a first plurality of features of the media content in the first web page and a second plurality of weights for a second plurality of features of the media content in the second web page, each of said weights being associated with a corresponding feature, in the first or second pluralities of features, of the media content in either the first or second web pages, wherein said each weight quantitatively specifies an amount by which the corresponding one feature distinguishes the associated media content in either the first or second web pages from media content contained within a collection of other web pages; and comparing a first weight for a selected one of the features of the media content in the first web page with a second weight for the selected one feature of the media content in the second web page so as to determine whether the media content in the first and second web pages is substantially identical; and where the access attempt is unsuccessful and the media content in the first and second web pages is substantially identical; accessing the media content in the second web page; and returning the media content in the second web page to the party which selected the hyperlink to the first web page. - View Dependent Claims (21, 22, 23, 24)
-
-
25. In a computer system, a method comprising the computer-implemented steps of:
-
calculating first and second word weights for a selected term within a first document and for the selected term within a second document, respectively, by; (i) calculating a collection frequency component that identifies how often the selected term appears within documents in a collection of documents; (ii) calculating first and second term frequency components for the first and second documents equal to a number of times that the selected term appears within the first and second documents, respectively; and (iii) calculating first and second products of the collection frequency component and the first and second term frequency components, respectively, and normalizing the first and second products to respectively produces the first and second word weights; comparing the first word weight with the second word weight so as to determine whether the first and second documents are substantially identical; and if the first and second documents are substantially identical, accessing the second document, in response to a request emanating from a requesting party to access the first document, when the first document can not be accessed in a predefined manner by the requesting party. - View Dependent Claims (26, 27)
-
-
28. For use with a computer system, a computer-readable medium storing computer-executable instructions for performing a method comprising the computer-implemented steps of:
-
obtaining a first plurality of weights for a first plurality of features of a first data object and a second plurality of weights for a second plurality of features of a second data object, each of said weights being associated with a corresponding one of the features in the first and second pluralities of features, wherein said each weight quantitatively specifies an amount by which the corresponding one feature distinguishes an associated data object from other data objects contained within a collection of data objects; comparing a first weight for a selected one of the features of the first data object with a second weight for the selected one feature of the second data object so as to determine whether the first and second data objects are substantially identical; and if the first and second data objects are substantially identical, accessing the second data object, in response to a request emanating from a requesting party for the first data object, when the first data object cannot be accessed in a predefined manner by the requesting party. - View Dependent Claims (29, 30, 31, 32, 33)
-
-
34. For use with a computer system having access to web pages, a computer-readable medium storing computer-executable instructions for performing a method comprising the computer-implemented steps of:
-
obtaining a first plurality of weights for a first plurality of terms of a first web page and a second plurality of weights for a second plurality of terms of a second web page, each of said weights being associated with a corresponding one of the terms in the first or second pluralities of terms, wherein said each weight quantitatively specifies an amount by which the corresponding one term distinguishes an associated web page from other web pages contained within a collection of web pages; comparing a first weight for a selected one of the terms of the first web page with a second weight for the selected one term of the second web page so as to determine whether the first and second web pages are substantially identical; and if the first and second web pages are substantially identical, accessing the second web page, in response to a request emanating from a requesting party for the first web page, when the first web page cannot be accessed in a predefined manner by the requesting party. - View Dependent Claims (35, 36, 37, 38, 39, 40, 41)
-
-
42. For use with a computer system, a computer-readable medium storing computer-executable instructions for performing a method comprising the computer-implemented steps of:
-
providing a hypertext document having a hyperlink to a first web page holding media content; in response to a party selecting the hyperlink, attempting to access the first web page so as to define an access attempt; and if the access attempt is unsuccessful, identifying a second web page, holding substantially identical media content to the media content in said first web page, through the steps of; obtaining a first plurality of weights for a first plurality of features of the media content in the first web page and a second plurality of weights for a second plurality of features of the media content in the second web page, each of said weights being associated with a corresponding feature, in the first or second pluralities of features, of the media content in either the first or second web pages, wherein said each weight quantitatively specifies an amount by which the corresponding one feature distinguishes the associated media content in either the first or second web page from media content contained within a collection of other web pages; and comparing a first weight for a selected one of the features of the media content in the first web page with a second weight for the selected one feature of the media content in the second web page so as to determine whether the media content in the first and second web pages is substantially identical; and where the access attempt is unsuccessful and the media content in the first and second web pages is substantially identical; accessing the media content in the second web page; and returning the media content in the second web page to the party which selected the hyperlink to the first web page. - View Dependent Claims (43, 44)
-
-
45. For use with a computer system, a computer-readable medium storing computer-executable instructions for performing a method comprising the computer-implemented steps of:
-
calculating first and second word weights for a selected term within a first document and for the selected term within a second document, respectively, by; (i) calculating a collection frequency component that identifies how often the selected term appears within documents in a collection of documents; (ii) calculating first and second term frequency components for the first and second documents equal to a number of times that the selected term appears within the first and second documents, respectively; and (iii) calculating first and second products of the collection frequency component and the first and second term frequency components, respectively, and normalizing the first and second products to respectively produce the first and second word weights; comparing the first word weight with the second word weight so as to determine whether the first and second documents are substantially identical; and if the first and second documents are substantially identical, accessing the second document, in response to a request emanating from a requesting party to access the first document, when the first document can not be accessed in a predefined manner by the requesting party. - View Dependent Claims (46, 47)
-
Specification