Page aggregation for web sites
First Claim
Patent Images
1. A method for determining whether first and second linked World Wide Web pages are part of the same Web site, the method comprising:
- (a) determining a multi-byte IP address of a server on which resides the first page and a multi-byte IP address of a server on which resides the second linked page, the multi-byte IP address having at least four bytes;
(b) if a leading subset of one or more of the bytes of said multi-byte IP address of said server upon which resides said first page is different than a leading subset of one or more of the bytes of said multi-byte IP address of said server upon which resides said second page, determining whether said first and second linked pages reside on at least one other combination of servers;
(c) if said first page and said second linked page reside on at least one other combination of servers, repeating (a) and (b) for said at least one other combination of servers, such that if the leading subset of the bytes of a multi-byte IP address, the multi-byte IP address having at least four bytes, of a server from said combination upon which resides said first page is identical to the leading subset of the bytes of a multi-byte IP address, the multi-byte IP address having at least four bytes, of a server from said combination upon which resides said second page, concluding that said first and second linked pages are part of said same Web site, and concluding said process; and
,(d) if (b) does not yield at least one other combination of servers, concluding that said first and second linked pages are not part of said same Web site.
3 Assignments
0 Petitions
Accused Products
Abstract
Disclosed is a method and device and computer storage medium for determining whether two pages linked on the World Wide Web are a part of the same World Wide Web site. The method involves examining and comparing the IP addresses of the Web pages, and can also be extended to finding other pages to which a given Web page is linked on the Web, and to determining whether a Web page of interest is part of a Web site with a desired characteristic, such as being part of an electronic commerce site.
-
Citations
27 Claims
-
1. A method for determining whether first and second linked World Wide Web pages are part of the same Web site, the method comprising:
-
(a) determining a multi-byte IP address of a server on which resides the first page and a multi-byte IP address of a server on which resides the second linked page, the multi-byte IP address having at least four bytes; (b) if a leading subset of one or more of the bytes of said multi-byte IP address of said server upon which resides said first page is different than a leading subset of one or more of the bytes of said multi-byte IP address of said server upon which resides said second page, determining whether said first and second linked pages reside on at least one other combination of servers; (c) if said first page and said second linked page reside on at least one other combination of servers, repeating (a) and (b) for said at least one other combination of servers, such that if the leading subset of the bytes of a multi-byte IP address, the multi-byte IP address having at least four bytes, of a server from said combination upon which resides said first page is identical to the leading subset of the bytes of a multi-byte IP address, the multi-byte IP address having at least four bytes, of a server from said combination upon which resides said second page, concluding that said first and second linked pages are part of said same Web site, and concluding said process; and
,(d) if (b) does not yield at least one other combination of servers, concluding that said first and second linked pages are not part of said same Web site. - View Dependent Claims (2, 3, 22)
-
-
4. A device for determining whether first and second linked World Wide Web pages are part of the same Web site, the device comprising:
-
(a) means for determining a multi-byte IP address of a server on which resides the first page and a multi-byte IP address of a server on which resides the second page, the multi-byte IP address having at least four bytes; (b) means for comparing said IP addresses (i) such that an identity between a leading subset of one or more of the bytes of said multi-byte IP address of said server upon which resides said first page and a leading subset of one or more of the bytes of said multi-byte IP address of said server upon which resides said second page indicates that said first and second linked pages are part of said same Web site, and (ii) such that a difference between said leading subset of the bytes of said multi-byte IP address of said server upon which resides said first page and said leading subset of the bytes of said multi-byte IP address of said server upon which resides said second page requires a determination of whether said first page and said second page reside on at least one other combination of servers; and
,(c) means for analyzing said at least one other combination of servers on which said first page and said second page reside, such that an absence of at least one other combination of servers on which said first page and said second page reside indicates that said first page and said second page are not part of said same Web site. - View Dependent Claims (5, 6, 23)
-
-
7. A non-transitory machine-accessible and readable storage medium containing a computer program having means for determining whether first and second linked World Wide Web pages are part of said same Web site, comprising:
-
(a) means for determining a multi-byte IP address of a server on which resides the first page and a multi-byte IP address of a server on which resides the second linked page, the multi-byte IP address having at least four bytes; (b) means for comparing said IP addresses (i) such that an identity between a leading subset of one or more of the bytes of said multi-byte IP address of said server upon which resides said first page and a leading subset of one or more of the bytes of said multi-byte IP address of said server upon which resides said second page indicates that said first and second linked pages are part of said same Web site, and (ii) such that a difference between said leading subset of the bytes of said multi-byte IP address of said server upon which resides said first page and said leading subset of the bytes of said multi-byte IP address of said server upon which resides said second page requires a determination of whether said first page and second page reside on at least one other combination of servers; and
,(c) means for analyzing said at least one other combination of servers on which said first page and said second page reside, such that an absence of at least one other combination of servers on which said first page and said second page reside indicates that said first page and said second page are not part of said same Web site. - View Dependent Claims (8, 9, 24)
-
-
10. A method of determining whether first and second pages are associated with the same site, the method of comprising:
-
determining by a computing Internet Protocol (IP) addresses of a pair of servers on which reside the first and second pages, based on a leading subset of one or more of the bytes of said IP addresses of said pair of servers being different, determining whether said first and second pages reside on one or more different pairs of servers, and based on said first and second pages residing on one or more different pairs of servers; determining IP addresses of each of said one or more different pairs of servers, comparing a leading subset of one or more of the bytes of said IP addresses of each of said one or more different pairs of servers to provide a comparison result for each of said one or more different pairs of servers, and determining that said first and second pages are associated with said same site based on at least one of said comparison results indicating that said leading subset of the bytes of said IP addresses of said one or more different pairs of servers are identical. - View Dependent Claims (11, 12, 25)
-
-
13. A method of identifying pages that are associated with the same site as a starting page, said method comprising:
-
based on a pre-determined link processing order, processing by a computing device all of said links on said starting page and all of said links on pages associated with said same sites as said starting page, wherein processing includes; identifying on a page one or more links to one or more different pages, determining IP addresses of a pair of servers on which reside said starting page and one of said one or more different pages, comparing a leading subset of one or more of the bytes of said IP addresses to provide a comparison result for said pair of servers, and determining that said starting page and said one of said one or more different pages are associated with said same site based on said comparison result for said pair of servers indicating that said leading subset of the bytes of said IP addresses of said pair of servers are identical. - View Dependent Claims (14, 15, 16, 26)
-
-
17. A method of determining whether a site includes data related to a search query, said method comprising:
-
determining by a computing device whether a starting page that is associated with said site includes data related to said search query, based on said starting page not including data related to said search query, processing one or more of;
one or more links on said starting page and one or more links on pages determined to be associated with said same site as starting page, wherein processing includes;identifying on a page one or more links to one or more different pages, determining IP addresses of a pair of servers on which reside said starting page and one of said one or more different pages, comparing a leading subset of one or more of the bytes of said IP addresses to provide a comparison result for said pair of servers, determining that said starting page and said one of said one or more different pages are associated with said same site based on said comparison result indicating that said leading subset of the bytes of said IP addresses of said pair of servers are identical, and based on said starting page and said one of said one or more different pages being associated with said same site, determining whether said one of said one or more different pages includes data related to said search query, and based on one or more pages determined to be associated with said same site as said starting page including data related to said search query, determining that said site includes data related to said search query. - View Dependent Claims (18, 19, 20, 21, 27)
-
Specification