Page aggregation for Web sites
3 Assignments
0 Petitions
Accused Products
Abstract
Disclosed is a method and device and computer storage medium for determining whether two pages linked on the World Wide Web are a part of the same World Wide Web site. The method involves examining and comparing the IP addresses of the Web pages, and can also be extended to finding other pages to which a given Web page is linked on the Web, and to determining whether a Web page of interest is part of a Web site with a desired characteristic, such as being part of an electronic commerce site.
136 Citations
54 Claims
-
1-33. -33. (canceled)
-
34. A method for determining whether first and second linked World Wide Web pages are part of the same Web site, the method comprising:
-
(a) determining a four-byte IP address of a server on which resides the first page and a four-byte IP address of a server on which resides the second linked page;
(b) if the first three bytes of said four-byte IP address of said server upon which resides said first page is different than the first three bytes of said four-byte IP address of said server upon which resides said second page, determining whether said first and second linked pages reside on at least one other combination of servers;
(c) if said first page and said second linked page reside on at least one other combination of servers, repeating (a) and (b) for said at least one other combination of servers, such that if the first three bytes of a four-byte IP address of a server from said combination upon which resides said first page is identical to the first three bytes of a four-byte IP address of a server from said combination upon which resides said second page, concluding that said first and second linked pages are part of said same Web site, and concluding said process; and
,(d) if (b) does not yield at least one other combination of servers, concluding that said first and second linked pages are not part of said same Web site. - View Dependent Claims (35, 36)
-
-
37. A device for determining whether first and second linked World Wide Web pages are part of the same Web site, the device comprising:
-
(a) means for determining a four-byte IP address of a server on which resides the first page and a four-byte IP address of a server on which resides the second page;
(b) means for comparing said IP addresses (i) such that an identity between the first three bytes of said four-byte IP address of said server upon which resides said first page and the first three bytes of said four-byte IP address of said server upon which resides said second page indicates that said first and second linked pages are part of said same Web site, and (ii) such that a difference between said first three bytes of said four-byte IP address of said server upon which resides said first page and said first three bytes of said four-byte IP address of said server upon which resides said second page requires a determination of whether said first page and said second page reside on at least one other combination of servers; and
,(c) means for analyzing said at least one other combination of servers on which said first page and said second page reside, such that an absence of at least one other combination of servers on which said first page and said second page reside indicates that said first page and said second page are not part of said same Web site. - View Dependent Claims (38, 39)
-
-
40. A computer storage medium containing a computer program having means for determining whether first and second linked World Wide Web pages are part of said same Web site, comprising:
-
(a) means for determining a four-byte IP address of a server on which resides the first page and a four-byte IP address of a server on which resides the second linked page;
(b) means for comparing said IP addresses (i) such that an identity between the first three bytes of said four-byte IP address of said server upon which resides said first page and the first three bytes of said four-byte IP address of said server upon which resides said second page indicates that said first and second linked pages are part of said same Web site, and (ii) such that a difference between said first three bytes of said four-byte IP address of said server upon which resides said first page and said first three bytes of said four-byte IP address of said server upon which resides said second page requires a determination of whether said first page and said second page reside on at least one other combination of servers; and
,(c) means for analyzing said at least one other combination of servers on which said first page and said second page reside, such that an absence of at least one other combination of servers on which said first page and said second page reside indicates that said first page and said second page are not part of said same Web site. - View Dependent Claims (41, 42)
-
-
43. A method of determining whether first and second pages are associated with the same site, the method comprising:
-
determining Internet Protocol (IP) addresses of a pair of servers on which reside the first and second pages, based on the first three bytes of said IP addresses of said pair of servers being different, determining whether said first and second pages reside on one or more different pairs of servers, and based on said first and second pages residing on one or more different pairs of servers;
determining IP addresses of each of said one or more different pairs of servers, comparing the first three bytes of said IP addresses of each of said one or more different pairs of servers to provide a comparison result for each of said one or more different pairs of servers, and determining that said first and second pages are associated with said same site based on at least one of said comparison results indicating that said first three bytes of said IP addresses of said one or more different pairs of servers are identical. - View Dependent Claims (44, 45)
-
-
46. A method of identifying pages that are associated with the same site as a starting page, said method comprising:
based on a pre-determined link processing order, processing all of said links on said starting page and all of said links on pages associated with said same site as said starting page, wherein processing includes;
identifying on a page one or more links to one or more different pages, determining IP addresses of a pair of servers on which reside said starting page and one of said one or more different pages, comparing the first three bytes of said IP addresses to provide a comparison result for said pair of servers, and determining that said starting page and said one of said one or more different pages are associated with said same site based on said comparison result for said pair of servers indicating that said first three bytes of said IP addresses of said pair of servers are identical. - View Dependent Claims (47, 48, 49)
-
50. A method of determining whether a site includes data related to a search query, said method comprising:
-
determining whether a starting page that is associated with said site includes data related to said search query, based on said starting page not including data related to said search query, processing one or more of;
one or more links on said starting page and one or more links on pages determined to be associated with said same site as said starting page, wherein processing includes;
identifying on a page one or more links to one or more different pages, determining IP addresses of a pair of servers on which reside said starting page and one of said one or more different pages, comparing the first three bytes of said IP addresses to provide a comparison result for said pair of servers, determining that said starting page and said one of said one or more different pages are associated with said same site based on said comparison result indicating that said first three bytes of said IP addresses of said pair of servers are identical, and based on said starting page and said one of said one or more different pages being associated with said same site, determining whether said one of said one or more different pages includes data related to said search query, and based on one or more pages determined to be associated with said same site as said starting page including data related to said search query, determining that said site includes data related to said search query. - View Dependent Claims (51, 52, 53, 54)
-
Specification