Method and system for web resource location classification and detection
First Claim
1. A method in a computer system for identifying a content location associated with a web page, the content location identifying a geographic location that is a subject of the web page, the method comprising:
- providing a spread threshold and a power threshold;
providing a geographic hierarchy of geographic locations;
for each of a plurality of geographic locations of the geographic hierarchy,calculating a weight for the geographic location that provides an indication that the web page is related to the geographic location based on geographic keywords contained on the web page;
calculating a power for the geographic location that factors in the weight of ancestor and descendant geographic locations, the power being a measure of whether the geographic location is a subject of the web page based on weight of ancestor and descendant geographic locations of the geographic location; and
calculating a spread for the geographic location based on the calculated power, the spread being a measure of the uniformity of the power among direct descendent geographic locations of the geographic location in the geographic hierarchy of geographic locations; and
after calculating the weight, power, and spread for the plurality of geographic locations,determining whether a geographic location has a power that meets the provided power threshold and a spread that meets the provided spread threshold; and
determining that the geographic location has a power that meets the provided power threshold and a spread that meets the provided spread threshold, identifying the geographic location as a content location of the web page.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and system for identifying locations associated with a web resource is provided. The location system identifies three different types of geographic locations: a provider location, a content location, and a serving location. A provider location identifies the geographic location of the entity that provides the web resource. A content location identifies the geographic location that is the subject of the web resource. A serving location identifies the geographic scope that the web page reaches. An application can select to use the type of location that is of particular interest.
-
Citations
15 Claims
-
1. A method in a computer system for identifying a content location associated with a web page, the content location identifying a geographic location that is a subject of the web page, the method comprising:
-
providing a spread threshold and a power threshold; providing a geographic hierarchy of geographic locations; for each of a plurality of geographic locations of the geographic hierarchy, calculating a weight for the geographic location that provides an indication that the web page is related to the geographic location based on geographic keywords contained on the web page; calculating a power for the geographic location that factors in the weight of ancestor and descendant geographic locations, the power being a measure of whether the geographic location is a subject of the web page based on weight of ancestor and descendant geographic locations of the geographic location; and calculating a spread for the geographic location based on the calculated power, the spread being a measure of the uniformity of the power among direct descendent geographic locations of the geographic location in the geographic hierarchy of geographic locations; and after calculating the weight, power, and spread for the plurality of geographic locations, determining whether a geographic location has a power that meets the provided power threshold and a spread that meets the provided spread threshold; and determining that the geographic location has a power that meets the provided power threshold and a spread that meets the provided spread threshold, identifying the geographic location as a content location of the web page. - View Dependent Claims (2, 3, 4, 8, 9, 11, 13, 14, 15)
-
-
5. A method in a computer system for identifying a serving location associated with a target web page, the method comprising:
-
providing a power threshold and a spread threshold; providing a geographic hierarchy of geographic locations; identifying one or more content locations for the target web page, a content location of a web page identifying a geographic location that is a subject of the web page; providing content locations associated with other web pages that include links to the target web page; determining whether a geographic location associated with the target web page is an identified serving location based on the provided content locations associated with the other web pages by iteratively calculating a power for each geographic location that factors in weight of ancestor and descendant geographic locations, the power being a measure of whether the geographic location is a subject of the web page; calculating a spread for each geographic location based on the calculated power, the spread being a measure of the uniformity of the power among direct descendent geographic locations of the geographic location in the geographic hierarchy of geographic locations; marking each geographic location that has a power that meets the provided power threshold and a spread that meets the provided spread threshold as a serving location of the target web page until the serving locations converge on a solution wherein the weight for each geographic location is computed based on a number of other web pages with links to the target web page and whether a serving location of the other web page is contained within a geographic location marked at a serving location. - View Dependent Claims (6, 7)
-
-
10. A computer-readable storage medium containing instructions for controlling a computer system to identify a content location associated with a web page, the content location identifying a geographic location that is a subject of the web page, by a method comprising:
-
providing a spread threshold and a power threshold; accessing a geographic hierarchy of geographic locations; for each of a plurality of geographic locations of the geographic hierarchy, calculating a weight for the geographic location that provides an indication that the web page is related to the geographic location based on geographic keywords contained on the web page; calculating a power for the geographic location that factors in the weight of ancestor and descendant geographic locations as indicated by the geographic hierarchy, the power being a measure of whether the geographic location is a subject of the web page based on weight of ancestor and descendent geographic locations of the geographic location; and calculating a spread for the geographic location based on the calculated power, the spread being a measure of the uniformity of the power among direct descendent geographic locations of the geographic location in the geographic hierarchy of geographic locations; and after calculating the weight, power, and spread for each of the plurality of geographic locations, determining whether the geographic location has a power that meets the provided power threshold and a spread that meets the provided spread threshold; and after determining that a geographic location has a power that meets a power threshold and a spread that meets a spread threshold, indicating that the geographic location is the identified location of the web page. - View Dependent Claims (12)
-
Specification