Indexing pages based on associations with geographic regions
First Claim
1. A computer-implemented method comprising:
- determining a plurality of association scores for a plurality of pages with respect to a plurality of geographic regions, wherein the association score comprises a relative degree to which the page is associated with the geographic region, wherein the determining the plurality of association scores further comprises a term that is associated with the geographic region is present in content of the page and adding a term score to the association score, wherein the term score comprises a relative degree to which presence of the term in the content of the page indicates that the page is associated with the geographic region, wherein the term score of a telephone number is higher than the term score of a name of the geographic region, wherein the determining further comprises increasing the association score based on a neighbor association score of a neighbor geographic region, wherein the neighbor association score comprises a relative degree to which the page is associated with the neighbor geographic region; and
creating an index for the plurality of pages based on the plurality of association scores, wherein the creating the index for the plurality of pages based on the plurality of association scores further comprises selecting a plurality of keywords present in the page, selecting a plurality of weights for the plurality of keywords, finding the keywords that match the terms, and increasing the plurality of weights for the keywords that match the terms based on the association score and the term scores, wherein the increasing the plurality of weights further comprises adding to the plurality of weights a result of multiplying the term scores of the terms that match the keywords by the association score of the geographic region that has the terms that match the keywords by the weight of the keywords that match the term, dividing by a maximum term score in the plurality of geographic regions that are associated with the page that contains the keywords that match the terms, and dividing by the sum of the term scores in the plurality of geographic regions for the terms that match the keywords.
3 Assignments
0 Petitions
Accused Products
Abstract
A method, apparatus, system, and storage medium that, in an embodiment, create an index for pages based on association scores for the pages with respect to geographic regions, where the association scores indicate relative degrees to which the pages are associated with the geographic regions. In an embodiment, the association scores are determined by adding a term scare to the association score if a term that is associated with the geographic region is present in the page. The term score indicates a relative degree to which presence of the term in the page indicates that the page is associated with the geographic region. In an embodiment, the association scores are further increased based on association scores of neighbor geographic regions and based on the association scores of incoming linked pages.
7 Citations
13 Claims
-
1. A computer-implemented method comprising:
-
determining a plurality of association scores for a plurality of pages with respect to a plurality of geographic regions, wherein the association score comprises a relative degree to which the page is associated with the geographic region, wherein the determining the plurality of association scores further comprises a term that is associated with the geographic region is present in content of the page and adding a term score to the association score, wherein the term score comprises a relative degree to which presence of the term in the content of the page indicates that the page is associated with the geographic region, wherein the term score of a telephone number is higher than the term score of a name of the geographic region, wherein the determining further comprises increasing the association score based on a neighbor association score of a neighbor geographic region, wherein the neighbor association score comprises a relative degree to which the page is associated with the neighbor geographic region; and creating an index for the plurality of pages based on the plurality of association scores, wherein the creating the index for the plurality of pages based on the plurality of association scores further comprises selecting a plurality of keywords present in the page, selecting a plurality of weights for the plurality of keywords, finding the keywords that match the terms, and increasing the plurality of weights for the keywords that match the terms based on the association score and the term scores, wherein the increasing the plurality of weights further comprises adding to the plurality of weights a result of multiplying the term scores of the terms that match the keywords by the association score of the geographic region that has the terms that match the keywords by the weight of the keywords that match the term, dividing by a maximum term score in the plurality of geographic regions that are associated with the page that contains the keywords that match the terms, and dividing by the sum of the term scores in the plurality of geographic regions for the terms that match the keywords. - View Dependent Claims (2, 3, 4)
-
-
5. A storage medium encoded with instructions, wherein the instructions when executed comprise:
-
determining a plurality of association scores for a plurality of pages with respect to a plurality of geographic regions, wherein the association score comprises a relative degree to which the page is associated with the geographic region, wherein the determining the plurality of association scores further comprises a term that is associated with the geographic region is present in content of the page and adding a term score to the association score, wherein the term score comprises a relative degree to which presence of the term in the content of the page indicates that the page is associated with the geographic region, wherein the term score of a telephone number is higher than the term score of a name of the geographic region, wherein the determining further comprises increasing the association score based on a neighbor association score of a neighbor geographic region. wherein the neighbor association score comprises a relative degree to which the page is associated with the neighbor geographic region; and creating an index for the plurality of pages based on the plurality of association scores, wherein the creating the index for the plurality of pages based on the plurality of association scores further comprises selecting a plurality of keywords present in the page, selecting a plurality of weights for the plurality of keywords, finding the keywords that match the terms, and increasing the plurality of weights for the keywords that match the terms based on the association score and the term scores, wherein the increasing the plurality of weights further comprises adding to the plurality of weights a result of multiplying the term scores of the terms that match the keywords by the association score of the geographic region that has the terms that match the keywords by the weight of the keywords that match the term, dividing by a maximum term score in the plurality of geographic regions that are associated with the page that contains the keywords that match the terms, and dividing by the sum of the term scores in the plurality of geographic regions for the terms that match the keywords. - View Dependent Claims (6, 7, 8, 9)
-
-
10. A method for configuring a computer, comprising:
-
configuring the computer to determine a plurality of association scores for a plurality of pages with respect to a plurality of geographic regions, wherein the association score comprises a relative degree to which the page is associated with the geographic region, wherein the configuring the computer to determine the plurality of association scores further comprises a term that is associated with the geographic region is present in content of the page, and adding a term score to the association score, wherein the term score comprises a relative degree to which presence of the term in the content of the page indicates that the page is associated with the geographic region, wherein the term score of a telephone number is higher than the term score of a name of the geographic region, wherein the configuring the computer to determine further comprises configuring the computer to increase the association score based on a neighbor association score of a neighbor geographic region, wherein the neighbor association score comprises a relative degree to which the page is associated with the neighbor geographic region, wherein the geographic region and the neighbor geographic region are located within a threshold distance; and configuring the computer to create an index for the plurality of pages based on the plurality of association scores, wherein the configuring the computer to create the index for the plurality of pages based on the plurality of association scores further comprises configuring the computer to select a plurality of keywords present in the page, configuring the computer to select a plurality of weights for the plurality of keywords, configuring the computer to find the keywords that match the terms, and configuring the computer to increase the plurality of weights for the keywords that match the terms based on the association score and the term scores, wherein the configuring the computer to increase the plurality of weights further comprises configuring the computer to add to the plurality of weights a result of multiplying the term scores of the terms that match the keywords by the association score of the geographic region that has the terms that match the keywords by the weight of the keywords that match the term, dividing by a maximum term score in the plurality of geographic regions that are associated with the page that contains the keywords that match the terms, and dividing by the sum of the term scores in the plurality of geographic regions for the terms that match the keywords. - View Dependent Claims (11, 12, 13)
-
Specification