Identification and automatic propagation of geo-location associations to un-located documents
First Claim
1. A computer-implemented method of tagging documents comprising:
- maintaining, on a storage medium, data that establishes associations between documents that belong to a corpus of documents, and geographic location information;
identifying documents, within the corpus, that contain authoritative geographic location information, such documents being addressed documents;
updating the data to cause the data to reflect an association between each addressed document and the authoritative geographic location information contained in the addressed document;
identifying groups of documents, wherein at least one identified group of documents includes at least one addressed document and at least one unaddressed document, the at least one unaddressed document being a document that contains either no authoritative geographic location information or unauthoritative geographic location information;
for at least one unaddressed document in the at least one identified group, establishing the at least one unaddressed document as a propagated-addressed document by updating said data;
(a) to cause the data to reflect an association between the at least one unaddressed document and authoritative geographic location information obtained from one or more addressed documents in the at least one identified group, and(b) to include a confidence level that represents the confidence that each addressed document and each propagated-addressed document is associated with authoritative geographic location information, performing location-specific processing on at least one such propagated-addressed document; and
using the authoritative geographic location information to determine local customs or time zones, wherein the location-specific processing takes into account said local customs or time zones,wherein the confidence level is based at least in part on the link structure among the at least one unaddressed document and the group of documents,wherein geographic location information is obtained by processing the displayed and untagged text,wherein the authoritative geographic location information is associated with a proprietor of both the addressed documents and the propagated-addressed documents, andwherein the method is performed by one or more computing devices.
9 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods are provided for identifying pages that can be authoritatively, to some confidence level or another, associated with a geographic location, and are provided for grouping documents such that authoritative location associations can be propagated from pages with higher location confidence to pages with lower location confidence. Pages might be identified with authoritative indicators, groups of pages identified including at least one addressed page and at least one unaddressed page, wherein an addressed page is a page having a higher confidence level than an unaddressed page, and at least one processing step performed that is location specific. The confidence level assigned to a page as part of the process represents the confidence that the page is associated with an identifiable geographic location.
28 Citations
19 Claims
-
1. A computer-implemented method of tagging documents comprising:
-
maintaining, on a storage medium, data that establishes associations between documents that belong to a corpus of documents, and geographic location information; identifying documents, within the corpus, that contain authoritative geographic location information, such documents being addressed documents; updating the data to cause the data to reflect an association between each addressed document and the authoritative geographic location information contained in the addressed document; identifying groups of documents, wherein at least one identified group of documents includes at least one addressed document and at least one unaddressed document, the at least one unaddressed document being a document that contains either no authoritative geographic location information or unauthoritative geographic location information; for at least one unaddressed document in the at least one identified group, establishing the at least one unaddressed document as a propagated-addressed document by updating said data; (a) to cause the data to reflect an association between the at least one unaddressed document and authoritative geographic location information obtained from one or more addressed documents in the at least one identified group, and (b) to include a confidence level that represents the confidence that each addressed document and each propagated-addressed document is associated with authoritative geographic location information, performing location-specific processing on at least one such propagated-addressed document; and using the authoritative geographic location information to determine local customs or time zones, wherein the location-specific processing takes into account said local customs or time zones, wherein the confidence level is based at least in part on the link structure among the at least one unaddressed document and the group of documents, wherein geographic location information is obtained by processing the displayed and untagged text, wherein the authoritative geographic location information is associated with a proprietor of both the addressed documents and the propagated-addressed documents, and wherein the method is performed by one or more computing devices. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
Specification