Identification and automatic propagation of geo-location associations to un-located documents
First Claim
1. In a computerized system for searching a corpus of documents, a method of geographically-based processing comprising:
- identifying documents within the corpus that contain authoritative geographic location information, such documents being addressed documents;
identifying groups of documents, wherein at least one identified group of documents includes at least one addressed document and at least one unaddressed document, the at least one unaddressed document being a document that contains either no authoritative geographic location information or unauthoritative geographic location information;
for at least one unaddressed document in the at least one identified group, propagating an authoritative geographic location to that document from an addressed document that is a member of the at least one identified group, such documents being propagated-addressed documents; and
performing location-specific processing on at least one such propagated-addressed document using its propagated authoritative geographic location.
9 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods are provided for identifying pages that can be authoritatively, to some confidence level or another, associated with a geographic location, and systems and methods for grouping documents such that authoritative location associations can be propagated from pages with higher location confidence to pages with lower location confidence. Pages might be identified with authoritative indicators, groups of pages identified including at least one addressed page and at least one unaddressed page, wherein an addressed page is a page having a higher confidence level than an unaddressed page, and at least one processing step performed that is location specific. The confidence level assigned to a page as part of the process represents the confidence that the page is associated with an identifiable geographic location, with documents having a high confidence level being determined to be strongly associated with a particular geographic location while documents having a low confidence level being determined to be weakly associated with a particular geographic location or not associated at all with a geographic location.
91 Citations
18 Claims
-
1. In a computerized system for searching a corpus of documents, a method of geographically-based processing comprising:
-
identifying documents within the corpus that contain authoritative geographic location information, such documents being addressed documents;
identifying groups of documents, wherein at least one identified group of documents includes at least one addressed document and at least one unaddressed document, the at least one unaddressed document being a document that contains either no authoritative geographic location information or unauthoritative geographic location information;
for at least one unaddressed document in the at least one identified group, propagating an authoritative geographic location to that document from an addressed document that is a member of the at least one identified group, such documents being propagated-addressed documents; and
performing location-specific processing on at least one such propagated-addressed document using its propagated authoritative geographic location. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 12, 13, 14, 15, 16, 17, 18)
-
-
10. The method of claim 10, wherein commercially defined regions are neighborhoods.
-
11. The method of claim 11, wherein neighborhoods are geographic regions defined for purposes of evaluating and/or searching real estate and/or neighborhood-based real estate information.
Specification