Classification of ambiguous geographic references
First Claim
Patent Images
1. A method of determining a geographical relevance of a document, performed by one or more processors associated with one or more server devices, the method comprising:
- generating, using a processor associated with the one or more server devices, a plurality of histograms associated with a plurality of respective strings in the document by comparing strings in the document to a pre-existing plurality of strings for which histograms were previously generated, where the plurality of histograms relates occurrences of strings to geographical regions;
generating, using a processor associated with the one or more server devices, a combined histogram for the document from the plurality of histograms by multiplying together values for each histogram of the plurality of histograms; and
identifying, using a processor associated with the one or more server devices, a geographical relevance of the document based on the combined histogram.
2 Assignments
0 Petitions
Accused Products
Abstract
A location classifier generates location information based on textual strings in input text. The location information defines potential geographical relevance of the input text. In determining the location information, the location classifier may receive at least one geo-relevance profile associated with at least one string in the input text, obtain a combined geo-relevance profile for the document from the at least one geo-relevance profile, and determine geographical relevance of the input text based on the combined geo-relevance profile.
22 Citations
28 Claims
-
1. A method of determining a geographical relevance of a document, performed by one or more processors associated with one or more server devices, the method comprising:
-
generating, using a processor associated with the one or more server devices, a plurality of histograms associated with a plurality of respective strings in the document by comparing strings in the document to a pre-existing plurality of strings for which histograms were previously generated, where the plurality of histograms relates occurrences of strings to geographical regions; generating, using a processor associated with the one or more server devices, a combined histogram for the document from the plurality of histograms by multiplying together values for each histogram of the plurality of histograms; and identifying, using a processor associated with the one or more server devices, a geographical relevance of the document based on the combined histogram. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A location classifier implemented within a computing device, comprising:
-
means for receiving input text; a memory to store the input text; means for locating strings within the input text that were previously determined to be geographically relevant; means for comparing located strings in the input text to a pre-existing plurality of strings for which histograms were previously generated, where the histograms relate occurrences of strings to geographical regions; means for combining the retrieved histograms, to form a combined histogram, by multiplying together values for each histogram of the retrieved histograms; and means for determining whether the input text is geographically relevant based on the peaks of the combined histogram. - View Dependent Claims (8)
-
-
9. A memory device containing programming instructions for execution by a processor, the memory device comprising:
-
programming instructions for generating a plurality of histograms associated with a respective plurality of strings in a document by examining the document to locate strings for the respective plurality of strings by comparing strings in the document to a pre-existing plurality of strings for which histograms were previously generated, the histograms defining a geographical relevance of the plurality of strings with respect to geographical regions; programming instructions for combining the plurality of histograms to obtain a combined histogram for the document by multiplying together values for each histogram of the plurality of histograms; and programming instructions for determining a geographical relevance of the document based on the combined histogram. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. A method for generating a histogram for a string, performed by a server device, the method comprising:
-
determining, using a processor of the server device, a plurality of sections of training text in which each section of training text is associated with a geographical region; accumulating, using the processor, occurrences of the string in the plurality of sections of training text when terms to a left or right of the string are different from previous instances of terms to the left or right of the string; and generating, using the processor, the histogram based on the accumulated occurrences of the string, where the histogram relates the occurrences of the string to geographical regions. - View Dependent Claims (16, 17, 18, 19, 20)
-
-
21. A device comprising:
-
a processor; and a computer-readable memory coupled to the processor and containing instructions that when executed by the processor cause the processor to; determine a plurality of sections of training text in which each section of training text is associated with a geographical region; accumulate occurrences of a string in the plurality of sections of training text when terms to a left or right of the string are different from previous instances of the terms to the left or right of the string; and generate a histogram that defines the geographical relevance of the string with respect to geographical regions based on the accumulated occurrences of the string. - View Dependent Claims (22, 23, 24)
-
-
25. A method performed by one or more processors associated with one or more server devices, the method comprising:
-
examining, using a processor associated with the one or more server devices, a document to locate strings that were previously determined to have geographical relevance; generating, using a processor associated with the one or more server devices, a plurality of histograms associated with the located strings, in the document, by comparing the located strings in the document to a pre-existing plurality of strings for which histograms were previously generated; receiving, using a communication interface or an input device of the one or more server devices, the histograms associated with the strings, where the histograms relate occurrences of the strings to geographic regions; obtaining, using a processor associated with the one or more server devices, a combined histogram for the document from the received histograms by multiplying values for each of the received histograms together; analyzing, using a processor associated with the one or more server devices, the combined histogram for peaks; and determining, using a processor associated with the one or more server devices, geographical relevance of the document based on whether peaks are present in the combined histogram. - View Dependent Claims (26, 27, 28)
-
Specification