Short text language detection using geographic information
First Claim
1. A computer-implemented method comprising:
- determining a particular language based at least in part on both content of text submitted by a user and a source IP address that is associated with the user; and
based on having determined the particular language for the user, presenting, to the user, one or more content items that are associated with the particular language;
wherein the content of the text does not expressly state the particular language.
3 Assignments
0 Petitions
Accused Products
Abstract
A content-providing entity receives a relatively short text from a user and attempts to determine, automatically, based on that short text (and on other available clues), a language that the user can read and understand. The content-providing entity may then provide, to the user, documents that are written in the determined language. The content-providing entity may determine a language of the input text based on several factors in combination: (a) the service provider'"'"'s “market,” which is determined based on at least a portion of the URL of the Internet site to which the user directed his browser; (b) the user'"'"'s “region,” which is determined based on the source Internet Protocol (IP) address of the IP packets that the user sends to the Internet site; (c) the “script” in which the short user-entered text is written; and (d) a statistical analysis of the frequency of the characters present in the short user-entered text.
32 Citations
32 Claims
-
1. A computer-implemented method comprising:
-
determining a particular language based at least in part on both content of text submitted by a user and a source IP address that is associated with the user; and based on having determined the particular language for the user, presenting, to the user, one or more content items that are associated with the particular language; wherein the content of the text does not expressly state the particular language. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A volatile or non-volatile computer-readable storage medium storing one or more instructions which, when executed by one or more processors, cause the one or more processors to perform steps comprising:
-
determining a particular language based at least in part on both content of text submitted by a user and a source IP address that is associated with the user; and based on having determined the particular language for the user, presenting, to the user, one or more content items that are associated with the particular language; wherein the content of the text does not expressly state the particular language. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. A computer-implemented method comprising:
-
receiving one or more query terms of a query to a search engine that is capable of performing searches based on any one of a plurality of languages; selecting a particular language, from among the plurality of languages, based at least in part on both the one or more query terms of the query and at least one of;
(a) whether the one or more query terms are encoded in ASCII, (b) a top-level domain of a particular URL, (c) a source IP address that is associated with a user entering the query, and (d) whether characters of the one or more query terms are in a specified subset of Unicode;based on having selected the particular language, performing a search for content items associated with the particular language based on the one or more query terms; and presenting, to the user, one or more content items that are associated with the particular language; wherein the content of the one or more query terms does not expressly state the particular language. - View Dependent Claims (24, 25, 26, 27)
-
-
28. A volatile or non-volatile computer-readable storage medium storing one or more instructions which, when executed by one or more processors, cause the one or more processors to perform steps comprising:
-
receiving one or more query terms of a query to a search engine that is capable of performing searches based on any one of a plurality of languages; selecting a particular language, from among the plurality of languages, based at least in part on both the one or more query terms of the query and at least one of;
(a) whether the one or more query terms are encoded in ASCII, (b) a top-level domain of a particular URL, (c) a source IP address that is associated with a user entering the query, and (d) whether characters of the one or more query terms are in a specified subset of Unicode;based on having selected the particular language, performing a search for content items associated with the particular language based on the one or more query terms; and presenting, to the user, one or more content items that are associated with the particular language; wherein the content of the one or more query terms does not expressly state the particular language. - View Dependent Claims (29, 30, 31, 32)
-
Specification