Methods and systems for selecting a language for text segmentation
First Claim
1. A computer-implemented method, comprising:
- receiving from a user of a computing device, at a computer server system, a request for information about one or more internet-accessible documents, the request having a string of characters;
identifying, using the computer server system, at least a first candidate language and a second candidate language associated with the request;
determining at least a first segmented result associated with the first candidate language from the string of characters and a second segmented result associated with the second candidate language from the string of characters;
determining a first frequency of occurrence for the first segmented result in a group of articles that are associated by the system with the first language and a second frequency of occurrence for the second segmented result in a group of articles that are associated by the system with the second language;
identifying, with the computer server system, an operable language from the first candidate language and the second candidate language based at least in part on the first frequency of occurrence and the second frequency of occurrence;
selecting, for use by the user of the computing device, electronic content in the identified operable language from among available content in multiple languages; and
providing the selected content to the computing device so that the selected content is arranged to be displayed to the user in the identified operable language and accompanying the requested one or more internet-accessible documents.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods and systems for selecting a language for text segmentation are disclosed. In one embodiment, at least a first candidate language and a second candidate language associated with a string of characters are identified, at least a first segmented result associated with the first candidate language and a second segmented result associated with the second candidate language are determined, a first frequency of occurrence for the first segmented result and a second frequency of occurrence for the second segmented result are determined, and an operable language is identified from the first candidate language and the second candidate language based at least in part on the first frequency of occurrence and the second frequency of occurrence.
92 Citations
36 Claims
-
1. A computer-implemented method, comprising:
-
receiving from a user of a computing device, at a computer server system, a request for information about one or more internet-accessible documents, the request having a string of characters; identifying, using the computer server system, at least a first candidate language and a second candidate language associated with the request; determining at least a first segmented result associated with the first candidate language from the string of characters and a second segmented result associated with the second candidate language from the string of characters; determining a first frequency of occurrence for the first segmented result in a group of articles that are associated by the system with the first language and a second frequency of occurrence for the second segmented result in a group of articles that are associated by the system with the second language; identifying, with the computer server system, an operable language from the first candidate language and the second candidate language based at least in part on the first frequency of occurrence and the second frequency of occurrence; selecting, for use by the user of the computing device, electronic content in the identified operable language from among available content in multiple languages; and providing the selected content to the computing device so that the selected content is arranged to be displayed to the user in the identified operable language and accompanying the requested one or more internet-accessible documents. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A tangible and non-transitory computer-readable medium containing program code executable on a computer, comprising:
-
program code for receiving from a user of a computing device, at a computer server system, a request to receive one or more internet-accessible documents, the request having a string of characters; program code for identifying at least a first candidate language and a second candidate language associated with a string of characters received in the request; program code for determining at least a first segmented result associated with the first candidate language from the string of characters and a second segmented result associated with the second candidate language from the string of characters; program code for determining a first frequency of occurrence for the first segmented result in a group of articles that are associated by the computer server system with the first language and a second frequency of occurrence for the second segmented result in a group of articles that are associated by the computer server system with the second language; program code for identifying an operable language from the first candidate language and the second candidate language based at least in part on the first frequency of occurrence and the second frequency of occurrence; program code for selecting, for use by the user of the computing device, electronic content in the identified operable language from among available content in multiple languages; and program code for providing the selected content to the computing device so that the selected content is arranged to be displayed to the user in the identified operable language with the requested one or more internet-accessible documents. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35)
-
-
36. A computer-implemented method, comprising:
-
receiving from a user of a computing device, at a computer server system, a request to receive one or more internet-accessible documents, the request having a string of characters that include a domain name; determining at least a first segmented result in a first candidate language and at least a second segmented result in a second candidate language from the domain name; determining at least a first frequency of occurrence for the first segmented result in a group of articles that are associated with the first language and based at least in part on at least one of an article index, a text index, and a search result set; determining a second frequency of occurrence for the second segmented result in a group of articles that are associated with the second language and; if the first frequency of occurrence is greater than the second frequency of occurrence, then selecting the first candidate language as an operable language; if the second frequency of occurrence is greater than the first frequency of occurrence, then selecting the second candidate language as the operable language; selecting an advertisement from among available advertisements in multiple languages based at least in part on the operable language, wherein the advertisement includes text in the operable language; and providing the selected advertisement to the computing device arranged to be displayed to the user with the requested one or more internet-accessible documents associated with the domain name.
-
Specification