LEARNING LANGUAGE MODELS FROM SCRATCH BASED ON CROWD-SOURCED USER TEXT INPUT
First Claim
1. A tangible computer-readable memory having contents configured to cause at least one computer having a processor to perform a method for assisting in building a new language model used by language recognition systems, the method comprising:
- initializing a language model for a selected language,wherein a language recognition system that uses a language model to predict words in a language is ineffective to predict intended words in the selected language;
monitoring use of words in the selected language on various computing devices by multiple users of the selected language;
collecting, in substantially real-time, information about the monitored use of the words in the selected language by the multiple users of the selected language;
generating updates to the language model based on the collected information about the monitored use of the words in the selected language; and
providing to the various computing devices the generated updates to the language model, such that a language recognition system using the language model including the generated updates is more effective to predict intended words in the selected language.
1 Assignment
0 Petitions
Accused Products
Abstract
Technology is described for developing a language model for a language recognition system from scratch based on aggregating and analyzing text input from multiple users of the language. The technology allows a user to select a language, and if no existing language model is available for the selected language, provides a new language model for the selected language, monitors and collects information about the use of words in the selected language, combines information collected from multiple users of the selected language, and updates the user'"'"'s language model based on the combined information from multiple users of the selected language.
-
Citations
23 Claims
-
1. A tangible computer-readable memory having contents configured to cause at least one computer having a processor to perform a method for assisting in building a new language model used by language recognition systems, the method comprising:
-
initializing a language model for a selected language, wherein a language recognition system that uses a language model to predict words in a language is ineffective to predict intended words in the selected language; monitoring use of words in the selected language on various computing devices by multiple users of the selected language; collecting, in substantially real-time, information about the monitored use of the words in the selected language by the multiple users of the selected language; generating updates to the language model based on the collected information about the monitored use of the words in the selected language; and providing to the various computing devices the generated updates to the language model, such that a language recognition system using the language model including the generated updates is more effective to predict intended words in the selected language. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method in a computing system of assisting in building a new language model used by a language recognition system to predict words in a language, the method comprising:
-
distinguishing a language; determining whether a substantially complete language model is available for the distinguished language; when a substantially complete language model is not available for the distinguished language, monitoring, on the computing system, use of words in the distinguished language by a user of the computing system substantially in real time; collecting, in a language model on the computing system, information about the monitored use of the words in the distinguished language; receiving updates to the language model on the computing system based on additional information about use of words in the distinguished language by other users of the distinguished language monitored substantially in real time; and predicting in response to user input, by the language recognition system, a word in the distinguished language intended by the user, wherein the predicting is based on the information in the language model, including the information about the monitored use of words in the distinguished language and the additional information collected from other users of the distinguished language. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A system for assisting in building a language model used by a language recognition system to predict words in a language, the system comprising:
-
at least one memory storing computer-executable instructions of; a component configured to associate a crowd-sourced language model with the language; for one of multiple computing devices; a component configured to identify user input of words on the computing device as use of words in the language; a component configured to monitor use of words in the language on the computing device substantially in real time; a component configured to collect, in the crowd-sourced language model, information about the monitored use of the words in the distinguished language on the multiple computing devices; a component configured to generate updates to the crowd-sourced language model based on the collected information about the monitored use of the words in the language; and a component configured to provide to each of the multiple devices the generated updates to the language model; and at least one processor for executing the computer-executable instructions stored in the at least one memory. - View Dependent Claims (23)
-
Specification