Language input architecture for converting one text form to another text form with tolerance to spelling typographical and conversion errors
First Claim
1. A language input architecture, tangibly embodied on one or more computer-readable media, comprising:
- a user interface to receive an input string, the input string containing a spelling error;
a typing model to automatically correct spelling errors and to determine typing candidates, each typing candidate being associated with a first probability representing how likely the typing candidate would be correctly entered as the input string, wherein the first probability is derived from data collected from multiple users entering a training text; and
a language model to determine conversion candidates associated with each of the typing candidates, each conversion candidate being associated with a second probability representing a likelihood of the conversion candidate being in context with preceding strings.
1 Assignment
0 Petitions
Accused Products
Abstract
A language input architecture converts input strings of phonetic text to an output string of language text. The language input architecture has a search engine, one or more typing models, a language model, and one or more lexicons for different languages. The typing model is configured to generate a list of probable typing candidates that may be substituted for the input string based on probabilities of how likely each of the candidate strings was incorrectly entered as the input string. The language model provides probable conversion strings for each of the typing candidates based on probabilities of how likely a probable conversion output string represents the candidate string. The search engine combines the probabilities of the typing and language models to find the most probable conversion string that represents a converted form of the input string.
68 Citations
40 Claims
-
1. A language input architecture, tangibly embodied on one or more computer-readable media, comprising:
-
a user interface to receive an input string, the input string containing a spelling error; a typing model to automatically correct spelling errors and to determine typing candidates, each typing candidate being associated with a first probability representing how likely the typing candidate would be correctly entered as the input string, wherein the first probability is derived from data collected from multiple users entering a training text; and a language model to determine conversion candidates associated with each of the typing candidates, each conversion candidate being associated with a second probability representing a likelihood of the conversion candidate being in context with preceding strings. - View Dependent Claims (2, 3, 15, 16)
-
-
4. A language input architecture, tangibly embodied on one or more computer-readable media, comprising:
-
a typing model to receive an input string and determine a typing error probability of how likely a candidate string would be incorrectly entered as the input string, wherein the typing model is trained using data collected from multiple users entering a training text; a language model to determine a language text probability of how likely an output string represents the candidate string; and a search engine to selectively convert the input string to the output string based on the typing error probability and the language text probability. - View Dependent Claims (5, 6, 7, 8, 9, 10, 11)
-
-
12. One or more computer-readable media having computer-executable instructions that, when executed on a processor, direct a computer to:
-
receiving an input string; determine typing candidates based, at least in part, on a typing model configured to automatically correct a spelling error, each typing candidate being associated with a first probability representing how likely the typing candidate would be correctly entered as the input string, wherein the first probability is associated with observed errors made during entry of training text by trainers; determine a conversion string associated with the typing candidate based, at least in part, on a statistical language model, the conversion string being associated with a second probability of how likely the conversion string accurately represents the typing candidate based on a language context of the input string; and convert the input string to the conversion string.
-
-
13. One or more computer-readable media having computer-executable instructions that, when executed on a processor, direct a computer to:
-
receive an input string; determine typing candidate strings that may be used to replace the input string based on a first probability of how likely each typing candidate string would be correctly entered as the input string, wherein the typing candidate strings are based on a typing model that is trained on actual data collected from multiple users entering at least one training text; use at least one of the typing candidate strings to derive an output string based on the first probability and a second probability of how likely the output string is within a context associated with previously entered strings; and convert the input string to the output string.
-
-
14. A method for handling text input comprising:
-
receiving an input string having a spelling error; determining typing candidates based, at least in part, on a typing model configured to automatically correct the spelling error, each typing candidate being associated with a first probability representing how likely the typing candidate would be correctly entered as the input string, wherein the first probability is associated with observed errors made during entry of training text by trainers; determining conversion candidates associated with each of the typing candidates based, at least in part, on a statistical language model, each conversion candidate being associated with a second probability representing a likelihood of the conversion candidate being within a context associated with the input string; selecting a conversion candidate from the determined conversion candidates based, at least in part, on the first probability and the second probability; and converting the input string using the selected conversion candidate.
-
-
17. A method for handling text input comprising:
-
receiving an input string; determining a typing candidate based, at least in part, on a typing model, the typing candidate being associated with a first probability of how likely the typing candidate would be correctly entered as the input string, wherein the first probability is derived from data collected from multiple users entering a training text; determining a conversion siring associated with the typing candidate based, at least in part, on a statistical language model, the conversion string being associated with a second probability of how likely the conversion string accurately represents the typing candidate based on textual elements entered previous to the input string; and converting the input string to the conversion string. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25)
-
-
26. A method for handling text input comprising:
-
segmenting an input string in multiple different ways to produce multiple typing candidates based, at least in part, on a typing model, each of the multiple typing candidates being associated with a first probability of how likely the typing candidate would be correctly entered as the input string, wherein the first probability is derived from data collected from multiple users entering a training text; determining multiple candidate strings that may be used to replace the input string based, at least in part, on a statistical language model, each of the candidate strings being associated with a second probability of how likely the candidate string accurately represents the typing candidate; and associating an output string with at least one of the candidate strings based, at least in part, on the first and second probabilities associated with the at least one candidate strings. - View Dependent Claims (27, 28, 29, 30, 31)
-
-
32. A method comprising:
-
receiving an input string; determining typing candidates based, at least in part, on a typing model configured to automatically correct a spelling error, each typing candidate being associated with a first probability representing how likely the typing candidate would be correctly entered as the input string, wherein the first probability is associated with observed errors made during entry of training text by trainers; determining a conversion string associated with the typing candidate based, at least in part, on a statistical language model, the conversion string being associated with a second probability of how likely the conversion string accurately represents the typing candidate based on a language context of the input string; and converting the input string to the conversion string.
-
-
33. A language input architecture, tangibly embodied on one or more computer-readable media, comprising:
-
a user interface to receive an input string, the input string containing a spelling error; a typing model to generate a list of probable candidate strings that may be substituted for the input string based on first typing error probabilities, each first typing error probability indicating a likelihood that the corresponding candidate string would be incorrectly entered as the input string, the typing model being trained on actual data collected from multiple users entering at least one training text; determining a conversion string associated with each candidate string based, at least in part, on a statistical language model, the conversion string being associated with a second probability of how likely the conversion string accurately represents the typing candidate based on textual elements entered previous to the input string; and converting the input string to the conversion string.
-
-
34. A language input architecture, tangibly embodied on one or more computer-readable media, comprising:
-
a typing model to generate a list of probable candidate strings that may be substituted for an input string written in phonetic text based on typing error probabilities of how likely each of the candidate strings would be incorrectly entered as the input string, wherein the typing model is trained using data collected from multiple users entering a training text; and a language model to provide output strings written in language text for each of the candidate strings based on language text probabilities of how likely each of the output strings accurately represent the corresponding candidate strings. - View Dependent Claims (35, 36, 37, 38)
-
-
39. One or more computer-readable media having computer-executable instructions that, when executed on a processor, direct a computer to:
-
analyze an input string having a spelling error; determine typing candidates based, at least in part, on a typing model configured to automatically correct the spelling error, each typing candidate being associated with a first probability representing how likely the typing candidate would be correctly entered as the input string, wherein the first probability is derived from data collected from multiple users entering a training text; determine conversion candidates associated with each of the typing candidates based, at least in part, on a statistical language model, each conversion candidate being associated with a second probability representing a likelihood of the conversion candidate being within a context associated with the input string; select a conversion candidate from the determined conversion candidates based, at least in part, on the first probability and the second probability; and convert the input string using the selected conversion candidate.
-
-
40. A language input architecture, tangibly embodied on one or more computer-readable media, implemented at least in part on one or more computing devices, comprising:
-
a user interface to receive an input string, the input string containing a spelling error; a first typing model to generate a list of probable candidate strings that may be substituted for the input string based on typing error probabilities of how likely each of the candidate strings was incorrectly entered as the input string, the first typing model being trained in a first language and on actual data collected from multiple users entering at least one training text; a second typing model to generate a list of probable candidate strings that may be substituted for the input string based on typing error probabilities of how likely each of the candidate strings was incorrectly entered as the input string, the second typing model being trained in a second language, wherein each candidate string is associated with a first probability; a language model to determine conversion candidates associated with each of the candidate strings, each conversion candidate being associated with a second probability representing a likelihood of the conversion candidate being in context with preceding strings; and a search engine to select a particular candidate string with a combination of first and second probabilities that represents the least likelihood of typing errors and the greatest likelihood of an accurate conversion and converting the input string to an output string that is associated with the particular candidate string.
-
Specification