Spelling correction of email queries
First Claim
1. A computer-implemented method for automatically correcting words in an email query, the method comprising:
- receiving an email query for mining at least one email repository, wherein the email query comprises at least one misspelled word;
determining two or more features for personalized candidate spelling correction, wherein at least one feature comprises contextual feature of emails in at least one email repository, and wherein the context feature comprises one or more of;
subject lines,frequent contacts,recent contacts, andinput device types;
generating at least one personalized candidate spelling correction for the email query based on the two or more determined features;
ranking the at least one personalized candidate spelling corrections based at least in part on a weighted score of two or more determined features;
determining at least one personalized spelling correction for the at least one email query from the ranked personalized candidate spelling corrections; and
applying the at least one personalized spelling correction to the email query to produce a second email query with automatically corrected words.
2 Assignments
0 Petitions
Accused Products
Abstract
Techniques and constructs to facilitate spelling correction of email queries can leverage features of email data to obtain candidate corrections particular to the email data being queried. The constructs may enable accurate spelling correction of email queries across languages and domains based on, for example, one or more of a language model such as a bigram language model and/or a normalized token IDF based language model, a translation model such as an edit distance translation model and/or a fuzzy match translation model, content-based features, and/or contextual features. Content-based features can include features associated with the subject line of emails, content including identified phrases, contacts, and/or the number of candidate emails returned. Contextual features can include a time window of subject match and/or contact match, a frequency of emails received from a contact, and/or device characteristics.
18 Citations
20 Claims
-
1. A computer-implemented method for automatically correcting words in an email query, the method comprising:
-
receiving an email query for mining at least one email repository, wherein the email query comprises at least one misspelled word; determining two or more features for personalized candidate spelling correction, wherein at least one feature comprises contextual feature of emails in at least one email repository, and wherein the context feature comprises one or more of; subject lines, frequent contacts, recent contacts, and input device types; generating at least one personalized candidate spelling correction for the email query based on the two or more determined features; ranking the at least one personalized candidate spelling corrections based at least in part on a weighted score of two or more determined features; determining at least one personalized spelling correction for the at least one email query from the ranked personalized candidate spelling corrections; and applying the at least one personalized spelling correction to the email query to produce a second email query with automatically corrected words. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A computing device, comprising:
-
at least one processing unit; and at least one memory storing computer executable instructions for correcting a plurality of misspelled words in emails, the instructions when executed by the at least processing unit causing the computing device to; identify, by a feature function module, features from email data in the email repository, wherein the feature function module comprises context-based features module configured to score personalized candidate spelling corrections based at least on where personalized candidate spelling corrections occur in context of the email data, and wherein the context-based features module is configured to treat output of a translation model module as a feature; generate, by a candidate generation module, the personalized candidate spelling corrections corresponding to a first email query, wherein the personalized candidate spelling corrections being based at least in part on the email data; and rank, by a ranking module, the personalized candidate spelling corrections based at least in part on a weighted score of the features from the email data; applying the ranked personalized candidate spelling corrections to the email query to produce a second email query with automatically corrected words. - View Dependent Claims (14, 15, 16, 17, 18)
-
-
19. A computer storage medium having thereon computer executable instructions, the computer-executable instructions upon execution configuring a computer to perform operations comprising:
-
extracting a plurality of tokens in an email query; determine at least one feature associated with respective of the tokens, wherein the feature comprises at least one context of email; generating a vector of the at least one feature associated with the tokens; generating one or more personalized candidate spelling corrections associated with at least one of the tokens; ranking the one or more personalized candidate spelling corrections based at least in part on the vector of the feature associated with the tokens; selecting at least one word from the personalized candidate spelling corrections based on a number of emails associated with the personalized candidate spelling correction; replacing at least one word in the email query with the selected at least one word from the personalized candidate spelling corrections; executing the email query; and retrieving at least one email from an email repository based on a result of the executed email query. - View Dependent Claims (20)
-
Specification