Spelling Correction of Email Queries
First Claim
1. A method comprising:
- mining at least one email repository;
determining at least two features for candidate correction of an email query as at least two determined features based on at least one of;
content from the email repository;
a language model;
a translation model;
orcontext of the email repository;
generating candidate corrections for the email query from the at least two determined features;
ranking the candidate corrections based at least on the determined features; and
identifying at least one correction from the email repository according to ranks associated with the respective candidate corrections.
3 Assignments
0 Petitions
Accused Products
Abstract
Techniques and constructs to facilitate spelling correction of email queries can leverage features of email data to obtain candidate corrections particular to the email data being queried. The constructs may enable accurate spelling correction of email queries across languages and domains based on, for example, one or more of a language model such as a bigram language model and/or a normalized token IDF based language model, a translation model such as an edit distance translation model and/or a fuzzy match translation model, content-based features, and/or contextual features. Content-based features can include features associated with the subject line of emails, content including identified phrases, contacts, and/or the number of candidate emails returned. Contextual features can include a time window of subject match and/or contact match, a frequency of emails received from a contact, and/or device characteristics.
85 Citations
20 Claims
-
1. A method comprising:
-
mining at least one email repository; determining at least two features for candidate correction of an email query as at least two determined features based on at least one of; content from the email repository; a language model; a translation model;
orcontext of the email repository; generating candidate corrections for the email query from the at least two determined features; ranking the candidate corrections based at least on the determined features; and identifying at least one correction from the email repository according to ranks associated with the respective candidate corrections. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A device comprising:
-
one or more computer-readable media having thereon a plurality of modules and an email repository; a processing unit operably coupled to the computer-readable media, the processing unit adapted to execute modules of the plurality of modules comprising; an feature function module configured to identify features from email data in the email repository, the feature function module including a content-based features module configured to score candidate corrections according to where respective of the candidate corrections occur in content of the email data; a candidate generation module configured to generate the candidate corrections corresponding to an email query, the candidate corrections being based at least in part on the email data; a ranking module configured to rank the candidate corrections based at least in part on the features from the email data upon which the candidate corrections are based. - View Dependent Claims (15, 16, 17, 18, 19)
-
-
20. A computer-readable medium having thereon computer-executable instructions, the computer-executable instructions upon execution configuring a computer to perform operations comprising:
-
identifying a plurality of tokens in an email query; identifying features associated with respective of the tokens, the features associated with at least one of email contacts or email content; creating respective vectors of the features associated with respective of the tokens; identifying candidate corrections for the respective tokens; and combining the respective vectors to rank the candidate corrections based at least on contextual aspects of the features.
-
Specification