Method of spell-checking search queries
First Claim
1. A method comprising:
- receiving a search query that includes a query term;
identifying, from a corpus of documents, text patterns that each include the query term occurring adjacent to one or more other query terms;
determining a first quantity of occurrences of the text patterns that each include the query term occurring adjacent to the one or more other terms, in the corpus of documents;
determining a second quantity of occurrences of text patterns that each include a heterographic homophone of the query term occurring adjacent to the one or more other terms, in the corpus of documents; and
determining, by one or more computers, whether to revise the received search query to include the heterographic homophone of the query term, based on comparing the first quantity and the second quantity.
2 Assignments
0 Petitions
Accused Products
Abstract
A computer-implemented method for determining whether a target text-string is correctly spelled is provided. The target text-string is compared to a corpus to determine a set of contexts which each include an occurrence of the target text-string. Using heuristics, each context of the set is characterized based on occurrences in the corpus of the target text-string and a reference text-string. Contexts are characterized as including a correct spelling of the target text-string, an incorrect spelling of the reference text-string, or including an indeterminate usage of the target text-string. A likelihood that the target text-string is a misspelling of the reference text-string is computed as a function of the quantity of contexts including a correct spelling of the target text-string and the quantity of contexts including an incorrect spelling of a reference text-string. In one application, the target text-string is received in a search query, the search executed following a spell-check.
48 Citations
28 Claims
-
1. A method comprising:
-
receiving a search query that includes a query term; identifying, from a corpus of documents, text patterns that each include the query term occurring adjacent to one or more other query terms; determining a first quantity of occurrences of the text patterns that each include the query term occurring adjacent to the one or more other terms, in the corpus of documents; determining a second quantity of occurrences of text patterns that each include a heterographic homophone of the query term occurring adjacent to the one or more other terms, in the corpus of documents; and determining, by one or more computers, whether to revise the received search query to include the heterographic homophone of the query term, based on comparing the first quantity and the second quantity. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system comprising:
-
one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising; receiving a search query that includes a query term; identifying, from a corpus of documents, text patterns that each include the query term occurring adjacent to one or more other query terms; determining a first quantity of occurrences of the text patterns that each include the query term occurring adjacent to the one or more other terms, in the corpus of documents; determining a second quantity of occurrences of text patterns that each include a heterographic homophone of the query term occurring adjacent to the one or more other terms, in the corpus of documents; and determining, by one or more computers, whether to revise the received search query to include the heterographic homophone of the query term, based on comparing the first quantity and the second quantity. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A non-transitory computer-readable storage device storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
-
receiving a search query that includes a query term; identifying, from a corpus of documents, text patterns that each include the query term occurring adjacent to one or more other query terms; determining a first quantity of occurrences of the text patterns that each include the query term occurring adjacent to the one or more other terms, in the corpus of documents; determining a second quantity of occurrences of text patterns that each include a heterographic homophone of the query term occurring adjacent to the one or more other terms, in the corpus of documents; and determining, by one or more computers, whether to revise the received search query to include the heterographic homophone of the query term, based on comparing the first quantity and the second quantity. - View Dependent Claims (21, 22, 23, 24, 25, 26, 27, 28)
-
Specification