CORRECTION OF MISSPELLINGS IN QA SYSTEM
First Claim
1. A computer implemented method for identifying and correcting a misspelling in a question answering (QA) system, wherein the QA system is coupled to a document corpus, and the document corpus includes a plurality of documents related to a particular domain, the method comprising:
- receiving, by a processor coupled to one or more user devices, an input question and a plurality of passages, wherein the plurality of passages are extracted from the document corpus by the QA system;
providing, by the processor, at least one alternate form for each token extracted from the input question and the plurality of passages;
identifying, by the processor, at least one misspelled token from the input question and the plurality of passages; and
scoring, by the processor, at least one alternate form of each identified misspelled token.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments provide a computer implemented method for identifying and correcting a misspelling in a question answering (QA) system, wherein the QA system is coupled to a document corpus, and the document corpus includes a plurality of documents related to a particular domain. The method includes the following steps: receiving an input question and a plurality of passages, wherein the plurality of passages are extracted from the document corpus by the QA system; providing at least one alternate form for each token extracted from the input question and the plurality of passages; identifying at least one misspelled token; and scoring at least one alternate form of each identified misspelled token.
-
Citations
20 Claims
-
1. A computer implemented method for identifying and correcting a misspelling in a question answering (QA) system, wherein the QA system is coupled to a document corpus, and the document corpus includes a plurality of documents related to a particular domain, the method comprising:
-
receiving, by a processor coupled to one or more user devices, an input question and a plurality of passages, wherein the plurality of passages are extracted from the document corpus by the QA system; providing, by the processor, at least one alternate form for each token extracted from the input question and the plurality of passages; identifying, by the processor, at least one misspelled token from the input question and the plurality of passages; and scoring, by the processor, at least one alternate form of each identified misspelled token. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer program product for identifying and correcting a misspelling in a question answering (QA) system, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to:
-
receive, by a processor coupled to one or more user devices, an input question and a plurality of passages, wherein the plurality of passages are extracted from the document corpus by the QA system; provide, by the processor, at least one alternate form for each token extracted from the input question and the plurality of passages; provide, by the processor, a modified Levenshtein distance value for each alternate form, wherein the modified Levenshtein distance value is between 0 and 1; identify, by the processor, at least one misspelled token and at least one token having the modified Levenshtein distance value more than a predetermined threshold value from the input question and the plurality of passages; and score, by the processor, at least one alternate form of each identified misspelled token. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A system for identifying and correcting a misspelling in a question answering (QA) system, comprising:
a processor configured to; receive, by a processor coupled to one or more user devices, an input question and a plurality of passages, wherein the plurality of passages are extracted from the document corpus by the QA system; provide, by the processor, at least one alternate form for each token extracted from the input question and the plurality of passages; provide, by the processor, a modified Levenshtein distance value for each alternate form, wherein the modified Levenshtein distance value is between 0 and 1; identify, by the processor, at least one misspelled token and at least one token having the modified Levenshtein distance value more than a predetermined threshold value from the input question and the plurality of passages; and score, by the processor, at least one alternate form of each identified misspelled token. - View Dependent Claims (17, 18, 19, 20)
Specification