×

Automatic context sensitive language correction and enhancement using an internet corpus

  • US 8,914,278 B2
  • Filed: 07/31/2008
  • Issued: 12/16/2014
  • Est. Priority Date: 08/01/2007
  • Status: Active Grant
First Claim
Patent Images

1. A computer-assisted language correction system comprising:

  • a computer storage device, storing computer modules;

    a computer processor operative to execute said modules;

    said computer modules including;

    contextual feature-sequence (CFS) functionality operative to generate a plurality of contextual feature-sequences based on an input sentence, said contextual feature sequence comprising at least one of N-grams, skip-grams, switch-grams, co-occurrences, and combinations thereof;

    an alternatives generator, generating on the basis of said input sentence a text-based representation providing multiple alternatives for each of a plurality of words in the sentence, said multiple alternatives including non-contextual corrections for each of said plurality of words;

    a selector for selecting among at least said multiple alternatives for each of said plurality of words in the sentence, said selector including context based scoring functionality operative to rank said multiple alternatives, based at least partly on contextual feature-sequence frequencies of occurrences in an internet corpus for each of the plurality of contextual feature-sequences, said context based scoring functionality including ranking said multiple alternatives based at least partially on a CFS importance score, wherein the CFS importance score is a function of a combination of;

    a) a number of parsing tree nodes that correspond to a same part of the CFS, and b) a frequency of occurrence of each of the words in the CFS; and

    a correction generator operative to provide a correction output based on selections made by said selector.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×