×

Detecting writing systems and languages

  • US 8,326,602 B2
  • Filed: 06/05/2009
  • Issued: 12/04/2012
  • Est. Priority Date: 06/05/2009
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method comprising:

  • receiving text at a computer system having one or more processors;

    detecting, at the computer system, a first segment of the text, where a substantial amount of the first segment represents a first language;

    detecting, at the computer system, a second segment of the text, where a substantial amount of the second segment represents a second language;

    obtaining, at the computer system, a first language likelihood for each n-gram of size x included in the text;

    obtaining, at the computer system, a second language likelihood for each n-gram of size x included in the text;

    identifying, at the computer system, a score for each n-gram of size x included in the text, where each score represents a difference between the first language likelihood and the second language likelihood; and

    detecting, at the computer system, an edge including;

    calculating a first average of the scores for a first group of consecutive n-grams, where consecutive n-grams are defined as including a third n-gram including a first left context and a first right context and a fourth n-gram including a second left context and a second right context, where the second left context is the first right context, where the first group of consecutive n-grams is defined as including a specified number of consecutive n-grams that includes an ending n-gram,calculating a second average of the scores for a second group of consecutive n-grams, and the second group of consecutive n-grams is defined as including a same number of consecutive n-grams that includes a starting n-gram, where the ending n-gram is adjacent to the starting n-gram, andidentifying the edge based on a difference between the first average and the second average.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×