×

Systems and Methods for Language Detection

  • US 20170024372A1
  • Filed: 10/03/2016
  • Published: 01/26/2017
  • Est. Priority Date: 10/17/2014
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method of identifying a language in a message, the method comprising:

  • obtaining a text message;

    removing non-language characters from the text message to generate a sanitized text message;

    detecting at least one of an alphabet and a script present in the sanitized text message, wherein detecting comprises at least one of;

    (i) performing an alphabet-based language detection test to determine a first set of scores, wherein each score in the first set of scores represents a likelihood that the sanitized text message comprises the alphabet for one of a plurality of different languages; and

    (ii) performing a script-based language detection test to determine a second set of scores, wherein each score in the second set of scores represents a likelihood that the sanitized text message comprises the script for one of the plurality of different languages; and

    identifying the language in the sanitized text message based on at least one of the first set of scores, the second set of scores, and a combination of the first and second sets of scores.

View all claims
  • 6 Assignments
Timeline View
Assignment View
    ×
    ×