Method and system for language identification

  • US 7,818,165 B2
  • Filed: 07/26/2005
  • Issued: 10/19/2010
  • Est. Priority Date: 04/07/2005
  • Status: Expired due to Fees
  • ×
    • Pin Icon | RPX Insight
    • Pin
First Claim
Patent Images

1. A system for language identification, comprising:

  • at least one processor;

    at least one computer readable storage medium;

    a feature set of a plurality of character strings of varying length with associated information;

    the associated information including one or more significance scores for one of the character strings for one or more of a plurality of languages, wherein the significance scores include a basic significance score and an additional significance score, wherein the additional significance score is for application in response to detection of a characteristic in a syllable other than the character string within a word containing the character string, and wherein the characteristic comprises the syllable containing a letter matching a letter contained in a predetermined set of one or more letters; and

    program code executable on the at least one processor and stored on the at least one computer readable storage medium, for detecting the character string from the feature set within a token from an input text and for detecting the characteristic in a syllable other than the character string within a word containing the character string within the input text responsive to detecting the character string within the input text.

View all claims
    ×
    ×

    Thank you for your feedback

    ×
    ×