×

Natural language determination using partial words

  • US 6,216,102 B1
  • Filed: 09/30/1996
  • Issued: 04/10/2001
  • Est. Priority Date: 08/19/1996
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for identifying the language in which a document is written, comprising the steps of:

  • reading a plurality of words from a document into a computer memory;

    truncating words within the plurality of words which exceed a predetermined length to produce a set of short and truncated words;

    comparing the set of short and truncated words to words in a plurality of word tables, each word table associated with and containing a selection of most frequently used words in a respective candidate language, wherein the most frequently used words which exceed the predetermined length are truncated in the word tables;

    accumulating a respective count for each candidate language each time one of the set of short and truncated words from the document matches a word in a word table associated with the candidate language; and

    identifying the language of the document as the language associated with the count having the highest value.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×