×

Data detection

  • US 8,489,388 B2
  • Filed: 11/10/2008
  • Issued: 07/16/2013
  • Est. Priority Date: 11/10/2008
  • Status: Active Grant
First Claim
Patent Images

1. A machine-implemented method of detecting data of a plurality of types in a sequence of characters, the method comprising:

  • combining the use of a pattern detection method and a statistical learning method to detect the data, the statistical learning method converting the sequence of characters into a first sequence of tokens, each token comprising a lexeme and a token type relating to the function of the lexeme within the sequence of characters and having at least a predetermined probability that the corresponding data is of at least one of said types, the pattern detection method converting the sequence of characters into a second sequence of tokens, each token corresponding to data that matches a predetermined pattern indicative of the at least one of said types, the pattern detection method further parsing a combination of the first and second sequence of tokens;

    comparing the first and second sequence of tokens, wherein when corresponding tokens from the first and second sequence of tokens for a portion of the sequence of characters are the same, parsing only one of the corresponding tokens, when the tokens are not name tokens and the corresponding tokens are different, parsing both corresponding tokens, and when the tokens are name tokens and the corresponding tokens are different, parsing the corresponding token only from the statistical learning method; and

    outputting the data corresponding to the combination of tokens as the data that matches the predetermined pattern.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×