Please download the dossier by clicking on the dossier button x
×

High-accuracy confidential data detection

  • US 7,885,944 B1
  • Filed: 03/28/2008
  • Issued: 02/08/2011
  • Est. Priority Date: 03/28/2008
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method comprising:

  • storing, in memory, a plurality of classified data patterns for personal identifiers, the plurality of classified data patterns corresponding to variations of personal identifier formats, the personal identifiers including confidential information of a plurality of entities;

    identifying a text document to be searched;

    searching the text document for data expressed in a format that matches any of the plurality of classified data patterns corresponding to the variations of personal identifier formats;

    finding, in the text document, one or more sets of data having the format that matches any of the plurality of classified data patterns, each found set of data representing a possible candidate of a personal identifier; and

    validating each of the found sets of data from the text document using one or more personal identifier validators to determine whether the found set of data is a personal identifier or a false positive, wherein validating each of the found sets of data from the text document comprises;

    eliminating false positives based on data immediately preceding or following each of the personal identifier candidates, wherein the data immediately preceding or following a personal identifier candidate indicates a false positive if the personal identifier candidate is immediately preceded, or immediately followed by, a number or a letter.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×