×

Protecting confidential information

  • US 10,032,046 B1
  • Filed: 11/30/2017
  • Issued: 07/24/2018
  • Est. Priority Date: 06/28/2017
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • receiving, by one or more computer processors, from a first computer, text generated by a user wherein the text generated by the user is one of;

    text input on an external web service generated from a plug-in to a client program that is a browser on the first computer, an email from a plug-in to the client program that is one of an email program or an email application on the first computer, and a message from a plug-in to the client program that is one of a messaging program or a messaging application on the first computer;

    identifying, by one or more computer processors, in the text generated by the user, one or more confidential information registered in a dictionary, wherein the dictionary contains a plurality of registered confidential information and a plurality of substitute words corresponding to the plurality of registered confidential information;

    retrieving, by one or more computer processors, from the dictionary, one or more substitute words corresponding to each identified registered confidential information of the one or more confidential information registered in the dictionary;

    identifying, by one or more computer processors, in the text generated by the user, whether one or more words are potentially confidential based, at least in part, on a text analysis of the text generated by the user;

    generating, by one or more computer processors, one or more words for each of the one or more potentially confidential words, wherein the one or more generated words are determined based, at least in part, on determining an edit distance is less than a threshold edit distance;

    determining, by one or more computer processors, for each of the one or more potentially confidential words with the edit distance less than the threshold edit distance, the registered confidential information associated with a shortest edit distance;

    retrieving, by one or more computer processors, from the dictionary, the one or more substitute words corresponding to the registered confidential information with the shortest edit distance;

    determining, by one or more computer processors, a category of the one or more substitute words corresponding to the registered confidential information associated with the shortest edit distance;

    retrieving, by one or more computer processors, a list of unused words in the category of the one or more words corresponding to the registered confidential information associated with the shortest edit distance;

    selecting, by one or more computer processors, one or more words from the list of unused words in the category of the one or more retrieved substitute words corresponding to the registered confidential information with the shortest edit distance based, at least in part, the text analysis identifying a highest topic index of the one or more words from the list of unused words in the category of the one or more retrieved substitute words corresponding to the registered confidential information associated with the shortest edit distance;

    sending, by one or more computer processors, to the first computer, a proposed protected text, wherein the proposed protected text includes the text generated by the user with each of the identified registered confidential information included with each of the one or more retrieved substitute words to replace the identified confidential information and each of the one or more potentially confidential words included with each of the one or more generated words to replace the one or more potentially confidential words;

    receiving, by one or more computer processors, from the first computer, at least one of;

    one of more edits to the proposed protected text input by the user and an indication of an approval by the user of the proposed protected text;

    responsive to receiving, from the first computer, the one of more edits to the proposed protected text input by the user, performing, by one or more computer processors, the one or more edits to the proposed protected text input by the user;

    generating, by the computer, one or more substitute words for each of the one of more edits to the proposed protected text input by the user, wherein the one or more generated substitute words are determined based, at least in part, on determining an edit distance is less than a threshold edit distance;

    responsive to receiving, from the first computer, the indication of the approval by the user of the proposed protected text, creating, by one or more computer processors, a user approved protected text, wherein the user approved protected text includes replacing each of the identified registered confidential information in the proposed protected text with the one or more retrieved substitute words corresponding to the identified registered confidential information of the one or more confidential information registered in the dictionary, replacing each of the one or more potentially confidential words in the proposed protected text with the one or more generated words to replace each of the one or more potentially confidential words, and replacing the additional registered confidential information indicated by the one or more edits to the proposed protected text input by the user with the one or more generated substitute words for each of the additional registered confidential information indicated by the one of more edits to the proposed protected text input by the user;

    sending, by one or more computer processors, the user approved protected text to the first computer;

    identifying, by one or more computer processors, each of the one or more potentially confidential words replaced in the user approved protected text as registered confidential information with the one or more generated words replacing each of the one or more potentially confidential words in the user approved protected text and each of the additional registered confidential information indicated by the one or more edits to the proposed protected text input by the user with the one or more generated substitute words for each of the additional registered confidential information indicated by one of more edits to the proposed protected text input the by the user; and

    updating, by one or more computers processors, the dictionary to include each of the one or more potentially confidential words replaced in the user approved protected text as registered confidential information with the one or more generated words replacing each of the one or more potentially confidential words in the user approved protected text and each of the additional registered confidential information in the proposed protected text identified by the one of one of more edits as registered confidential information with the with the one or more generated substitute words replacing each of the additional registered confidential information in the user approved proposed protected text.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×