×

Computer method for automatic extraction of commonly specified information from business correspondence

  • US 4,965,763 A
  • Filed: 02/06/1989
  • Issued: 10/23/1990
  • Est. Priority Date: 03/03/1987
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer method for the automatic extraction of commonly specified information from a business correspondence document, such as date of letter, name of recipient, name of sender, address of sender, title of sender, carbon copy list, subject statement, and the like, comprising the steps of:

  • a first scanning step of scanning the input data stream to locate postscripts, attachments of appendices at a first location by matching each word from the input data stream against a list of expressions used to indicate postscripts, attachments of appendices, said first location being set equal to the final occurring line in said data stream if said first scanning step does not locate any postscripts, attachments or appendices therein, said first location alternately being set equal to a location of postscripts, attachments or appendices found in said first scanning step;

    a second scanning step of scanning the input data stream to locate the final sentence in said document, starting from said first location and scanning toward the beginning of said data stream, searching for words which are verbs in the final sentence in said document, by identifying the last occurrence of a verb in the input data stream, which will occur in the final sentence of said document;

    a first identifying step of identifying an ending portion of the document expected to contain a sender'"'"'s name, return address, title of carbon copy list information, at a location in the input data stream occurring after the end of said final sentence located in said second scanning step, and occurring before said first location located in said first scanning step;

    a third scanning step of scanning said input data stream to locate any salutation by matching each word from the input data stream against a list of natural language expressions that can be used as a salutation;

    a second identifying step of identifying a beginning portion of the document at a location which includes a portion from the start of the input data stream to the end of a salutation, if a salutation was located in said third scanning step;

    a fourth scanning step of scanning the input data stream if no salutation was found in said third scanning step, said fourth scanning step to locate date, addressee, sender, return address, personal title or subject information in the input data stream by matching each word of the input data stream against a list of expressions that are used to indicate the date, addressee, the sender, the return address, personal title and the subject of the correspondence document;

    a third identifying step of identifying, if no salutation was found in said third scanning step, a beginning portion of the document at a location which includes the date, addressee, sender, return address, personal title or subject information of the correspondence document located in said fourth scanning step;

    isolating and storing from said beginning portion of said document, any addressee, sender, return address, personal title or subject information therein;

    isolating and storing from said ending portion of said document, any sender, return address, title or carbon copy list information therein.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×