×

System and method for data extraction from digital images

  • US 6,400,845 B1
  • Filed: 04/23/1999
  • Issued: 06/04/2002
  • Est. Priority Date: 04/23/1999
  • Status: Expired due to Fees
First Claim
Patent Images

1. A system for the extraction of textual data from a digital image using predefined patterns based on visible and invisible characters contained in the textual data, comprising:

  • database means for storing data base records comprising;

    a master document image database comprised of at least one table containing at least one master document image;

    a template database comprised of at least one table comprising at least one template associated with the master document image, the template having at least one zone, the zone associated with a unique pattern comprised of one or more nonoverlapping segments, each segment containing one or more characters, with selected ones of the segments being associated with a data field in an extracted data base record;

    an extracted data database comprised of at least one table of extracted data base records, each record comprised of at least one data field for storing textual information extracted from the digital image;

    an image comparator in communication with the database means and receiving therefrom the master document image, the image comparator having an input for receiving the digital image, the image comparator comparing the master document image to the digital image and providing an output indicative of the success of the comparison;

    a template mapper in communication with the database means and the output of the image comparator and having an input for receiving the digital image and, on receiving the image comparator output indicating a successful comparison, retrieving the template from the template database associated with the successfully compared master document image and applying the template to the digital image, the template mapper providing as an output an image of each zone associated with the applied template;

    a zone optical character reader (OCR) in communication with the template mapper and receiving the output thereof, the zone OCR creating a zone data file of the characters in each zone image and providing the zone data file as an output;

    a zone pattern comparator in communication with the database means and the output of the zone OCR, the zone pattern comparator retrieving from template database the pattern associated with the zone and comparing the pattern to the zone data file, and, in the event that the pattern is found, extracting the data matching the pattern digital into an extracted data file, the zone pattern comparator providing the extracted data file as an output; and

    an extracted data parser in communication with the database means and the output of the zone pattern comparator, the parser parsing the data in the extracted data file for populating the data field of the database record associated with the pattern, the parser providing as an output the populated database record to the extracted data database for storage therein.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×