×

AUTOMATED DOCUMENT RECOGNITION, IDENTIFICATION, AND DATA EXTRACTION

  • US 20150078671A1
  • Filed: 08/25/2014
  • Published: 03/19/2015
  • Est. Priority Date: 09/19/2013
  • Status: Active Grant
First Claim
Patent Images

1. A processor-implemented method for automated document recognition, identification and data extraction, the method comprising:

  • receiving a video stream associated with the document, the document being associated with a user;

    detecting an image of the document in the video stream, the detecting including recognizing a shape corresponding to the document overall;

    improving the detected image of the document in the video stream by adjusting colors, adjusting brightness, and removing blurring;

    extracting the detected image of the document from the video stream, the image being a still image;

    analyzing the extracted image using optical character recognition to produce image data, the image data including text zones, each of the text zones being associated with one or more distances to other text zones and one or more borders of the document, the one or more distances being determined using coordinates;

    comparing the extracted image to one or more document templates using the image data;

    determining a document template having a highest degree of coincidence with the extracted image using the comparison;

    matching the text zones of the extracted image with text zones of the document template to determine a type of data in each text zone; and

    structuring the data into a standard format to obtain structured data.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×