Please download the dossier by clicking on the dossier button x
×

Template-free extraction of data from documents

  • US 10,019,535 B1
  • Filed: 08/06/2013
  • Issued: 07/10/2018
  • Est. Priority Date: 08/06/2013
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method for processing data, comprising:

  • obtaining text from a document associated with a user, wherein the document was generated based on a template and includes template text;

    without removing any of the obtained text, applying a set of rules to each term in the obtained text to determine a context associated with the term, wherein the determined context includes a category and at least one of the rules specifies a regular expression for a character sequence matching the determined context;

    applying an additional set of rules to refine a broad category of a plurality of terms in the obtained text to a refined category of fewer terms based on a location in the document of at least one term in the broad category of the plurality of terms;

    extracting one or more terms from the obtained text without removing any of the template text from the obtained text and without extracting the one or more terms using code developed to process only documents generated based on the template;

    storing each extracted term in one of a plurality of data elements according to the determined context; and

    enabling use of the plurality of data elements with one or more applications without requiring manual input of the extracted terms into the one or more applications.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×