×

Intelligent extraction and organization of data from unstructured documents

  • US 10,678,998 B1
  • Filed: 04/01/2019
  • Issued: 06/09/2020
  • Est. Priority Date: 02/20/2018
  • Status: Active Grant
First Claim
Patent Images

1. A system comprising a computer configured with instructions encoded on a non-transitory computer readable medium, the instructions operable to, when executed, cause the computer to perform a set of acts comprising:

  • a) obtaining a plurality of data elements from an input document corresponding to information for rows and columns in a structured output document;

    b) generating a set of data element groups based on applying a set of matching criteria to the data elements after the data elements have been assigned to a set of existing groups, each of which comprises one or more data elements, and testing for horizontal overlaps between neighbors of data elements in a first existing group and data elements in a second existing group and combining the first and second existing groups based on determining the existence of a match and absence of neighbor overlap between the first and second existing groups; and

    c) assigning the data elements to columns based on the set of data element groups to generate the structured output document with columns having data elements separated from data elements in neighboring columns by whitespace and extending vertically across any internal headings in the one or more input documents regardless of whether the internal headings cause vertical whitespace between neighboring columns to be discontinuous.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×