×

Machine learning of document templates for data extraction

  • US 7,561,734 B1
  • Filed: 10/23/2006
  • Issued: 07/14/2009
  • Est. Priority Date: 03/02/2002
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method in a computer system for learning at least one attribute of a data element within a document, comprising:

  • receiving from a user by the computer system a boundary of a data element within a document; and

    inferring by the computer system at least one attribute of the data element bounded by the boundary,wherein the at least one attribute of the data element is inferred from the boundary of the data element;

    wherein the at least one attribute includes at least one of one or more lexical attributes, one or more contextual attributes, and one or more control attributes; and

    wherein each of the one or more contextual attributes comprises;

    a total number of words in a context; and

    one or more context words, each context word having one or more associated measurements.

View all claims
  • 6 Assignments
Timeline View
Assignment View
    ×
    ×