×

Auto-classification of PDF forms by dynamically defining a taxonomy and vocabulary from PDF form fields

  • US 8,392,472 B1
  • Filed: 11/05/2009
  • Issued: 03/05/2013
  • Est. Priority Date: 11/05/2009
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method comprising:

  • associating form fields of a Portable Document Format (PDF) file with a markup language schema, the markup language schema specifying semantic constraints on attributes of the form fields within the PDF file, the form fields for receiving data;

    creating a content folder representing a specific classification, the specific classification based on attributes of form fields from the PDF file;

    receiving a selection of a subset of the form fields from the PDF file;

    associating the selection of the subset of the form fields with the content folder including creating metadata describing the selected form fields, the content folder configured for storing corresponding individual data entries received within the selected form fields of PDF files;

    extracting data from form fields of submitted PDF files, the submitted PDF files having data input into form fields associated with the content folder, the extracted data and metadata describing the form fields stored separately from the submitted PDF files; and

    automatically classifying the submitted PDF files based on attributes of the selected form fields and the extracted data.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×