×

Generation of classification data used for classifying documents

  • US 10,318,568 B2
  • Filed: 06/07/2016
  • Issued: 06/11/2019
  • Est. Priority Date: 06/07/2016
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method for generating classification data which is used for classifying documents, the method comprising:

  • reading, in a memory, documents in a form of a spreadsheet;

    collecting cell values in each of the documents;

    finding, using a processor, in each of common or near cell locations among all or a part of the documents, one or more common cell values among the collected values;

    counting, using the processor, for each of the common cell values, a number of the documents having the common cell value;

    storing, if the number of the documents is equal to or larger than a predetermined number, the common cell value as a candidate header label in a memory;

    calculating, using the processor, a distance between cell locations of the candidate header labels in each of the documents;

    choosing, according to the calculated distance, two or more candidate header labels among the candidate header labels for each of the documents; and

    storing, in a storage, one or more combinations of the chosen two or more candidate header labels (hereinafter referred to as “

    header”

    ) as the classification data.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×