×

CONCEPTUAL DOCUMENT ANALYSIS AND CHARACTERIZATION

  • US 20160328454A1
  • Filed: 07/20/2016
  • Published: 11/10/2016
  • Est. Priority Date: 04/27/2015
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • receiving a plurality of data files from a plurality of data sources that comprise textual content;

    categorizing the plurality of data files using a taxonomy of categories in which each category has associated sample textual content defining a concept for the category, the categorizing comprising, for each category;

    comparing, for each of the plurality of data files, the textual content of the data file with the sample textual content for the category;

    calculating, based on the comparing and for each of the plurality of data files, a file score corresponding to the degree of similarity between the defined concept of the category and a determined concept for the data file; and

    associating, for each of the plurality of data files, the data file with the category if the file score is equal to or greater than a pre-determined minimum score for the category; and

    providing at least a portion of the data file and/or the associated file score.

View all claims
  • 10 Assignments
Timeline View
Assignment View
    ×
    ×