×

Computer-implemented systems and methods for taxonomy development

  • US 9,116,985 B2
  • Filed: 12/16/2011
  • Issued: 08/25/2015
  • Est. Priority Date: 12/16/2011
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method for generating a set of classifiers, comprising:

  • determining, using one or more data processors, one or more locations of instances of a topic term in a collection of documents;

    identifying, using the one or more data processors, one or more topic term phrases by parsing words in the collection of documents, wherein a topic term phrase includes one or more words that appear within a topic threshold distance of a topic term;

    identifying, using the one or more data processors, one or more sentiment terms within a topic term phrase;

    identifying, using the one or more data processors, one or more candidate classifiers by parsing words in the one or more topic term phrases, wherein a candidate classifier is a word that appears within a sentiment threshold distance of a sentiment term;

    generating, using the one or more data processors, a colocation matrix including a plurality of rows, wherein a candidate classifier is associated with a row, and wherein the colocation matrix is generated using the locations of the candidate classifiers as they appear within the collection of documents;

    standardizing, using the one or more data processors, each of the plurality of rows by dividing values in each row by a sum of the values in the row;

    identifying, using the one or more data processors, a seed row, wherein the seed row is selected from among the plurality of rows, and wherein the seed row is associated with a particular attribute;

    determining, using the one or more data processors, distance metrics by comparing rows of the colocation matrix to the seed row; and

    generating, using the one or more data processors, a set of classifiers for the particular attribute, wherein classifiers in the set of classifiers are selected using the distance metrics.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×