×

Method and system for implementing author profiling

  • US 9,607,340 B2
  • Filed: 03/12/2013
  • Issued: 03/28/2017
  • Est. Priority Date: 03/12/2013
  • Status: Active Grant
First Claim
Patent Images

1. A computer implemented method for analyzing author data, comprising:

  • receiving writings created by a plurality of authors;

    performing a semantic analysis upon the writings;

    generating a plurality of author profiles for the writings using results from the semantic analysis, the plurality of author profiles respectively identifying topics of interest to the plurality of authors, and groups of authors being identified from one or more of the topics of interest;

    identifying a first group of multiple authors that corresponds to a first topical subject and multiple author profiles for the multiple authors, the first group of multiple authors identified from the groups and corresponding to the multiple author profiles identified from the plurality of author profiles;

    identifying a second topical subject shared among at least some authors of the multiple authors in the first group at least by performing a correlation analysis that analyzes at least some author profiles in the multiple author profiles of the at least some authors;

    identifying a second group of authors from the plurality of authors that exhibit affinity for the second topical subject at least by identifying author vectors corresponding to the second group of authors with respect to the second topical subject; and

    correlating the first group of multiple authors with the second group of authors in response to the identification of the second topical subject, wherein the writings are received from the plurality of authors without targeting specific groups of authors;

    classifying the writings into a plurality of classes based in part or in whole upon topics of interests determined by the semantic analysis, classifying the writing including;

    creating a set of themes from results of the semantic analysis;

    analyzing the set of themes created from the results of the semantic analysis;

    determining subjects of the topics of interest based in part or in whole upon the set of themes;

    determining similarity among the subjects of the topics of interest at least by analyzing the plurality of author profiles;

    clustering the topics of interests into the plurality of classes based in part or in whole upon the similarity among the subjects;

    determining respective strength numbers for the plurality of authors, a strength number for a user indicating relative affinity of the user to a category relative to one or more remaining categories;

    associating the respective strength numbers that correspond to the plurality of authors with a plurality of categories;

    creating a vector for each author of the plurality of authors, wherein vectors for the plurality of authors indicate respective affinities among the plurality of authors to one or more common topics of interests or one or more subjects;

    establishing an author profile for the each author by using the vector for the each author;

    storing the author profile for the author in the plurality of author profiles;

    reducing noise in the writings at least by performing a semantic filtering process;

    improving accuracy of the plurality of classes from classifying the writings at least by reducing false positives, false negatives, and inappropriate contents with the semantic filtering process;

    identifying an actionable data based in part or in whole upon results of the semantic analysis, wherein the writings created by the plurality of authors include contents transcribed from non-social data;

    determining, at a rule and workflow module stored at least partially in memory, the plurality of computing systems to receive the actionable data based in part or in whole upon a set of rules that identifies how the actionable data is to be handled and directed;

    performing the semantic analysis upon the writings at least by performing a statistical language modeling;

    performing the semantic analysis upon the writings at least by performing a latent semantic analysis;

    preconfiguring a plurality of types of topics of interest;

    determining a first set of authors that corresponds to the one or more first types of topics of interest at least by analyzing the plurality of author profiles to identify a first set of author profiles corresponding to the first set of authors;

    determining commonality of one or more second types of topics of interest without pre-defining the one or more second types of topics of interest;

    identifying commonality among the plurality of writings in response to the one or more second types of topics of interest based in part or in whole upon results of the semantic analysis;

    identifying a group of authors that corresponds to a first affinity for a first subject;

    determining a second affinity and a third affinity shared by at least a threshold percentage of authors of the group of authors at least by analyzing a set of author profiles corresponding to the group of authors and by performing one or more first correlation analyses, wherein the second affinity and the third affinity are not known or expected in advance;

    generating correlation data based in part or in whole upon results of determining the second affinity and the third affinity;

    generating an action for the group of authors based on the second affinity and the third affinity;

    receiving the writings created by the plurality of authors without targeting one or more specific groups of authors;

    generating the plurality of author profiles for the writings based in part or in whole upon respective strength numbers for the plurality of authors;

    identifying a plurality of themes from the writings based in part or in whole upon results of the semantic analysis and results of classifying the writings;

    performing a themes analysis;

    generating the plurality of author profiles for the writing based in part or in whole upon the plurality of themes;

    determining a first set of actionable data for the plurality of authors based in part or in whole upon results of correlating an at least one group with the authors;

    identifying a set of rules from a rulebase;

    dispatching, at a rules and workflow engine, actionable data for the plurality of authors to a plurality of computing systems based in part or in whole upon the set of rules, wherein a rule provides how the actionable data is to be dispatched;

    determining, at a computer system, contextual and semantic significance in the writings at least by performing classification and filtering on the writings of the plurality of authors;

    identifying specific themes within the writings based in part or in whole upon topics and subjects revealed from the semantic analysis and the classification;

    performing categorization on the topics and the subjects of the writings to create a number of categories;

    associating a set of strength numbers with the number of categories, a strength number indicating relative affinity of each author of the plurality of authors to a particular topic, a particular subject, or a particular theme; and

    defining a vector for the each author using at least the set of strength numbers and the number of categories, a vector establishing an author profile for a specific author and being used to describe and analyze the specific author with respect to one or more affinities of the specific author.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×