×

Contextual analysis engine

  • US 9,990,422 B2
  • Filed: 10/15/2013
  • Issued: 06/05/2018
  • Est. Priority Date: 10/15/2013
  • Status: Active Grant
First Claim
Patent Images

1. A method of analyzing digital content, the method comprising:

  • receiving a corpus of text;

    extracting a plurality of n-grams from the corpus of text;

    constructing a multi-dimensional document feature vector, wherein the multi-dimensional document feature vector includes at least a portion of the n-grams extracted from the corpus of text and a relevance factor corresponding to each of the n-grams included in the multi-dimensional document feature vector;

    extracting a portion of topics included in a topic ontology, wherein each of the extracted topics is related to at least one of the n-grams included in the multi-dimensional document feature vector;

    generating a hierarchical listing that includes the extracted topics, wherein the hierarchical listing comprises a first plurality of nodes in a first branch of the hierarchical listing, and a second plurality of nodes in a second branch of the hierarchical listing, and wherein a particular node in a particular branch of the hierarchical listing includes a particular extracted topic; and

    assigning a relevancy score to the particular extracted topic, wherein the assigned relevancy score is based on (a) the relevance factor corresponding to an n-gram that is related to the particular extracted topic, and (b) relevancy scores assigned to other extracted topics included in the particular branch of the hierarchical listing,wherein the hierarchical listing has a hierarchical structure corresponding to a hierarchical structure of the topic ontology, such that topics extracted from a relatively higher ontology level are in a corresponding higher hierarchical level of the listing, and topics extracted from a relatively lower ontology level are in a corresponding lower hierarchical level of the hierarchical listing, andwherein the hierarchical listing includes an extracted topic that is not included in the plurality of n-grams extracted from the corpus of text.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×