×

Apparatus and method for extracting semantic topic

  • US 10,423,723 B2
  • Filed: 06/03/2015
  • Issued: 09/24/2019
  • Est. Priority Date: 12/06/2012
  • Status: Active Grant
First Claim
Patent Images

1. A method for automatically extracting one or more semantic topics from one or more electronic document sets in which user'"'"'s opinions of an object are described using an apparatus capable of calculating a probability distribution, the method comprising:

  • (a) extracting, by a processor, a global word distribution from a plurality of global topic-sentiment pairs and a local word distribution from a plurality of local topic-sentiment pairs;

    (b) extracting, by the processor, a global topic distribution corresponding to words constituting a global topic, a global sentiment distribution corresponding to words constituting a first sentiment about the global topic, a local topic distribution corresponding to words constituting a local topic, and a local sentiment distribution corresponding to words constituting a second sentiment about the local topic with respect to each document of the electronic document sets;

    (c) performing, by the processor, statistical inference about each of the global word distribution, the local word distribution, the global topic distribution, the global sentiment distribution, the local topic distribution and the local sentiment distribution extracted in the step (a) and step (b);

    (d) with respect to each document of the electronic document sets, extracting, by the processor, a first global topic from the global topic distribution and the global sentiment distribution, and a first local topic from the local topic distribution and the local sentiment distribution, and extracting, by the processor, a third sentiment relevant to the first global topic and a fourth sentiment relevant to the first local topic; and

    (e) extracting, by the processor, one or more words from the global word distribution and the local word distribution on the basis of the first global topic, the first local topic, the third sentiment relevant to the first global topic and the fourth sentiment relevant to the first local topic,wherein each of the plurality of the global topic-sentiment pairs includes a first word expressing the global topic and a second word expressing the sentiment about the global topic, and each of the plurality of the local topic-sentiment pairs includes a third word expressing the local topic and a fourth word expressing the sentiment about the local topic,wherein the step (b) includes;

    extracting, by the processor, the global topic distribution and the global sentiment distribution about the global topic from the each document;

    shifting, by the processor, one or more sliding windows overlapped with each other in each document;

    extracting, by the processor, a categorical distribution of the sliding window; and

    extracting, by the processor, the local topic distribution, the local sentiment distribution about the local topic and a topic context distribution based on words extracted from a sentence in the one or more sliding window;

    wherein a size of the one or more sliding windows is set for the third word expressing the local topic and the fourth word expressing the sentiment about the local topic to be extracted together,wherein the step (d) includes;

    selecting a first sliding window and a first topic context with respect to each document;

    if the first topic context is global, selecting the first global topic and the third sentiment relevant to the first global topic from the global topic distribution and the global sentiment distribution; and

    ,if the first topic context is local, selecting the first local topic and the fourth sentiment relevant to the first local topic from the local topic distribution and the local sentiment distribution;

    wherein the global topic includes a first group of aspects that are used to only distinguish the object from another object among aspects of the object, andthe local topic includes a second group of aspects that are sentiment-oriented and are used to calculate a rating of the object,wherein the one or more words extracted in the step (e) includes the first global topic of the object, the first local topic of the object, the third sentiment relevant to the first global topic and the fourth sentiment relevant to the first local topic, andwherein the first global topic, the first local topic, the third sentiment relevant to the first global topic and the fourth sentiment relevant to the first local topic are used to calculate a rating of the object and a rating of the aspect of the object.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×