×

Knowledge extraction from online discussion forums

  • US 7,814,048 B2
  • Filed: 08/14/2006
  • Issued: 10/12/2010
  • Est. Priority Date: 08/14/2006
  • Status: Active Grant
First Claim
Patent Images

1. A method, comprising:

  • accessing a thread from a discussion forum having a plurality of threads, the accessed thread having a root message with a thread title and a plurality of replies associated with the root message;

    selecting replies from the plurality of replies in the accessed thread by analyzing structural features and content features of each reply, wherein the structural features provide context of a given reply as related to other of the plurality of replies to the root message and the content features include words related to the root message;

    applying a filter to remove one or more replies from the previously selected replies by comparing a keyword list having a plurality of words, wherein the keyword list includes words indicative of personal identifying information, to content features in each of the selected replies and removing those replies that have at least one of the words indicative of personal identifying information in its content features;

    ranking the replies previously selected from the plurality of replies in the accessed thread that remain after applying the filter using a ranking model based on ranking features of the replies;

    generating a list of replies from the selected replies based on results of the ranking; and

    storing the list of replies in a data store to create a knowledge base for an automated conversational agent.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×