×

INTEGRATING AND EXTRACTING TOPICS FROM CONTENT OF HETEROGENEOUS SOURCES

  • US 20150066904A1
  • Filed: 08/29/2013
  • Published: 03/05/2015
  • Est. Priority Date: 08/29/2013
  • Status: Active Grant
First Claim
Patent Images

1. A system for integrating and extracting topics from content of heterogeneous sources, the system comprising:

  • a processor to;

    identify a plurality of observed words in documents that are received from the heterogeneous sources;

    obtain document metadata and source metadata from the heterogeneous sources;

    use the document metadata to calculate a plurality of word topic probabilities for the plurality of observed words;

    use the source metadata to calculate a plurality of source topic probabilities for the plurality of observed words; and

    determine a latent topic for one of the documents based on the plurality of observed words, the plurality of word topic probabilities, and the plurality of source topic probabilities.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×