×

MINING MULTILINGUAL TOPICS

  • US 20150046459A1
  • Filed: 08/28/2014
  • Published: 02/12/2015
  • Est. Priority Date: 04/15/2010
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • identifying multiple concept-units from a multi-language document corpus, a particular concept-unit including a set of documents in different languages describing a particular concept;

    modeling the concept-units of the multi-language document corpus to create a generative model, wherein the generative model represents at least;

    (a) a plurality of universal topics, each of the universal topics being defined by a plurality of topic word distributions corresponding respectively to the different languages;

    (b) a topic distribution for each concept-unit, wherein the documents of any single concept-unit are constrained within the generative model to share a common topic distribution;

    inferring the plurality of universal topics from the documents of the concept-units based on the generative model.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×