×

Text segmentation and identification of topic using language models

  • US 6,052,657 A
  • Filed: 11/25/1997
  • Issued: 04/18/2000
  • Est. Priority Date: 09/09/1997
  • Status: Expired due to Term
First Claim
Patent Images

1. A method for segmenting a stream of text into segments using a plurality of language models, the stream of text including a sequence of blocks of text, the method comprising:

  • scoring the blocks of text against the language models to generate language model scores for the blocks of text, the language model score for a block of text against a language model indicating a correlation between the block of text and the language model;

    generating language model sequence scores for different sequences of language models to which a sequence of blocks of text may correspond, a language model sequence score being a function of the scores of a sequence of blocks of text against a sequence of language models;

    selecting a sequence of language models that satisfies a predetermined condition; and

    identifying segment boundaries in the stream of text that correspond to language model transitions in the selected sequence of language models.

View all claims
  • 8 Assignments
Timeline View
Assignment View
    ×
    ×