Automatic generation of superwords
First Claim
Patent Images
1. A method for determining superwords, comprising the steps of:
- generating a set of candidate phrases from a database of observed sequences of words, symbols and/or sounds, the database having a perplexity value based on a language model;
incorporating one of the candidate phrases from the database into the language model;
analyzing how the perplexity value of the database, which is based on the language model, is affected by the incorporated candidate phrase; and
determining if the candidate phrase is a superword based on the analyzing step.
5 Assignments
0 Petitions
Accused Products
Abstract
This invention is directed to the automatic selection and generation of superwords based on a criterion relevant to speech recognition. Superwords are used to refer to those word combinations which are so often spoken that they can be recognized by a recognition device as being a single word.
216 Citations
21 Claims
-
1. A method for determining superwords, comprising the steps of:
-
generating a set of candidate phrases from a database of observed sequences of words, symbols and/or sounds, the database having a perplexity value based on a language model; incorporating one of the candidate phrases from the database into the language model; analyzing how the perplexity value of the database, which is based on the language model, is affected by the incorporated candidate phrase; and determining if the candidate phrase is a superword based on the analyzing step. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. An apparatus that determines superwords, comprising:
-
a database of observed sequences of words, symbols and/or sounds, the database having a perplexity value based on a language model; generating means for generating candidate phrases from the database; input means for incorporating one of the candidate phrases from the database into the language model; analyzing means for analyzing how the perplexity value of the database, which is based on the language model, is affected by the incorporated candidate phrase, the analyzing means producing an output; and determining means for determining if the candidate phrase is a superword from the output of the analyzing means. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. An apparatus that generates and selects superwords, comprising:
-
a database of observed sequences of words, symbols and/or sounds, the database having a perplexity value based on a language model; and a superword selector that generates candidate phrases from the database, the selector incorporating one of the candidate phrases from the database into the language model, the selector further analyzing how the perplexity value of the database, which is based on the language model, is affected by the incorporated candidate phrase and determining whether the candidate phrase is a superword based on the analysis. - View Dependent Claims (18, 19, 20, 21)
-
Specification