Apparatus and methods for pronunciation lexicon compression
First Claim
Patent Images
1. A method comprising:
- generating a compressed pronunciation lexicon file from a source pronunciation lexicon using a pronunciation prediction algorithm in a multi-output mode;
wherein the source pronunciation lexicon includes a list of textual representations of words and corresponding phonetic representations of the words; and
wherein generating the compressed lexicon file further includes;
sorting the list in alphabetical order of the textual representations to generate a sorted list;
substituting a textual representation of a particular word with i) the length of a common string of initial letters of the textual representation and the textual representation of a preceding word in the list, and ii) an encoded version of the remaining letters of the textual representation of the particular word.
4 Assignments
0 Petitions
Accused Products
Abstract
A compressed pronunciation lexicon file is generated from a source pronunciation lexicon using a pronunciation prediction algorithm in a multi-output mode. The pronunciation prediction algorithm may generate a deterministic ordered list of phoneme strings from the textual representation of a particular word. The compressed pronunciation lexicon file may include a sorted list of records of compressed textual representations of words and compressed phonetic representations of the words.
-
Citations
16 Claims
-
1. A method comprising:
-
generating a compressed pronunciation lexicon file from a source pronunciation lexicon using a pronunciation prediction algorithm in a multi-output mode; wherein the source pronunciation lexicon includes a list of textual representations of words and corresponding phonetic representations of the words; and wherein generating the compressed lexicon file further includes; sorting the list in alphabetical order of the textual representations to generate a sorted list; substituting a textual representation of a particular word with i) the length of a common string of initial letters of the textual representation and the textual representation of a preceding word in the list, and ii) an encoded version of the remaining letters of the textual representation of the particular word. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. An article comprising a computer-readable storage medium having stored thereon a compressed pronunciation lexicon file comprising:
-
a sorted list of records of textual representations of words and compressed phonetic representations of the words, the compressed phonetic representations generated using a pronunciation prediction algorithm in a multi-output mode, wherein the textual representations of words are compressed textual representations including prefix lengths and encoded suffixes, wherein a prefix length of a particular word is the length of a common string of initial letters of a textual representation of the particular word and the textual representation of a preceding word in the sorted list, and wherein an encoded suffix is an encoded version of the remaining letters of the textual representation of the particular word. - View Dependent Claims (9)
-
-
10. An article comprising a computer-readable storage medium having stored theron instructions that, when executed by a processor, result in:
-
generating a compressed pronunciation lexicon file from a source pronunciation lexicon using a pronunciation prediction algorithm in a multi-output mode; wherein the source pronunciation lexicon includes a list of textual representations of words and corresponding phonetic representations of the words; and wherein generating the compressed lexicon file further includes; sorting the list in alphabetical order of the textual representations to generate a sorted list; substituting a textual representation of a particular word with i) the length of a common string of initial letters of the textual representation and the textual representation of a preceding word in the list, and ii) an encoded version of the remaining letters of the textual representation of the particular word. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
Specification