Method for compressing dictionary data
First Claim
1. An electronic device comprising a processing unit and a memory for storing a pre-processed pronunciation dictionary including a first set of units having character units and a second set of units having phoneme units, the units of the first set and the units of the second set being aligned and interleaved by having each phoneme unit at a predetermined location relative to the corresponding character unit, wherein the electronic device is configured to find a matching entry for a text string input from the pre-processed pronunciation dictionary using said first set of units of the entry from the predetermined locations;
- the electronic device is configured to select from said matching entry phoneme units of said second set of units from predetermined locations; and
the electronic device is configured to concatenate the selected phoneme units into a sequence of phoneme units.
0 Assignments
0 Petitions
Accused Products
Abstract
The invention relates to pre-processing of a pronunciation dictionary for compression in a data processing device, the pronunciation dictionary comprising at least one entry, the entry comprising a sequence of character units and a sequence of phoneme units. According to one aspect of the invention the sequence of character units and the sequence of phoneme units are aligned using a statistical algorithm. The aligned sequence of character units and aligned sequence of phoneme units are interleaved by inserting each phoneme unit at a predetermined location relative to the corresponding character unit.
176 Citations
24 Claims
-
1. An electronic device comprising a processing unit and a memory for storing a pre-processed pronunciation dictionary including a first set of units having character units and a second set of units having phoneme units, the units of the first set and the units of the second set being aligned and interleaved by having each phoneme unit at a predetermined location relative to the corresponding character unit, wherein
the electronic device is configured to find a matching entry for a text string input from the pre-processed pronunciation dictionary using said first set of units of the entry from the predetermined locations; -
the electronic device is configured to select from said matching entry phoneme units of said second set of units from predetermined locations; and
the electronic device is configured to concatenate the selected phoneme units into a sequence of phoneme units. - View Dependent Claims (2, 3, 4, 5)
-
-
6. An electronic device comprising a processing unit and memory for storing a pre-processed pronunciation dictionary including entries, the entries having a first set of units having character units and a second set of units having phoneme units;
- the units of the first set and the units of the second set being aligned and interleaved by having each phoneme unit at a predetermined location relative to the corresponding character unit;
the electronic device is configured to store or create pronunciation models of each entry'"'"'s phonemic representation;
the electronic device is configured to find a matching entry for a speech information input by comparing said speech information input to said pronunciation models and selecting the most corresponding entry;
the electronic device is configured to select from said matching entry character units of said first set of units from predetermined locations; and
the electronic device is configured to concatenate the selected character units into a sequence of character units. - View Dependent Claims (7, 8, 9, 10, 11)
- the units of the first set and the units of the second set being aligned and interleaved by having each phoneme unit at a predetermined location relative to the corresponding character unit;
-
12. A system comprising a first electronic device and a second electronic device arranged in a communication connection with each other, the system being configured to convert a text string input into a sequence of phoneme units, wherein:
-
the first electronic device comprises means for storing a pre-processed pronunciation dictionary including a first set of units having character units and a second set of units having phoneme units, wherein entries are aligned and interleaved by having each phoneme unit at a predetermined location relative to the corresponding character unit;
the first electronic device comprises means for finding a matching entry for said text string input from the pre-processed pronunciation dictionary using said first set of units of the entry;
the first electronic device comprises means for transmitting said matching entry to the second electronic device;
the second electronic device comprises means for receiving said matching entry from the first electronic device;
the second electronic device comprises means for selecting from said matching entry units of said second set of units and concatenating them into a sequence of phoneme units; and
the second electronic device comprises means for removing empty spaces from said sequence of phoneme units. - View Dependent Claims (13)
-
-
14. A method for converting a text string input into a sequence of phoneme units, the method comprising:
-
accessing a pre-processed pronunciation dictionary including a first set of units having character units and a second set of units having phoneme units, the units of the first set and the units of the second set being aligned and interleaved by having each phoneme unit at a predetermined location relative to the corresponding character unit;
finding a matching entry from the pre-processed pronunciation dictionary for a text string input using said first set of units of the entry from the predetermined locations and ignoring empty spaces;
selecting from said matching entry phoneme units of said second set of units from the predetermined locations; and
concatenating the selected phoneme units into a sequence of phoneme units. - View Dependent Claims (15, 16, 17)
-
-
18. A method for converting a speech information input into a sequence of character units, the method comprising:
-
accessing a pre-processed pronunciation dictionary including a first set of units having character units and a second set of units having phoneme units, the units of the first set and the units of the second set being aligned and interleaved by having each phoneme unit at a predetermined location relative to the corresponding character unit, obtaining pronunciation models of each entry'"'"'s phonemic representation;
finding a matching entry for a speech information input by comparing said speech information input to said pronunciation models and selecting the most corresponding entry;
selecting from said matching entry character units of said first set of units from the predetermined locations; and
concatenating the selected character units into a sequence of character units. - View Dependent Claims (19, 20)
-
-
21. A computer readable medium storing a computer program product comprising code which is executable in an electronic device for causing the electronic device to:
-
access a pre-processed pronunciation dictionary including a first set of units having character units and a second set of units having phoneme units, the first set of units and the second set of units being aligned and interleaved by having each phoneme unit at a predetermined location relative to the corresponding character unit;
find a matching entry from the pre-processed pronunciation dictionary for a text string input using said first set of units of the entry from the predetermined locations and ignoring empty spaces; and
select from said matching entry phoneme units of said second set of units from the predetermined locations and concatenate them into a sequence of phoneme units. - View Dependent Claims (22)
-
-
23. A computer readable medium storing a computer program product comprising code which is executable in an electronic device for causing the electronic device to:
-
access a pre-processed pronunciation dictionary including a first set of units having character units and a second set of units having phoneme units, the first set of units and the second set of units being aligned and interleaved by having each phoneme unit at a predetermined location relative to the corresponding character unit;
store or create pronunciation models of each entry'"'"'s phonemic representation;
find a matching entry for a speech information input by comparing said speech information input to said pronunciation models and selecting the most corresponding entry;
select from said matching entry character units of said first set of units from the predetermined locations and concatenate them into a sequence of character units. - View Dependent Claims (24)
-
Specification