Method for compressing dictionary data

US 20070073541A1
Filed: 11/29/2006
Published: 03/29/2007
Est. Priority Date: 11/12/2001
Status: Abandoned Application

First Claim

Patent Images

1. An electronic device comprising a processing unit and a memory for storing a pre-processed pronunciation dictionary including a first set of units having character units and a second set of units having phoneme units, the units of the first set and the units of the second set being aligned and interleaved by having each phoneme unit at a predetermined location relative to the corresponding character unit, wherein the electronic device is configured to find a matching entry for a text string input from the pre-processed pronunciation dictionary using said first set of units of the entry from the predetermined locations;

the electronic device is configured to select from said matching entry phoneme units of said second set of units from predetermined locations; and

the electronic device is configured to concatenate the selected phoneme units into a sequence of phoneme units.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The invention relates to pre-processing of a pronunciation dictionary for compression in a data processing device, the pronunciation dictionary comprising at least one entry, the entry comprising a sequence of character units and a sequence of phoneme units. According to one aspect of the invention the sequence of character units and the sequence of phoneme units are aligned using a statistical algorithm. The aligned sequence of character units and aligned sequence of phoneme units are interleaved by inserting each phoneme unit at a predetermined location relative to the corresponding character unit.

176 Citations

24 Claims

1. An electronic device comprising a processing unit and a memory for storing a pre-processed pronunciation dictionary including a first set of units having character units and a second set of units having phoneme units, the units of the first set and the units of the second set being aligned and interleaved by having each phoneme unit at a predetermined location relative to the corresponding character unit, wherein the electronic device is configured to find a matching entry for a text string input from the pre-processed pronunciation dictionary using said first set of units of the entry from the predetermined locations;
- the electronic device is configured to select from said matching entry phoneme units of said second set of units from predetermined locations; and
  
  the electronic device is configured to concatenate the selected phoneme units into a sequence of phoneme units.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The electronic device of claim 1, wherein the electronic device is configured to remove empty spaces from said sequence of phoneme units.
  - 3. The electronic device of claim 1, wherein the electronic device is configured to retrieve from the memory a pronunciation model for each phoneme unit in said sequence of phoneme units, and the electronic device is configured to concatenate the pronunciation models.
  - 4. The electronic device of claim 1, wherein the electronic device is configured to map each phoneme unit from a first phonemic representation method to a second phonemic representation method.
  - 5. The electronic device of claim 1, wherein the electronic device is a mobile communications device.

6. An electronic device comprising a processing unit and memory for storing a pre-processed pronunciation dictionary including entries, the entries having a first set of units having character units and a second set of units having phoneme units;
- the units of the first set and the units of the second set being aligned and interleaved by having each phoneme unit at a predetermined location relative to the corresponding character unit;
  
  the electronic device is configured to store or create pronunciation models of each entry'"'"'s phonemic representation;
  
  the electronic device is configured to find a matching entry for a speech information input by comparing said speech information input to said pronunciation models and selecting the most corresponding entry;
  
  the electronic device is configured to select from said matching entry character units of said first set of units from predetermined locations; and
  
  the electronic device is configured to concatenate the selected character units into a sequence of character units.
- View Dependent Claims (7, 8, 9, 10, 11)
- - 7. The electronic device of claim 6, wherein the electronic device is configured to remove empty spaces from said sequence of character units.
  - 8. The electronic device of claim 6, wherein, for creating the pronunciation models, the electronic device is configured to:
    - find a matching entry for a text string input from the pre-processed pronunciation dictionary using said first set of units of the entry from the predetermined locations;
      
      select from said matching entry phoneme units of said second set of units from predetermined locations and concatenate them into a sequence of phoneme units;
      
      remove empty spaces from said sequence of phoneme units;
      
      retrieve from the memory a pronunciation model for each phoneme unit in said sequence of phoneme units, and concatenate the pronunciation models.
  - 9. The electronic device of claim 8, wherein the electronic device is configured to map each phoneme unit from a first phonemic representation method to a second phonemic representation method.
  - 10. The electronic device of claim 6, wherein the electronic device is configured to receive the pre-processed pronunciation dictionary and the pronunciation models, and the electronic device is configured to store the received pre-processed pronunciation dictionary and the pronunciation models into the memory.
  - 11. The electronic device of claim 6, wherein the electronic device is a mobile communications device.

12. A system comprising a first electronic device and a second electronic device arranged in a communication connection with each other, the system being configured to convert a text string input into a sequence of phoneme units, wherein:
- the first electronic device comprises means for storing a pre-processed pronunciation dictionary including a first set of units having character units and a second set of units having phoneme units, wherein entries are aligned and interleaved by having each phoneme unit at a predetermined location relative to the corresponding character unit;
  
  the first electronic device comprises means for finding a matching entry for said text string input from the pre-processed pronunciation dictionary using said first set of units of the entry;
  
  the first electronic device comprises means for transmitting said matching entry to the second electronic device;
  
  the second electronic device comprises means for receiving said matching entry from the first electronic device;
  
  the second electronic device comprises means for selecting from said matching entry units of said second set of units and concatenating them into a sequence of phoneme units; and
  
  the second electronic device comprises means for removing empty spaces from said sequence of phoneme units.
- View Dependent Claims (13)
- - 13. The system of claim 12, wherein the second electronic device is configured to map each phoneme unit from a first phonemic representation method to a second phonemic representation method.

14. A method for converting a text string input into a sequence of phoneme units, the method comprising:
- accessing a pre-processed pronunciation dictionary including a first set of units having character units and a second set of units having phoneme units, the units of the first set and the units of the second set being aligned and interleaved by having each phoneme unit at a predetermined location relative to the corresponding character unit;
  
  finding a matching entry from the pre-processed pronunciation dictionary for a text string input using said first set of units of the entry from the predetermined locations and ignoring empty spaces;
  
  selecting from said matching entry phoneme units of said second set of units from the predetermined locations; and
  
  concatenating the selected phoneme units into a sequence of phoneme units.
- View Dependent Claims (15, 16, 17)
- - 15. The method of claim 14, further comprising:
    - removing empty spaces from said sequence of phoneme units.
  - 16. The method of claim 14, further comprising:
    - retrieving a pronunciation model for each phoneme unit in said sequence of phoneme units; and
      
      concatenating the pronunciation models.
  - 17. The method of claim 14, further comprising:
    - mapping each phoneme unit from a first phonemic representation method to a second phonemic representation method.

18. A method for converting a speech information input into a sequence of character units, the method comprising:
- accessing a pre-processed pronunciation dictionary including a first set of units having character units and a second set of units having phoneme units, the units of the first set and the units of the second set being aligned and interleaved by having each phoneme unit at a predetermined location relative to the corresponding character unit, obtaining pronunciation models of each entry'"'"'s phonemic representation;
  
  finding a matching entry for a speech information input by comparing said speech information input to said pronunciation models and selecting the most corresponding entry;
  
  selecting from said matching entry character units of said first set of units from the predetermined locations; and
  
  concatenating the selected character units into a sequence of character units.
- View Dependent Claims (19, 20)
- - 19. The method of claim 18, further comprising:
    - removing empty spaces from said sequence of character units.
  - 20. The method of claim 18, wherein the step of obtaining the pronunciation models comprises:
    - finding a matching entry for a text string input from the pre-processed pronunciation dictionary using said first set of units of the entry from the predetermined locations;
      
      selecting from said matching entry phoneme units of said second set of units from predetermined locations and concatenate them into a sequence of phoneme units;
      
      removing empty spaces from said sequence of phoneme units;
      
      retrieving from the memory a pronunciation model for each phoneme unit in said sequence of phoneme units, and concatenating the pronunciation models.

21. A computer readable medium storing a computer program product comprising code which is executable in an electronic device for causing the electronic device to:
- access a pre-processed pronunciation dictionary including a first set of units having character units and a second set of units having phoneme units, the first set of units and the second set of units being aligned and interleaved by having each phoneme unit at a predetermined location relative to the corresponding character unit;
  
  find a matching entry from the pre-processed pronunciation dictionary for a text string input using said first set of units of the entry from the predetermined locations and ignoring empty spaces; and
  
  select from said matching entry phoneme units of said second set of units from the predetermined locations and concatenate them into a sequence of phoneme units.
- View Dependent Claims (22)
- - 22. The computer readable medium of claim 21, further comprising code for mapping each phoneme unit from a first phonemic representation method to a second phonemic representation method.

23. A computer readable medium storing a computer program product comprising code which is executable in an electronic device for causing the electronic device to:
- access a pre-processed pronunciation dictionary including a first set of units having character units and a second set of units having phoneme units, the first set of units and the second set of units being aligned and interleaved by having each phoneme unit at a predetermined location relative to the corresponding character unit;
  
  store or create pronunciation models of each entry'"'"'s phonemic representation;
  
  find a matching entry for a speech information input by comparing said speech information input to said pronunciation models and selecting the most corresponding entry;
  
  select from said matching entry character units of said first set of units from the predetermined locations and concatenate them into a sequence of character units.
- View Dependent Claims (24)
- - 24. The computer readable medium of claim 23, further comprising code for removing empty spaces from said sequence of character units.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nokia Corporation
Original Assignee
Nokia Corporation
Inventors
Tian, Jilei

Application Number

US11/605,655
Publication Number

US 20070073541A1
Time in Patent Office

Days
Field of Search
US Class Current

704/253
CPC Class Codes

G10L 15/12   using dynamic programming t...

G10L 2015/025   Phonemes, fenemes or fenone...

H03M 7/30   Compression speech analysis...

Method for compressing dictionary data

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

176 Citations

24 Claims

Specification

Solutions

Use Cases

Quick Links

Method for compressing dictionary data

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

176 Citations

24 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links