Method and system for configurable allocation of sound segments for use in concatenative text-to-speech voice synthesis

US 20070073542A1
Filed: 09/23/2005
Published: 03/29/2007
Est. Priority Date: 09/23/2005
Status: Abandoned Application

First Claim

Patent Images

1. A method of dynamically allocating speech segments used in a concatenative text-to-speech engine, the method comprising:

determining memory capacity of a user computer adapted for playing a CTTS voice, wherein the user computer includes a data storage unit;

sorting the speech segments according to their frequency of access during speech synthesis; and

partitioning the speech segments between the computer memory and the data storage unit depending upon their frequency of access during speech synthesis.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Embodiments of the present invention provide a method, system and computer program product for synthesizing concatenative speech by allocating speech segments based upon their frequency of access during speech synthesis and storing frequently used speech segments in memory where they can be easily and quickly accessed. Speech data is recorded in separate files from which individual speech units are identified. The method and system of the present invention analyzes the frequency of access of each speech unit during synthesis and uses this data to sort the speech units according to their frequency of access. Those speech units that are accessed more frequently than others are loaded into memory where they can be accessed quickly during subsequent speech synthesis. Other speech units that are not used as frequently can be stored on a data storage disk. The invention can also dynamically adapt to changes in the frequency of speech unit access by moving units from memory to disk or vice versa depending upon their frequency of access or to account for a change in the user'"'"'s system requirements.

17 Citations

View as Search Results

20 Claims

1. A method of dynamically allocating speech segments used in a concatenative text-to-speech engine, the method comprising:
- determining memory capacity of a user computer adapted for playing a CTTS voice, wherein the user computer includes a data storage unit;
  
  sorting the speech segments according to their frequency of access during speech synthesis; and
  
  partitioning the speech segments between the computer memory and the data storage unit depending upon their frequency of access during speech synthesis.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein partitioning the speech segments between the computer memory and the data storage unit includes:
    - establishing a frequency usage cutoff value; and
      
      loading into computer memory the speech segments having a frequency of use greater than the frequency usage cutoff value.
  - 3. The method of claim 1, wherein if speech segments stored in the data storage unit are accessed frequently during speech synthesis, re-allocating to computer memory the frequently accessed speech segments.
  - 4. The method of claim 3, wherein re-allocating to computer memory the frequently accessed speech segments is performed automatically.
  - 5. The method of claim 3, wherein re-allocating to computer memory the frequently accessed speech segments is performed manually.
  - 6. The method of claim 1, wherein partitioning the speech segments between the computer memory and the data storage unit depending upon their frequency of use comprises:
    - assigning a time offset value for each speech segment, the time offset value corresponding to the average time between speech segment access occurrences;
      
      determining a partition cutoff value; and
      
      comparing the time offset associated with the speech segment with the partition cutoff value, such that if the time offset value of the speech segment is greater than the partition cutoff value, partitioning the desired speech segment in the data storage unit, otherwise partitioning the desired speech segment in the memory unit.
  - 7. The method of claim 2, wherein the frequency usage cutoff value is related to the capacity of the computer memory.

8. A computer program product comprising a computer usable medium having computer usable program code for dynamically allocating speech segments used in a concatenative text-to-speech engine, said computer program product including:
- computer usable program code for determining memory capacity of a user computer adapted for playing of a CTTS voice, wherein the user computer includes a data storage unit;
  
  computer usable program code for sorting the speech segments according to their frequency of access during speech synthesis; and
  
  computer usable program code for partitioning the speech segments between the computer memory and the data storage unit depending upon their frequency of access during the speech synthesis.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The computer program product of claim 8, wherein said computer usable program code for partitioning the speech segments between the computer memory and the data storage unit includes:
    - computer usable program code for establishing a frequency usage cutoff value; and
      
      computer usable program code for loading into computer memory the speech segments having a frequency of use greater than the frequency usage cutoff value.
  - 10. The computer program product of claim 8, further comprising computer usable program code for re-allocating to computer memory the frequently accessed speech segments if speech segments stored in the data storage unit are accessed frequently during speech synthesis.
  - 11. The computer program product of claim 10, wherein said computer usable program code for re-allocating to computer memory the frequently accessed speech segments comprises computer usable program code for automatically re-allocating to computer memory the frequently accessed speech segments.
  - 12. The computer program product of claim 10, wherein said computer usable program code for re-allocating to computer memory the frequently accessed speech segments comprises computer usable program code for manually re-allocating to computer memory the frequently accessed speech segments.
  - 13. The computer program product of claim 9, wherein said computer usable program code for partitioning the speech segments between the computer memory and the data storage unit depending upon their frequency of use comprises:
    - computer usable program code for assigning a time offset value for each speech segment, the time offset value corresponding to the average time between speech segment access occurrences;
      
      computer usable program code for determining a partition cutoff; and
      
      computer usable program code for comparing the time offset associated with the speech segment with the partition cutoff value, such that if the time offset value of the speech segment is greater than the partition cutoff value, partitioning the desired speech segment in the data storage unit, otherwise partitioning the desired speech segment in the memory unit.
  - 14. The computer program product of claim 10, wherein the frequency usage cutoff value is related to the capacity of the computer memory.

15. A system for dynamically allocating speech segments used in a concatenative text-to-speech engine, the system comprising:
- a computer, the computer including;
  
  a memory unit;
  
  a data storage unit adapted to store at least one file containing a plurality of speech segments; and
  
  a processor for sorting the speech segments based upon their frequency of access during speech synthesis, the processor adapted to allocate the frequently used speech segments to the memory unit.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. The system of claim 15, further including a frequency usage cutoff value and a usage frequency value associated with each speech segment, whereby during speech synthesis, the processor determines whether a desired speech segment resides in the memory unit or the data storage unit by comparing the desired speech segment'"'"'s usage frequency value with the frequency usage cutoff value.
  - 17. The system of claim 15, wherein the processor re-allocates a speech segment stored in the data storage unit to the memory unit if the speech segment is accessed frequently during speech synthesis.
  - 18. The system of claim 17, wherein the re-allocation of the speech segment stored in the data storage unit to the memory unit is performed automatically.
  - 19. The system of claim 17, wherein the re-allocation of the speech segment stored in the data storage unit to the memory unit is performed manually.
  - 20. The system of claim 16, wherein the frequency usage cutoff value is related to the capacity of the computer memory.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
International Business Machines Corporation
Inventors
Hamza, Wael, Smith, Maria, Chittaluru, Hari, Monteiro, Brennan

Application Number

US11/234,690
Publication Number

US 20070073542A1
Time in Patent Office

Days
Field of Search
US Class Current

704/260
CPC Class Codes

G10L 13/047 Architecture of speech synt...

G10L 13/07 Concatenation rules

Method and system for configurable allocation of sound segments for use in concatenative text-to-speech voice synthesis

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

17 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Method and system for configurable allocation of sound segments for use in concatenative text-to-speech voice synthesis

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

17 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links