SYSTEM AND METHOD FOR PERFORMING SPEECH SYNTHESIS WITH A CACHE OF PHONEME SEQUENCES

US 20090043585A1
Filed: 08/09/2007
Published: 02/12/2009
Est. Priority Date: 08/09/2007
Status: Active Grant

First Claim

Patent Images

1. A method of performing speech synthesis, the method comprising:

applying a first part of a speech synthesizer to a text corpus to obtain a plurality of phoneme sequences, the first part of the speech synthesizer only identifying possible phoneme sequences;

for each of the obtained plurality of phoneme sequences, identifying joins that would be calculated to synthesize each of the plurality of respective phoneme sequences; and

adding the identified joins to a cache for use in speech synthesis.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed are systems, methods, and computer readable media for performing speech synthesis. The method embodiment comprises applying a first part of a speech synthesizer to a text corpus to obtain a plurality of phoneme sequences, the first part of the speech synthesizer only identifying possible phoneme sequences, for each of the obtained plurality of phoneme sequences, identifying joins that would be calculated to synthesize each of the plurality of respective phoneme sequences, and adding the identified joins to a cache for use in speech synthesis.

Citations

18 Claims

1. A method of performing speech synthesis, the method comprising:
- applying a first part of a speech synthesizer to a text corpus to obtain a plurality of phoneme sequences, the first part of the speech synthesizer only identifying possible phoneme sequences;
  
  for each of the obtained plurality of phoneme sequences, identifying joins that would be calculated to synthesize each of the plurality of respective phoneme sequences; and
  
  adding the identified joins to a cache for use in speech synthesis.
- View Dependent Claims (2, 3, 4)
- - 2. The method of claim 1, the method further comprising:
    - recording a frequency of occurrence for each of the plurality of phoneme sequences; and
      
      pruning the cache.
  - 3. The method of claim 1, the method further comprising:
    - building a plurality of caches of different sizes based on values or parameters.
  - 4. The method of claim 3, wherein the values or parameters comprise computational costs or frequency of occurrence.

5. A method of synthesizing a speech signal, the method comprising:
- (1) selecting one or more acoustic units from an acoustic unit database;
  
  (2) determining whether a join cost of an acoustic unit sequential pair resides in a cache created by steps comprising;
  
  (a) applying a first part of a speech synthesizer to a text corpus to obtain a plurality of phoneme sequences, the first part of the speech synthesizer only identifying possible phoneme sequences;
  
  (b) for each of the obtained plurality of phoneme sequences, identifying joins that would be calculated to synthesize each of the plurality of respective phoneme sequences; and
  
  (c) adding the identified joins to a cache for use in speech synthesis(3) if the cache contains the join, extracting the join from the cache for use in speech synthesis; and
  
  (4) if the cache does not contain the join, calculating a value of the join for use in speech synthesis.
- View Dependent Claims (6)
- - 6. The method of claim 5, wherein calculating the value of the join cost is performed to enhance accuracy over speed.

7. A system for performing speech synthesis, the system comprising:
- a module configured to apply a first part of a speech synthesizer to a text corpus to obtain a plurality of phoneme sequences, the first part of the speech synthesizer only identifying possible phoneme sequences;
  
  for each of the obtained plurality of phoneme sequences, a module configured to identify joins that would be calculated to synthesize each of the plurality of respective phoneme sequences; and
  
  a module configured to add the identified joins to a cache for use in speech synthesis.
- View Dependent Claims (8, 9, 10)
- - 8. The system of claim 7, the system further comprising:
    - a module configured to record a frequency of occurrence for each of the plurality of phoneme sequences; and
      
      a module configured to prune the cache.
  - 9. The system of claim 7, the system further comprising:
    - a module configured to build a plurality of caches of different sizes based on values or parameters.
  - 10. The system of claim 9, wherein the values or parameters comprise computational costs or frequency of occurrence.

11. A system for synthesizing a speech signal, the system comprising:
- (1) a module configured to select one or more acoustic units from an acoustic unit database;
  
  (2) a module configured to determine whether a join cost of an acoustic unit sequential pair resides in a cache created by steps comprising;
  
  (a) applying a first part of a speech synthesizer to a text corpus to obtain a plurality of phoneme sequences, the first part of the speech synthesizer only identifying possible phoneme sequences;
  
  (b) for each of the obtained plurality of phoneme sequences, identifying joins that would be calculated to synthesize each of the plurality of respective phoneme sequences; and
  
  (c) adding the identified joins to a cache for use in speech synthesis(3) if the cache contains the join, a module configured to extract the join from the cache for use in speech synthesis; and
  
  (4) if the cache does not contain the join, a module configured to calculate a value of the join for use in speech synthesis.
- View Dependent Claims (12)
- - 12. The system of claim 11, wherein calculating the value of the join cost is performed to enhance accuracy over speed.

13. A computer readable medium storing a computer program having instructions for performing speech synthesis, the instructions comprising:
- applying a first part of a speech synthesizer to a text corpus to obtain a plurality of phoneme sequences, the first part of the speech synthesizer only identifying possible phoneme sequences;
  
  for each of the obtained plurality of phoneme sequences, identifying joins that would be calculated to synthesize each of the plurality of respective phoneme sequences; and
  
  adding the identified joins to a cache for use in speech synthesis.
- View Dependent Claims (14, 15, 16)
- - 14. The computer readable medium of claim 13, the instructions further comprising:
    - recording a frequency of occurrence for each of the plurality of phoneme sequences; and
      
      pruning the cache.
  - 15. The computer readable medium of claim 13, the instructions further comprising:
    - building a plurality of caches of different sizes based on values or parameters.
  - 16. The computer readable medium of claim 15, wherein the values or parameters comprise computational costs or frequency of occurrence.

17. A computer readable medium storing a computer program having instructions for synthesizing a speech signal, the instructions comprising:
- (1) selecting one or more acoustic units from an acoustic unit database;
  
  (2) determining whether a join cost of an acoustic unit sequential pair resides in a cache created by steps comprising;
  
  (a) applying a first part of a speech synthesizer to a text corpus to obtain a plurality of phoneme sequences, the first part of the speech synthesizer only identifying possible phoneme sequences;
  
  (b) for each of the obtained plurality of phoneme sequences, identifying joins that would be calculated to synthesize each of the plurality of respective phoneme sequences; and
  
  (c) adding the identified joins to a cache for use in speech synthesis(3) if the cache contains the join, extracting the join from the cache for use in speech synthesis; and
  
  (4) if the cache does not contain the join, calculating a value of the join for use in speech synthesis.
- View Dependent Claims (18)
- - 18. The computer readable medium of claim 17, wherein calculating the value of the join cost is performed to enhance accuracy over speed.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
AT&T Corporation (AT&T, Inc.)
Inventors
CONKIE, Alistair D.

Granted Patent

US 7,983,919 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/267
CPC Class Codes

G10L 13/04 Details of speech synthesis...

G10L 13/08 Text analysis or generation...

SYSTEM AND METHOD FOR PERFORMING SPEECH SYNTHESIS WITH A CACHE OF PHONEME SEQUENCES

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

SYSTEM AND METHOD FOR PERFORMING SPEECH SYNTHESIS WITH A CACHE OF PHONEME SEQUENCES

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links