Distributed speech synthesis system, terminal device, and computer program thereof

US 20060004577A1
Filed: 01/07/2005
Published: 01/05/2006
Est. Priority Date: 07/05/2004
Status: Abandoned Application

First Claim

Patent Images

1. A terminal device which can connect to a processing server via a network, said terminal device comprising:

a unit of receiving from said processing server a secondary content furnished with information for access to a speech database and retrieval of optimal units selected by analyzing text data included in a primary content distributed via said network; and

a unit of synthesizing speech corresponding to said text data, based on said secondary content and the speech database.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In the text-to-speech synthesis technique for synthesizing speech from text, this invention enables a terminal device with relatively small computing power to perform speech synthesis based on optimal unit selection. The text-to-speech synthesis procedure of the present invention involves content generation and output; that is, a secondary content including the results of the optimal unit selection process is output. By virtue of the secondary content, a high load process of selecting optimal units and a light load process of synthesizing speech waveforms can be performed separately. The optimal unit selection process is performed at a server and information for the units to be retrieved from a corpus is sent to the terminal as data for speech synthesis.

25 Citations

View as Search Results

20 Claims

1. A terminal device which can connect to a processing server via a network, said terminal device comprising:
- a unit of receiving from said processing server a secondary content furnished with information for access to a speech database and retrieval of optimal units selected by analyzing text data included in a primary content distributed via said network; and
  
  a unit of synthesizing speech corresponding to said text data, based on said secondary content and the speech database.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The terminal device according to claim 1, wherein a speech database exists on said processing server and this speech database and the speech database existing on said terminal device apply a common identification scheme in which a particular waveform can be identified uniquely.
  - 3. The terminal device according to claim 1, wherein said secondary content comprises a text part where text from said primary content and a string of phonetic symbols are stored and a waveform information part where reference information for the waveforms of said optimal units selected by analyzing data in the text part is described, and wherein speech database ID information for identifying one of said speech databases and waveform index information for synthesizing speech corresponding to the data in said text part are stored in said waveform information part.
  - 4. The terminal device according to claim 3, further comprising:
    - a unit of generating prosodic parameters with regard to the string of phonetic symbols included in said secondary content and outputting prosodic information for the data in said text part.
  - 5. The terminal device according to claim 3, further comprising:
    - a unit of executing morphological analysis of the text included in said secondary content; and
      
      a unit of generating prosodic parameters with regard to the string of phonetic symbols included in said secondary content and outputting prosodic information for the data in said text part.

6. A distributed speech synthesis system which includes a processing server and a terminal device connected to said processing server via a network, wherein said system implements speech synthesis and outputs speech from text data included in a primary content received over said network, wherein said processing server comprises:
- a unit of generating a secondary content, which comprises analyzing the text data included in the primary content received over said network, selecting optimal units, and furnishing information for access to a speech database and retrieval of the optimal units; and
  
  a unit of sending the secondary content to said terminal device.
- View Dependent Claims (7, 8)
- - 7. The distributed speech synthesis system according to claim 6, wherein respective speech databases exist on said processing server and said terminal device, applying a common identification scheme in which a particular waveform can be identified uniquely.
  - 8. The distributed speech synthesis system according to claim 7, wherein said secondary content comprises a text part where text from said primary content and a string of phonetic symbols are stored and a waveform information part where reference information for the waveforms of said optimal units selected by analyzing data in the text part is described, and wherein speech database ID information for identifying one of said speech databases and waveform index information for synthesizing speech corresponding to the text in said text part are stored in said waveform information part.

9. A computer program for speech synthesis and output from requested content data at a terminal device connected to a processing server via a network, said computer program causing a computer to implement:
- a function of requesting said processing server for a primary content to be vocalized;
  
  a function of receiving a secondary content including information of a string of optimal units selected by analyzing text data from said primary content from said processing server; and
  
  a function of synthesizing speech from the secondary content data by accessing a speech database.
- View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
- - 10. The computer program according to claim 9, wherein the speech database existing on said terminal device and a speech database existing on said processing server apply a common identification scheme in which a particular waveform can be identified uniquely.
  - 11. The computer program according to claim 9, wherein said secondary content comprises a text part where text from said primary content and a string of phonetic symbols are stored and a waveform information part where reference information for the waveforms of said optimal units selected by analyzing data in the text part is described, and wherein said waveform information part comprises speech database ID information for identifying a speech database to access and waveform index information for identifying waveforms to be retrieved from the speech database identified by the database ID.
  - 12. The computer program according to claim 9, further including:
    - a function of generating prosodic parameters with regard to the string of phonetic symbols included in said secondary content and outputting prosodic information for the data in said text part.
  - 13. The computer program according to claim 9, further including:
    - a function of executing morphological analysis of the text included in said secondary content; and
      
      a function of generating prosodic parameters with regard to the string of phonetic symbols included in said secondary content and outputting prosodic information for the data in said text part.
  - 14. The computer program according to claim 9, wherein said terminal device is provided with a management table and the management table comprises a speech database and a terminal ID part as identifier information to identify said speech database existing on the terminal device.
  - 15. The computer program according to claim 14, wherein said identifier information is managed by said processing server.
  - 16. The computer program according to claim 14, which further causes the computer to implement a function of transmitting the identifier information to identify said speech database existing on said terminal device from the terminal device to said processing server over the network.

17. A computer program for distributed speech synthesis, which synthesizes and outputs speech from text data included in a primary content received over said network, in a distributed speech synthesis system including a processing server and a terminal device connected to said processing server via a network, wherein respective speech databases exist on said processing server and said terminal device, applying a common identification scheme in which a particular waveform can be identified uniquely, said computer program. causing a computer to implement:
- a function of generating a secondary content, which comprises analyzing the text data included in the primary content received over said network, selecting optimal units, and furnishing information for access to a speech database and retrieval of the optimal units; and
  
  a function of synthesizing speech corresponding to said text data, based on said secondary content and the appropriate speech database.
- View Dependent Claims (18, 19, 20)
- - 18. The computer program according to claim 17, which further causes the computer to implement:
    - a function of requesting said processing server for selecting optimal units by analyzing the primary content to be vocalized from said terminal device;
      
      a function of generating the secondary content by the request at said processing server; and
      
      a function of sending said secondary content to said processing server together with a request for content from said terminal device.
  - 19. The computer program according to claim 17, which further causes the computer to implement:
    - a function of generating a secondary content including optimal units selected by analyzing the primary content to be vocalized, which is performed in advance at the processing server; and
      
      a function of sending said secondary content to said processing server together with a request for content from said terminal device.
  - 20. The computer program according to claim 17, which further causes the computer to implement:
    - a function of updating the speech databases to access for selecting optimal units with a management table comprising waveform IDs and update status data.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Hitachi, Ltd.
Original Assignee
Hitachi, Ltd.
Inventors
Kujirai, Toshihiro, Nukaga, Nobuo

Application Number

US11/030,109
Publication Number

US 20060004577A1
Time in Patent Office

Days
Field of Search
US Class Current

704/267
CPC Class Codes

G10L 13/047 Architecture of speech synt...

Distributed speech synthesis system, terminal device, and computer program thereof

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

25 Citations

20 Claims

Specification

Use Cases

Quick Links

Others

Distributed speech synthesis system, terminal device, and computer program thereof

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

25 Citations

20 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others