Speech recognition-synthesis based encoding/decoding method, and speech encoding/decoding system
First Claim
1. A speech recognition synthesis based encoding/decoding method comprising the steps of:
- recognizing character information from an input speech signal;
detecting first prosody information from said input speech signal;
encoding said character information and said first prosody information to acquire code data;
transferring or storing the code data;
decoding said transferred or stored code data to said character information and said first prosody information;
selecting a synthesis unit codebook from a plurality of synthesis unit codebooks in accordance with one of said first prosody information and a specified type of a synthesized speech, the plurality of synthesis unit codebooks storing second prosody information prepared from speech data of different speakers, the selecting step including computing error between the first prosody information and the second prosody information and selecting from said synthesis unit codebooks a synthesis unit codebook which minimizes the error; and
synthesizing a speech signal using said character information and the selected said synthesis unit codebook.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech recognition synthesis based encoding/decoding method recognizes phonetic segments, syllables, words or the like as character information from an input speech signal and detects pitch periods, phoneme or syllable durations or the like, as information for prosody generation, from the input speech signal, transfers or stores the character information and information for prosody generation as code data, decodes the transferred or stored code data to acquire the character information and information for prosody generation, and synthesizes the acquired character information and information for prosody generation to obtain a speech signal.
-
Citations
26 Claims
-
1. A speech recognition synthesis based encoding/decoding method comprising the steps of:
-
recognizing character information from an input speech signal; detecting first prosody information from said input speech signal; encoding said character information and said first prosody information to acquire code data; transferring or storing the code data; decoding said transferred or stored code data to said character information and said first prosody information; selecting a synthesis unit codebook from a plurality of synthesis unit codebooks in accordance with one of said first prosody information and a specified type of a synthesized speech, the plurality of synthesis unit codebooks storing second prosody information prepared from speech data of different speakers, the selecting step including computing error between the first prosody information and the second prosody information and selecting from said synthesis unit codebooks a synthesis unit codebook which minimizes the error; and synthesizing a speech signal using said character information and the selected said synthesis unit codebook. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A speech recognition synthesis based encoding/decoding method comprising the steps of:
-
recognizing phonetic segments, syllables or words as character information from an input speech signal; detecting pitch periods and durations of said phonetic segments or syllables, as first prosody information, from said input speech signal; encoding said character information and said first prosody information to obtain code data; transferring or storing said code data; decoding said transferred or stored code data to said character information and said first prosody information; selecting a synthesis unit codebook from a plurality of synthesis unit codebooks in accordance with one of said first prosody information and a specified type of a synthesized speech, the plurality of synthesis unit codebooks storing second prosody information prepared from speech data of different speakers, the selecting step including computing error between the first prosody information and the second prosody information and selecting from said synthesis unit codebooks a synthesis unit codebook which minimizes the error; and synthesizing a speech signal using said character information and the selected synthesis unit codebook. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A speech encoding/decoding system comprising:
-
a recognition section configured to recognize character information from an input speech signal; a detection section configured to detect first prosody information from said input speech signal; an encoding section configured to encode said character information and said first prosody information to code data; a transfer/storage section configured to transfer or store said code data acquired by said encoding section; a decoding section configured to decode said transferred or stored code data to said character information and said first prosody information; a plurality of synthesis unit codebooks storing second prosody information prepared from speech data of different speakers; a controller configured to select one of said synthesis unit codebooks in accordance with one of said first prosody information and a specified type of a synthesized speech by computing error between the first prosody information and the second prosody information and selecting from said synthesis unit codebooks a synthesis unit codebook which minimizes the error; and a synthesis section configured to synthesize a speech signal using said character information and the selected one of said synthesis unit codebooks. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25)
-
-
26. A speech recognition synthesis based encoding method comprising the steps of:
-
recognizing character information from an input speech signal; detecting prosody information from said input speech signal; generating select information indicating a type of a synthesized speech to be produced by a decoder based upon an error between the prosody information and stored prosody generation information; encoding said character information and said prosody information to acquire code data; and transferring or storing the code data and the select information.
-
Specification