Prosody editing apparatus and method
First Claim
1. A prosody editing apparatus comprising:
- a storage configured to store attribute information items of phrases and one or more first prosodic patterns corresponding to each of the attribute information items of the phrases;
a search unit configured to search the storage for one or more second prosodic patterns corresponding to an attribute information item that matches an attribute information item of a predetermined phrase, the second prosodic patterns being included in the first prosodic patterns;
a mapping unit configured to map each of the second prosodic patterns on a low dimensional space to generate mapping coordinates, the mapping coordinates being used to suppress a first prosodic pattern which is not assumed normally, wherein a first distance between coordinates of the first prosodic pattern and coordinates of a target prosodic pattern is not within a first threshold;
a selection unit configured to obtain coordinates selected from the mapping coordinates as selected coordinates;
a restoring unit configured to restore a second prosodic pattern according to the selected coordinates to obtain a restored prosodic pattern; and
a replacing unit configured to replace prosody of synthetic speech generated based on the predetermined phrase by the restored prosodic pattern.
4 Assignments
0 Petitions
Accused Products
Abstract
According to one embodiment, a prosody editing apparatus includes a storage, a first selection unit, a search unit, a normalization unit, a mapping unit, a display, a second selection unit, a restoring unit and a replacing unit. The search unit searches the storage for one or more second prosodic patterns corresponding to attribute information that matches attribute information of the selected phrase. The mapping maps each of the normalized second prosodic patterns on a low-dimensional space. The restoring unit restores a restored prosodic pattern according to the selected coordinates. The replacing unit replaces prosody of synthetic speech generated based on the selected phrase by the restored prosodic pattern.
-
Citations
17 Claims
-
1. A prosody editing apparatus comprising:
-
a storage configured to store attribute information items of phrases and one or more first prosodic patterns corresponding to each of the attribute information items of the phrases; a search unit configured to search the storage for one or more second prosodic patterns corresponding to an attribute information item that matches an attribute information item of a predetermined phrase, the second prosodic patterns being included in the first prosodic patterns; a mapping unit configured to map each of the second prosodic patterns on a low dimensional space to generate mapping coordinates, the mapping coordinates being used to suppress a first prosodic pattern which is not assumed normally, wherein a first distance between coordinates of the first prosodic pattern and coordinates of a target prosodic pattern is not within a first threshold; a selection unit configured to obtain coordinates selected from the mapping coordinates as selected coordinates; a restoring unit configured to restore a second prosodic pattern according to the selected coordinates to obtain a restored prosodic pattern; and a replacing unit configured to replace prosody of synthetic speech generated based on the predetermined phrase by the restored prosodic pattern. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A prosody editing method comprising:
-
storing, in a storage, attribute information items of phrases and one or more first prosodic patterns corresponding to each of the attribute information items of the phrases; searching the storage for one or more second prosodic patterns corresponding to an attribute information item that matches an attribute information item of a predetermined phrase, the second prosodic patterns being included in the first prosodic patterns; mapping each of the second prosodic patterns on a low-dimensional space to generate mapping coordinates, the mapping coordinates being used to suppress to suppress a first prosodic pattern which is not assumed normally, wherein a first distance between coordinates of the first prosodic pattern and coordinates of a target prosodic pattern is not within a first threshold; obtaining coordinates selected from the mapping coordinates as selected coordinates; restoring a second prosodic pattern according to the selected coordinates to obtain a restored prosodic pattern; and replacing prosody of synthetic speech generated based on the predetermined phrase by the restored prosodic pattern.
-
-
17. A non-transitory computer readable medium including computer executable instructions, wherein the instructions, when executed by a processor, cause the processor to perform a method comprising:
-
storing, in a storage, attribute information items of phrases and one or more first prosodic patterns corresponding to each of the attribute information items of the phrases; searching the storage for one or more second prosodic patterns corresponding to an attribute information item that matches an attribute information item of a predetermined phrase, the second prosodic patterns being included in the first prosodic patterns; mapping each of the second prosodic patterns on a low-dimensional space to generate mapping coordinates, the mapping coordinates being used to suppress a first prosodic pattern which is not assumed normally, wherein a first distance between coordinates of the first prosodic pattern being suppressed and coordinates of a target prosodic pattern is not within a first threshold; obtaining coordinates selected from the mapping coordinates as selected coordinates; restoring a prosodic pattern according to the selected coordinates to obtain a restored prosodic pattern; and replacing prosody of synthetic speech generated based on the predetermined phrase by the restored prosodic pattern.
-
Specification