Recognition dictionary creating device, voice recognition device, and voice synthesizer
First Claim
Patent Images
1. A recognition dictionary creating device comprising:
- an acoustic standard pattern storage to store acoustic standard patterns showing standard acoustic features for each language;
a user dictionary storage to store a user dictionary;
a language storage to store language information;
a mapping table storage to store a mapping table in which a correspondence between phoneme labels in different languages is defined;
a processor to execute a program;
a memory to store the program which, when executed by the processor, results in performance of steps comprising;
performing an acoustic analysis on a voice signal of an inputted voice to output a time series of acoustic features;
comparing the time series of acoustic features with the acoustic standard patterns stored in said acoustic standard pattern storage to create a phoneme label string of said inputted voice;
registering said phoneme label string in the user dictionary;
storing a first language of the phoneme label string which is registered in said user dictionary into the language storage;
switching from the first language to a second language; and
referring to the mapping table stored in said mapping table storage to convert the phoneme label string registered in said user dictionary and expressed in the language shown by the information stored in said language storage into a phoneme label string expressed in the second language when the first language has been switched to the second language.
1 Assignment
0 Petitions
Accused Products
Abstract
A recognition dictionary creating device includes a user dictionary in which a phoneme label string of an inputted voice is registered and an interlanguage acoustic data mapping table in which a correspondence between phoneme labels in different languages is defined, and refers to the interlanguage acoustic data mapping table to convert the phoneme label string registered in the user dictionary and expressed in a language set at the time of creating the user dictionary into a phoneme label string expressed in another language which the recognition dictionary creating device has switched.
33 Citations
6 Claims
-
1. A recognition dictionary creating device comprising:
-
an acoustic standard pattern storage to store acoustic standard patterns showing standard acoustic features for each language; a user dictionary storage to store a user dictionary; a language storage to store language information; a mapping table storage to store a mapping table in which a correspondence between phoneme labels in different languages is defined; a processor to execute a program; a memory to store the program which, when executed by the processor, results in performance of steps comprising; performing an acoustic analysis on a voice signal of an inputted voice to output a time series of acoustic features; comparing the time series of acoustic features with the acoustic standard patterns stored in said acoustic standard pattern storage to create a phoneme label string of said inputted voice; registering said phoneme label string in the user dictionary; storing a first language of the phoneme label string which is registered in said user dictionary into the language storage; switching from the first language to a second language; and referring to the mapping table stored in said mapping table storage to convert the phoneme label string registered in said user dictionary and expressed in the language shown by the information stored in said language storage into a phoneme label string expressed in the second language when the first language has been switched to the second language.
-
-
2. A voice recognition device comprising:
-
an acoustic standard pattern storage to store acoustic standard patterns showing standard acoustic features for each language; a user dictionary storage to store a user dictionary; a general dictionary storage to store a general dictionary having a vocabulary expressed by said acoustic standard patterns; a language storage to store language information; a mapping table storage to store a mapping table in which a correspondence between phoneme labels in different languages is defined; a processor to execute a program; a memory to store the program which, when executed by the processor, results in performance of steps comprising; performing an acoustic analysis on a voice signal of an inputted voice to output a time series of acoustic features; comparing the time series of acoustic features with the acoustic standard patterns stored in said acoustic standard pattern storage to create a phoneme label string of said inputted voice; registering said phoneme label string in the user dictionary; storing a first language of the phoneme label string which is registered in said user dictionary into the language storage; switching from the first language to a second language; referring to the mapping table stored in said mapping table storage to convert the phoneme label string registered in said user dictionary and expressed in the language shown by the information stored in said language storage into a phoneme label string expressed in the second language when the first language has been switched to the second language; comparing the phoneme label string of said inputted voice with said general dictionary and said user dictionary to specify a word which is most similar to the phoneme label string of said inputted voice from said general dictionary and said user dictionary; and outputting the specified word as a voice recognition result.
-
-
3. A voice synthesizer comprising:
-
an acoustic standard pattern storage to store acoustic standard patterns showing standard acoustic features for each language; a user dictionary storage to store a user dictionary; a general dictionary storage to store a general dictionary having a vocabulary expressed by said acoustic standard patterns; a language storage to store language information; a mapping table storage to store a mapping table in which a correspondence between phoneme labels in different languages is defined; a processor to execute a program; a memory to store the program which, when executed by the processor, results in performance of steps comprising; performing an acoustic analysis on a voice signal of an inputted voice to output a time series of acoustic features; comparing the time series of acoustic features with the acoustic standard patterns stored in said acoustic standard pattern storage to create a phoneme label string of said inputted voice; registering said phoneme label string in the user dictionary; storing a first language of the phoneme label string which is registered in said user dictionary into the language storage; switching from the first language to a second language; referring to the mapping table stored in said mapping table storage to convert the phoneme label string registered in said user dictionary and expressed in the language shown by the information stored in said language storage into a phoneme label string expressed in the second language when the first language has been switched to the second language; inputting a text; detecting a word part corresponding to the phoneme label string registered in said user dictionary from a character string of the inputted text; replacing said word part with the phoneme label string acquired from said user dictionary and corresponding to said word part; replacing a part of the character string of said text other than said word part with a phoneme label string of a corresponding word in said general dictionary; and creating a synthetic voice of said text from the phoneme label strings of said text.
-
-
4. A recognition dictionary creating device comprising:
-
an acoustic standard pattern storage to store acoustic standard patterns showing standard acoustic features for each language; a user dictionary storage to store a user dictionary; a mapping table storage to store a mapping table in which a correspondence between phoneme labels in different languages is defined; a processor to execute a program; a memory to store the program which, when executed by the processor, results in performance of steps comprising; performing an acoustic analysis on a voice signal of an inputted voice to output a time series of acoustic features; selecting acoustic standard patterns for a preset language from among the acoustic standard patterns stored in said acoustic standard pattern storage; comparing the time series of acoustic features with the acoustic standard patterns for the language which are selected by said step of selecting acoustic standard patterns to create a phoneme label string of said inputted voice; registering said phoneme label string in the user dictionary; switching from a first language to a second language; and referring to the mapping table stored in said mapping table storage to convert the phoneme label string registered in said user dictionary and expressed in the language selected by said step of selecting acoustic standard patterns into a phoneme label string expressed in the second language when the first language has been switched to the second language.
-
-
5. A voice recognition device comprising:
-
an acoustic standard pattern storage to store acoustic standard patterns showing standard acoustic features for each language; a user dictionary storage to store a user dictionary; a general dictionary storage to store a general dictionary having a vocabulary expressed by said acoustic standard patterns; a mapping table storage to store a mapping table in which a correspondence between phoneme labels in different languages is defined; a processor to execute a program; a memory to store the program which, when executed by the processor, results in performance of steps comprising; performing an acoustic analysis on a voice signal of an inputted voice to output a time series of acoustic features; selecting acoustic standard patterns for a preset language from among the acoustic standard patterns stored in said acoustic standard pattern storage; comparing the time series of acoustic features with the acoustic standard patterns for the language which are selected by said step of selecting acoustic standard patterns to create a phoneme label string of said inputted voice; registering said phoneme label string in the user dictionary; switching from a first language to a second language; referring to the mapping table stored in said mapping table storage to convert the phoneme label string registered in said user dictionary and expressed in the language selected by said step of selecting acoustic standard patterns into a phoneme label string expressed in the second language when the first language has been switched to the second language; comparing the phoneme label string of said inputted voice with said general dictionary and said user dictionary to specify a word which is most similar to the phoneme label string of said inputted voice from said general dictionary and said user dictionary; and outputting the specified word as a voice recognition result.
-
-
6. A voice synthesizer comprising:
-
an acoustic standard pattern storage to store acoustic standard patterns showing standard acoustic features for each language; a user dictionary storage to store a user dictionary; a general dictionary storage to store a general dictionary having a vocabulary expressed by said acoustic standard patterns; a mapping table storage to store a mapping table in which a correspondence between phoneme labels in different languages is defined; a processor to execute a program; a memory to store the program which, when executed by the processor, results in performance of steps comprising; performing an acoustic analysis on a voice signal of an inputted voice to output a time series of acoustic features; selecting acoustic standard patterns for a preset language from among the acoustic standard patterns stored in said acoustic standard pattern storage; comparing the time series of acoustic features with the acoustic standard patterns for the language which are selected by said step of selecting acoustic standard patterns to create a phoneme label string of said inputted voice; registering said phoneme label string in the user dictionary; switching from a first language to a second language; referring to the mapping table stored in said mapping table storage to convert the phoneme label string registered in said user dictionary and expressed in the language selected by said step of selecting acoustic standard patterns into a phoneme label string expressed in the second language when the first language has been switched to the second language; inputting a text; detecting a word part corresponding to the phoneme label string registered in said user dictionary from a character string of the inputted text; replacing said word part with the phoneme label string acquired from said user dictionary and corresponding to said word part; replacing a part of the character string of said text other than said word part with a phoneme label string of a corresponding word in said general dictionary; and creating a synthetic voice of said text from the phoneme label strings of said text.
-
Specification