Recognition dictionary creating device, voice recognition device, and voice synthesizer

US 9,177,545 B2
Filed: 01/22/2010
Issued: 11/03/2015
Est. Priority Date: 01/22/2010
Status: Active Grant

First Claim

Patent Images

1. A recognition dictionary creating device comprising:

an acoustic standard pattern storage to store acoustic standard patterns showing standard acoustic features for each language;

a user dictionary storage to store a user dictionary;

a language storage to store language information;

a mapping table storage to store a mapping table in which a correspondence between phoneme labels in different languages is defined;

a processor to execute a program;

a memory to store the program which, when executed by the processor, results in performance of steps comprising;

performing an acoustic analysis on a voice signal of an inputted voice to output a time series of acoustic features;

comparing the time series of acoustic features with the acoustic standard patterns stored in said acoustic standard pattern storage to create a phoneme label string of said inputted voice;

registering said phoneme label string in the user dictionary;

storing a first language of the phoneme label string which is registered in said user dictionary into the language storage;

switching from the first language to a second language; and

referring to the mapping table stored in said mapping table storage to convert the phoneme label string registered in said user dictionary and expressed in the language shown by the information stored in said language storage into a phoneme label string expressed in the second language when the first language has been switched to the second language.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A recognition dictionary creating device includes a user dictionary in which a phoneme label string of an inputted voice is registered and an interlanguage acoustic data mapping table in which a correspondence between phoneme labels in different languages is defined, and refers to the interlanguage acoustic data mapping table to convert the phoneme label string registered in the user dictionary and expressed in a language set at the time of creating the user dictionary into a phoneme label string expressed in another language which the recognition dictionary creating device has switched.

33 Citations

View as Search Results

6 Claims

1. A recognition dictionary creating device comprising:
- an acoustic standard pattern storage to store acoustic standard patterns showing standard acoustic features for each language;
  
  a user dictionary storage to store a user dictionary;
  
  a language storage to store language information;
  
  a mapping table storage to store a mapping table in which a correspondence between phoneme labels in different languages is defined;
  
  a processor to execute a program;
  
  a memory to store the program which, when executed by the processor, results in performance of steps comprising;
  
  performing an acoustic analysis on a voice signal of an inputted voice to output a time series of acoustic features;
  
  comparing the time series of acoustic features with the acoustic standard patterns stored in said acoustic standard pattern storage to create a phoneme label string of said inputted voice;
  
  registering said phoneme label string in the user dictionary;
  
  storing a first language of the phoneme label string which is registered in said user dictionary into the language storage;
  
  switching from the first language to a second language; and
  
  referring to the mapping table stored in said mapping table storage to convert the phoneme label string registered in said user dictionary and expressed in the language shown by the information stored in said language storage into a phoneme label string expressed in the second language when the first language has been switched to the second language.

2. A voice recognition device comprising:
- an acoustic standard pattern storage to store acoustic standard patterns showing standard acoustic features for each language;
  
  a user dictionary storage to store a user dictionary;
  
  a general dictionary storage to store a general dictionary having a vocabulary expressed by said acoustic standard patterns;
  
  a language storage to store language information;
  
  a mapping table storage to store a mapping table in which a correspondence between phoneme labels in different languages is defined;
  
  a processor to execute a program;
  
  a memory to store the program which, when executed by the processor, results in performance of steps comprising;
  
  performing an acoustic analysis on a voice signal of an inputted voice to output a time series of acoustic features;
  
  comparing the time series of acoustic features with the acoustic standard patterns stored in said acoustic standard pattern storage to create a phoneme label string of said inputted voice;
  
  registering said phoneme label string in the user dictionary;
  
  storing a first language of the phoneme label string which is registered in said user dictionary into the language storage;
  
  switching from the first language to a second language;
  
  referring to the mapping table stored in said mapping table storage to convert the phoneme label string registered in said user dictionary and expressed in the language shown by the information stored in said language storage into a phoneme label string expressed in the second language when the first language has been switched to the second language;
  
  comparing the phoneme label string of said inputted voice with said general dictionary and said user dictionary to specify a word which is most similar to the phoneme label string of said inputted voice from said general dictionary and said user dictionary; and
  
  outputting the specified word as a voice recognition result.

3. A voice synthesizer comprising:
- an acoustic standard pattern storage to store acoustic standard patterns showing standard acoustic features for each language;
  
  a user dictionary storage to store a user dictionary;
  
  a general dictionary storage to store a general dictionary having a vocabulary expressed by said acoustic standard patterns;
  
  a language storage to store language information;
  
  a mapping table storage to store a mapping table in which a correspondence between phoneme labels in different languages is defined;
  
  a processor to execute a program;
  
  a memory to store the program which, when executed by the processor, results in performance of steps comprising;
  
  performing an acoustic analysis on a voice signal of an inputted voice to output a time series of acoustic features;
  
  comparing the time series of acoustic features with the acoustic standard patterns stored in said acoustic standard pattern storage to create a phoneme label string of said inputted voice;
  
  registering said phoneme label string in the user dictionary;
  
  storing a first language of the phoneme label string which is registered in said user dictionary into the language storage;
  
  switching from the first language to a second language;
  
  referring to the mapping table stored in said mapping table storage to convert the phoneme label string registered in said user dictionary and expressed in the language shown by the information stored in said language storage into a phoneme label string expressed in the second language when the first language has been switched to the second language;
  
  inputting a text;
  
  detecting a word part corresponding to the phoneme label string registered in said user dictionary from a character string of the inputted text;
  
  replacing said word part with the phoneme label string acquired from said user dictionary and corresponding to said word part;
  
  replacing a part of the character string of said text other than said word part with a phoneme label string of a corresponding word in said general dictionary; and
  
  creating a synthetic voice of said text from the phoneme label strings of said text.

4. A recognition dictionary creating device comprising:
- an acoustic standard pattern storage to store acoustic standard patterns showing standard acoustic features for each language;
  
  a user dictionary storage to store a user dictionary;
  
  a mapping table storage to store a mapping table in which a correspondence between phoneme labels in different languages is defined;
  
  a processor to execute a program;
  
  a memory to store the program which, when executed by the processor, results in performance of steps comprising;
  
  performing an acoustic analysis on a voice signal of an inputted voice to output a time series of acoustic features;
  
  selecting acoustic standard patterns for a preset language from among the acoustic standard patterns stored in said acoustic standard pattern storage;
  
  comparing the time series of acoustic features with the acoustic standard patterns for the language which are selected by said step of selecting acoustic standard patterns to create a phoneme label string of said inputted voice;
  
  registering said phoneme label string in the user dictionary;
  
  switching from a first language to a second language; and
  
  referring to the mapping table stored in said mapping table storage to convert the phoneme label string registered in said user dictionary and expressed in the language selected by said step of selecting acoustic standard patterns into a phoneme label string expressed in the second language when the first language has been switched to the second language.

5. A voice recognition device comprising:
- an acoustic standard pattern storage to store acoustic standard patterns showing standard acoustic features for each language;
  
  a user dictionary storage to store a user dictionary;
  
  a general dictionary storage to store a general dictionary having a vocabulary expressed by said acoustic standard patterns;
  
  a mapping table storage to store a mapping table in which a correspondence between phoneme labels in different languages is defined;
  
  a processor to execute a program;
  
  a memory to store the program which, when executed by the processor, results in performance of steps comprising;
  
  performing an acoustic analysis on a voice signal of an inputted voice to output a time series of acoustic features;
  
  selecting acoustic standard patterns for a preset language from among the acoustic standard patterns stored in said acoustic standard pattern storage;
  
  comparing the time series of acoustic features with the acoustic standard patterns for the language which are selected by said step of selecting acoustic standard patterns to create a phoneme label string of said inputted voice;
  
  registering said phoneme label string in the user dictionary;
  
  switching from a first language to a second language;
  
  referring to the mapping table stored in said mapping table storage to convert the phoneme label string registered in said user dictionary and expressed in the language selected by said step of selecting acoustic standard patterns into a phoneme label string expressed in the second language when the first language has been switched to the second language;
  
  comparing the phoneme label string of said inputted voice with said general dictionary and said user dictionary to specify a word which is most similar to the phoneme label string of said inputted voice from said general dictionary and said user dictionary; and
  
  outputting the specified word as a voice recognition result.

6. A voice synthesizer comprising:
- an acoustic standard pattern storage to store acoustic standard patterns showing standard acoustic features for each language;
  
  a user dictionary storage to store a user dictionary;
  
  a general dictionary storage to store a general dictionary having a vocabulary expressed by said acoustic standard patterns;
  
  a mapping table storage to store a mapping table in which a correspondence between phoneme labels in different languages is defined;
  
  a processor to execute a program;
  
  a memory to store the program which, when executed by the processor, results in performance of steps comprising;
  
  performing an acoustic analysis on a voice signal of an inputted voice to output a time series of acoustic features;
  
  selecting acoustic standard patterns for a preset language from among the acoustic standard patterns stored in said acoustic standard pattern storage;
  
  comparing the time series of acoustic features with the acoustic standard patterns for the language which are selected by said step of selecting acoustic standard patterns to create a phoneme label string of said inputted voice;
  
  registering said phoneme label string in the user dictionary;
  
  switching from a first language to a second language;
  
  referring to the mapping table stored in said mapping table storage to convert the phoneme label string registered in said user dictionary and expressed in the language selected by said step of selecting acoustic standard patterns into a phoneme label string expressed in the second language when the first language has been switched to the second language;
  
  inputting a text;
  
  detecting a word part corresponding to the phoneme label string registered in said user dictionary from a character string of the inputted text;
  
  replacing said word part with the phoneme label string acquired from said user dictionary and corresponding to said word part;
  
  replacing a part of the character string of said text other than said word part with a phoneme label string of a corresponding word in said general dictionary; and
  
  creating a synthetic voice of said text from the phoneme label strings of said text.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Mitsubishi Electric Corporation
Original Assignee
Mitsubishi Electric Corporation
Inventors
Maruta, Yuzo
Primary Examiner(s)
He, Jialong

Application Number

US13/500,855
Publication Number

US 20120203553A1
Time in Patent Office

2,111 Days
Field of Search

704/257, 704/258, 704/260, 704/270
US Class Current

1/1
CPC Class Codes

C01G 41/00   Compounds of tungsten

C01P 2006/80   Compositional purity

G10L 13/08   Text analysis or generation...

G10L 15/06   Creation of reference templ...

G10L 15/187   Phonemic context, e.g. pron...

Recognition dictionary creating device, voice recognition device, and voice synthesizer

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

33 Citations

6 Claims

Specification

Solutions

Use Cases

Quick Links

Recognition dictionary creating device, voice recognition device, and voice synthesizer

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

33 Citations

6 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links