Automated generation of phonemic lexicon for voice activated cockpit management systems
First Claim
1. A system for processing speech recognition through the use of allophones and allophone recognition techniques, comprising:
- an allophone candidate selecting unit, wherein the allophone candidate selecting unit repeats processing of adding other allophone characters to a certain allophone character string contained in the input text character by character at the front-end or the tail-end of the certain character string, until an optimization score in the input text of an allophone character string obtained by such addition is reached, and selects the allophone character string before the addition as the allophone candidate character string, andacquiring from an input text and an input speech, a set of a allophone character string and a pronunciation thereof which should be recognized as a word, a word in a sentence, or a sentence in a procedure;
a candidate selecting unit comprising one or more processors executed stored program instructions for selecting, from input text, at least one allophone candidate character string which is a candidate to be recognized as a word;
a pronunciation generating unit comprising one or more processors executing stored program instructions for generating at least one allophone pronunciation candidate of each of the selected allophone candidate character strings by combining pronunciations of all allophone characters contained in the selected allophone candidate character string, while one or more pronunciations are predetermined for each of the allophone characters;
confidence score generating unit comprising one or more processors executing stored program instructions for generating confidence score data indicating confidence score for recognition of the respective sets each constituting of an allophone character string indicating a word and a pronunciation thereof, the confidence score generated by combining data in which the generated allophone pronunciation candidates are respectively associated with the allophone character strings, with language model data prepared by previously recording numerical values based on an accuracy score at which respective allophones and their words appear in the text;
a speech recognizing unit comprising one or more processors executing stored program instructions for performing, based on the generated confidence score data, speech recognition on the input speech to generate recognition data in which allophone character strings respectively indicating plural words contained in the input speech are associated with pronunciations.
2 Assignments
0 Petitions
Accused Products
Abstract
A system, method and program for acquiring from an input text a character string set and generating the pronunciation thereof which should be recognized as a word is disclosed. The system selects from an input text, plural candidate character strings which are phonemic character candidates or allophones to be recognized as a word; generates plural pronunciation candidates of the selected candidate character string and outputs the optimum pronunciation candidate to be recognized as a word; generates phonemic dictionary by combining data in which the pronunciation candidate with optimal recognition is respectively associated with the character strings; generates recognition data in which character strings respectively indicating plural words contained in the input speech are associated with pronunciations; and outputs a combination contained in the recognition data, out of combinations each consisting of one of the candidate character strings and the one of the pronunciations candidates with the optimum recognition.
-
Citations
8 Claims
-
1. A system for processing speech recognition through the use of allophones and allophone recognition techniques, comprising:
-
an allophone candidate selecting unit, wherein the allophone candidate selecting unit repeats processing of adding other allophone characters to a certain allophone character string contained in the input text character by character at the front-end or the tail-end of the certain character string, until an optimization score in the input text of an allophone character string obtained by such addition is reached, and selects the allophone character string before the addition as the allophone candidate character string, and acquiring from an input text and an input speech, a set of a allophone character string and a pronunciation thereof which should be recognized as a word, a word in a sentence, or a sentence in a procedure; a candidate selecting unit comprising one or more processors executed stored program instructions for selecting, from input text, at least one allophone candidate character string which is a candidate to be recognized as a word; a pronunciation generating unit comprising one or more processors executing stored program instructions for generating at least one allophone pronunciation candidate of each of the selected allophone candidate character strings by combining pronunciations of all allophone characters contained in the selected allophone candidate character string, while one or more pronunciations are predetermined for each of the allophone characters; confidence score generating unit comprising one or more processors executing stored program instructions for generating confidence score data indicating confidence score for recognition of the respective sets each constituting of an allophone character string indicating a word and a pronunciation thereof, the confidence score generated by combining data in which the generated allophone pronunciation candidates are respectively associated with the allophone character strings, with language model data prepared by previously recording numerical values based on an accuracy score at which respective allophones and their words appear in the text; a speech recognizing unit comprising one or more processors executing stored program instructions for performing, based on the generated confidence score data, speech recognition on the input speech to generate recognition data in which allophone character strings respectively indicating plural words contained in the input speech are associated with pronunciations. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
Specification