Automated generation of phonemic lexicon for voice activated cockpit management systems

US 9,135,911 B2
Filed: 09/26/2014
Issued: 09/15/2015
Est. Priority Date: 02/07/2014
Status: Expired due to Fees

First Claim

Patent Images

1. A system for processing speech recognition through the use of allophones and allophone recognition techniques, comprising:

an allophone candidate selecting unit, wherein the allophone candidate selecting unit repeats processing of adding other allophone characters to a certain allophone character string contained in the input text character by character at the front-end or the tail-end of the certain character string, until an optimization score in the input text of an allophone character string obtained by such addition is reached, and selects the allophone character string before the addition as the allophone candidate character string, andacquiring from an input text and an input speech, a set of a allophone character string and a pronunciation thereof which should be recognized as a word, a word in a sentence, or a sentence in a procedure;

a candidate selecting unit comprising one or more processors executed stored program instructions for selecting, from input text, at least one allophone candidate character string which is a candidate to be recognized as a word;

a pronunciation generating unit comprising one or more processors executing stored program instructions for generating at least one allophone pronunciation candidate of each of the selected allophone candidate character strings by combining pronunciations of all allophone characters contained in the selected allophone candidate character string, while one or more pronunciations are predetermined for each of the allophone characters;

confidence score generating unit comprising one or more processors executing stored program instructions for generating confidence score data indicating confidence score for recognition of the respective sets each constituting of an allophone character string indicating a word and a pronunciation thereof, the confidence score generated by combining data in which the generated allophone pronunciation candidates are respectively associated with the allophone character strings, with language model data prepared by previously recording numerical values based on an accuracy score at which respective allophones and their words appear in the text;

a speech recognizing unit comprising one or more processors executing stored program instructions for performing, based on the generated confidence score data, speech recognition on the input speech to generate recognition data in which allophone character strings respectively indicating plural words contained in the input speech are associated with pronunciations.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system, method and program for acquiring from an input text a character string set and generating the pronunciation thereof which should be recognized as a word is disclosed. The system selects from an input text, plural candidate character strings which are phonemic character candidates or allophones to be recognized as a word; generates plural pronunciation candidates of the selected candidate character string and outputs the optimum pronunciation candidate to be recognized as a word; generates phonemic dictionary by combining data in which the pronunciation candidate with optimal recognition is respectively associated with the character strings; generates recognition data in which character strings respectively indicating plural words contained in the input speech are associated with pronunciations; and outputs a combination contained in the recognition data, out of combinations each consisting of one of the candidate character strings and the one of the pronunciations candidates with the optimum recognition.

Citations

8 Claims

1. A system for processing speech recognition through the use of allophones and allophone recognition techniques, comprising:
- an allophone candidate selecting unit, wherein the allophone candidate selecting unit repeats processing of adding other allophone characters to a certain allophone character string contained in the input text character by character at the front-end or the tail-end of the certain character string, until an optimization score in the input text of an allophone character string obtained by such addition is reached, and selects the allophone character string before the addition as the allophone candidate character string, andacquiring from an input text and an input speech, a set of a allophone character string and a pronunciation thereof which should be recognized as a word, a word in a sentence, or a sentence in a procedure;
  
  a candidate selecting unit comprising one or more processors executed stored program instructions for selecting, from input text, at least one allophone candidate character string which is a candidate to be recognized as a word;
  
  a pronunciation generating unit comprising one or more processors executing stored program instructions for generating at least one allophone pronunciation candidate of each of the selected allophone candidate character strings by combining pronunciations of all allophone characters contained in the selected allophone candidate character string, while one or more pronunciations are predetermined for each of the allophone characters;
  
  confidence score generating unit comprising one or more processors executing stored program instructions for generating confidence score data indicating confidence score for recognition of the respective sets each constituting of an allophone character string indicating a word and a pronunciation thereof, the confidence score generated by combining data in which the generated allophone pronunciation candidates are respectively associated with the allophone character strings, with language model data prepared by previously recording numerical values based on an accuracy score at which respective allophones and their words appear in the text;
  
  a speech recognizing unit comprising one or more processors executing stored program instructions for performing, based on the generated confidence score data, speech recognition on the input speech to generate recognition data in which allophone character strings respectively indicating plural words contained in the input speech are associated with pronunciations.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The system according to claim 1, wherein said score generating unit generates said language model data by calculating confidence scores at which said respective allophone candidate character strings appear in said input text and then by calculating, based on said confidence scores accuracies at which said respective allophone candidate character strings appear;
    - andgenerates said confidence score data by combining the generated language model data with data in which each of said pronunciation candidates is associated with one of the allophone character strings.
  - 3. The system according to claim 1, wherein the confidence score generating unit calculates and thus generates, as the language model data, an accuracy score for each set of at least two of consecutive allophone character strings, the accuracy indicating the frequency that each set of the consecutive allophone candidate character strings appears in an input text.
  - 4. The system according to claim 1, wherein the score generating unit generates score data by:
    - selecting sets each constituting of at least two consecutive words from a group of words containing known allophones, the known allophones indicating a certain allophone character string unrecognizable as a word;
      
      acquiring the language model data having numerical values recorded therein, the numerical value indicating a the accuracy at which each of the selected sets of consecutive words appear in a text; and
      
      associating each of the candidate character strings with the known phonemic symbol.
  - 5. The system according to claim 1, wherein the pronunciation generating unit generates a plurality of pronunciation candidates for each of the allophone character strings by:
    - retrieving one or more pronunciations of each of allophone characters contained in the allophone candidate character string, from a pronunciation dictionary in which each allophone character is associated with one or more pronunciations; and
      
      combining together the retrieved pronunciations.
  - 6. The system according to claim 1, wherein an outputting unit outputs a combination of one of the allophone candidate character strings and one of the pronunciation candidates contained in the recognition data, on condition that the combination appears in the recognition data not less than a predetermined criterial number of times.
  - 7. The system according to claim 1, wherein based the speech recognizing unit selects one of the combinations constituting of a set of pronunciations agreeing with the input speech and a set of allophone character strings corresponding to the set of the pronunciations, the selected combination constituting of pronunciations and allophone character strings whose optimization score and confidence score have the largest product among those of the other combinations;
    - andthe outputting unit further selects and outputs some of the allophone candidate character string and some of the pronunciation candidates, the selected allophone candidate character strings and pronunciation candidates included in a predetermined criterial number of combinations of allophone character strings and pronunciations whose optimization score and confidence score have the predetermined criterial number of the largest products, the confidence scores calculated by the speech recognizing unit.
  - 8. The system according to claim 1, wherein the input text and the input speech have the contents indicating a common event belonging to a predetermined field;
    - andthe outputting unit outputs one or more combinations among the combinations each consisting of one of the allophone candidate character strings and one of the pronunciations candidates, the outputted combinations being those contained in the recognition data, and then registers the outputted combinations in a dictionary used in speech processing in the predetermined field.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nexgen Flight LLC
Original Assignee
Doinita Diane Serban, Nexgen Flight LLC
Inventors
Serban, Doinita, Raigaga, Bhupat
Primary Examiner(s)
COLUCCI, MICHAEL C

Application Number

US14/498,897
Publication Number

US 20150228273A1
Time in Patent Office

354 Days
Field of Search

704/270, 704/277, 704/267, 704/260, 704/256.2, 704/256, 704/254, 704/251, 704/244, 704/242, 704/232, 704/200, 704/1, 382/159
US Class Current

1/1
CPC Class Codes

G10L 15/02   Feature extraction for spee...

G10L 15/14   using statistical models, e...

G10L 15/18   using natural language mode...

G10L 15/187   Phonemic context, e.g. pron...

G10L 17/22   Interactive procedures; Man...

Automated generation of phonemic lexicon for voice activated cockpit management systems

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

8 Claims

Specification

Solutions

Use Cases

Quick Links

Automated generation of phonemic lexicon for voice activated cockpit management systems

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

8 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links