Speech recognition method

US 4,713,778 A
Filed: 03/27/1984
Issued: 12/15/1987
Est. Priority Date: 03/27/1984
Status: Expired due to Term

First Claim

Patent Images

1. In a speech recognition apparatus wherein speech units are each characterized by a sequence of template patterns, and havingmeans for processing a speech input signal for repetitively deriving therefrom, at a frame repetition rate, a plurality of speech recognition acoustic parameters, andmeans responsive to said acoustic parametersfor generating likelihood costs between said acoustic parameters and said speech template patterns, andfor processing said likelihood costs for determining the speech units in said speech input signal,a method of template matching and cost processing for recognizing the correspondence of said speech input signal and said template patterns, said method comprising the steps ofcharacterizing the allowable possible sequences of speech units as a grammer graph structure having a beginning node, an ending node and a plurality of intermediate nodes, all said nodes being connected by grammar arcs to at least one other node,initializing each said node with a high cumulative likelihood cost designating a bad score,generating likelihood costs representing the similarity of said acoustic parameters and selected ones of said template patterns,associating with each said node, at each frame time, a cumulative score corresponding to an accumulated template likelihood score in reaching said node, andgenerating a recognition decision when said cumulative score associated with the ending node is better than the cumulative score associated with any other node,storing a source representation of said grammar graph in a changeable memory of said responsive means,replacing said memory data with a representation of a second grammar graph, andgenerating a speech recognition decision based upon the second grammar graph,whereby said grammar source representing is software interchangeable and can be edited.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech recognition method and apparatus employ a speech processing circuitry for repetitively deriving from a speech input, at a frame repetition rate, a plurality of acoustic parameters. The acoustic parameters represent the speech input signal for a frame time. A plurality of template matching and cost processing circuitries are connected to a system bus, along with the speech processing circuitry, for determining, or identifying, the speech units in the input speech, by comparing the acoustic parameters with stored template patterns. The apparatus can be expanded by adding more template matching and cost processing circuitry to the bus thereby increasing the speech recognition capacity of the apparatus. The template matching and cost processing circuitries provide distributed processing, on demand, of the acoustic parameters for generating through a dynamic programming technique the recognition decision. Grammar graphs, having a plurality of nodes, are employed for representing both sequences of speech keywords and the speech components which form a keyword. The grammar graphs are software interchangeable, and can be advantageously employed together with dynamic programming methods.

43 Citations

View as Search Results

3 Claims

1. In a speech recognition apparatus wherein speech units are each characterized by a sequence of template patterns, and havingmeans for processing a speech input signal for repetitively deriving therefrom, at a frame repetition rate, a plurality of speech recognition acoustic parameters, andmeans responsive to said acoustic parametersfor generating likelihood costs between said acoustic parameters and said speech template patterns, andfor processing said likelihood costs for determining the speech units in said speech input signal,a method of template matching and cost processing for recognizing the correspondence of said speech input signal and said template patterns, said method comprising the steps ofcharacterizing the allowable possible sequences of speech units as a grammer graph structure having a beginning node, an ending node and a plurality of intermediate nodes, all said nodes being connected by grammar arcs to at least one other node,initializing each said node with a high cumulative likelihood cost designating a bad score,generating likelihood costs representing the similarity of said acoustic parameters and selected ones of said template patterns,associating with each said node, at each frame time, a cumulative score corresponding to an accumulated template likelihood score in reaching said node, andgenerating a recognition decision when said cumulative score associated with the ending node is better than the cumulative score associated with any other node,storing a source representation of said grammar graph in a changeable memory of said responsive means,replacing said memory data with a representation of a second grammar graph, andgenerating a speech recognition decision based upon the second grammar graph,whereby said grammar source representing is software interchangeable and can be edited.
- View Dependent Claims (2)
- - 2. The method of claim 1 further comprising the steps ofgenerating said template patterns bycharacterizing an utterance as a grammar graph structure having a beginning node, an ending node, and a plurality of intermediate nodes, each node being connected by arcs to at least one other node, said arcs representing successive portions of said utterance, andgenerating said template patterns using dynamic programming and said utterance characterizing grammar graph.

3. In a speech recognition apparatus wherein speech units are each characterized by a sequence of template patterns, and havingmeans for processing a speech input signal for repetitively deriving therefrom, at a frame repetition rate, a plurality of speech recognition acoustic parameters, andmeans responsive to said acoustic parametersfor generating likelihood costs between said acoustic parameters and said speech template patterns, andfor processing said likelihood costs for determining the speech units in said speech input signal,a method of template matching and cost processing for recognizing the correspondence of said speech input signal and said template patterns, said method comprising the steps ofcharacterizing the allowable possible sequences of speech units as a grammer graph structure having a beginning node, an ending node and a plurality of intermediate nodes, all said nodes being connected by grammar arcs to at least one other node,initializing each said node with a high cumulative likelihood cost designating a bad score,generating likelihood costs representing the similarity of said acoustic parameters and selected ones of said template patterns,associating with each said node, at each frame time, a cumulative score corresponding to an accumulated template likelihood score in reaching said node, andgenerating a recognition decision when said cumulative score associated with the ending node is better than the cumulative score associated with any other node,storing a source representation of said grammar graph in a changeable memory of said responsive means,replacing said memory data with a representation of a second grammar graph, andgenerating a speech recognition decision based upon the second grammar graph,generating said template patterns bycharacterizing an utterance as a grammar graph structure having a beginning node, an ending node, and a plurality of intermediate nodes, each node being connected by arcs to at least one other node, said arcs representing successive portions of said utterance, andgenerating said template patterns using dynamic programming and said utterance characterizing grammar graph, andwherein said grammar characterizations are software interchangeable and can be edited.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Voice Industries Corporation
Original Assignee
Exxon Research and Engineering Company (Exxon Mobil Corporation)
Inventors
Baker, James K.
Primary Examiner(s)
Kemeny, E. S. Matt

Application Number

US06/593,895
Time in Patent Office

1,358 Days
Field of Search

381/41-43, 365/513.5
US Class Current

704/254
CPC Class Codes

G10L 15/12 using dynamic programming t...

G10L 15/193 Formal grammars, e.g. finit...

Speech recognition method

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

43 Citations

3 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition method

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

43 Citations

3 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links