Telephony channel simulator for speech recognition application

US 5,475,792 A
Filed: 02/24/1994
Issued: 12/12/1995
Est. Priority Date: 09/21/1992
Status: Expired due to Fees

First Claim

Patent Images

1. A method for training a speech recognition processor to respond to speech obtained from telephone systems, comprising the steps of:

inputting a speech data set to a speech recognition training processor, said data set having a bandwidth higher than a telephone bandwidth;

decimating said inputted speech data set in said training processor to obtain a decimated speech data set having said telephone bandwidth;

applying a bandpass digital filter to said decimated speech data set in said training processor, said filter characterizing transmission characteristics of telephone equipment, for obtaining a filtered speech data set;

rescaling the amplitude of said filtered speech data set in said training processor, so that the maximum dynamic range of said filtered speech data set matches the maximum dynamic range of uncompanded telephone speech, to obtain a rescaled speech data set;

modifying said rescaled speech data set in said training processor, with quantization noise representing companding and uncompanding a speech signal in a telephone system, to obtain a modified speech data set;

inputting said modified speech data set into a hidden Markov model speech recognition processor to train statistical pattern matching data units;

performing speech recognition on voice signals from a telephone system with said speech recognition processor.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A telephony channel simulation process is disclosed for training a speech recognizer to respond to speech obtained from telephone systems. An input speech data set is provided to a speech recognition training processor, whose bandwidth is higher than a telephone bandwidth. The process performs a series of alterations to the input speech data set to obtain a modified speech data set. The modified speech data set enables the speech recognition processor to perform speech recognition on voice signals from a telephone system.

239 Citations

7 Claims

1. A method for training a speech recognition processor to respond to speech obtained from telephone systems, comprising the steps of:
- inputting a speech data set to a speech recognition training processor, said data set having a bandwidth higher than a telephone bandwidth;
  
  decimating said inputted speech data set in said training processor to obtain a decimated speech data set having said telephone bandwidth;
  
  applying a bandpass digital filter to said decimated speech data set in said training processor, said filter characterizing transmission characteristics of telephone equipment, for obtaining a filtered speech data set;
  
  rescaling the amplitude of said filtered speech data set in said training processor, so that the maximum dynamic range of said filtered speech data set matches the maximum dynamic range of uncompanded telephone speech, to obtain a rescaled speech data set;
  
  modifying said rescaled speech data set in said training processor, with quantization noise representing companding and uncompanding a speech signal in a telephone system, to obtain a modified speech data set;
  
  inputting said modified speech data set into a hidden Markov model speech recognition processor to train statistical pattern matching data units;
  
  performing speech recognition on voice signals from a telephone system with said speech recognition processor.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1 wherein:
    - said telephone bandwidth is any bandwidth lower than said higher bandwidth.
  - 3. The method of claim 1 which further comprises:
    - said bandpass digital filter has a maximally flat design algorithm.
  - 4. The method of claim 1 wherein said rescaling step results in a maximum dynamic range matching a maximum dynamic range of uncompanded mu-law telephone speech.
  - 5. The method of claim 1 wherein said rescaling step results in a maximum dynamic range matching a maximum dynamic range of uncompanded A-law telephone speech.
  - 6. The method of claim 1 wherein said modifying step has quantization noise as mu-law noise.
  - 7. The method of claim 1 wherein said modifying step has quantization noise as A-law noise.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Stanford, Vince M., Brickman, Norman F.
Primary Examiner(s)
MacDonald, Allen R.
Assistant Examiner(s)
Doerrler, Michelle

Application Number

US08/201,157
Time in Patent Office

656 Days
Field of Search

381/29-45, 395/2.1, 395/2.2-2.25, 395/2.35-2.37, 395/2.4-2.65, 375/122, 375/25, 375/27
US Class Current

704/233
CPC Class Codes

G10L 15/063   Training

G10L 15/142   Hidden Markov Models [HMMs]

G10L 15/20   Speech recognition techniqu...

G10L 2015/022   Demisyllables, biphones or ...

G10L 2019/0005   Multi-stage vector quantisa...

G10L 25/06   the extracted parameters be...

G10L 25/24   the extracted parameters be...

Telephony channel simulator for speech recognition application

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

239 Citations

7 Claims

Specification

Use Cases

Quick Links

Others

Telephony channel simulator for speech recognition application

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

239 Citations

7 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others