Acoustic speech recognizer system and method

US 6,574,601 B1
Filed: 01/13/1999
Issued: 06/03/2003
Est. Priority Date: 01/13/1999
Status: Expired due to Term

First Claim

Patent Images

1. A system for use in speech recognition wherein a user receives a synthetic or recorded speech prompt from a text-to-speech (TTS) server via at least one network, comprising:

a client application for communicating, via the at least one network, with a speech recognition (SR) server, the TTS server, and, at a location of the user, a microphone;

wherein;

said client application enables the SR server to receive speech data provided by the user via the microphone; and

said client application determines whether the TTS server is operating, where the TTS server outputs a speech prompt when it is operating, and, if it is determined that the TTS server is operating, the client application operates in a state where it determines whether barge-in speech has been detected by processing an audio input received via the microphone, and, if it is determined that the TTS server is not operating, the client application operates in a state where it does not determine whether barge-in speech has been detected.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An adaptive endpointer system and method are used in speech recognition applications, such as telephone-based Internet browsers, to determine barge-in events during the processing of speech. The endpointer system includes a signal energy level estimator for estimating signal levels in speech data; a noise energy level estimator for estimating noise levels in the speech data; and a barge-in detector for increasing a threshold used in comparing the signal levels and the noise levels to detect the barge-in event in the speech data corresponding to a speech prompt during speech recognition.

Citations

11 Claims

1. A system for use in speech recognition wherein a user receives a synthetic or recorded speech prompt from a text-to-speech (TTS) server via at least one network, comprising:
- a client application for communicating, via the at least one network, with a speech recognition (SR) server, the TTS server, and, at a location of the user, a microphone;
  
  wherein;
  
  said client application enables the SR server to receive speech data provided by the user via the microphone; and
  
  said client application determines whether the TTS server is operating, where the TTS server outputs a speech prompt when it is operating, and, if it is determined that the TTS server is operating, the client application operates in a state where it determines whether barge-in speech has been detected by processing an audio input received via the microphone, and, if it is determined that the TTS server is not operating, the client application operates in a state where it does not determine whether barge-in speech has been detected.
- View Dependent Claims (4, 5)
- - 4. The system of claim 1, wherein:
5. The system of claim 1, wherein:
- the audio input is processed using a signal energy level estimator for estimating signal levels thereof, and a noise energy level estimator for estimating noise levels thereof.

2. The system of clam 1, wherein:
- if said client application determines that the TTS server is operating but no such barge-in speech has been detected, said client application waits and determines whether the TS server is quiet, indicating that the TTS server is no longer operating.
- View Dependent Claims (3)
- - 3. The system of claim 2, wherein:

6. A method for use in speech recognition wherein a user receives a synthetic or recorded speech prompt from a text-to-speech (TTS) server via at least one network, comprising:
- providing a client application for communicating, via the at least one network with a speech recognition (SR) server, the TTS server, and, at a location of the user, a microphone;
  
  wherein the client application enables the SR server to receive speech data provided by the user via the microphone; and
  
  determining whether the TTS server is operating, where the TTS server outputs a speech prompt when it is operating, and, if it is determined that the TTS server is operating, operating the client application in a state where it determines whether barge-in speech has been detected by processing an audio input received via the microphone, and, if it is determined that the TTS server is not operating, operating the client application in a state where it does not determine whether barge-in speech has been detected.
- View Dependent Claims (7, 8, 9, 10)
- - 7. The method of claim 6, wherein:
8. The method of claim 7, wherein:
- if the client application determines that the TTS server is quiet, the client application transitions from the state where it determines whether barge-in speech has been detected to the state where it does not determine whether barge-in speech has been detected.
9. The method of claim 6, wherein:
- the client application is implemented as a state machine.
10. The method of claim 6, wherein:
- the audio input is processed using a signal energy level estimator for estimating signal levels thereof, and a noise energy level estimator for estimating noise levels thereof.

11. A computer readable medium for use in speech recognition, wherein a user receives a synthetic or recorded speech prompt from a text-to-speech (TTS) server via at least one network, comprising:
- software which is executable to;
  
  (a) provide a client application for communicating, via the at least one network, with a speech recognition (SR) server, the TTS server, and, at a location of the user, a microphone;
  
  wherein the client application enables the SR server to receive speech data provided by the user via the microphone; and
  
  (b) determine whether the TTS server is operating, where the TTS server outputs a speech prompt when it is operating, and, if it is determined that the TTS server is operating, operating the client application in a state where it determines whether barge-in speech has been detected by processing an audio input received via the microphone, and, if it is determined that the TTS server is not operating, operating the client application in a state where it does not determine whether barge-in speech has been detected.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nokia of America Corporation (Nokia Corporation)
Original Assignee
Lucent Technologies, Inc. (Nokia Corporation)
Inventors
Brown, Michael Kenneth, Glinski, Stephen Charles
Primary Examiner(s)
Knepper, David D.
Assistant Examiner(s)
Azad, Abul K.

Application Number

US09/229,809
Time in Patent Office

1,602 Days
Field of Search

704/233, 704/231, 704/251, 704/252, 704/253, 704/214, 704/248, 704/270.1, 379/88.01, 379/88.28, 379/410, 379/406, 379/80
US Class Current

704/270.1
CPC Class Codes

G10L 15/22 Procedures used during a sp...

Acoustic speech recognizer system and method

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

Citations

11 Claims

Specification

Solutions

Use Cases

Quick Links

Acoustic speech recognizer system and method

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

11 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links