Method and apparatus for accurate endpointing of speech in the presence of noise

US 6,324,509 B1
Filed: 02/08/1999
Issued: 11/27/2001
Est. Priority Date: 02/08/1999
Status: Expired due to Term

First Claim

Patent Images

1. A device for detecting endpoints of an utterance in frames of a received signal, comprising:

a processor; and

a software module executable by the processor to compare an utterance with a first threshold value to determine a first starting point and a first ending point of the utterance, compare with a second threshold value a part of the utterance that predates the first starting point to determine a second starting point of the utterance, and compare with the second threshold value a part of the utterance that postdates the first ending point to determine a second ending point of the utterance, wherein the first and second threshold values are calculated once per frame from a signal-to-noise ratio for the utterance.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An apparatus for accurate endpointing of speech in the presence of noise includes a processor and a software module. The processor executes the instructions of the software module to compare an utterance with a first signal-to-noise-ratio (SNR) threshold value to determine a first starting point and a first ending point of the utterance. The processor then compares with a second SNR threshold value a part of the utterance that predates the first starting point to determine a second starting point of the utterance. The processor also then compares with the second SNR threshold value a part of the utterance that postdates the first ending point to determine a second ending point of the utterance. The first and second SNR threshold values are recalculated periodically to reflect changing SNR conditions. The first SNR threshold value advantageously exceeds the second SNR threshold value.

112 Citations

View as Search Results

13 Claims

1. A device for detecting endpoints of an utterance in frames of a received signal, comprising:
- a processor; and
  
  a software module executable by the processor to compare an utterance with a first threshold value to determine a first starting point and a first ending point of the utterance, compare with a second threshold value a part of the utterance that predates the first starting point to determine a second starting point of the utterance, and compare with the second threshold value a part of the utterance that postdates the first ending point to determine a second ending point of the utterance, wherein the first and second threshold values are calculated once per frame from a signal-to-noise ratio for the utterance.
- View Dependent Claims (2, 3)
- - 2. The device of claim 1, wherein the first threshold value exceeds the second threshold value.
  - 3. The device of claim 1, wherein a difference between the second ending point and the second starting point is constrained by predefined maximum and minimum length bounds.

4. A method of detecting endpoints of an utterance in frames of a received signal, comprising the steps of:
- comparing an utterance with a first threshold value to determine a first starting point and a first ending point of the utterance;
  
  comparing with a second threshold value a part of the utterance that predates the first starting point to determine a second starting point of the utterance; and
  
  comparing with the second threshold value a part of the utterance that postdates the first ending point to determine a second ending point of the utterance, wherein the first and second threshold values are calculated once per frame from a signal-to-noise ratio for the utterance.
- View Dependent Claims (5, 6)
- - 5. The method of claim 4, wherein the first threshold value exceeds the second threshold value.
  - 6. The method of claim 4, further comprising the step of constraining a difference between the second ending point and the second starting point by predefined maximum and minimum length bounds.

7. A device for detecting endpoints of an utterance in frames of a received signal, comprising:
- means for comparing an utterance with a first threshold value to determine a first starting point and a first ending point of the utterance;
  
  means for comparing with a second threshold value a part of the utterance that predates the first starting point to determine a second starting point of the utterance; and
  
  means for comparing with the second threshold value a part of the utterance that postdates the first ending point to determine a second ending point of the utterance, wherein the first and second threshold values are calculated once per frame from a signal-to-noise ratio for the utterance.
- View Dependent Claims (8, 9)
- - 8. The device of claim 7, wherein the first threshold value exceeds the second threshold value.
  - 9. The device of claim 7, further comprising means for constraining a difference between the second ending point and the second starting point by predefined maximum and minimum length bounds.

10. A voice recognition system, comprising:
- an acoustic processor configured to determine parameters of an utterance contained in received frames of a speech signal, the acoustic processor including an endpoint detector configured to compare the utterance with a first threshold value to determine a first starting point and a first ending point of the utterance, compare with a second threshold value a part of the utterance that predates the first starting point to determine a second starting point of the utterance, and compare with the second threshold value a part of the utterance that postdates the first ending point to determine a second ending point of the utterance, wherein the first and second threshold values are calculated once per frame from a signal-to-noise ratio for the utterance;
  
  pattern comparison logic coupled to the acoustic processor and configured to compare stored word templates with parameters associated with the utterance; and
  
  a database coupled to the pattern comparison logic and configured to store the word templates.
- View Dependent Claims (11, 12, 13)
- - 11. The voice recognition system of claim 10, further comprising decision logic coupled to the pattern comparison logic and configured to decide which word template most closely matches the parameters.
  - 12. The voice recognition system of claim 10, wherein the first threshold value exceeds the second threshold value.
  - 13. The voice recognition system of claim 12, wherein a difference between the second ending point and the second starting point is constrained by predefined maximum and minimum length bounds.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Qualcomm, Inc.
Original Assignee
Qualcomm, Inc.
Inventors
Chang, Chienchung, Dejaco, Andrew P., Bi, Ning
Primary Examiner(s)
Korzuch, William
Assistant Examiner(s)
Chawan, Vijay B

Application Number

US09/246,414
Time in Patent Office

1,023 Days
Field of Search

704/248, 704/233, 704/251, 704/253, 704/254-256, 704/257, 704/231, 704/244, 704/200, 379/58
US Class Current

704/248
CPC Class Codes

G10L 2025/786 Adaptive threshold

G10L 25/87 Detection of discrete point...

Method and apparatus for accurate endpointing of speech in the presence of noise

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

112 Citations

13 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for accurate endpointing of speech in the presence of noise

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

112 Citations

13 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links