METHOD AND APPARATUS FOR RECOGNIZING SPEECH

US 20120166194A1
Filed: 12/22/2011
Published: 06/28/2012
Est. Priority Date: 12/23/2010
Status: Abandoned Application

First Claim

Patent Images

1. A method of recognizing speech, comprising:

extracting frame speech feature vectors from a speech signal;

performing speech recognition on frames of the speech signal using the frame speech feature vectors and a frame-based probability model;

dividing the speech signal into segments each of which is longer than each of the frames in terms of time;

extracting segment speech feature vectors around a boundary between the segments;

performing speech recognition on the segments of the speech signal using the segment speech feature vectors and a segment-based probability model; and

combining results of the speech recognition for the frames with results of the speech recognition for the segments.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed herein are an apparatus and method for recognizing speech. The apparatus includes a frame-based speech recognition unit, a segment division unit, a segment feature extraction unit, a segment speech recognition performance unit, and a combination and synchronization unit. The frame-based speech recognition unit extracts frame speech feature vectors from a speech signal, and performs speech recognition on frames of the speech signal using the frame speech feature vectors and a frame-based probability model. The segment division unit divides the speech signal into segments. The segment feature extraction unit extracts segment speech feature vectors around a boundary between the segments. The segment speech recognition performance unit performs speech recognition on the segments of the speech signal using the segment speech feature vectors and a segment-based probability model. The combination and synchronization unit combines results of the speech recognition for the frames with results of the speech recognition for the segments.

26 Citations

View as Search Results

16 Claims

1. A method of recognizing speech, comprising:
- extracting frame speech feature vectors from a speech signal;
  
  performing speech recognition on frames of the speech signal using the frame speech feature vectors and a frame-based probability model;
  
  dividing the speech signal into segments each of which is longer than each of the frames in terms of time;
  
  extracting segment speech feature vectors around a boundary between the segments;
  
  performing speech recognition on the segments of the speech signal using the segment speech feature vectors and a segment-based probability model; and
  
  combining results of the speech recognition for the frames with results of the speech recognition for the segments.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method as set forth in claim 1, wherein the dividing comprises calculating a distance measure between adjacent first and second frame speech feature vectors, and, if the calculated distance measure is greater than a predetermined value, dividing the speech signal into the segments using a point between the first and second frame speech feature vectors as a point for the division between the segments.
  - 3. The method as set forth in claim 2, wherein the distance measure is a variation in the speech signal.
  - 4. The method as set forth in claim 1, further comprising synchronizing the results of the speech recognition for the frames with the results of the speech recognition for the segments.
  - 5. The method as set forth in claim 4, wherein the synchronizing comprises applying a Dynamic Bayesian Network (DBN)-based Switching Linear Dynamic Model (SLDM) to a portion where the frame-based probability model is combined with the segment-based probability model in order to synchronize the results of the speech recognition for the frames with the results of the speech recognition for the segments.
  - 6. The method as set forth in claim 1, wherein the extracting segment speech feature vectors comprises extracting the segment speech feature vectors by performing Principal Component Analysis (PCA) and trajectory information feature extraction on the segments of the speech signal.
  - 7. The method as set forth in claim 1, wherein the segment-based probability model is a Gaussian model based on the segment speech feature vectors.
  - 8. The method as set forth in claim 1, wherein the frame-based probability model is a Hidden Markov Model (HMM).

9. An apparatus for recognizing speech, comprising:
- a frame-based speech recognition unit for extracting frame speech feature vectors from a speech signal, and performing speech recognition on frames of the speech signal using the frame speech feature vectors and a frame-based probability model;
  
  a segment division unit for dividing the speech signal into segments each of which is longer than each of the frames in terms of time;
  
  a segment feature extraction unit for extracting segment speech feature vectors around a boundary between the segments;
  
  a segment speech recognition performance unit for performing speech recognition on the segments of the speech signal using the segment speech feature vectors and a segment-based probability model; and
  
  a combination and synchronization unit for combining results of the speech recognition obtained by the frame-based speech recognition unit with results of the speech recognition obtained by the segment speech recognition performance unit .
- View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
- - 10. The apparatus as set forth in claim 9, wherein the segment division unit calculates a distance measure between adjacent first and second frame speech feature vectors, and, if the calculated distance measure is greater than a predetermined value, divides the speech signal into the segments using a point between the first and second frame speech feature vectors as a point for the division between the segments.
  - 11. The apparatus as set forth in claim 10, wherein the distance measure is a variation in the speech signal.
  - 12. The apparatus as set forth in claim 9, wherein the combination and synchronization unit synchronizes the results of the speech recognition obtained by the frame-based speech recognition unit with the results of the speech recognition obtained by the segment speech recognition performance unit.
  - 13. The apparatus as set forth in claim 12, wherein the combination and synchronization unit applies a DBN-based SLDM to a portion where the frame-based probability model is combined with the segment-based probability model in order to synchronize the results of the speech recognition for the frames with the results of the speech recognition for the segments.
  - 14. The apparatus as set forth in claim 9, wherein the segment extraction unit extracts the segment speech feature vectors by performing PCA and trajectory information feature extraction on the segments of the speech signal.
  - 15. The apparatus as set forth in claim 9, wherein the segment-based probability model is a Gaussian model based on the segment speech feature vectors.
  - 16. The apparatus as set forth in claim 9, wherein the frame-based probability model is an HMM.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Electronics and Telecommunications Research Institute
Original Assignee
Electronics and Telecommunications Research Institute
Inventors
JUNG, Ho-Young, PARK, Jeon-Gue, CHUNG, Hoon

Application Number

US13/335,854
Publication Number

US 20120166194A1
Time in Patent Office

Days
Field of Search
US Class Current

704/238
CPC Class Codes

G10L 15/02   Feature extraction for spee...

G10L 15/04   Segmentation; Word boundary...

G10L 15/142   Hidden Markov Models [HMMs]

METHOD AND APPARATUS FOR RECOGNIZING SPEECH

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

26 Citations

16 Claims

Specification

Solutions

Use Cases

Quick Links

METHOD AND APPARATUS FOR RECOGNIZING SPEECH

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

26 Citations

16 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links