Method and system for objectively evaluating speech

US 6,446,038 B1
Filed: 04/01/1996
Issued: 09/03/2002
Est. Priority Date: 04/01/1996
Status: Expired due to Term

First Claim

Patent Images

1. An output-based objective method for evaluating the quality of speech in a voice communication system comprising:

providing a plurality of speech reference vectors, the speech reference vectors corresponding to a plurality of known clean speech samples obtained in a quiet environment;

receiving an unknown corrupted speech signal from an unavailable clean speech signal that is corrupted with distortions;

determining a plurality of distortions by comparing the unknown corrupted speech signal to at least one of the plurality of speech reference vectors; and

generating a score representing a subjective quality of the unknown corrupted speech signal based on the plurality of distortions.

View all claims

7 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and system for objectively evaluating the quality of speech in a voice communication system. A plurality of speech reference vectors is first obtained based on a plurality of clean speech samples. A corrupted speech signal is received and processed to determine a plurality of distortions derived from a plurality of distortion measures based on the plurality of speech reference vectors. The plurality of distortions are processed by a non-linear neural network model to generate a subjective score representing user acceptance of the corrupted speech signal. The non-linear neural network model is first trained on clean speech samples as well as corrupted speech samples through the use of backpropagation to obtain the weights and bias terms necessary to predict subjective scores from several objective measures.

75 Citations

View as Search Results

20 Claims

1. An output-based objective method for evaluating the quality of speech in a voice communication system comprising:
- providing a plurality of speech reference vectors, the speech reference vectors corresponding to a plurality of known clean speech samples obtained in a quiet environment;
  
  receiving an unknown corrupted speech signal from an unavailable clean speech signal that is corrupted with distortions;
  
  determining a plurality of distortions by comparing the unknown corrupted speech signal to at least one of the plurality of speech reference vectors; and
  
  generating a score representing a subjective quality of the unknown corrupted speech signal based on the plurality of distortions.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method as recited in claim 1 wherein generating the score includes processing the plurality of distortions in a neural network having a plurality of inputs and an output.
  - 3. The method as recited in claim 2 wherein the neural network is a three-layer network.
  - 4. The method as recited in claim 3 wherein generating the score includes training the neural network utilizing backpropagation.
  - 5. The method as recited in claim 1 wherein providing the plurality of speech reference vectors includes:
6. The method as recited in claim 5 wherein the clustering technique is a vector quantization.
7. The method as recited in claim 5 wherein the clustering technique is a k-means clustering technique.
8. The method as recited in claim 5 wherein performing the spectral analysis includes performing a linear predictive analysis.
9. The method as recited in claim 5 wherein performing the spectral analysis includes performing a perceptual linear predictive analysis.

10. An output-based objective system for evaluating the quality of speech in a voice communication system comprising:
- a plurality of speech reference vectors, the speech reference vectors corresponding to a plurality of known clean speech samples obtained in a quiet environment;
  
  means for receiving an unknown corrupted speech signal from an unavailable clean speech signal that is corrupted with distortions;
  
  means for determining a plurality of distortions by comparing the unknown corrupted speech signal to at least one of the plurality of speech reference vectors; and
  
  a non-linear model responsive to the plurality of distortions to generate a score representing a subjective quality of the unknown corrupted speech signal.
- View Dependent Claims (11, 12, 13, 14, 16, 17, 18)
- - 11. The system as recited in claim 10 wherein the non-linear model is a neural network having a plurality of inputs and an output.
  - 12. The system as recited in claim 11 wherein the neural network is a three-layer network.
  - 13. The system as recited in claim 12 wherein the neural network is trained utilizing backpropagation.
  - 14. The system as recited in claim 10 further comprising:
16. The system as recited in claim 14 wherein the means for performing the clustering technique includes means for performing a k-means clustering technique.
17. The system as recited in claim 14 wherein the means for performing the spectral analysis includes means for performing a linear predictive analysis.
18. The system as recited in claim 14 wherein the means for performing the spectral analysis includes means for performing a perceptual linear predictive analysis.

15. The system as recited in claim 15 wherein the means for performing the clustering technique includes means for performing a vector quantization.

19. A computer readable storage medium having information stored thereon representing instructions executable by a computer to evaluate the quality of speech in a voice communication system, the computer readable storage medium further comprising:
- instructions for providing a plurality of speech reference vectors, the speech reference vectors corresponding to a plurality of known clean speech samples obtained in a quiet environment;
  
  instructions for receiving an unknown corrupted speech signal from an unavailable clean speech signal that is corrupted with distortions;
  
  instructions for determining a plurality of distortions by comparing the unknown corrupted speech signal to at least one of the plurality of speech reference vectors; and
  
  instructions for generating a score representing a subjective quality of the unknown corrupted speech signal based on the plurality of distortions.
- View Dependent Claims (20)
- - 20. The computer readable storage medium of claim 19 wherein the instructions for generating the score further comprise:

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Qwest Communications International Incorporated (Lumen Technologies, Inc.)
Original Assignee
Qwest Communications International Incorporated (Lumen Technologies, Inc.)
Inventors
Vis, Marvin, Bayya, Aruna
Primary Examiner(s)
Banks-Harold, Marsha D.
Assistant Examiner(s)
Opsasnick, Michael N.

Application Number

US08/627,249
Time in Patent Office

2,346 Days
Field of Search

395/2.4, 395/2.41, 395/2.37, 395/2.35
US Class Current

704/232
CPC Class Codes

G10L 25/30 using neural networks

G10L 25/69 for evaluating synthetic or...

Method and system for objectively evaluating speech

First Claim

7 Assignments

0 Petitions

Accused Products

Abstract

75 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Method and system for objectively evaluating speech

First Claim

7 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

75 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links