VARIABLE-COMPONENT DEEP NEURAL NETWORK FOR ROBUST SPEECH RECOGNITION

US 20160275947A1
Filed: 09/09/2014
Published: 09/22/2016
Est. Priority Date: 09/09/2014
Status: Active Grant

First Claim

Patent Images

1. A method for recognizing speech, the method comprising:

capturing speech input;

determining a value for an environment variable;

utilizing a deep neural network (DNN) to recognize the captured speech input, wherein one or more components of the DNN are modeled as a set of functions of the environment variable; and

producing an output of recognized speech.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and methods for speech recognition incorporating environmental variables are provided. The systems and methods capture speech to be recognized. The speech is then recognized utilizing a variable component deep neural network (DNN). The variable component DNN processes the captured speech by incorporating an environment variable. The environment variable may be any variable that is dependent on environmental conditions or the relation of the user, the client device, and the environment. For example, the environment variable may be based on noise of the environment and represented as a signal-to-noise ratio. The variable component DNN may incorporate the environment variable in different ways. For instance, the environment variable may be incorporated into weighting matrices and biases of the DNN, the outputs of the hidden layers of the DNN, or the activation functions of the nodes of the DNN.

193 Citations

12 Claims

1. A method for recognizing speech, the method comprising:
- capturing speech input;
  
  determining a value for an environment variable;
  
  utilizing a deep neural network (DNN) to recognize the captured speech input, wherein one or more components of the DNN are modeled as a set of functions of the environment variable; and
  
  producing an output of recognized speech.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, wherein the one or more components of the DNN are at least one of the group consisting of one or more weighting matrices and one or more biases of the DNN.
  - 3. The method of claim 1, wherein the one or more components of the DNN are one or more outputs of a hidden layer of the DNN.
  - 4. The method of claim 1, wherein the one or more components of the DNN are one or more activation functions of one or more nodes in the DNN.
  - 5. The method of claim 1, wherein the environment variable is based on a noise of an environment.
  - 6. The method of claim 5, wherein the environment variable is a signal-to-noise ratio.

7. A system for recognizing speech, the system comprising:
- a speech capture device;
  
  a feature extraction module;
  
  an environment variable module, wherein the environment variable module determines a value for an environment variable; and
  
  a speech recognition decoder, wherein the speech recognition decoder utilizes a deep neural network (DNN) to recognize speech captured by the speech capture device, wherein one or more components of the DNN are modeled as a set of functions of the environment variable.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The system of claim 7, wherein the one or more components of the one or more components of the DNN are at least one of the group consisting of one or more weighting matrices and one or more biases of the DNN.
  - 9. The system of claim 7, wherein the one or more components of the DNN are one or more outputs of a hidden layer of the DNN.
  - 10. The system of claim 7, wherein the one or more components of the DNN are one or more activation functions of one or more nodes in the DNN.
  - 11. The system of claim 7, wherein the environment variable is based on a noise of an environment.
  - 12. The system of claim 11, wherein the environment variable is a signal-to-noise ratio.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Inventors
LI, Jinyu, GONG, Yifan, ZHAO, Rui

Granted Patent

US 10,019,990 B2
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G10L 15/16   using artificial neural net...

G10L 15/20   Speech recognition techniqu...

G10L 19/24   Variable rate codecs, e.g. ...

G10L 25/84   for discriminating voice fr...

VARIABLE-COMPONENT DEEP NEURAL NETWORK FOR ROBUST SPEECH RECOGNITION

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

193 Citations

12 Claims

Specification

Use Cases

Quick Links

Others

VARIABLE-COMPONENT DEEP NEURAL NETWORK FOR ROBUST SPEECH RECOGNITION

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

193 Citations

12 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others