Augmented generalized deep learning with special vocabulary

US 10,540,959 B1
Filed: 12/26/2018
Issued: 01/21/2020
Est. Priority Date: 07/27/2018
Status: Active Grant

First Claim

Patent Images

1. A method for customizing a neural network trained on a general dataset to a custom dataset, the method comprising:

providing a trained speech recognition neural network, the trained speech recognition neural network including a plurality of layers each having a plurality of nodes, the trained speech recognition neural network including an output layer with nodes corresponding to words of a vocabulary, the nodes of the output layer outputting values, wherein the values output by the nodes in the output layer correspond to a probability of the corresponding word in the vocabulary being a correct transcription of an input;

for a plurality of words in the vocabulary, determining a frequency of occurrence of the word in a general training set and a frequency of occurrence of the word in a custom dataset;

during inference using the trained speech recognition neural network, for each word in the plurality of words, adjusting the value output by the output node for the word based on the frequency of occurrence of the word in the custom dataset and the frequency of occurrence of the word in the general training set to obtain a custom model probability; and

generating a transcription of a spoken input based on the custom model probability.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and methods are disclosed for customizing a neural network for a custom dataset, when the neural network has been trained on data from a general dataset. The neural network may comprise an output layer including one or more nodes corresponding to candidate outputs. The values of the nodes in the output layer may correspond to a probability that the candidate output is the correct output for an input. The values of the nodes in the output layer may be adjusted for higher performance when the neural network is used to process data from a custom dataset.

Citations

20 Claims

1. A method for customizing a neural network trained on a general dataset to a custom dataset, the method comprising:
- providing a trained speech recognition neural network, the trained speech recognition neural network including a plurality of layers each having a plurality of nodes, the trained speech recognition neural network including an output layer with nodes corresponding to words of a vocabulary, the nodes of the output layer outputting values, wherein the values output by the nodes in the output layer correspond to a probability of the corresponding word in the vocabulary being a correct transcription of an input;
  
  for a plurality of words in the vocabulary, determining a frequency of occurrence of the word in a general training set and a frequency of occurrence of the word in a custom dataset;
  
  during inference using the trained speech recognition neural network, for each word in the plurality of words, adjusting the value output by the output node for the word based on the frequency of occurrence of the word in the custom dataset and the frequency of occurrence of the word in the general training set to obtain a custom model probability; and
  
  generating a transcription of a spoken input based on the custom model probability.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein the plurality of words comprises all of the words in the vocabulary.
  - 3. The method of claim 1, wherein the frequency of occurrence of the word in the general training set is set to a threshold minimum value if the word does not appear in the general training set.
  - 4. The method of claim 1, wherein the trained speech recognition neural network includes one or more fully-connected neural network layers.
  - 5. The method of claim 1, wherein the trained speech recognition neural network includes one or more locally connected neural network layers.
  - 6. The method of claim 5, wherein the trained speech recognition neural network includes one or more recurrent neural network layers.
  - 7. The method of claim 6, wherein the trained speech recognition neural network has been trained in an end-to-end training process including backpropagation through each of its layers.

8. A non-transitory computer-readable medium comprising instructions for:
- providing a trained speech recognition neural network, the trained speech recognition neural network including a plurality of layers each having a plurality of nodes, the trained speech recognition neural network including an output layer with nodes corresponding to words of a vocabulary, the nodes of the output layer outputting values, wherein the values output by the nodes in the output layer correspond to a probability of the corresponding word in the vocabulary being a correct transcription of an input;
  
  for a plurality of words in the vocabulary, determining a frequency of occurrence of the word in a general training set and a frequency of occurrence of the word in a custom dataset;
  
  during inference using the trained speech recognition neural network, for each word in the plurality of words, adjusting the value output by the output node for the word based on the frequency of occurrence of the word in the custom dataset and the frequency of occurrence of the word in the general training set to obtain a custom model probability; and
  
  generating a transcription of a spoken input based on the custom model probability.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The non-transitory computer-readable medium of claim 8, wherein the plurality of words comprises all of the words in the vocabulary.
  - 10. The non-transitory computer-readable medium of claim 8, wherein the frequency of occurrence of the word in the general training set is set to a threshold minimum value if the word does not appear in the general training set.
  - 11. The non-transitory computer-readable medium of claim 8, wherein the trained speech recognition neural network includes one or more fully-connected neural network layers.
  - 12. The non-transitory computer-readable medium of claim 8, wherein the trained speech recognition neural network includes one or more locally connected neural network layers.
  - 13. The non-transitory computer-readable medium of claim 12, wherein the trained speech recognition neural network includes one or more recurrent neural network layers.
  - 14. The non-transitory computer-readable medium of claim 13, wherein the trained speech recognition neural network has been trained in an end-to-end training process including backpropagation through each of its layers.

15. A non-transitory computer-readable medium comprising instructions for:
- providing a trained speech recognition neural network, the trained speech recognition neural network including a plurality of layers each having a plurality of nodes, the trained speech recognition neural network including an output layer with nodes corresponding to words of a vocabulary, the nodes of the output layer outputting values, wherein the values output by the nodes in the output layer correspond to a probability of the corresponding word in the vocabulary being a correct transcription of an input;
  
  during inference using the trained speech recognition neural network, adjusting the values output by a plurality of nodes in the output layer based on a frequency of occurrence of the corresponding word in a general training set and a frequency of occurrence of the corresponding word in a custom dataset to obtain a custom model probability; and
  
  generating a transcription of a spoken input based on the custom model probability.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. The non-transitory computer-readable medium of claim 15, further comprising instructions for adjusting the values of all of the nodes in the output layer based on the frequency of occurrence of the corresponding word in the general training set and the custom dataset to obtain the custom model probability.
  - 17. The non-transitory computer-readable medium of claim 15, wherein the frequency of occurrence of the word in the general training set is set to a threshold minimum value if the word does not appear in the general training set.
  - 18. The non-transitory computer-readable medium of claim 15, wherein the trained speech recognition neural network includes one or more locally connected neural network layers.
  - 19. The non-transitory computer-readable medium of claim 18, wherein the trained speech recognition neural network includes one or more recurrent neural network layers.
  - 20. The non-transitory computer-readable medium of claim 19, wherein the trained speech recognition neural network has been trained in an end-to-end training process including backpropagation through each of its layers.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Deepgram, Inc.
Original Assignee
Deepgram, Inc.
Inventors
Ward, Jeff, Sypniewski, Adam, Stephenson, Scott
Primary Examiner(s)
Wozniak, James S

Application Number

US16/232,652
Publication Number

US 20200035219A1
Time in Patent Office

391 Days
Field of Search

704232, 704251, 704254, 704257
US Class Current
CPC Class Codes

G06F 18/214   Generating training pattern...

G06F 18/24133   Distances to prototypes

G06N 3/044   Recurrent networks, e.g. Ho...

G06N 3/045   Combinations of networks

G06N 3/048   Activation functions

G06N 3/08   Learning methods

G06N 3/084   Backpropagation, e.g. using...

G06V 10/454   Integrating the filters int...

G10L 15/02   Feature extraction for spee...

G10L 15/063   Training

G10L 15/16   using artificial neural net...

G10L 15/197   Probabilistic grammars, e.g...

G10L 15/22   Procedures used during a sp...

G10L 15/30   Distributed recognition, e....

G10L 2015/0635   updating or merging of old ...

G10L 2015/081   Search algorithms, e.g. Bau...

G10L 25/18   the extracted parameters be...

G10L 25/24   the extracted parameters be...

Augmented generalized deep learning with special vocabulary

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Augmented generalized deep learning with special vocabulary

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links