Sound identification utilizing periodic indications
First Claim
Patent Images
1. A computer-implemented method performed by a speech recognition system having at least a processor, the method comprising:
- obtaining, by the processor, a frequency spectrum of an audio signal data;
extracting, by the processor, periodic indications from the frequency spectrum;
inputting, by the processor, the periodic indications and components of the frequency spectrum into a neural network;
estimating, by the processor, sound identification information from the neural network; and
performing, by the processor, a speech recognition operation on the audio signal data to decode the audio signal data into a textual representation based on the estimated sound identification information,wherein the neural network includes a plurality of fully-connected network layers having a first layer that includes a plurality of first nodes and a plurality of second nodes, and wherein the method further comprises training the neural network by initially isolating the periodic indications from the components of the frequency spectrum in the first layer by setting weights between the first nodes and a plurality of input nodes corresponding to the periodic indications to 0.
1 Assignment
0 Petitions
Accused Products
Abstract
A computer-implemented method and an apparatus are provided. The method includes obtaining, by a processor, a frequency spectrum of an audio signal data. The method further includes extracting, by the processor, periodic indications from the frequency spectrum. The method also includes inputting, by the processor, the periodic indications and components of the frequency spectrum into a neural network. The method additionally includes estimating, by the processor, sound identification information from the neural network.
-
Citations
25 Claims
-
1. A computer-implemented method performed by a speech recognition system having at least a processor, the method comprising:
-
obtaining, by the processor, a frequency spectrum of an audio signal data; extracting, by the processor, periodic indications from the frequency spectrum; inputting, by the processor, the periodic indications and components of the frequency spectrum into a neural network; estimating, by the processor, sound identification information from the neural network; and performing, by the processor, a speech recognition operation on the audio signal data to decode the audio signal data into a textual representation based on the estimated sound identification information, wherein the neural network includes a plurality of fully-connected network layers having a first layer that includes a plurality of first nodes and a plurality of second nodes, and wherein the method further comprises training the neural network by initially isolating the periodic indications from the components of the frequency spectrum in the first layer by setting weights between the first nodes and a plurality of input nodes corresponding to the periodic indications to 0. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A non-transitory computer program product having instructions embodied therewith, the instructions executable by a speech recognition system that includes a processor or programmable circuitry to cause the processor or programmable circuitry to perform a method comprising:
-
obtaining a frequency spectrum of an audio signal data; extracting periodic indications from the frequency spectrum; inputting the periodic indications and components of the frequency spectrum into a neural network; and estimating sound identification information from the neural network; and performing a speech recognition operation on the audio signal data to decode the audio signal data into a textual representation based on the estimated sound identification information, wherein the neural network includes a plurality of fully-connected network layers having a first layer that includes a plurality of first nodes and a plurality of second nodes, and wherein the method further comprises training the neural network by initially isolating the periodic indications from the components of the frequency spectrum in the first layer by setting weights between the first nodes and a plurality of input nodes corresponding to the periodic indications to 0. - View Dependent Claims (19, 20, 21)
-
-
22. A speech recognition system, comprising:
-
a processor; and one or more computer readable mediums collectively including instructions that, when executed by the processor, cause the processor to; obtain frequency spectrum of an audio signal data; extract periodic indications from the frequency spectrum; input the periodic indications and components of the frequency spectrum into a neural network, wherein the neural network includes a plurality of fully-connected network layers having a first layer that includes a plurality of first nodes and a plurality of second nodes; and estimate sound identification information from the neural network; and perform a speech recognition operation on the audio signal data to decode the audio signal data into a textual representation based on the estimated sound identification information, wherein the neural network is trained by initially isolating the periodic indications from the components of the frequency spectrum in the first layer by setting weights between the first nodes and a plurality of input nodes corresponding to the periodic indications to 0. - View Dependent Claims (23, 24, 25)
-
Specification