Method and apparatus for encoding speech using neural network technology for speech classification
First Claim
Patent Images
1. A speech coding apparatus for encoding speech data which is input to the speech coding apparatus, the speech coding apparatus comprising:
- an input device for receiving the speech data; and
at least one processor coupled to the input device, the at least one processor for parameterizing the speech data to produce at least one feature vector which describe parameters of the speech data, applying a first neural network to the at least one feature vector to obtain at least one speech classification of the speech data, creating characterized speech data by characterizing the speech data using a characterization methodology which depends on the at least one speech classification, and creating an encoded bitstream by encoding the characterized speech data.
3 Assignments
0 Petitions
Accused Products
Abstract
A low-rate voice coding method and apparatus uses vocoder-embedded neural network techniques. A neural network controlled speech analysis processor includes a neural network which manages speech characterization, encoding , decoding, and reconstruction methodologies. The voice coding method and apparatus uses multi-layer perceptron (MLP) based neural network structures in single or multi-stage arrangements.
-
Citations
41 Claims
-
1. A speech coding apparatus for encoding speech data which is input to the speech coding apparatus, the speech coding apparatus comprising:
-
an input device for receiving the speech data; and at least one processor coupled to the input device, the at least one processor for parameterizing the speech data to produce at least one feature vector which describe parameters of the speech data, applying a first neural network to the at least one feature vector to obtain at least one speech classification of the speech data, creating characterized speech data by characterizing the speech data using a characterization methodology which depends on the at least one speech classification, and creating an encoded bitstream by encoding the characterized speech data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A speech decoding apparatus for decoding an encoded bitstream to produce synthesized speech data, the speech decoding apparatus comprising:
-
a transmission channel interface for receiving the encoded bitstream from a speech encoding apparatus; and at least one processor coupled to the transmission channel interface, the at least one processor for decoding a speech classification from a first portion of the encoded bitstream, wherein the speech classification was derived by a neural network in the speech encoding apparatus, the at least one processor also for decoding a remainder of the encoded bitstream using a decoding methodology which depends on the speech classification, resulting in a decoded bitstream, the at least one processor also for creating reconstructed speech basis elements from the decoded bitstream and producing the synthesized speech data using the reconstructed speech basis elements. - View Dependent Claims (19)
-
-
20. A method for encoding speech data by a speech coding apparatus comprising the steps of:
-
a) acquiring a segment of the speech data; b) parameterizing the segment of the speech data to produce at least one feature vector which describes parameters of the speech data; c) applying a first neural network to the at least one feature vector to obtain at least one speech classification of the speech data; d) creating characterized speech data by characterizing the speech data using a characterization methodology which depends on the at least one speech classification; and e) creating an encoded bitstream by encoding the characterized speech data. - View Dependent Claims (21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39)
-
-
40. A method for decoding an encoded bitstream to produce synthesized speech data, the method comprising the steps of:
-
a) receiving the encoded bitstream from a speech encoding apparatus; b) decoding a speech classification from a fit portion of the encoded bitstream, wherein the speech classification was derived by a neural network in the speech encoding apparatus; c) decoding a remainder of the encoded bitstream using a decoding methodology which depends on the speech classification, resulting in a decoded bitstream; d) creating reconstructed speech basis elements from the decoded bitstream; and e) producing the synthesized speech data using the reconstructed speech basis elements. - View Dependent Claims (41)
-
Specification