TRAINING METHOD AND APPARATUS FOR CONVOLUTIONAL NEURAL NETWORK MODEL
First Claim
1. A method for training a Convolutional Neural Network (CNN) model, comprising:
- acquiring, by a server, initial model parameters of a CNN model to be trained, the initial model parameters comprising initial convolution kernels and initial bias matrixes of convolution layers of respective levels, and an initial weight matrix and an initial bias vector of a fully connected layer;
acquiring a plurality of training images;
on the convolution layer of each level, performing, by the server, convolution operation and maximal pooling operation on each of the training images to obtain a first feature image of each of the training images on the convolution layer of each level by using the initial convolution kernel and initial bias matrix of the convolution layer of each level;
performing, by the server, horizontal pooling operation on the first feature image of each of the training images on the convolution layer of at least one of the levels to obtain a second feature image of each of the training images on the convolution layer of each level;
determining, by the server, a feature vector of each of the training images according to the second feature image of each of the training images on the convolution layer of each level;
processing, by the server, each feature vector to obtain a classification probability vector of each of the training images according to the initial weight matrixes and the initial bias vectors;
calculating, by the server, a classification error according to the classification probability vector and initial classification of each of the training images;
regulating, by the server, the model parameters of the CNN model to be trained on the basis of the classification errors;
on the basis of the regulated model parameters and the plurality of training images, continuing, by the server, the process of regulating the model parameters, until the number of iterations reaches a preset number; and
determining, by the server, model parameters obtained when the number of iterations reaches the preset number as the model parameters of the trained CNN model.
1 Assignment
0 Petitions
Accused Products
Abstract
Disclosed are a training method and apparatus for a CNN model, which belong to the field of image recognition. The method comprises: performing a convolution operation, maximal pooling operation and horizontal pooling operation on training images, respectively, to obtain second feature images; determining feature vectors according to the second feature images; processing the feature vectors to obtain category probability vectors; according to the category probability vectors and an initial category, calculating a category error; based on the category error, adjusting model parameters; based on the adjusted model parameters, continuing the model parameters adjusting process, and using the model parameters when the number of iteration times reaches a pre-set number of times as the model parameters for the well-trained CNN model. After the convolution operation and maximal pooling operation on the training images on each level of convolution layer, a horizontal pooling operation is performed. Since the horizontal pooling operation can extract feature images identifying image horizontal direction features from the feature images, such that the well-trained CNN model can recognize an image of any size, thus expanding the applicable range of the well-trained CNN model in image recognition.
-
Citations
21 Claims
-
1. A method for training a Convolutional Neural Network (CNN) model, comprising:
-
acquiring, by a server, initial model parameters of a CNN model to be trained, the initial model parameters comprising initial convolution kernels and initial bias matrixes of convolution layers of respective levels, and an initial weight matrix and an initial bias vector of a fully connected layer; acquiring a plurality of training images; on the convolution layer of each level, performing, by the server, convolution operation and maximal pooling operation on each of the training images to obtain a first feature image of each of the training images on the convolution layer of each level by using the initial convolution kernel and initial bias matrix of the convolution layer of each level; performing, by the server, horizontal pooling operation on the first feature image of each of the training images on the convolution layer of at least one of the levels to obtain a second feature image of each of the training images on the convolution layer of each level; determining, by the server, a feature vector of each of the training images according to the second feature image of each of the training images on the convolution layer of each level; processing, by the server, each feature vector to obtain a classification probability vector of each of the training images according to the initial weight matrixes and the initial bias vectors; calculating, by the server, a classification error according to the classification probability vector and initial classification of each of the training images; regulating, by the server, the model parameters of the CNN model to be trained on the basis of the classification errors; on the basis of the regulated model parameters and the plurality of training images, continuing, by the server, the process of regulating the model parameters, until the number of iterations reaches a preset number; and determining, by the server, model parameters obtained when the number of iterations reaches the preset number as the model parameters of the trained CNN model. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A device for training a Convolutional Neural Network (CNN) model, comprising:
-
one or more processors, and a memory connected with the one or more processors, the memory being configured to store instructions executable for the one or more processors, wherein the one or more processors are configured to execute the instructions stored in the memory to; acquire initial model parameters of a CNN model to be trained, the initial model parameters comprising initial convolution kernels and initial bias matrixes of convolution layers of respective levels, and an initial weight matrix and an initial bias vector of a fully connected layer; acquire a plurality of training images; on the convolution layer of each level, perform convolution operation and maximal pooling operation on each of the training images to obtain a first feature image of each of the training images on the convolution layer of each level by using the initial convolution kernel and initial bias matrix of the convolution layer of each level; perform horizontal pooling operation on the first feature image of each of the training images on the convolution layer of at least one of the levels to obtain a second feature image of each of the training images on the convolution layer of each level; determine a feature vector of each of the training images according to the second feature image of each of the training images on the convolution layer of each level; process each feature vector to obtain a classification probability vector of each of the training images according to the initial weight matrixes and the initial bias vectors; calculate a classification error according to the classification probability vector and initial classification of each of the training images; regulate the model parameters of the CNN model to be trained on the basis of the classification errors; continue, on the basis of the regulated model parameters and the plurality of training images, the process of regulating the model parameters until the number of iterations reaches a preset number; and determine model parameters obtained when the number of iterations reaches the preset number as the model parameters of the trained CNN model. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A server, comprising:
-
one or more processors, and a memory connected with the one or more processors, the memory being configured to store instructions executable for the one or more processors, wherein the one or more processors are configured to execute the instructions stored in the memory to execute a method for training the Convolutional Neural Network (CNN) model, the method comprising; acquiring initial model parameters of a CNN model to be trained, the initial model parameters comprising initial convolution kernels and initial bias matrixes of convolution layers of respective levels, and an initial weight matrix and an initial bias vector of a fully connected layer; acquiring a plurality of training images; on the convolution layer of each level, performing convolution operation and maximal pooling operation on each of the training images to obtain a first feature image of each of the training images on the convolution layer of each level by using the initial convolution kernel and initial bias matrix of the convolution layer of each level; performing horizontal pooling operation on the first feature image of each of the training images on the convolution layer of at least one of the levels to obtain a second feature image of each of the training images on the convolution layer of each level; determining a feature vector of each of the training images according to the second feature image of each of the training images on the convolution layer of each level; processing each feature vector to obtain a classification probability vector of each of the training images according to the initial weight matrixes and the initial bias vectors; calculating a classification error according to the classification probability vector and initial classification of each of the training images; regulating the model parameters of the CNN model to be trained on the basis of the classification errors; on the basis of the regulated model parameters and the plurality of training images, continuing the process of regulating the model parameters, until the number of iterations reaches a preset number; and determining model parameters obtained when the number of iterations reaches the preset number as the model parameters of the trained CNN model.
-
Specification