Object recognition with reduced neural network weight precision
First Claim
Patent Images
1. A client device configured with a trained neural network, the client device comprising:
- a processor, a memory, a user interface, a communications interface, a power supply and an input device;
the memory comprising the trained neural network received from a server system, wherein the server system has trained and configured a server-based neural network to be used as the trained neural network for the client device;
wherein;
the trained neural network is configured to generate a feature map, the feature map comprising a plurality of weight values derived from an input image; and
the trained neural network is configured to perform a unitary quantizing operation or a supervised iterative quantization operation on the feature map to reduce a number of bits of each weight of the plurality of weight values from a first predetermined number to a second predetermined number that is less than the first predetermined number without changing a dimension of the feature map.
1 Assignment
0 Petitions
Accused Products
Abstract
A client device configured with a neural network includes a processor, a memory, a user interface, a communications interface, a power supply and an input device, wherein the memory includes a trained neural network received from a server system that has trained and configured the neural network for the client device. A server system and a method of training a neural network are disclosed.
85 Citations
17 Claims
-
1. A client device configured with a trained neural network, the client device comprising:
-
a processor, a memory, a user interface, a communications interface, a power supply and an input device; the memory comprising the trained neural network received from a server system, wherein the server system has trained and configured a server-based neural network to be used as the trained neural network for the client device; wherein; the trained neural network is configured to generate a feature map, the feature map comprising a plurality of weight values derived from an input image; and the trained neural network is configured to perform a unitary quantizing operation or a supervised iterative quantization operation on the feature map to reduce a number of bits of each weight of the plurality of weight values from a first predetermined number to a second predetermined number that is less than the first predetermined number without changing a dimension of the feature map. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method that comprises performing the following using a client device:
-
receiving a trained neural network from a server system, wherein the server system has trained and configured a server-based neural network to be used as the trained neural network for the client device; capturing an input image; processing the input image using the trained neural network; generating a feature map using the trained neural network, the feature map comprising a plurality of weight values derived from an input image; performing a unitary quantizing operation or a supervised iterative quantization operation on the feature map using the trained neural network to reduce a number of bits of each weight of the plurality of weight values from a first predetermined number to a second predetermined number that is less than the first predetermined number without changing a dimension of the feature map; and recognizing an object in the input image based on a result of the processing. - View Dependent Claims (11, 12, 13)
-
-
14. A non-transitory computer-readable medium storing program code, which, when executed by a processor, cause the processor to perform the following:
-
receive a trained neural network from a server system, wherein the server system has trained and configured a server-based neural network to be used as the trained neural network; capture an input image; generating a feature map using the trained neural network, the feature map comprising a plurality of weight values derived from an input image; performing a unitary quantizing operation or supervised iterative quantization operation on the feature map using the trained neural network to reduce a number of bits of each weight of the plurality of weight values from a first predetermined number to a second predetermined number that is less than the first predetermined number without changing a dimension of the feature map; and process the input image using the trained neural network to recognize an object in the input image. - View Dependent Claims (15, 16)
-
-
17. A client device configured with a trained neural network, the client device comprising:
-
a processor, a memory, a user interface, a communications interface, a power supply and an input device; the memory comprising the trained neural network received from a server system, wherein the server system has trained and configured a server-based neural network to be used as the trained neural network for the client device; wherein; the trained neural network is configured to generate a feature map, the feature map comprising a plurality of first weight values derived from an input image; the trained neural network is configured to convert the first weights of the feature map into second weights by a unitary or a supervised iterative quantizing operation; and the second weights are encoded using a number of bits lower than that used to encode the first weights without changing a dimension of the feature map.
-
Specification