Hierarchical machine learning system for lifelong learning
First Claim
1. A computer-implemented method for a machine learning system that mitigates catastrophic forgetting, comprising:
- receiving a first input at a first node in a first layer of a hierarchy of nodes, wherein the first input comprises a) a first previous feature vector that was generated by the first node based on a first previous input and b) a data item;
processing at least a portion of the first input by the first node to generate a first feature vector, the first feature vector comprising a first plurality of feature elements;
processing a second input by a second node in a second layer of the hierarchy of nodes to generate a second feature vector, wherein the second input comprises a) at least a portion of the first feature vector and b) a second previous feature vector that was generated by the second node based on a second previous input, and wherein the second feature vector comprises a second plurality of feature elements;
generating at least one of a first sparse feature vector from the first feature vector or a second sparse feature vector from the second feature vector, wherein a majority of feature elements in the first sparse feature vector and the second sparse feature vector have a value of zero; and
processing at least one of the first sparse feature vector or the second sparse feature vector by a third node to determine a first output.
3 Assignments
0 Petitions
Accused Products
Abstract
Embodiments described herein cover a hierarchical machine learning system with a separated perception subsystem (that includes a hierarchy of nodes having at least a first layer and a second layer) and application subsystem. In one example embodiment a first node in the first layer processes a first input and processes at least a portion of the first input to generate a first feature vector. A second node in the second layer processes a second input comprising at least a portion of the first feature vector to generate a second feature vector. The first node generates a first sparse feature vector from the first feature vector and/or the second node generates a second sparse feature vector from the second feature vector. A third node of the perception subsystem then processes at least one of the first sparse feature vector or the second sparse feature vector to determine an output.
22 Citations
27 Claims
-
1. A computer-implemented method for a machine learning system that mitigates catastrophic forgetting, comprising:
-
receiving a first input at a first node in a first layer of a hierarchy of nodes, wherein the first input comprises a) a first previous feature vector that was generated by the first node based on a first previous input and b) a data item; processing at least a portion of the first input by the first node to generate a first feature vector, the first feature vector comprising a first plurality of feature elements; processing a second input by a second node in a second layer of the hierarchy of nodes to generate a second feature vector, wherein the second input comprises a) at least a portion of the first feature vector and b) a second previous feature vector that was generated by the second node based on a second previous input, and wherein the second feature vector comprises a second plurality of feature elements; generating at least one of a first sparse feature vector from the first feature vector or a second sparse feature vector from the second feature vector, wherein a majority of feature elements in the first sparse feature vector and the second sparse feature vector have a value of zero; and processing at least one of the first sparse feature vector or the second sparse feature vector by a third node to determine a first output. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 16, 17, 18)
-
-
12. A computer-implemented method for a machine learning system that mitigates catastrophic forgetting, comprising:
-
receiving a first input at a first node in a first layer of a hierarchy of nodes; processing at least a portion of the first input by the first node to generate a first feature vector, the first feature vector comprising a first plurality of feature elements; processing a second input comprising at least a portion of the first feature vector by a second node in a second layer of the hierarchy of nodes to generate a second feature vector, the second feature vector comprising a second plurality of feature elements; generating at least one of a first sparse feature vector from the first feature vector or a second sparse feature vector from the second feature vector, wherein a majority of feature elements in the first sparse feature vector and the second sparse feature vector have a value of zero; processing at least one of the first sparse feature vector or the second sparse feature vector by a third node to determine a first output; receiving a new input at the first node; processing at least a portion of the new input by the first node to generate a first new feature vector; and determining, by the second node, whether a second new input comprising at least a portion of the first new feature vector satisfies a processing criterion. - View Dependent Claims (13, 14, 15)
-
-
19. A system comprising:
-
a memory to store instructions for a machine learning system that mitigates catastrophic forgetting; and a processing device, operatively coupled to the memory, to execute the instructions, wherein the instructions cause the processing device to instantiate the machine learning system, the machine learning system comprising; a hierarchy of nodes, the hierarchy of nodes comprising; a first layer of nodes comprising a first plurality of nodes, wherein a first node of the first plurality of nodes is to; process a first input to produce a first feature vector; and generate a first sparse feature vector from the first feature vector; and a second layer of nodes comprising a second plurality of nodes, wherein a second node of the second plurality of nodes is to; process a second input comprising the first feature vector to produce a second feature vector; and generate a second sparse feature vector from the second feature vector; and an additional node to; receive a first plurality of sparse feature vectors from the first plurality of nodes, the first plurality of sparse feature vectors comprising the first sparse feature vector; receive a second plurality of sparse feature vectors from the second plurality of nodes, the second plurality of sparse feature vectors comprising the second sparse feature vector; and determine an output based on the first plurality of sparse feature vectors and the second plurality of sparse feature vectors. - View Dependent Claims (20, 21, 22, 23, 24, 25)
-
-
26. A non-transitory computer readable medium comprising instructions that, when executed by a processing device, cause the processing device to perform operations for a machine learning system that mitigates catastrophic forgetting, the operations comprising:
-
receiving a first input at a first node in a first layer of a hierarchy of nodes; processing at least a portion of the first input by the first node to generate a first feature vector, the first feature vector comprising a first plurality of feature elements; processing a second input comprising at least a portion of the first feature vector by a second node in a second layer of the hierarchy of nodes to generate a second feature vector, the second feature vector comprising a second plurality of feature elements; generating at least one of a first sparse feature vector from the first feature vector or a second sparse feature vector from the second feature vector, wherein a majority of feature elements in the first sparse feature vector and the second sparse feature vector have a value of zero; and processing at least one of the first sparse feature vector or the second sparse feature vector by a third node to determine a first output; determining whether update criteria are satisfied for the first node; separately determining whether the update criteria are satisfied for the second node; and updating at least one of the first node or the second node. - View Dependent Claims (27)
-
Specification