Machine learning architecture for lifelong learning
First Claim
1. A computer-implemented method for a machine learning system that mitigates catastrophic forgetting, comprising:
- receiving a data item;
processing, by a first node that comprises a plurality of centroids, information from at least a portion of the data item to generate a first feature vector, wherein the first feature vector comprises a plurality of feature elements, each of the plurality of feature elements having a similarity value representing a similarity to one of the plurality of centroids;
selecting a subset of the plurality of feature elements from the first feature vector, the subset containing one or more feature elements of the plurality of feature elements that have highest similarity values;
generating a second feature vector from the first feature vector by replacing similarity values of feature elements in the first feature vector that are not in the subset with zeros;
processing the second feature vector by a second node to determine an output;
determining, by the first node, a novelty rating for the data item based on similarity values of the plurality of feature elements in at least one of the first feature vector or the second feature vector;
determining a relevancy rating for the data item; and
determining whether to update the first node based on the novelty rating and the relevancy rating.
4 Assignments
0 Petitions
Accused Products
Abstract
Some embodiments described herein cover a machine learning architecture with a separated perception subsystem and application subsystem. These subsystems can be co-trained. In one example embodiment, a data item is received and information from the data item is processed by a first node to generate a first feature vector comprising a plurality of features, each of the plurality of features having a similarity value representing a similarity to one of a plurality of centroids. The first node selects a subset of the features from the first feature vector, the subset containing one or more features that have highest similarity values. The first node generates a second feature vector from the first feature vector by replacing similarity values of features in the first feature vector that are not in the subset with zeros. A second node then processes the second feature vector to determine an output.
31 Citations
24 Claims
-
1. A computer-implemented method for a machine learning system that mitigates catastrophic forgetting, comprising:
-
receiving a data item; processing, by a first node that comprises a plurality of centroids, information from at least a portion of the data item to generate a first feature vector, wherein the first feature vector comprises a plurality of feature elements, each of the plurality of feature elements having a similarity value representing a similarity to one of the plurality of centroids; selecting a subset of the plurality of feature elements from the first feature vector, the subset containing one or more feature elements of the plurality of feature elements that have highest similarity values; generating a second feature vector from the first feature vector by replacing similarity values of feature elements in the first feature vector that are not in the subset with zeros; processing the second feature vector by a second node to determine an output; determining, by the first node, a novelty rating for the data item based on similarity values of the plurality of feature elements in at least one of the first feature vector or the second feature vector; determining a relevancy rating for the data item; and determining whether to update the first node based on the novelty rating and the relevancy rating. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A system comprising:
-
at least one memory to store instructions for a machine learning system that mitigates catastrophic forgetting; and at least one processing device, operatively coupled to the at least one memory, to execute the instructions, wherein the instructions cause the processing device to; receive a data item; process, by a first node that comprises a plurality of centroids, information from at least a portion of the data item to generate a first feature vector, wherein the first feature vector comprises a plurality of feature elements, each of the plurality of feature elements having a similarity value representing a similarity to one of the plurality of centroids; select a subset of the plurality of feature elements from the first feature vector, the subset containing one or more feature elements of the plurality of feature elements that have highest similarity values; generate a second feature vector from the first feature vector by replacing similarity values of feature elements in the first feature vector that are not in the subset with zeros; process the second feature vector by a second node to determine an output; determine, by the first node, a novelty rating for the data item based on similarity values of the plurality of feature elements in at least one of the first feature vector or the second feature vector; determine a relevancy rating for the data item; and determine whether to update the first node based on the novelty rating and the relevancy rating. - View Dependent Claims (20, 21, 22, 23, 24)
-
Specification