Structure learning in convolutional neural networks

US 10,255,529 B2
Filed: 03/13/2017
Issued: 04/09/2019
Est. Priority Date: 03/11/2016
Status: Active Grant

First Claim

Patent Images

1. A method implemented with a processor, comprising:

creating a neural network;

generating output from the neural network;

identifying a low performing layer from the neural network, the low performing layer having a relatively lower performance than a performance of another layer in the neural network;

inserting a new specialist layer at the low performing layer; and

repeating the act of identifying and the act of inserting until a top of the neural network is reached.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present disclosure provides an improved approach to implement structure learning of neural networks by exploiting correlations in the data/problem the networks aim to solve. A greedy approach is described that finds bottlenecks of information gain from the bottom convolutional layers all the way to the fully connected layers. Rather than simply making the architecture deeper, additional computation and capacitance is only added where it is required.

Citations

33 Claims

1. A method implemented with a processor, comprising:
- creating a neural network;
  
  generating output from the neural network;
  
  identifying a low performing layer from the neural network, the low performing layer having a relatively lower performance than a performance of another layer in the neural network;
  
  inserting a new specialist layer at the low performing layer; and
  
  repeating the act of identifying and the act of inserting until a top of the neural network is reached.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method of claim 1, further comprising updating a model for the neural network to obtain an updated model, wherein the updated model comprises the new specialist layer, and at least one generalist layer.
  - 3. The method of claim 1, wherein the new specialist layer is configured to handle a specific subdomain of data distinct from a subdomain handled by another specialist layer.
  - 4. The method of claim 1, wherein a plurality of loss layers are added to the neural network.
  - 5. The method of claim 4, further comprising generating predictions at one of the loss layers, and converting the predictions to one or more confusion matrices forming a tensor T.
  - 6. The method of claim 5, wherein a structure of T is analyzed to modify and augment an existing structure of the neural network both in terms of depth and breadth.
  - 7. The method of claim 1, wherein the neural network undergoes both vertical splitting and horizontal splitting.
  - 8. The method of claim 7, wherein K-way Bifurcation is performed to implement the horizontal splitting.
  - 9. The method of claim 1, wherein each layer of the neural network is addressed independently, and a given layer of the neural network undergoes splitting by performing a greedy choice to split the given layer which provides a best improvement on a training loss.
  - 10. The method of claim 1, wherein an all-or-nothing highway network is employed to identify layers in the neural network to be removed.
  - 11. The method of claim 1, wherein the neural network is employed to classify images captured for a virtual realty or augmented reality system.

12. A system, comprising:
- a processor;
  
  a memory for holding programmable code; and
  
  wherein the programmable code includes instructions for creating a neural network;
  
  generating output from the neural network;
  
  identifying a low performing layer from the neural network, the low performing layer having a relatively lower performance than a performance of another layer in the neural network;
  
  inserting a new specialist layer at the low performing layer; and
  
  repeating the act of identifying and the act of inserting until a top of the neural network is reached.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
- - 13. The system of claim 12, wherein the processor is configured to update a model for the neural network to obtain an updated model, wherein the updated model comprises the new specialist layer and at least one generalist layer.
  - 14. The system of claim 12, wherein the new specialist layer is configured to handle a specific subdomain of data distinct from a subdomain handled by another specialist layer.
  - 15. The system of claim 12, wherein the neural network comprises a plurality of loss layers.
  - 16. The system of claim 15, wherein the processor is configured to generate predictions at one of the loss layers, and to convert the predictions to one or more confusion matrices forming a tensor T.
  - 17. The system of claim 16, wherein a structure of T is analyzed to modify and augment an existing structure of the neural network both in terms of depth and breadth.
  - 18. The system of claim 12, wherein the neural network is configured to undergo both vertical splitting and horizontal splitting.
  - 19. The system of claim 18, wherein the horizontal splitting is implemented using K-way Bifurcation.
  - 20. The system of claim 12, wherein the processor is configured to address each layer of the neural network independently, and to cause a given layer of the neural network to undergo splitting by performing a greedy choice to split the given layer which provides a best improvement on a training loss.
  - 21. The system of claim 12, wherein the processor is configured to employ an all-or-nothing highway network to identify layers in the neural network to be removed.
  - 22. The system of claim 12, wherein the neural network is configured to classify images captured for a virtual realty or augmented reality system.

23. A computer program product embodied on a non-transitory computer readable medium, the non-transitory computer readable medium having stored thereon a sequence of instructions which, when executed by a processor causes the processor to execute a method comprising:
- creating a neural network;
  
  generating output from the neural network;
  
  identifying a low performing layer from the neural network, the low performing layer having a relatively lower performance than a performance of another layer in the neural network;
  
  inserting a new specialist layer at the low performing layer; and
  
  repeating the act of identifying and the act of inserting until a top of the neural network is reached.
- View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32, 33)
- - 24. The computer program product of claim 23, wherein the method further comprises updating a model for the neural network to obtain an updated model, wherein the updated model comprises the new specialist layer and at least one generalist layer.
  - 25. The computer program product of claim 23, wherein in the method, the new specialist layer is configured to handle a specific subdomain of data distinct from a subdomain handled by another specialist layer.
  - 26. The computer program product of claim 23, wherein in the method, a plurality of loss layers are added to the neural network.
  - 27. The computer program product of claim 26, wherein the method further comprises generating predictions at one of the loss layers, and converting the predictions to one or more confusion matrices forming a tensor T.
  - 28. The computer program product of claim 27, wherein a structure of T is analyzed in the method to modify and augment an existing structure of the neural network both in terms of depth and breadth.
  - 29. The computer program product of claim 23, wherein in the method, the neural network undergoes both vertical splitting and horizontal splitting.
  - 30. The computer program product of claim 29, wherein K-way Bifurcation is performed in the method to implement the horizontal splitting.
  - 31. The computer program product of claim 23, wherein each layer of the neural network is addressed independently in the method, and a given layer of the neural network undergoes splitting by performing a greedy choice to split the given layer which provides a best improvement on a training loss.
  - 32. The computer program product of claim 23, wherein an all-or-nothing highway network is employed in the method to identify layers in the neural network to be removed.
  - 33. The computer program product of claim 23, wherein the neural network is employed in the method to classify images captured for a virtual realty or augmented reality system.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Magic Leap, Inc.
Original Assignee
Magic Leap, Inc.
Inventors
Rabinovich, Andrew, Badrinarayanan, Vijay, DeTone, Daniel, Rajendran, Srivignesh, Lee, Douglas Bertram, Malisiewicz, Tomasz
Primary Examiner(s)
Akhavannik, Hadi

Application Number

US15/457,990
Publication Number

US 20170262737A1
Time in Patent Office

757 Days
Field of Search

None
US Class Current
CPC Class Codes

G06F 18/24   Classification techniques

G06F 18/24137   Distances to cluster centroïds

G06N 3/045   Combinations of networks

G06N 3/082   modifying the architecture,...

G06V 10/454   Integrating the filters int...

G06V 30/19173   Classification techniques

G06V 30/194   References adjustable by an...

Structure learning in convolutional neural networks

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

33 Claims

Specification

Solutions

Use Cases

Quick Links

Structure learning in convolutional neural networks

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

33 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links