Learning method and learning device for improving segmentation performance to be used for detecting road user events using double embedding configuration in multi-camera system and testing method and testing device using the same

US 10,551,846 B1
Filed: 01/25/2019
Issued: 02/04/2020
Est. Priority Date: 01/25/2019
Status: Active Grant

First Claim

Patent Images

1. A learning method for instance segmentation, comprising steps of:

(a) a learning device performing a process of acquiring at least one network output feature from a neural network capable of detecting one or more objects in at least one training image;

(b) the learning device performing a process of instructing at least one similarity convolutional layer to apply one or more similarity convolution operations to the network output feature, to thereby generate at least one similarity embedding feature, wherein the similarity convolution operations are adopted to output one or more embedding vectors corresponding to at least part of pixels of the network output feature;

(c) the learning device performing a similarity embedding process of instructing at least one similarity loss layer to output at least one similarity between two points sampled from the similarity embedding feature and to output at least one similarity loss by referring to the similarity and its corresponding at least one ground truth (GT) label image;

(d) the learning device performing a process of instructing at least one distance convolutional layer to apply one or more distance convolution operations to the similarity embedding feature, to thereby generate at least one distance embedding feature, wherein the distance convolution operations are adopted to transform the similarity embedding feature into at least one feature space;

(e) the learning device performing a distance embedding process of instructing at least one distance loss layer to calculate each of mean values and each of variance values of each of one or more instance classes by using the distance embedding feature, to thereby output at least one distance loss to be used for increasing each of inter-class differences among each of the mean values of the instance classes and decreasing each of intra-class variance values of each of the instance classes; and

(f) the learning device performing a process of learning one or more parameters of at least one of the distance convolutional layer, the similarity convolutional layer, and the neural network by backpropagating at least one of the similarity loss and the distance loss.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A learning method for improving segmentation performance to be used for detecting road user events including pedestrian events and vehicle events using double embedding configuration in a multi-camera system is provided. The learning method includes steps of: a learning device instructing similarity convolutional layer to generate similarity embedding feature by applying similarity convolution operations to a feature outputted from a neural network; instructing similarity loss layer to output a similarity loss by referring to a similarity between two points sampled from the similarity embedding feature, and its corresponding GT label image; instructing distance convolutional layer to generate distance embedding feature by applying distance convolution operations to the similarity embedding feature; instructing distance loss layer to output a distance loss for increasing inter-class differences among mean values of instance classes and decreasing intra-class variance values of the instance classes; backpropagating at least one of the similarity loss and the distance loss.

7 Citations

View as Search Results

30 Claims

1. A learning method for instance segmentation, comprising steps of:
- (a) a learning device performing a process of acquiring at least one network output feature from a neural network capable of detecting one or more objects in at least one training image;
  
  (b) the learning device performing a process of instructing at least one similarity convolutional layer to apply one or more similarity convolution operations to the network output feature, to thereby generate at least one similarity embedding feature, wherein the similarity convolution operations are adopted to output one or more embedding vectors corresponding to at least part of pixels of the network output feature;
  
  (c) the learning device performing a similarity embedding process of instructing at least one similarity loss layer to output at least one similarity between two points sampled from the similarity embedding feature and to output at least one similarity loss by referring to the similarity and its corresponding at least one ground truth (GT) label image;
  
  (d) the learning device performing a process of instructing at least one distance convolutional layer to apply one or more distance convolution operations to the similarity embedding feature, to thereby generate at least one distance embedding feature, wherein the distance convolution operations are adopted to transform the similarity embedding feature into at least one feature space;
  
  (e) the learning device performing a distance embedding process of instructing at least one distance loss layer to calculate each of mean values and each of variance values of each of one or more instance classes by using the distance embedding feature, to thereby output at least one distance loss to be used for increasing each of inter-class differences among each of the mean values of the instance classes and decreasing each of intra-class variance values of each of the instance classes; and
  
  (f) the learning device performing a process of learning one or more parameters of at least one of the distance convolutional layer, the similarity convolutional layer, and the neural network by backpropagating at least one of the similarity loss and the distance loss.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The learning method of claim 1, further comprising a step of:
    - (g) the learning device performing a process of instructing at least one sampling layer and at least one detecting layer to recognize the objects individually by sampling the distance embedding feature and by finding locations of the objects through regression, to thereby generate at least one instance segmentation.
  - 3. The learning method of claim 2, further comprising a step of:
    - (h) the learning device performing a process of instructing at least one segmentation loss layer to output at least one segmentation loss by referring to the instance segmentation and its corresponding at least one GT label image, to thereby learn the parameters of at least one of the distance convolutional layer, the similarity convolutional layer, and the neural network by backpropagating the segmentation loss.
  - 4. The learning method of claim 1, wherein the GT label image is one corresponding to the instance segmentation.
  - 5. The learning method of claim 1, wherein a range of change in the parameters of the similarity convolutional layer is determined as higher than that in the parameters of the neural network, andwherein a range of change in the parameters of the distance convolutional layer is determined as higher than that in the parameters of the neural network or that in the parameters of the similarity convolutional layer.
  - 6. The learning method of claim 1, wherein the similarity is represented as a following equation
  - 7. The learning method of claim 1, wherein the objects represent one or more lanes.
  - 8. The learning method of claim 7, wherein the distance loss is a clustering loss represented as a following equation
  - 9. The learning method of claim 8, wherein the thresh is set to be 1.

10. A testing method for instance segmentation, comprising steps of:
- (a) on condition that a learning device (i) has performed a process of instructing at least one similarity convolutional layer to apply one or more similarity convolution operations to at least one network output feature for training acquired from a neural network capable of detecting one or more objects for training in at least one training image, to thereby generate at least one similarity embedding feature for training, wherein the similarity convolution operations are adopted to output one or more embedding vectors for training corresponding to at least part of pixels of the network output feature for training, (ii) has performed a similarity embedding process of instructing at least one similarity loss layer to output at least one similarity between two points sampled from the similarity embedding feature for training and to output at least one similarity loss by referring to the similarity and its corresponding at least one GT label image, (iii) has performed a process of instructing at least one distance convolutional layer to apply one or more distance convolution operations to the similarity embedding feature for training, to thereby generate at least one distance embedding feature for training, wherein the distance convolution operations are adopted to transform the similarity embedding feature for training into at least one feature space for training, (iv) has performed a distance embedding process of instructing at least one distance loss layer to calculate each of mean values and each of variance values of each of one or more instance classes by using the distance embedding feature for training, to thereby output at least one distance loss to be used for increasing each of inter-class differences among each of the mean values of the instance classes and decreasing each of intra-class variance values of each of the instance classes; and
  
  (v) has performed a process of learning one or more parameters of at least one of the distance convolutional layer, the similarity convolutional layer, and the neural network by backpropagating at least one of the similarity loss and the distance loss, a testing device acquiring at least one network output feature for testing from the neural network capable of detecting one or more objects for testing in at least one test image;
  
  (b) the testing device performing a process of instructing the similarity convolutional layer to apply the similarity convolution operations to the network output feature for testing, to thereby generate at least one similarity embedding feature for testing, wherein the similarity convolution operations are adopted to output one or more embedding vectors for testing corresponding to at least part of pixels of the network output feature for testing;
  
  (c) the testing device performing a process of instructing the distance convolutional layer to apply the distance convolution operations to the similarity embedding feature for testing, to thereby generate at least one distance embedding feature for testing, wherein the distance convolution operations are adopted to transform the similarity embedding feature for testing into at least one feature space for testing; and
  
  (d) the testing device performing a process of instructing at least one sampling layer and at least one detecting layer to recognize one or more objects for testing individually by sampling the distance embedding feature for testing and by finding locations of the objects for testing through regression, to thereby generate at least one instance segmentation for testing.
- View Dependent Claims (11, 12, 13, 14, 15)
- - 11. The testing method of claim 10, wherein, at the step of (a), the learning device further has performed processes of (vi) instructing the sampling layer and the detecting layer to recognize the objects for training individually by sampling the distance embedding feature for training and by finding locations of the objects for training through the regression, to thereby generate at least one instance segmentation for training and (vii) instructing at least one segmentation loss layer to output at least one segmentation loss by referring to the instance segmentation for training and its corresponding at least one GT label image, to thereby learn the parameters of at least one of the distance convolutional layer, the similarity convolutional layer, and the neural network by backpropagating the segmentation loss.
  - 12. The testing method of claim 10, wherein the GT label image is one corresponding to the instance segmentation for training.
  - 13. The testing method of claim 10, wherein a range of change in the parameters of the similarity convolutional layer is determined as higher than that in the parameters of the neural network, andwherein a range of change in the parameters of the distance convolutional layer is determined as higher than that in the parameters of the neural network or that in the parameters of the similarity convolutional layer.
  - 14. The testing method of claim 10, wherein the similarity is represented as a following equation
  - 15. The testing method of claim 10, wherein the objects for training represents one or more lanes, andwherein the distance loss is a clustering loss represented as a following equation

16. A learning device for instance segmentation, comprising:
- at least one memory that stores instructions; and
  
  at least one processor configured to execute the instructions to;
  
  (I) perform a process of instructing at least one similarity convolutional layer to apply one or more similarity convolution operations to at least one network output feature acquired from a neural network capable of detecting one or more objects in at least one training image, to thereby generate at least one similarity embedding feature, wherein the similarity convolution operations are adopted to output one or more embedding vectors corresponding to at least part of pixels of the network output feature, (II) perform a similarity embedding process of instructing at least one similarity loss layer to output at least one similarity between two points sampled from the similarity embedding feature and to output at least one similarity loss by referring to the similarity and its corresponding at least one GT label image, (III) perform a process of instructing at least one distance convolutional layer to apply one or more distance convolution operations to the similarity embedding feature, to thereby generate at least one distance embedding feature, wherein the distance convolution operations are adopted to transform the similarity embedding feature into at least one feature space, (IV) perform a distance embedding process of instructing at least one distance loss layer to calculate each of mean values and each of variance values of each of one or more instance classes by using the distance embedding feature, to thereby output at least one distance loss to be used for increasing each of inter-class differences among each of the mean values of the instance classes and decreasing each of intra-class variance values of each of the instance classes, and (V) perform a process of learning one or more parameters of at least one of the distance convolutional layer, the similarity convolutional layer, and the neural network by backpropagating at least one of the similarity loss and the distance loss.
- View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24)
- - 17. The learning device of claim 16, wherein the processor further performs a process of:
    - (VI) instructing at least one sampling layer and at least one detecting layer to recognize the objects individually by sampling the distance embedding feature and by finding locations of the objects through regression, to thereby generate at least one instance segmentation.
  - 18. The learning device of claim 17, wherein the processor further performs a process of:
    - (VII) instructing at least one segmentation loss layer to output at least one segmentation loss by referring to the instance segmentation and its corresponding at least one GT label image, to thereby learn the parameters of at least one of the distance convolutional layer, the similarity convolutional layer, and the neural network by backpropagating the segmentation loss.
  - 19. The learning device of claim 16, wherein the GT label image is one corresponding to the instance segmentation.
  - 20. The learning device of claim 16, wherein a range of change in the parameters of the similarity convolutional layer is determined as higher than that in the parameters of the neural network, andwherein a range of change in the parameters of the distance convolutional layer is determined as higher than that in the parameters of the neural network or that in the parameters of the similarity convolutional layer.
  - 21. The learning device of claim 16, wherein the similarity is represented as a following equation
  - 22. The learning device of claim 16, wherein the objects represent one or more lanes.
  - 23. The learning device of claim 22, wherein the distance loss is a clustering loss represented as a following equation
  - 24. The learning device of claim 23, wherein the thresh is set to be 1.

25. A testing device for instance segmentation, comprising:
- at least one memory that stores instructions; and
  
  at least one processor, on condition that a learning device (i) has performed a process of instructing at least one similarity convolutional layer to apply one or more similarity convolution operations to at least one network output feature for training acquired from the neural network capable of detecting one or more objects for training in at least one training image, to thereby generate at least one similarity embedding feature for training, wherein the similarity convolution operations are adopted to output one or more embedding vectors for training corresponding to at least part of pixels of the network output feature for training, (ii) has performed a similarity embedding process of instructing at least one similarity loss layer to output at least one similarity between two points sampled from the similarity embedding feature for training and to output at least one similarity loss by referring to the similarity and its corresponding at least one GT label image, (iii) has performed a process of instructing at least one distance convolutional layer to apply one or more distance convolution operations to the similarity embedding feature for training, to thereby generate at least one distance embedding feature for training, wherein the distance convolution operations are adopted to transform the similarity embedding feature for training into at least one feature space for training, (iv) has performed a distance embedding process of instructing at least one distance loss layer to calculate each of mean values and each of variance values of each of one or more instance classes by using the distance embedding feature for training, to thereby output at least one distance loss to be used for increasing each of inter-class differences among each of the mean values of the instance classes and decreasing each of intra-class variance values of each of the instance classes; and
  
  (v) has performed a process of learning one or more parameters of at least one of the distance convolutional layer, the similarity convolutional layer, and the neural network by backpropagating at least one of the similarity loss and the distance loss;
  
  configured to execute the instructions to;
  
  (I) perform a process of instructing the similarity convolutional layer to apply the similarity convolution operations to at least one network output feature for testing acquired from the neural network capable of detecting one or more objects for testing in at least one test image, to thereby generate at least one similarity embedding feature for testing, wherein the similarity convolution operations are adopted to output one or more embedding vectors for testing corresponding to at least part of pixels of the network output feature for testing, (II) perform a process of instructing the distance convolutional layer to apply the distance convolution operations to the similarity embedding feature for testing, to thereby generate at least one distance embedding feature for testing, wherein the distance convolution operations are adopted to transform the similarity embedding feature for testing into at least one feature space for testing, and (III) perform a process of instructing at least one sampling layer and at least one detecting layer to recognize the objects for testing individually by sampling the distance embedding feature for testing and by finding locations of the objects for testing through regression, to thereby generate at least one instance segmentation for testing.
- View Dependent Claims (26, 27, 28, 29, 30)
- - 26. The testing device of claim 25, wherein, the learning device further has performed processes of (vi) instructing the sampling layer and the detecting layer to recognize the objects for training individually by sampling the distance embedding feature for training and by finding locations of the objects for training through the regression, to thereby generate at least one instance segmentation for training and (vii) instructing at least one segmentation loss layer to output at least one segmentation loss by referring to the instance segmentation for training and its corresponding at least one GT label image, to thereby learn the parameters of at least one of the distance convolutional layer, the similarity convolutional layer, and the neural network by backpropagating the segmentation loss.
  - 27. The testing device of claim 25, wherein the GT label image is one corresponding to the instance segmentation for training.
  - 28. The testing device of claim 25, wherein a range of change in the parameters of the similarity convolutional layer is determined as higher than that in the parameters of the neural network, andwherein a range of change in the parameters of the distance convolutional layer is determined as higher than that in the parameters of the neural network or that in the parameters of the similarity convolutional layer.
  - 29. The testing device of claim 25, wherein the similarity is represented as a following equation
  - 30. The testing device of claim 25, wherein the objects for training represents one or more lanes, andwherein the distance loss is a clustering loss represented as a following equation

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Stradvision, Inc.
Original Assignee
Stradvision, Inc.
Inventors
Kim, Kye-Hyeon, Kim, Yongjoong, Kim, Insu, Kim, Hak-Kyoung, Nam, Woonhyun, Boo, SukHoon, Sung, Myungchul, Yeo, Donghun, Ryu, Wooju, Jang, Taewoong, Jeong, Kyungjoong, Je, Hongmo, Cho, Hojin
Primary Examiner(s)
Akhavannik, Hadi

Application Number

US16/257,993
Time in Patent Office

375 Days
Field of Search

None
US Class Current
CPC Class Codes

G05D 1/0221   involving a learning process

G05D 1/0246   using a video camera in com...

G06F 18/214   Generating training pattern...

G06N 20/00   Machine learning

G06N 3/045   Combinations of networks

G06N 3/08   Learning methods

G06N 3/084   Backpropagation, e.g. using...

G06N 5/046   Forward inferencing; Produc...

G06T 2207/20084   Artificial neural networks ...

G06T 2207/30261   Obstacle

G06T 7/11   Region-based segmentation

G06T 7/194   involving foreground-backgr...

G06T 7/70   Determining position or ori...

G06V 10/764   using classification, e.g. ...

G06V 10/82   using neural networks

G06V 20/58   Recognition of moving objec...

G06V 20/582   of traffic signs

Learning method and learning device for improving segmentation performance to be used for detecting road user events using double embedding configuration in multi-camera system and testing method and testing device using the same

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

7 Citations

30 Claims

Specification

Solutions

Use Cases

Quick Links

Learning method and learning device for improving segmentation performance to be used for detecting road user events using double embedding configuration in multi-camera system and testing method and testing device using the same

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

7 Citations

30 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links