Learning method, learning device using regression loss and testing method, testing device using the same

US 10,311,321 B1
Filed: 10/26/2018
Issued: 06/04/2019
Est. Priority Date: 10/26/2018
Status: Active Grant

First Claim

Patent Images

1. A method for learning one or more parameters of a CNN based on one or more regression losses, comprising steps of:

(a) learning device instructing a first convolutional layer to an n-th convolutional layer to respectively and sequentially generate a first encoded feature map to an n-th encoded feature map from at least one input image as a training image;

(b) the learning device instructing an n-th deconvolutional layer to a first deconvolutional layer to sequentially generate an n-th decoded feature map to first decoded feature map from the n-th encoded feature map;

(c) the learning device, on condition that each cell of a grid with a plurality of rows and a plurality of columns is generated by dividing at least one specific decoded feature map, among the n-th decoded feature map to the first decoded feature map, with respect to a first direction and a second direction, wherein the first direction is in a direction of the rows of the specific decoded feature map and the second direction is in a direction of the columns thereof, generating at least one obstacle segmentation result representing each of specific rows, where each of bottom lines of each of nearest obstacles is estimated as being located per each of the columns, by referring to at least one feature of at least part of the n-th decoded feature map to the first decoded feature map;

(d) the learning device generating the regression losses referring to each of respective differences of distances between (i) each location of exact rows where each of the bottom lines of each of the nearest obstacles is truly located per each of the columns on at least one GT, for each of the columns, and (ii) each location of the specific rows where each of the bottom lines of each of the nearest obstacles is estimated as being located per each of the columns on the obstacle segmentation result; and

(e) the learning device backpropagating the regression losses, to thereby learn the parameters of the CNN.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for learning parameters of a CNN based on regression losses is provided. The method includes steps of: a learning device instructing a first to an n-th convolutional layers to generate a first to an n-th encoded feature maps; instructing an n-th to a first deconvolutional layers to generate an n-th to a first decoded feature maps from the n-th encoded feature map; generating an obstacle segmentation result by referring to a feature of the decoded feature maps; generating the regression losses by referring to differences of distances between each location of the specific rows, where bottom lines of nearest obstacles are estimated as being located per each of columns of a specific decoded feature map, and each location of exact rows, where the bottom lines are truly located per each of the columns on a GT; and backpropagating the regression losses, to thereby learn the parameters.

20 Citations

View as Search Results

28 Claims

1. A method for learning one or more parameters of a CNN based on one or more regression losses, comprising steps of:
- (a) learning device instructing a first convolutional layer to an n-th convolutional layer to respectively and sequentially generate a first encoded feature map to an n-th encoded feature map from at least one input image as a training image;
  
  (b) the learning device instructing an n-th deconvolutional layer to a first deconvolutional layer to sequentially generate an n-th decoded feature map to first decoded feature map from the n-th encoded feature map;
  
  (c) the learning device, on condition that each cell of a grid with a plurality of rows and a plurality of columns is generated by dividing at least one specific decoded feature map, among the n-th decoded feature map to the first decoded feature map, with respect to a first direction and a second direction, wherein the first direction is in a direction of the rows of the specific decoded feature map and the second direction is in a direction of the columns thereof, generating at least one obstacle segmentation result representing each of specific rows, where each of bottom lines of each of nearest obstacles is estimated as being located per each of the columns, by referring to at least one feature of at least part of the n-th decoded feature map to the first decoded feature map;
  
  (d) the learning device generating the regression losses referring to each of respective differences of distances between (i) each location of exact rows where each of the bottom lines of each of the nearest obstacles is truly located per each of the columns on at least one GT, for each of the columns, and (ii) each location of the specific rows where each of the bottom lines of each of the nearest obstacles is estimated as being located per each of the columns on the obstacle segmentation result; and
  
  (e) the learning device backpropagating the regression losses, to thereby learn the parameters of the CNN.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein, at the step of (c), the learning device calculates one or more softmax losses by referring to (i) each location and each probability of the specific rows having each of highest possibilities of each of the bottom lines of each of the nearest obstacles being present per each of the columns according to the obstacle segmentation result, and (ii) their corresponding GTs, andwherein, at the step of (e), one or more integrated losses are generated by applying each of respective weights to the softmax losses and the regression losses, and then the integrated losses are backpropagated.
  - 3. The method of claim 1, wherein, at the step of (c), the obstacle segmentation result is generated by a softmax operation which normalizes each value corresponding to each of the rows per each of the columns, andwherein, at the step of (d), the obstacle segmentation result is adjusted to reduce each of differences between (i) each of probability values of each of the specific rows per each of the columns and (ii) each of probability values of neighboring rows per each of the columns within a certain distance from each of the specific rows per each of the columns, by using one or more regression operations.
  - 4. The method of claim 1, wherein, at the step of (d), the regression losses are generated by referring to each of the respective differences of distances between (i) each location of the specific rows having each of highest scores per each of the columns on the obstacle segmentation result and (ii) each location of the exact rows having each of highest scores per each of the columns on the GT.
  - 5. The method of claim 1, wherein the step of (c) includes steps of:
    - (c1) the learning device, supposing that each cell of the grid has been generated by dividing the at least one decoded feature map with respect to the first direction by first intervals and with respect to the second direction by second intervals, concatenating each of features of each of the rows per each of the columns in a direction of a channel, to thereby generate at least one reshaped feature map; and
      
      (c2) the learning device generating the obstacle segmentation result which represents where each of the bottom lines of each of the nearest obstacles is estimated as being located among the rows for each of the columns by referring to the reshaped feature map via checking each of estimated positions of each of the bottom lines of each of the nearest obstacles among concatenated channels for each of the columns, wherein the obstacle segmentation result is generated by a softmax operation which normalizes each value corresponding to each channel per each of the columns.
  - 6. The method of claim 1, wherein each of the columns includes one or more pixels in the first direction, and each of the rows includes one or more pixels in the second direction.
  - 7. The method of claim 1, wherein the GT includes information representing on which row each of the bottom lines of each of the nearest obstacles is truly located among the rows, per each of the columns, resulting from dividing the input image into N_crows, and wherein the obstacle segmentation result represents on which row each of the bottom lines of each of the nearest obstacles is estimated as being located among the rows, per each of the columns, resulting from dividing the input image into N_crows.

8. A testing method by using a CNN capable of detecting one or more nearest obstacles based on one or more regression losses, comprising steps of:
- (a) a testing device acquiring at least one test image as at least one input image, on condition that the learning device performed processes of (i) instructing a first convolutional layer an n-th convolutional layer to respectively and sequentially generate a first encoded feature map for training to an n-th encoded feature map for training from at least one training image, (ii) instructing an n-th deconvolutional layer to a first deconvolutional laver to sequentially generate an n-th decoded feature map for training to a first decoded feature map for training from the n-th encoded feature map for training, (iii) assuming that each cell of a grid with a plurality of rows and a plurality of columns is generated by dividing at least one specific decoded feature map for training, among the n-th decoded feature map for training to the first decoded feature map for training, with respect to a first direction and a second direction, wherein the first direction is in a direction of the rows of the specific decoded feature map for training and the second direction is in a direction of the columns thereof, generating at least one obstacle segmentation result for training representing each of specific rows for training, where each of bottom lines of each of nearest obstacles for training is estimated as being located per each of the columns, by referring to at least one feature of at least part of the n-th decoded feature map for training to the first decoded feature map for training, (iv) generating the regression losses by referring to each of respective differences of distances between (iv-1) each location of exact rows where each of the bottom lines of each of the nearest obstacles for training is truly located per each of the columns on at least one GT, for each of the columns, and (iv-2) each location of the specific rows for training where each of the bottom lines of each of the nearest obstacles for training is estimated as being located per each of the columns on the obstacle segmentation result for training, and (v) backpropagating the regression losses, to thereby learn one or more parameters of the CNN;
  
  (b) the testing device instructing the first convolutional layer to the n-th convolutional layer to respectively and sequentially generate a first encoded feature map for testing to an n-th encoded feature map for testing from the test image;
  
  (c) the testing device instructing the n-th deconvolutional layer to the first deconvolutional layer to sequentially generate an n-th decoded feature map for testing to a first decoded feature map for testing from the n-th encoded feature map for testing; and
  
  (d) the testing device, assuming that each cell of a grid with a plurality of rows and a plurality of columns is generated by dividing at least one specific decoded feature map for testing, among the n-th decoded feature map for testing to the first decoded feature map for testing, with respect to the first direction and the second direction, wherein the first direction is in a direction of the rows of the specific decoded feature map for testing and the second direction is in a direction of the columns thereof, generating at least one obstacle segmentation result for testing representing each of specific rows for testing, where each of bottom lines of each of nearest obstacles for testing is estimated as being located per each of the columns, by referring to at least one feature of at least part of the n-th decoded feature map for testing to the first decoded feature map for testing.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The method of claim 8, wherein, at the process of (iii), the learning device has calculated one or more softmax losses by referring to (i) each location and each probability of the specific rows for training having each of highest possibilities of each of the bottom lines of each of the nearest obstacles for training being present per each of the columns according to the obstacle segmentation result for training, and (ii) their corresponding GTs, andwherein, at the process of (v), one or more integrated losses has been generated by applying each of respective weights to the softmax losses and the regression losses, and then the integrated losses are backpropagated.
  - 10. The method of claim 8, wherein, at the process of (iii), the obstacle segmentation result for training has been generated by a softmax operation which normalizes each value corresponding to each of the rows per each of the columns, andwherein, at the process of (iv), the obstacle segmentation result for training has been adjusted to reduce each of differences between (i) each of probability values of each of the specific rows for training per each of the columns and (ii) each of probability values of neighboring rows per each of the columns within a certain distance from each of the specific rows for training per each of the columns, by using one or more regression operations.
  - 11. The method of claim 8, wherein, at the process of (iv), the regression losses have been generated by referring to each of the respective differences of distances between (i) each location of the specific rows for training having each of highest scores per each of the columns on the obstacle segmentation result for training and (ii) each location of the exact rows having each of highest scores per each of the columns on the GT.
  - 12. The method of claim 8, wherein the step of (d) includes steps of:
    - (d1) the testing device, supposing that each cell of the grid has been generated by dividing the at least one decoded feature map for testing with respect to the first direction by first intervals and with respect to the second direction by second intervals, concatenating each of features for testing of each of the rows per each of the columns in a direction of a channel, to thereby generate at least one reshaped feature map for testing; and
      
      (d2) the testing device generating the obstacle segmentation result for testing which represents where each of the bottom lines of each of the nearest obstacles for testing is estimated as being located among the rows for each of the columns by referring to the reshaped feature map for testing via checking each of estimated positions of each of the bottom lines of each of the nearest obstacles for testing among concatenated channels for each of the columns, wherein the obstacle segmentation result for testing is generated by a softmax operation which normalizes each value corresponding to each channel per each of the columns.
  - 13. The method of claim 8, wherein each of the columns includes one or more pixels in the first direction, and each of the rows includes one or more pixels in the second direction.
  - 14. The method of claim 8, wherein the GT includes information representing on which row each of the bottom lines of each of the nearest obstacles for training is truly located among the rows, per each of the columns, resulting from dividing the training image into N_crows, and wherein the obstacle segmentation result for training represents on which row each of the bottom lines of each of the nearest obstacles for training is estimated as being located among the rows, per each of the columns, resulting from dividing the training image into N_crows.

15. A learning device for learning one or more parameters of a CNN based on one or more regression losses, comprising a processor configured to perform the processes of:
- acquiring at least one input image as a training image; and
  
  a processor for performing processes of (I) instructing a first convolutional layer to an n-th convolutional layer to respectively and sequentially generate a first encoded feature map to an n-th encoded feature map from the input, (II) instructing an n-th deconvolutional layer to a first deconvolutional layer to sequentially generate an n-th decoded feature map to a first decoded feature map from the n-th encoded feature map, (III) on condition that each cell of a grid with a plurality of rows and a plurality of columns is generated by dividing at least one specific decoded feature map, among the n-th decoded feature map to the first decoded feature map, with respect to a first direction and a second direction, wherein the first direction is in a direction of the rows of the specific decoded feature map and the second direction is in a direction of the columns thereof, generating at least one obstacle segmentation result representing each of specific rows, where each of bottom lines of each of nearest obstacles is estimated as being located per each of the columns, by referring to at least one feature of at least part of the n-th decoded feature map to the first decoded feature map, (IV) generating the regression losses by referring to each of respective differences of distances between (i) each location of exact rows where each of the bottom lines of each of the nearest obstacles is truly located per each of the columns on at least one ground truth (GT), for each of the columns, and (ii) each location of the specific rows where each of the bottom lines of each of the nearest obstacles is estimated as being located per each of the columns on the obstacle segmentation result, and (V) backpropagating the regression losses, to thereby learn the parameters of the CNN.
- View Dependent Claims (16, 17, 18, 19, 20, 21)
- - 16. The learning device of claim 15, wherein, at the process of (III), the processor calculates one or more softmax losses by referring to (i) each location and each probability of the specific rows having each of highest possibilities of each of the bottom lines of each of the nearest obstacles being present per each of the columns according to the obstacle segmentation result, and (ii) their corresponding GTs, andwherein, at the process of (V), one or more integrated losses are generated by applying each of respective weights to the softmax losses and the regression losses, and then the integrated losses are backpropagated.
  - 17. The learning device of claim 15, wherein, at the process of (III), the obstacle segmentation result is generated by a softmax operation which normalizes each value corresponding to each of the rows per each of the columns, andwherein, at the process of (IV), the obstacle segmentation result is adjusted to reduce each of differences between (i) each of probability values of each of the specific rows per each of the columns and (ii) each of probability values of neighboring rows per each of the columns within a certain distance from each of the specific rows per each of the columns, by using one or more regression operations.
  - 18. The learning device of claim 15, wherein, at the process of (IV), the regression losses are generated by referring to each of the respective differences of distances between (i) each location of the specific rows having each of highest scores per each of the columns on the obstacle segmentation result and (ii) each location of the exact rows having each of highest scores per each of the columns on the GT.
  - 19. The learning device of claim 15, wherein the process of (III) includes processes of:
    - (III-1) supposing that each cell of the grid has been generated by dividing the at least one decoded feature map with respect to the first direction by first intervals and with respect to the second direction by second intervals, concatenating each of features of each of the rows per each of the columns in a direction of a channel, to thereby generate at least one reshaped feature map; and
      
      (III -2) generating the obstacle segmentation result which represents where each of the bottom lines of each of the nearest obstacles is estimated as being located among the rows for each of the columns by referring the reshaped feature map via checking each of estimated positions of each of the bottom lines of each of the nearest obstacles among concatenated channels for each of the columns, wherein the obstacle segmentation result is generated by a softmax operation which normalizes each value corresponding to each channel per each of the columns.
  - 20. The learning device of claim 15, wherein each of the columns includes one or more pixels in the first direction, and each of the rows includes one or more pixels in the second direction.
  - 21. The learning device of claim 15, wherein the GT includes information representing on which row each of the bottom lines of each of the nearest obstacles is truly located among the rows, per each of the columns, resulting from dividing the input image into N_crows, and wherein the obstacle segmentation result represents on which row each of the bottom lines of each of the nearest obstacles is estimated as being located among the rows, per each of the columns, resulting from dividing the input image into N_crows.

22. A testing device by using a CNN capable of detecting one or more nearest obstacles based on one or more regression losses, comprising a learning device with a processor configured to perform the processes of:
- acquiring at least one test image as at least one input image, on condition that the learning device has performed processes of(i) instructing a first convolutional layer to an n-th convolutional layer to respectively and sequentially generate a first encoded feature map for training to an n-th encoded feature map for training from at least one training image,(ii) instructing an n-th deconvolutional layer to a first deconvolutional layer to sequentially generate an n-th decoded feature map for training to a first decoded feature map for training from the n-th encoded feature map for training,(iii) assuming that each cell of a grid with a plurality of rows and a plurality of columns is generated by dividing at least one specific decoded 5 feature map for training, among the n-th decoded feature map for training to the first decoded feature map for training, with respect to a first direction and a second direction, wherein the first direction is in a direction of the rows of the specific decoded feature map for training and the second direction is in a direction of the columns thereof, generating at least one obstacle segmentation result for training representing each of specific rows for training, where each of bottom lines of each of nearest obstacles for training is estimated as being located per each of the columns, by referring to at least one feature of at least part of the n-th decoded feature map for training to the first decoded feature map for training,(iv) generating the regression losses by referring to each of respective differences of distances between(iv-1) each location of exact rows where each of the bottom lines of each of the nearest obstacles for training is truly located per each of the columns on at least one ground truth (GT), for each of the columns, and(iv-2) each location of the;
  
  specific rows for training where each of the bottom lines of each of the nearest obstacles for training is estimated as being located per each of the columns on the obstacle segmentation result for training, and(v) backpropagating the regression losses, to thereby learn one or more parameters of the CNN; and
  
  (I) instructing the first convolutional layer to the n-th convolutional layer to respectively and sequentially generate a first encoded feature map for testing to an n-th encoded feature map for testing from the test image,(II) instructing the n-th deconvolutional layer to the first deconvolutional layer to sequentially generate an n-th decoded feature map for testing to a first decoded feature map for testing from the n-th encoded feature map for testing, and(III) assuming that each cell of a grid with a plurality of rows and a plurality of columns is generated by dividing at least one specific decoded feature map for testing, among the n-th decoded feature map for testing to the first decoded feature map for testing, with respect to the first direction and the second direction, wherein the first direction is in a direction of the rows of the specific decoded feature map for testing and the second direction is in a direction of the columns thereof, generating at least one obstacle segmentation result for testing representing each of specific rows for testing, where each of bottom lines of each of nearest Obstacles for testing is estimated as being located per each of the columns, by referring to at least one feature of at least part of the n-th decoded feature map for testing to the first decoded feature map for testing.
- View Dependent Claims (23, 24, 25, 26, 27, 28)
- - 23. The testing device of claim 22, wherein, at the process of (iii), the learning device has calculated one or more softmax losses by referring to (i) each location and each probability of the specific rows for training having each of highest possibilities of each of the bottom lines of each of the nearest obstacles for training being present per each of the columns according to the obstacle segmentation result for training, and (ii) their corresponding GTs, andwherein, at the process of (v), one or more integrated losses has been generated by applying each of respective weights to the softmax losses and the regression losses, and then the integrated losses are backpropagated.
  - 24. The testing device of claim 22, wherein, at the process of (iii), the obstacle segmentation result for training has been generated by a softmax operation which normalizes each value corresponding to each of the rows per each of the columns, andwherein, at the process of (iv), the obstacle segmentation result for training has been adjusted to reduce each of differences between (i) each of probability values of each of the specific rows for training per each of the columns. and (ii) each of probability values of neighboring rows per each of the columns within a certain distance from each of the specific rows for training per each of the columns, by using one or more regression operations.
  - 25. The testing device of claim 22, wherein, at the process of (iv), the regression losses have been generated by referring to each of the respective differences of distances between (i) each location of the specific rows for training having each of highest scores per each of the columns on the obstacle segmentation result for training and (ii) each location of the exact rows having each of highest scores per each of the columns on the GT.
  - 26. The testing device of claim 22, wherein the process of (III) includes steps of:
    - (III-1) supposing that each cell of the grid has been generated by dividing the at least one decoded feature map for testing with respect to the first direction by first intervals and with respect to the second direction by second intervals, concatenating each of features for testing of each of the rows per each of the columns in a direction of a channel, to thereby generate at least one reshaped feature map for testing; and
      
      (III-2) generating the obstacle segmentation result for testing which represents where each of the bottom lines of each of the nearest obstacles for testing is estimated as being located among the rows for each of the columns by referring to the reshaped feature map for testing via checking each of estimated positions of each of the bottom lines of each of the nearest obstacles for testing among concatenated channels for each of the columns, wherein the obstacle segmentation result for testing is generated by softmax operation which normalizes each value corresponding to each channel per each of the columns.
  - 27. The testing device of claim 22, wherein each of the columns includes one or more pixels in the first direction, and each of the rows includes one or more pixels in the second direction.
  - 28. The testing device of claim 22, wherein the GT includes information representing on which row each of the bottom lines of each of the nearest obstacles for training is truly located among the rows, per each of the columns, resulting from dividing the training image into N_crows, and wherein the obstacle segmentation result for training represents on which row each of the bottom lines of each of the nearest obstacles for training is estimated as being located among the rows, per each of the columns, resulting from dividing the training image into N_crows.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Stradvision, Inc.
Original Assignee
Stradvision, Inc.
Inventors
Kim, Kye-Hyeon, Kim, Yongjoong, Kim, Insu, Kim, Hak-Kyoung, Nam, Woonhyun, Boo, SukHoon, Sung, Myungchul, Yeo, Donghun, Ryu, Wooju, Jang, Taewoong, Jeong, Kyungjoong, Je, Hongmo, Cho, Hojin
Primary Examiner(s)
Smith, Paulinho E

Application Number

US16/171,601
Time in Patent Office

221 Days
Field of Search

None
US Class Current
CPC Class Codes

G06F 17/18   for evaluating statistical ...

G06F 18/2414   Smoothing the distance, e.g...

G06N 3/04   Architecture, e.g. intercon...

G06N 3/045   Combinations of networks

G06N 3/084   Backpropagation, e.g. using...

G06N 5/046   Forward inferencing; Produc...

G06V 10/454   Integrating the filters int...

G06V 10/764   using classification, e.g. ...

G06V 10/82   using neural networks

G06V 20/58   Recognition of moving objec...

Learning method, learning device using regression loss and testing method, testing device using the same

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

20 Citations

28 Claims

Specification

Solutions

Use Cases

Quick Links

Learning method, learning device using regression loss and testing method, testing device using the same

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

20 Citations

28 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links