Joint depth estimation and semantic segmentation from a single image
First Claim
1. A method of performing joint depth estimation and semantic labeling of an image by one or more computing devices, the method comprising:
- estimating global semantic and depth layouts of a scene of the image through machine learning by the one or more computing devices;
estimating local semantic and depth layouts for respective ones of a plurality of segments of the scene of the image through machine learning by the one or more computing devices, the local depth layouts including relative depth values for an individual pixel in the image representing a depth of the individual pixel in relation to other pixels; and
merging the estimated global semantic and depth layouts with the estimated local semantic and depth layouts by the one or more computing devices to semantically label and assign a depth value to the individual pixels in the image.
2 Assignments
0 Petitions
Accused Products
Abstract
Joint depth estimation and semantic labeling techniques usable for processing of a single image are described. In one or more implementations, global semantic and depth layouts are estimated of a scene of the image through machine learning by the one or more computing devices. Local semantic and depth layouts are also estimated for respective ones of a plurality of segments of the scene of the image through machine learning by the one or more computing devices. The estimated global semantic and depth layouts are merged with the local semantic and depth layouts by the one or more computing devices to semantically label and assign a depth value to individual pixels in the image.
-
Citations
20 Claims
-
1. A method of performing joint depth estimation and semantic labeling of an image by one or more computing devices, the method comprising:
-
estimating global semantic and depth layouts of a scene of the image through machine learning by the one or more computing devices; estimating local semantic and depth layouts for respective ones of a plurality of segments of the scene of the image through machine learning by the one or more computing devices, the local depth layouts including relative depth values for an individual pixel in the image representing a depth of the individual pixel in relation to other pixels; and merging the estimated global semantic and depth layouts with the estimated local semantic and depth layouts by the one or more computing devices to semantically label and assign a depth value to the individual pixels in the image. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 20)
-
-
12. A system comprising:
one or more computing devices implemented at least partially in hardware, the one or more computing devices configured to perform operations comprising; estimating global semantic and depth layouts of a scene of an image through machine learning; decomposing the image into a plurality of segments; guiding a prediction of local semantic and depth layout of individual ones of the plurality of segments using the estimated global and semantic depth layouts of the scene, the local depth layout including relative depth values for an individual pixel in the image representing a depth of the individual pixel in relation to other pixels; and jointly forming a semantically-labeled version of the image in which the individual pixels are assigned a semantic label and a depth map of the image in which the individual pixels are assigned a depth value. - View Dependent Claims (13, 14, 15)
-
16. A system comprising:
-
a global determination module implemented at least partially in hardware, the global determination module for estimating global semantic and depth layouts of a scene of an image through machine learning; a local determination module implemented at least partially in hardware, the local determination module for estimating local semantic and depth layouts for respective ones of a plurality of segments of the scene of the image through machine learning, the local depth layouts including relative depth values for an individual pixel in the image representing a depth of the individual pixel in relation to other pixels; and a merge calculation module implemented at least partially in hardware, the merge calculation module for merging the estimated global semantic and depth layouts with the local semantic and depth layouts to semantically label and assign a depth value to the individual pixels in the image. - View Dependent Claims (17, 18, 19)
-
Specification