Semantic Parsing of Objects in Video
First Claim
1. A method comprising:
- producing and storing a plurality of versions of an image of an object derived from a video input, said image cropped from said video input, and wherein each version of said image has a different resolution of said image of said object;
computing an appearance score at each of a plurality of regions on the lowest resolution version of said versions of said image for a plurality of semantic attributes with associated parts for said object, said appearance score for at least one semantic attribute of the plurality of semantic attributes for each region denoting a probability of each semantic attribute of the at least one semantic attribute appearing in the region;
analyzing increasingly higher resolution versions than the lowest resolution version to compute a resolution context score for each region in the lowest resolution version, said resolution context score being indicative of an extent to which finer spatial structure exists in the increasingly higher resolution versions than in the lowest resolution version for each region; and
ascertaining an optimized configuration of body parts and associated semantic attributes in the lowest resolution version, said ascertaining utilizing the appearance scores and the resolution context scores in the regions in the lowest resolution version.
2 Assignments
0 Petitions
Accused Products
Abstract
Techniques, systems, and computer program products for parsing objects in a video are provided herein. A method includes producing and storing a plurality of versions of an image of an object derived from a video input, wherein each version of said image has a different resolution of said image; computing an appearance score at each of a plurality of regions on the lowest resolution version of said image for a plurality of semantic attributes with associated parts for said object, said appearance score denoting a probability of each semantic attribute appearing in the region; analyzing increasingly higher resolution versions than the lowest resolution version to compute a resolution context score for each region in the lowest resolution version; and ascertaining an optimized configuration of body parts and associated semantic attributes in the lowest resolution version, said ascertaining utilizing the appearance scores and the resolution context scores.
-
Citations
20 Claims
-
1. A method comprising:
-
producing and storing a plurality of versions of an image of an object derived from a video input, said image cropped from said video input, and wherein each version of said image has a different resolution of said image of said object; computing an appearance score at each of a plurality of regions on the lowest resolution version of said versions of said image for a plurality of semantic attributes with associated parts for said object, said appearance score for at least one semantic attribute of the plurality of semantic attributes for each region denoting a probability of each semantic attribute of the at least one semantic attribute appearing in the region; analyzing increasingly higher resolution versions than the lowest resolution version to compute a resolution context score for each region in the lowest resolution version, said resolution context score being indicative of an extent to which finer spatial structure exists in the increasingly higher resolution versions than in the lowest resolution version for each region; and ascertaining an optimized configuration of body parts and associated semantic attributes in the lowest resolution version, said ascertaining utilizing the appearance scores and the resolution context scores in the regions in the lowest resolution version. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer program product comprising:
a computer readable storage medium having computer readable program code embodied in the storage medium, said computer readable program code containing instructions that perform a method for estimating parts and attributes of an object in video, said method comprising; producing and storing a plurality of versions of an image of an object derived from a video input, said image cropped from said video input, and wherein each version of said image has a different resolution of said image of said object; computing an appearance score at each of a plurality of regions on the lowest resolution version of said versions of said image for a plurality of semantic attributes with associated parts for said object, said appearance score for at least one semantic attribute of the plurality of semantic attributes for each region denoting a probability of each semantic attribute of the at least one semantic attribute appearing in the region; analyzing increasingly higher resolution versions than the lowest resolution version to compute a resolution context score for each region in the lowest resolution version, said resolution context score being indicative of an extent to which finer spatial structure exists in the increasingly higher resolution versions than in the lowest resolution version for each region; and ascertaining an optimized configuration of body parts and associated semantic attributes in the lowest resolution version, said ascertaining utilizing the appearance scores and the resolution context scores in the regions in the lowest resolution version. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
17. A computer system comprising a processor and a computer readable memory unit coupled to the processor, said computer readable memory unit containing instructions that when run by the processor implement a method for estimating parts and attributes of an object in video, said method comprising:
-
producing and storing a plurality of versions of an image of an object derived from a video input, said image cropped from said video input, and wherein each version of said image has a different resolution of said image of said object; computing an appearance score at each of a plurality of regions on the lowest resolution version of said versions of said image for a plurality of semantic attributes with associated parts for said object, said appearance score for at least one semantic attribute of the plurality of semantic attributes for each region denoting a probability of each semantic attribute of the at least one semantic attribute appearing in the region; analyzing increasingly higher resolution versions than the lowest resolution version to compute a resolution context score for each region in the lowest resolution version, said resolution context score being indicative of an extent to which finer spatial structure exists in the increasingly higher resolution versions than in the lowest resolution version for each region; and ascertaining an optimized configuration of body parts and associated semantic attributes in the lowest resolution version, said ascertaining utilizing the appearance scores and the resolution context scores in the regions in the lowest resolution version. - View Dependent Claims (18)
-
-
19. A process for supporting computer infrastructure, said process comprising providing at least one support service for at lease one of creating, integrating, hosting, maintaining, and deploying computer-readable code in a computer system, wherein the code in combination with the computing system is capable of performing a method for estimating parts and attributes of an object in video, said method comprising:
-
producing and storing a plurality of versions of an image of an object derived from a video input, said image cropped from said video input, and wherein each version of said image has a different resolution of said image of said object; computing an appearance score at each of a plurality of regions on the lowest resolution version of said versions of said image for a plurality of semantic attributes with associated parts for said object, said appearance score for at least one semantic attribute of the plurality of semantic attributes for each region denoting a probability of each semantic attribute of the at least one semantic attribute appearing in the region; analyzing increasingly higher resolution versions than the lowest resolution version to compute a resolution context score for each region in the lowest resolution version, said resolution context score being indicative of an extent to which finer spatial structure exists in the increasingly higher resolution versions than in the lowest resolution version for each region; and ascertaining an optimized configuration of body parts and associated semantic attributes in the lowest resolution version, said ascertaining utilizing the appearance scores and the resolution context scores in the regions in the lowest resolution version. - View Dependent Claims (20)
-
Specification