Multi-modal method for locating objects in images
First Claim
Patent Images
1. A method for locating objects in images, comprising:
- tracking designated objects in the images using a plurality of channels during a first number of frames, the objects comprised of one or more features, each of said channels producing an independent representation comprising perceived locations of said one or more features;
determining a general score for each channel;
selecting, based on said general scores, at least one channel for additional tracking;
tracking the objects using said at least one channel during a second number of frames, each said at least one channel producing an independent representation comprising perceived locations of said one or more features; and
combining said independent representations to produce a tracked output.
1 Assignment
0 Petitions
Accused Products
Abstract
A multi-modal method for locating objects in images wherein a tracking analysis is first performed using a plurality of channels which may comprise a shape channel, a color channel, and a motion channel. After a predetermined number of frames, intermediate feature representations are obtained from each channel and evaluated for reliability. Based on the evaluation of each channel, one or more channels are selected for additional tracking. The results of all representations are ultimately integrated into a final tracked output. Additionally, any of the channels may be calibrated using initial results obtained from one or more channels.
248 Citations
27 Claims
-
1. A method for locating objects in images, comprising:
-
tracking designated objects in the images using a plurality of channels during a first number of frames, the objects comprised of one or more features, each of said channels producing an independent representation comprising perceived locations of said one or more features; determining a general score for each channel; selecting, based on said general scores, at least one channel for additional tracking; tracking the objects using said at least one channel during a second number of frames, each said at least one channel producing an independent representation comprising perceived locations of said one or more features; and combining said independent representations to produce a tracked output. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method for locating objects in images, comprising:
-
tracking the objects during a first number of frames only using a channel programmed to perform a shape analysis and to produce calibrating data based on said analysis, said shape channel producing independent representations comprising perceived locations of the objects; producing calibrating data by said shape channel after the passage of said first number of frames; tracking the objects during a second number of frames using a channel programmed to perform a color analysis, said color channel calibrated using said calibrating data obtained by said shape channel, said color channel producing independent representations comprising perceived locations of the objects. - View Dependent Claims (15, 16)
-
-
14. A method for locating objects in images, comprising:
-
tracking the objects during a first number of frames only using a channel programmed to perform a motion analysis and to produce calibrating data based on said analysis, said motion channel producing independent representations comprising perceived locations of the objects; producing calibrating data by said motion channel after the passage of said first number of frames; and tracking the objects during a second number of frames using a second channel programmed to perform a color analysis, said second channel calibrated using said calibrating data obtained by said motion channel, said color channel producing independent representations comprising perceived locations of the objects. - View Dependent Claims (17, 18)
-
-
19. A method for locating heads and faces in images, comprising:
-
tracking the heads and faces during a first number of frames using a plurality of channels; obtaining an independent intermediate feature representation from each of said plurality of channels after the passage of said first number of frames, said independent intermediate feature representations comprising data comprising perceived locations of head or facial features; running a first n-gram search using said independent intermediate feature representations, wherein a measure of confidence is computed for each of said features and combinations of features within said independent intermediate feature representations, and wherein a general score is assigned to each channel based on said measures of confidence; selecting one or more channels for additional tracking, said selection based on said general scores assigned to each channel; tracking the heads and faces during a second number of frames using said one or more selected channels; obtaining further independent feature representations from each of said one or more channels, each further independent feature representation comprising data comprising perceived locations of head or facial features; and running a second n-gram search wherein said further independent feature representations are integrated into said independent intermediate feature representations to produce a tracked output. - View Dependent Claims (20, 21)
-
-
22. A method for locating heads and faces within images, comprising:
-
tracking the images for a first number of frames using a plurality of channels; obtaining, after the passage of said first number of frames, independent intermediate feature representations from each of said plurality of channels; evaluating said independent intermediate feature representations, said evaluation step used to determine a level of reliability for each of said plurality of channels; selecting, based on said determination of said reliability for each of said plurality of channels, one or more channels for additional tracking; tracking the images for a second number of frames using said selected one or more channels; obtaining further independent feature representations from said selected one or more channels after the passage of said second number of frames; and combining said independent intermediate feature representations and said further independent feature representations into a net representation of likely head and facial locations. - View Dependent Claims (23, 24)
-
-
25. A method for locating objects in images, comprising:
-
tracking the objects during a first number of frames using only a first channel programmed to perform a shape analysis and a second channel programmed to perform a motion analysis, said first and second channels producing calibrating data based on said analyses, said first and second channels each producing independent representations comprising perceived locations of the object; producing calibrating data by said first and second channels after the passage of said first number of frames; tracking the objects during a second number of frames using a channel programmed to perform a color analysis, said color channel calibrated using said calibrating data obtained by said first and second channels, said color channel producing independent representations comprising perceived locations of the objects. - View Dependent Claims (26, 27)
-
Specification