Digital life recorder implementing enhanced facial recognition subsystem for acquiring face glossary data
First Claim
1. A computer implemented method for identifying an individual in an image, the computer implemented method comprising:
- capturing audio data and video data by a digital life recorder comprising a plurality of cameras positioned on a user, a plurality of microphones positioned on the user, a set of headphones positioned on the user, and a display device, wherein the display device is a mobile device and wherein the video data includes a continuous stream of images;
extracting, from the continuous stream of images, data that includes a set of facial frames;
responsive to capturing the data that includes the set of facial frames, automatically identifying, by a processing unit of the digital life recorder, an individual face in the set of facial frames based on metadata associated with the set of facial frames to form an identification, wherein the metadata associated with the set of facial frames includes a first time a facial frame in the set of facial frames including the individual face was captured by a camera in the plurality of cameras;
indexing the individual face identified in a glossary based on the metadata associated with the set of facial frames;
responsive to identifying the individual face, displaying the individual face and the identification of the individual face on the display device of the digital life recorder and requesting confirmation of the identification of the individual face from the user;
extracting a set of voice commands spoken by the user from the audio data, wherein extracting the set of voice commands spoken by the user from the audio data comprises;
recognizing a voice as that of the user and filtering the set of voice commands from the audio data captured;
identifying a second time a first voice command in the set of voice commands was spoken by the user;
executing the first voice command to identify an individual face in the set of facial frames by matching the second time the first voice command was spoken by the user with the first time the facial frame in the set of facial frames was captured by the camera, wherein the first voice command includes an identification from the user of the individual face in the set of facial frames;
executing a second voice command in the set of voice commands to control the capturing of the audio data and the video data by the plurality of cameras and the plurality of microphones of the digital life recorder positioned on the user;
obtaining feedback from the user about the capturing of the audio data and the video data by the digital life recorder using the set of headphones; and
using the feedback obtained to further control the capturing of the audio data and the video data by the digital life recorder.
1 Assignment
0 Petitions
Accused Products
Abstract
Identifying individual facial images that are broadcast to enable optimized indexing and storage of facial information. Frames of data including faces are continually captured from a stream of incoming data. The facial frame data is extracted and processed into individual facial images. The individual facial images may be compared to existing facial image data in a database or cache to determine the identity of a facial image. The individual facial images may also be compared to facial images and metadata describing the facial images that are broadcast from external recording subsystems. The individual facial images stored to the glossary may be indexed based on the metadata received in the broadcast from an external recording subsystem or by metadata received from the continuous face frame capture.
81 Citations
17 Claims
-
1. A computer implemented method for identifying an individual in an image, the computer implemented method comprising:
-
capturing audio data and video data by a digital life recorder comprising a plurality of cameras positioned on a user, a plurality of microphones positioned on the user, a set of headphones positioned on the user, and a display device, wherein the display device is a mobile device and wherein the video data includes a continuous stream of images; extracting, from the continuous stream of images, data that includes a set of facial frames; responsive to capturing the data that includes the set of facial frames, automatically identifying, by a processing unit of the digital life recorder, an individual face in the set of facial frames based on metadata associated with the set of facial frames to form an identification, wherein the metadata associated with the set of facial frames includes a first time a facial frame in the set of facial frames including the individual face was captured by a camera in the plurality of cameras; indexing the individual face identified in a glossary based on the metadata associated with the set of facial frames; responsive to identifying the individual face, displaying the individual face and the identification of the individual face on the display device of the digital life recorder and requesting confirmation of the identification of the individual face from the user; extracting a set of voice commands spoken by the user from the audio data, wherein extracting the set of voice commands spoken by the user from the audio data comprises; recognizing a voice as that of the user and filtering the set of voice commands from the audio data captured; identifying a second time a first voice command in the set of voice commands was spoken by the user; executing the first voice command to identify an individual face in the set of facial frames by matching the second time the first voice command was spoken by the user with the first time the facial frame in the set of facial frames was captured by the camera, wherein the first voice command includes an identification from the user of the individual face in the set of facial frames; executing a second voice command in the set of voice commands to control the capturing of the audio data and the video data by the plurality of cameras and the plurality of microphones of the digital life recorder positioned on the user; obtaining feedback from the user about the capturing of the audio data and the video data by the digital life recorder using the set of headphones; and using the feedback obtained to further control the capturing of the audio data and the video data by the digital life recorder. - View Dependent Claims (2, 3, 4, 5, 6, 17)
-
-
7. A computer program product comprising:
-
a computer readable storage medium tangibly embodying executable program instructions configured to identify an individual in an image; first program instructions configured to capture audio data and video data by a digital life recorder comprising a plurality of cameras positioned on a user, a plurality of microphones positioned on the user, a set of headphones positioned on the user, and a display device, wherein the display device is a mobile device and wherein the video data includes a continuous stream of images; second program instructions configured to extract from the continuous stream of images, data that includes a set of facial frames; third program instructions configured to automatically identify, responsive to capturing the data that includes the set of facial frames, an individual face in the set of facial frames based on metadata associated with the set of facial frames to form an identification, wherein the metadata associated with the set of facial frames includes a first time a facial frame in the set of facial frames including the individual face was captured by a camera in the plurality of cameras; fourth program instructions configured to index the individual face in a glossary based on the metadata associated with the set of facial frames; fifth program instructions configured to display, responsive to identifying the individual face, the individual face and the identification of the individual face on the display device of the digital life recorder and request confirmation of the identification of the individual face from the user; sixth program instructions configured to extract a set of voice commands spoken by the user from the audio data, wherein the sixth program instructions comprise; program instructions configured to recognize a voice as that of the user and filtering the set of voice commands from the audio data captured; seventh program instructions configured to identify a second time a first voice command in the set of voice commands was spoken by the user; eighth program instructions configured to execute the first voice command to identify an individual face in the set of facial frames by matching the second time the first voice command was spoken by the user with the first time the facial frame in the set of facial frames was captured by the camera, wherein the first voice command includes an identification from the user of the individual face in the set of facial frames; ninth program instructions configured to execute a second voice command in the set of voice commands to control the capture of the audio data and the video data by the plurality of cameras and the plurality of microphones of the digital life recorder positioned on the user; tenth program instructions configured to obtain feedback from the user about the capturing of the audio data and the video data by the digital life recorder using the set of headphones; and eleventh program instructions configured to use the feedback obtained to further control the capturing of the audio data and the video data by the digital life recorder, wherein the first through eleventh program instructions are stored on the computer readable storage medium. - View Dependent Claims (8, 9, 10, 11)
-
-
12. An apparatus comprising:
-
a bus system; a memory connected to the bus system, wherein the memory includes a computer usable program code; and a processing unit connected to the bus system, wherein the processing unit is configured to execute the computer usable program code to; capture audio data and video data by a digital life recorder comprising a plurality of cameras positioned on a user, a plurality of microphones positioned on the user, a set of headphones positioned on the user, and a display device, wherein the display device is a mobile device and wherein the video data includes a continuous stream of images; extract, from the continuous stream of images, data that includes a set of facial frames; responsive to capturing the data that includes the set of facial frames, automatically identify an individual face in the set of facial frames based on metadata associated with the set of facial frames to form an identification, wherein the metadata associated with the set of facial frames includes a first time a facial frame in the set of facial frames including the individual face was captured by a camera in the plurality of cameras; index the individual face in a glossary based on the metadata associated with the set of facial frames; responsive to identifying the individual face, display the individual face and the identification of the individual face on the display device of the digital life recorder and request confirmation of the identification of the individual face from the user; extract a set of voice commands spoken by the user from the audio data, wherein in executing the computer usable program code to extract the set of voice commands spoken by the user from the audio data the processing unit is further configured to execute the computer usable program code to; recognize a voice as that of the user and filtering the set of voice commands from the audio data captured; identify a second time a first voice command in the set of voice commands was spoken by the user; execute the first voice command to identify an individual face in the set of facial frames by matching the second time the first voice command was spoken by the user with the first time the facial frame in the set of facial frames was captured by the camera, wherein the first voice command includes an identification from the user of the individual face in the set of facial frames; execute a second voice command in the set of voice commands to control the capture of the audio data and the video data by the plurality of cameras and the plurality of microphones of the digital life recorder positioned on the user; obtain feedback from the user about the capturing of the audio data and the video data by the digital life recorder using the set of headphones; and use the feedback obtained to further control the capturing of the audio data and the video data by the digital life recorder. - View Dependent Claims (13, 14, 15, 16)
-
Specification