Apparatus and methods for embedding metadata into video stream
First Claim
1. A computerized capture system for obtaining a multimedia streaming file, the system comprising:
- an imaging sensor configured to generate output signals conveying a series of images;
a sensor interface configured to obtain information from one or more sensors other than the imaging sensor, the obtained information being relevant to one or more images within the series of images, the one or more sensors other than the imaging sensor including a first sensor;
information storage configured to store a collection of potential sensor tags; and
a processor configured to;
generate an encoded video track that includes images from the series of images;
generate a sensor track that includes a first sensor record containing the obtained information;
generate a combined multimedia stream comprised of the encoded video track and the sensor track; and
store the combined multimedia stream in the information storage;
wherein;
the first sensor record comprises;
a header portion comprising a 32 bit tag field comprising a sensor tag selected from the potential sensor tags, the sensor tag identifying type of the obtained information;
a 32 bit type size field comprising an 8 bit value type field identifying a value type of a given value of the obtained information that is within the first sensor record;
an 8 bit item size field indicating size of the given value of the obtained information that is within the first sensor record; and
a 16 bit repeat field indicating a number of values of the obtained information that is within the first sensor record; and
a data portion comprising the values of the obtained information; and
wherein individual ones of the values of the obtained information correspond temporally to specific ones of the one or more images in the series of images.
3 Assignments
0 Petitions
Accused Products
Abstract
Apparatus and methods for combining metadata with video into a video stream using a 32-bit aligned payload, that is computer storage efficient and human discernable. The metadata is stored in a track in a self-describing structure. Metadata track may be decoded using an identifier reference table that is substantially smaller than typical fourCC identifier tables. The combined metadata/video stream is compatible with a standard video stream convention and may be played using conventional media player applications that reads media files compliant with MP4/MOV container format. The proposed format may enable decoding of metadata during streaming, partitioning of combined video stream without loss of metadata. The proposed format and/or metadata protocol provides for temporal synchronization of metadata with video frames.
118 Citations
20 Claims
-
1. A computerized capture system for obtaining a multimedia streaming file, the system comprising:
-
an imaging sensor configured to generate output signals conveying a series of images; a sensor interface configured to obtain information from one or more sensors other than the imaging sensor, the obtained information being relevant to one or more images within the series of images, the one or more sensors other than the imaging sensor including a first sensor; information storage configured to store a collection of potential sensor tags; and a processor configured to; generate an encoded video track that includes images from the series of images; generate a sensor track that includes a first sensor record containing the obtained information; generate a combined multimedia stream comprised of the encoded video track and the sensor track; and store the combined multimedia stream in the information storage; wherein; the first sensor record comprises; a header portion comprising a 32 bit tag field comprising a sensor tag selected from the potential sensor tags, the sensor tag identifying type of the obtained information; a 32 bit type size field comprising an 8 bit value type field identifying a value type of a given value of the obtained information that is within the first sensor record;
an 8 bit item size field indicating size of the given value of the obtained information that is within the first sensor record; and
a 16 bit repeat field indicating a number of values of the obtained information that is within the first sensor record; anda data portion comprising the values of the obtained information; and wherein individual ones of the values of the obtained information correspond temporally to specific ones of the one or more images in the series of images. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A non-transitory computer readable medium comprising a plurality of computer instructions configured to, when executed by a processor, decode sensor information from a multimedia stream by at least:
-
accessing one or more image frames from a video track of the multimedia stream, the one or more frames corresponding to a time interval; accessing a text track of the multimedia stream corresponding to the time interval, the accessing the text track comprises steps of; reading from the text track a 32-bit sensor tag field value; accessing a data store configured to store multiple sensor tags; identifying within the data store an entry corresponding to the sensor tag field value, the entry configured to identify one or more of type, origin, and/or meaning of the sensor information; reading from the text track a 32 bit type size field comprising 8 bit type portion configured to identify type of a given value of the sensor information within a sensor record;
8 bit item size field indicating size of the given value of the sensor information; and
16 bit repeat field indicating a number of values of the sensor information within the sensor record; andreading from a data portion comprising the number of values of the sensor information; wherein; individual values of the number of values of the sensor information correspond temporally to the one or more images; and the sensor tag field, the type size field and the data portion are configured to form the sensor record, the sensor record being stored in the text track.
-
Specification