Apparatus and methods for embedding metadata into video stream
First Claim
1. A system for generating a multimedia streaming file, the system comprising:
- information storage; and
one or more processors configured by computer instructions to;
obtain a series of images captured by an imaging sensor;
obtain information captured by one or more sensors other than the imaging sensor, the obtained information being relevant to one or more images within the series of images, the one or more sensors other than the imaging sensor including a first sensor;
generate an encoded video track that includes images from the series of images;
generate a sensor track that includes a first sensor record based on the obtained information;
generate a combined multimedia stream comprised of the encoded video track and the sensor track; and
store the combined multimedia stream in the information storage;
wherein;
the first sensor record comprises;
a header portion comprising a tag field comprising a sensor tag selected from potential sensor tags, the sensor tag identifying type of the obtained information;
a type size field comprising at least one of a value type field identifying a value type of a given value of the obtained information that is within the first sensor record;
an item size field indicating size of the given value of the obtained information that is within the first sensor record; and
/or a repeat field indicating a number of values of the obtained information that is within the first sensor record; and
a data portion comprising the values of the obtained information; and
wherein individual ones of the values of the obtained information correspond temporally to specific ones of the one or more images within the series of images.
4 Assignments
0 Petitions
Accused Products
Abstract
Apparatus and methods for combining metadata with video into a video stream using a 32-bit aligned payload, that is computer storage efficient and human discernable. The metadata is stored in a track in a self-describing structure. Metadata track may be decoded using an identifier reference table that is substantially smaller than typical fourCC identifier tables. The combined metadata/video stream is compatible with a standard video stream convention and may be played using conventional media player applications that reads media files compliant with MP4/MOV container format. The proposed format may enable decoding of metadata during streaming, partitioning of combined video stream without loss of metadata. The proposed format and/or metadata protocol provides for temporal synchronization of metadata with video frames.
119 Citations
20 Claims
-
1. A system for generating a multimedia streaming file, the system comprising:
-
information storage; and one or more processors configured by computer instructions to; obtain a series of images captured by an imaging sensor; obtain information captured by one or more sensors other than the imaging sensor, the obtained information being relevant to one or more images within the series of images, the one or more sensors other than the imaging sensor including a first sensor; generate an encoded video track that includes images from the series of images; generate a sensor track that includes a first sensor record based on the obtained information; generate a combined multimedia stream comprised of the encoded video track and the sensor track; and store the combined multimedia stream in the information storage; wherein; the first sensor record comprises; a header portion comprising a tag field comprising a sensor tag selected from potential sensor tags, the sensor tag identifying type of the obtained information; a type size field comprising at least one of a value type field identifying a value type of a given value of the obtained information that is within the first sensor record;
an item size field indicating size of the given value of the obtained information that is within the first sensor record; and
/or a repeat field indicating a number of values of the obtained information that is within the first sensor record; anda data portion comprising the values of the obtained information; and wherein individual ones of the values of the obtained information correspond temporally to specific ones of the one or more images within the series of images. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A non-transitory computer readable medium comprising a plurality of computer instructions configured to, when executed by a processor, decode sensor information from a multimedia stream by at least:
-
accessing one or more image frames from a video track of the multimedia stream, the one or more frames corresponding to a time interval; accessing a text track of the multimedia stream corresponding to the time interval, the accessing the text track comprises steps of; reading from the text track a sensor tag field value; identifying one or more of type, origin, and/or meaning of the sensor information based on the sensor tag field value; reading from the text track a type size field comprising at least one of a type portion configured to identify type of a given value of the sensor information within a sensor record;
an item size field indicating size of the given value of the sensor information; and
/or a repeat field indicating a number of values of the sensor information within the sensor record; andreading from a data portion comprising the number of values of the sensor information; wherein; individual values of the number of values of the sensor information correspond temporally to the one or more images; and the sensor tag field, the type size field and the data portion are configured to form the sensor record, the sensor record being stored in the text track.
-
Specification