Methods and systems of spatiotemporal pattern recognition for video content development
First Claim
1. A method for providing enhanced video content, comprising:
- processing at least one video feed through at least one spatiotemporal pattern recognition algorithm that uses machine learning to determine at least one event type for each of a plurality of events within the at least one video feed, wherein machine learning determines the at least one event type for at least one spatiotemporal pattern selected from the group consisting of relative motion of two visible features toward each other for at least a duration threshold, acceleration of motion of at least two visible features with respect to each other being greater than an acceleration threshold, rate of motion of two visible features toward each other, projected point of intersection of the two visible features, and separation distance between the two visible features being less than a separation threshold;
extracting a plurality of video cuts from the at least one video feed;
indexing the extracted plurality of video cuts based on the at least one event type determined by the machine learning that corresponds to an event in the plurality of events detectable in the plurality of video cuts; and
automatically, under computer control, generating an enhanced video content data structure using the extracted plurality of video cuts based on the indexing of the extracted plurality of video cuts,wherein the at least one spatiotemporal pattern recognition algorithm is based on at least one pattern recognized by adjusting an input feature and a weight within a machine learning system, wherein the input feature is selected from the group consisting of relative direction of motion of at least two visible features, duration of relative motion of visible features with respect to each other, rate of motion of at least two visible features with respect to each other, acceleration of motion of at least two visible features with respect to each other, projected point of intersection of at least two visible features with respect to each other and separation distance between at least two visible features with respect to each other; and
wherein extracting the plurality of video cuts includes automatically extracting a cut from the video feed based on a result of processing another input feed with the machine learning, the another input feed including at least one of a portion of content of a broadcast commentary and a change in camera view in the another input feed.
2 Assignments
0 Petitions
Accused Products
Abstract
Providing enhanced video content includes processing at least one video feed through at least one spatiotemporal pattern recognition algorithm that uses machine learning to develop an understanding of a plurality of events and to determine at least one event type for each of the plurality of events. The event type includes an entry in a relationship library detailing a relationship between two visible features. Extracting and indexing a plurality of video cuts from the video feed is performed based on the at least one event type determined by the understanding that corresponds to an event in the plurality of events detectable in the video cuts. Lastly, automatically and under computer control, an enhanced video content data structure is generated using the extracted plurality of video cuts based on the indexing of the extracted plurality of video cuts.
53 Citations
22 Claims
-
1. A method for providing enhanced video content, comprising:
-
processing at least one video feed through at least one spatiotemporal pattern recognition algorithm that uses machine learning to determine at least one event type for each of a plurality of events within the at least one video feed, wherein machine learning determines the at least one event type for at least one spatiotemporal pattern selected from the group consisting of relative motion of two visible features toward each other for at least a duration threshold, acceleration of motion of at least two visible features with respect to each other being greater than an acceleration threshold, rate of motion of two visible features toward each other, projected point of intersection of the two visible features, and separation distance between the two visible features being less than a separation threshold; extracting a plurality of video cuts from the at least one video feed; indexing the extracted plurality of video cuts based on the at least one event type determined by the machine learning that corresponds to an event in the plurality of events detectable in the plurality of video cuts; and automatically, under computer control, generating an enhanced video content data structure using the extracted plurality of video cuts based on the indexing of the extracted plurality of video cuts, wherein the at least one spatiotemporal pattern recognition algorithm is based on at least one pattern recognized by adjusting an input feature and a weight within a machine learning system, wherein the input feature is selected from the group consisting of relative direction of motion of at least two visible features, duration of relative motion of visible features with respect to each other, rate of motion of at least two visible features with respect to each other, acceleration of motion of at least two visible features with respect to each other, projected point of intersection of at least two visible features with respect to each other and separation distance between at least two visible features with respect to each other; and wherein extracting the plurality of video cuts includes automatically extracting a cut from the video feed based on a result of processing another input feed with the machine learning, the another input feed including at least one of a portion of content of a broadcast commentary and a change in camera view in the another input feed. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method, comprising:
-
processing at least one video feed through at least one spatiotemporal pattern recognition algorithm that uses machine learning to determine at least one event type for each of a plurality of events within the at least one video feed, wherein machine learning determines the at least one event type for at least one spatiotemporal pattern selected from the group consisting of relative motion of two visible features toward each other for at least a duration threshold, acceleration of motion of at least two visible features with respect to each other being greater than an acceleration threshold, rate of motion of two visible features toward each other, projected point of intersection of the two visible features, and separation distance between the two visible features being less than a separation threshold; extracting a plurality of video cuts from the at least one video feed; indexing the plurality of video cuts based on the at least one event type determined by machine learning; and providing a mobile application having a user interface configured to permit a user to find the extracted plurality of video cuts based on the indexing of the extracted plurality of video cuts with the mobile application, wherein the at least one spatiotemporal pattern recognition algorithm is based on at least one pattern recognized by adjusting an input feature and a weight within a machine learning system, wherein the input feature is selected from the group consisting of relative direction of motion of at least two visible features, duration of relative motion of visible features with respect to each other, rate of motion of at least two visible features with respect to each other, acceleration of motion of at least two visible features with respect to each other, projected point of intersection of at least two visible features with respect to each other and separation distance between at least two visible features with respect to each other; and wherein extracting the plurality of video cuts includes automatically extracting a cut from the video feed based on a result of processing another input feed with the machine learning, the another input feed including at least one of a portion of content of a broadcast commentary and a change in camera view in the another input feed. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22)
-
Specification