Methods and systems of spatiotemporal pattern recognition for video content development

US 10,521,671 B2
Filed: 05/04/2017
Issued: 12/31/2019
Est. Priority Date: 02/28/2014
Status: Active Grant

First Claim

Patent Images

1. A method for providing enhanced video content, comprising:

processing at least one video feed through at least one spatiotemporal pattern recognition algorithm that uses machine learning to determine at least one event type for each of a plurality of events within the at least one video feed, wherein machine learning determines the at least one event type for at least one spatiotemporal pattern selected from the group consisting of relative motion of two visible features toward each other for at least a duration threshold, acceleration of motion of at least two visible features with respect to each other being greater than an acceleration threshold, rate of motion of two visible features toward each other, projected point of intersection of the two visible features, and separation distance between the two visible features being less than a separation threshold;

extracting a plurality of video cuts from the at least one video feed;

indexing the extracted plurality of video cuts based on the at least one event type determined by the machine learning that corresponds to an event in the plurality of events detectable in the plurality of video cuts; and

automatically, under computer control, generating an enhanced video content data structure using the extracted plurality of video cuts based on the indexing of the extracted plurality of video cuts,wherein the at least one spatiotemporal pattern recognition algorithm is based on at least one pattern recognized by adjusting an input feature and a weight within a machine learning system, wherein the input feature is selected from the group consisting of relative direction of motion of at least two visible features, duration of relative motion of visible features with respect to each other, rate of motion of at least two visible features with respect to each other, acceleration of motion of at least two visible features with respect to each other, projected point of intersection of at least two visible features with respect to each other and separation distance between at least two visible features with respect to each other; and

wherein extracting the plurality of video cuts includes automatically extracting a cut from the video feed based on a result of processing another input feed with the machine learning, the another input feed including at least one of a portion of content of a broadcast commentary and a change in camera view in the another input feed.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Providing enhanced video content includes processing at least one video feed through at least one spatiotemporal pattern recognition algorithm that uses machine learning to develop an understanding of a plurality of events and to determine at least one event type for each of the plurality of events. The event type includes an entry in a relationship library detailing a relationship between two visible features. Extracting and indexing a plurality of video cuts from the video feed is performed based on the at least one event type determined by the understanding that corresponds to an event in the plurality of events detectable in the video cuts. Lastly, automatically and under computer control, an enhanced video content data structure is generated using the extracted plurality of video cuts based on the indexing of the extracted plurality of video cuts.

53 Citations

View as Search Results

22 Claims

1. A method for providing enhanced video content, comprising:
- processing at least one video feed through at least one spatiotemporal pattern recognition algorithm that uses machine learning to determine at least one event type for each of a plurality of events within the at least one video feed, wherein machine learning determines the at least one event type for at least one spatiotemporal pattern selected from the group consisting of relative motion of two visible features toward each other for at least a duration threshold, acceleration of motion of at least two visible features with respect to each other being greater than an acceleration threshold, rate of motion of two visible features toward each other, projected point of intersection of the two visible features, and separation distance between the two visible features being less than a separation threshold;
  
  extracting a plurality of video cuts from the at least one video feed;
  
  indexing the extracted plurality of video cuts based on the at least one event type determined by the machine learning that corresponds to an event in the plurality of events detectable in the plurality of video cuts; and
  
  automatically, under computer control, generating an enhanced video content data structure using the extracted plurality of video cuts based on the indexing of the extracted plurality of video cuts,wherein the at least one spatiotemporal pattern recognition algorithm is based on at least one pattern recognized by adjusting an input feature and a weight within a machine learning system, wherein the input feature is selected from the group consisting of relative direction of motion of at least two visible features, duration of relative motion of visible features with respect to each other, rate of motion of at least two visible features with respect to each other, acceleration of motion of at least two visible features with respect to each other, projected point of intersection of at least two visible features with respect to each other and separation distance between at least two visible features with respect to each other; and
  
  wherein extracting the plurality of video cuts includes automatically extracting a cut from the video feed based on a result of processing another input feed with the machine learning, the another input feed including at least one of a portion of content of a broadcast commentary and a change in camera view in the another input feed.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 2. The method of claim 1, wherein the at least one spatiotemporal pattern recognition algorithm is based on at least one pattern recognized by adjusting at least one of an input feature and a weight within a machine learning system, wherein the input feature is selected from the group consisting of relative direction of motion of at least two visible features, duration of relative motion of visible features with respect to each other, rate of motion of at least two visible features with respect to each other, acceleration of motion of at least two visible feature with respect to each other, projected point of intersection of at least two visible features with respect to each other and separation distance between at least two visible features with respect to each other.
  - 3. The method of claim 1, wherein generating the enhanced video content data structure is based at least in part on at least one of a user preference and a user profile for a user for which the enhanced video content data structure is generated.
  - 4. The method of claim 1, further comprising providing a user interface for display on a mobile device, wherein the user interface includes at least one of a search option and a filtering option to allow a user to at least one of specify and select a description of a type of event, wherein the enhanced video content data structure is generated to match the description.
  - 5. The method of claim 1, wherein using the machine learning further comprises using the plurality of events in position tracking data over time obtained from at least one of the at least one video feed and a chip-based player tracking system, and wherein the machine learning is based on at least two of spatial configuration, relative motion, and projected motion of at least one of a player and an item used in a game.
  - 6. The method of claim 1, wherein using the machine learning further comprises aligning multiple unsynchronized input feeds related to an event of the plurality of events using at least one of a hierarchy of algorithms and a hierarchy of human operators, wherein the unsynchronized input feeds are selected from the group consisting of one or more broadcast video feeds of the event, one or more feeds of tracking video for the event, and one or more play-by-play data feeds of the event.
  - 7. The method of claim 6, wherein the multiple unsynchronized input feeds include at least three feeds selected from at least two event types related to the event.
  - 8. The method of claim 6, further comprising at least one of validating and modifying the alignment of the unsynchronized input feeds using a hierarchy involving at least two of one or more algorithms, one or more human operators, and one or more input feeds.
  - 9. The method of claim 1, further comprising at least one of validating and modifying the machine learning using a hierarchy involving at least two of one or more algorithms, one or more human operators, and one or more input feeds.
  - 10. The method of claim 1, further comprising automatically developing a semantic index of the at least one video feed based on an output of the machine learning of at least one event of the plurality of events in the video feed to indicate a game time of the at least one event in the video feed and a location of a display of the at least one event in the video feed.
  - 11. The method of claim 10, wherein the location of the display of the at least one event in the video feed includes at least one of a pixel location, a voxel location, a raster image location.
  - 12. The method of claim 10, further comprising providing the semantic index of the video feed with the video feed configured to enable semantic-based augmentation of the video feed.
  - 13. The method of claim 12, wherein augmentation of the video feed includes adding content based on the identified location and enabling at least one of a touch interface feature and a mouse interface feature based on the identified location.

14. A method, comprising:
- processing at least one video feed through at least one spatiotemporal pattern recognition algorithm that uses machine learning to determine at least one event type for each of a plurality of events within the at least one video feed, wherein machine learning determines the at least one event type for at least one spatiotemporal pattern selected from the group consisting of relative motion of two visible features toward each other for at least a duration threshold, acceleration of motion of at least two visible features with respect to each other being greater than an acceleration threshold, rate of motion of two visible features toward each other, projected point of intersection of the two visible features, and separation distance between the two visible features being less than a separation threshold;
  
  extracting a plurality of video cuts from the at least one video feed;
  
  indexing the plurality of video cuts based on the at least one event type determined by machine learning; and
  
  providing a mobile application having a user interface configured to permit a user to find the extracted plurality of video cuts based on the indexing of the extracted plurality of video cuts with the mobile application,wherein the at least one spatiotemporal pattern recognition algorithm is based on at least one pattern recognized by adjusting an input feature and a weight within a machine learning system, wherein the input feature is selected from the group consisting of relative direction of motion of at least two visible features, duration of relative motion of visible features with respect to each other, rate of motion of at least two visible features with respect to each other, acceleration of motion of at least two visible features with respect to each other, projected point of intersection of at least two visible features with respect to each other and separation distance between at least two visible features with respect to each other; and
  
  wherein extracting the plurality of video cuts includes automatically extracting a cut from the video feed based on a result of processing another input feed with the machine learning, the another input feed including at least one of a portion of content of a broadcast commentary and a change in camera view in the another input feed.
- View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22)
- - 15. The method of claim 14, the at least one spatiotemporal pattern recognition algorithm is based on at least one pattern recognized by adjusting at least one of an input feature and a weight within a machine learning system, wherein the input feature is selected from the group consisting of relative direction of motion of at least two visible features, duration of relative motion of visible features with respect to each other, rate of motion of at least two visible features with respect to each other, acceleration of motion of at least two visible feature with respect to each other, projected point of intersection of at least two visible features with respect to each other and separation distance between at least two visible features with respect to each other.
  - 16. The method of claim 14, including generating at least one metric associated with at least one event of the plurality of events and wherein the user interface of the mobile application is configured to permit the user to enhance a video cut from the plurality of video cuts by selecting the metric to be included in the video cut.
  - 17. The method of claim 14, wherein the user interface of the mobile application is configured to permit the user to share an edited video via the mobile application.
  - 18. The method of claim 14, wherein using the machine learning further comprises using the plurality of events in position tracking data over time obtained from at least one of the at least one video feed and a chip-based player tracking system, and wherein the machine learning is based on at least two of spatial configuration, relative motion, and projected motion of at least one of a player and an item used in a game.
  - 19. The method of claim 14, wherein using the machine learning further comprises aligning multiple unsynchronized input feeds related to an event of the plurality of events using at least one of a hierarchy of algorithms and a hierarchy of human operators, wherein the unsynchronized input feeds are selected from the group consisting of one or more broadcast video feeds of the event, one or more feeds of tracking video for the event, and one or more play-by-play data feeds of the event.
  - 20. The method of claim 19, wherein the multiple unsynchronized input feeds include at least three feeds selected from at least two types related to the event of the plurality of events.
  - 21. The method of claim 19, further comprising at least one of validating and modifying the alignment of the unsynchronized input feeds using a hierarchy involving at least two of one or more algorithms, one or more human operators and one or more input feeds where at least one algorithm in the hierarchy for validation is based on a nature of the input feed.
  - 22. The method of claim 14, further comprising at least one of validating the determination and modifying the determination using a hierarchy involving at least two of one or more algorithms, one or more human operators and one or more input feeds where at least one algorithm in the hierarchy for validating is based on a nature of the input feed.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Genius Sports SS LLC
Original Assignee
Second Spectrum, Inc. (Genius Sports Ltd.)
Inventors
Chang, Yu-Han, Maheswaran, Rajiv, Su, Jeffrey Wayne, Hollingsworth, Noel
Primary Examiner(s)
Tran, Loi H

Application Number

US15/586,379
Publication Number

US 20170238055A1
Time in Patent Office

971 Days
Field of Search

725 19
US Class Current
CPC Class Codes

A63F 13/60   Generating or modifying gam...

G06F 3/012   Head tracking input arrange...

G06F 3/013   Eye tracking input arrangem...

G06N 20/00   Machine learning

G06N 20/10   using kernel methods, e.g. ...

G06T 2207/20081   Training; Learning

G06T 2207/30221   Sports video; Sports image

G06V 20/42   of sport video content

G06V 20/46   Extracting features or char...

G11B 27/031   Electronic editing of digit...

G11B 27/28   by using information signal...

H04N 13/117   the virtual viewpoint locat...

H04N 13/204   using stereoscopic image ca...

H04N 13/243   using three or more 2D imag...

H04N 21/2187   Live feed

H04N 21/23418   involving operations for an...

H04N 21/251   Learning process for intell...

H04N 21/4223   Cameras H04N23/00 takes pre...

H04N 21/4345   Extraction or processing of...

H04N 21/44008   involving operations for an...

H04N 21/4532 : involving end-user characte...

H04N 21/4662 : characterized by learning a...

H04N 21/8456 : by decomposing the content ...

H04N 21/8549 : Creating video summaries, e...

H04N 5/2224 : related to virtual studio a...

View All

Methods and systems of spatiotemporal pattern recognition for video content development

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

53 Citations

22 Claims

Specification

Solutions

Use Cases

Quick Links

Methods and systems of spatiotemporal pattern recognition for video content development

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

53 Citations

22 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links