Coordinating and mixing audiovisual content captured from geographically distributed performers
First Claim
1. A method of preparing coordinated audiovisual performances from geographically distributed performer contributions, the method comprising:
- receiving via a communication network, a first audiovisual encoding of a first performer, including first performer vocals captured at a first remote device and first performer video;
determining, from the first performer vocals, at least one time-varying, computationally-defined audio feature, wherein the computationally-defined audio feature determined from the first performer vocals includes one or more of a measure of tempo correspondence with a melody track, a measure of tempo correspondence with a harmony track, a measure of pitch correspondence with a melody track, a measure of pitch correspondence with a harmony track, a measure of tempo correspondence with a score, and a measure of pitch correspondence with a score;
determining, from second performer vocals of a second audiovisual encoding of a second performer including the second performer vocals captured at a second device and second performer video, at least one time-varying, computationally-defined audio feature for comparison to the computationally-defined audio feature determined from the first performer vocals; and
based on comparison of the computationally-defined audio feature determined from the first and second performer vocals, dynamically varying relative visual prominence of first and second performer video throughout a combined audiovisual performance mix of the captured first and second performer vocals with a backing track and the first and second performer video.
5 Assignments
0 Petitions
Accused Products
Abstract
Audiovisual performances, including vocal music, are captured and coordinated with those of other users in ways that create compelling user experiences. In some cases, the vocal performances of individual users are captured (together with performance synchronized video) on mobile devices, television-type display and/or set-top box equipment in the context of karaoke-style presentations of lyrics in correspondence with audible renderings of a backing track. Contributions of multiple vocalists are coordinated and mixed in a manner that selects for visually prominent presentation performance synchronized video of one or more of the contributors. Prominence of particular performance synchronized video may be based, at least in part, on computationally-defined audio features extracted from (or computed over) captured vocal audio. Over the course of a coordinated audiovisual performance timeline, these computationally-defined audio features are selective for performance synchronized video of one or more of the contributing vocalists.
94 Citations
35 Claims
-
1. A method of preparing coordinated audiovisual performances from geographically distributed performer contributions, the method comprising:
-
receiving via a communication network, a first audiovisual encoding of a first performer, including first performer vocals captured at a first remote device and first performer video; determining, from the first performer vocals, at least one time-varying, computationally-defined audio feature, wherein the computationally-defined audio feature determined from the first performer vocals includes one or more of a measure of tempo correspondence with a melody track, a measure of tempo correspondence with a harmony track, a measure of pitch correspondence with a melody track, a measure of pitch correspondence with a harmony track, a measure of tempo correspondence with a score, and a measure of pitch correspondence with a score; determining, from second performer vocals of a second audiovisual encoding of a second performer including the second performer vocals captured at a second device and second performer video, at least one time-varying, computationally-defined audio feature for comparison to the computationally-defined audio feature determined from the first performer vocals; and based on comparison of the computationally-defined audio feature determined from the first and second performer vocals, dynamically varying relative visual prominence of first and second performer video throughout a combined audiovisual performance mix of the captured first and second performer vocals with a backing track and the first and second performer video. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35)
-
-
31. An apparatus comprising:
-
a mobile computing device; and machine readable code embodied in a non-transitory medium and executable on the mobile computing device to receive via a communication network, a first audiovisual encoding of a first performer, including first performer vocals captured at a first remote device and first performer video; the machine readable code further executable to determine, from the first performer vocals, at least one time-varying, computationally-defined audio feature, wherein the computationally-defined audio feature determined from the first performer vocals includes one or more of a measure of tempo correspondence with a melody track, a measure of tempo correspondence with a harmony track, a measure of pitch correspondence with a melody track, a measure of pitch correspondence with a harmony track, a measure of tempo correspondence with a score, and a measure of pitch correspondence with a score; the machine readable code further executable to determine, from second performer vocals of a second audiovisual encoding of a second performer including the second performer vocals captured at a second device and second performer video, at least one time-varying, computationally-defined audio feature for comparison to the computationally-defined audio feature determined from the first performer vocals; and the machine readable code further executable to, based on comparison of the computationally-defined audio feature determined from the first and second performer vocals, dynamically vary relative visual prominence of first and second performer video throughout a combined audiovisual performance mix of the captured first and second performer vocals with a backing track and the first and second performer video. - View Dependent Claims (32, 33)
-
-
34. A service platform, comprising:
-
one or more computing devices; and machine readable code embodied in a non-transitory medium and executable on at least one of the one or more computing devices to receive via a communication network, a first audiovisual encoding of a first performer, including first performer vocals captured at a first remote device and first performer video; the machine readable code further executable to determine, from the first performer vocals, at least one time-varying, computationally-defined audio feature, wherein the computationally-defined audio feature determined from the first performer vocals includes one or more of a measure of tempo correspondence with a melody track, a measure of tempo correspondence with a harmony track, a measure of pitch correspondence with a melody track, a measure of pitch correspondence with a harmony track, a measure of tempo correspondence with a score, and a measure of pitch correspondence with a score; the machine readable code further executable to determine, from second performer vocals of a second audiovisual encoding of a second performer including the second performer vocals captured at a second device and second performer video, at least one time-varying, computationally-defined audio feature for comparison to the computationally-defined audio feature determined from the first performer vocals; and the machine readable code further executable to, based on comparison of the computationally-defined audio feature determined from the first and second performer vocals, dynamically vary relative visual prominence of first and second performer video throughout a combined audiovisual performance mix of the captured first and second performer vocals with a backing track and the first and second performer video.
-
Specification