Coordinating and mixing audiovisual content captured from geographically distributed performers

US 9,866,731 B2
Filed: 10/30/2015
Issued: 01/09/2018
Est. Priority Date: 04/12/2011
Status: Active Grant

First Claim

Patent Images

1. A method of preparing coordinated audiovisual performances from geographically distributed performer contributions, the method comprising:

receiving via a communication network, a first audiovisual encoding of a first performer, including first performer vocals captured at a first remote device and first performer video;

receiving via the communication network, a second audiovisual encoding of a second performer, including second performer vocals captured at a second remote device and second performer video;

determining, from the first performer vocals, at least one time-varying, computationally-defined audio feature;

determining, from the second performer vocals, at least one time-varying, computationally-defined audio feature; and

based on comparison of the computationally-defined audio feature determined from the first and second performer vocals, dynamically varying relative visual prominence of first and second performer video throughout a combined audiovisual performance mix of the captured first and second performer vocals with a backing track and the first and second performer video; and

supplying the first and second remote devices with corresponding, but differing, versions of the combined audiovisual performance mix,wherein the combined audiovisual performance mix supplied to the first remote device features the first performer video and first performer vocals more prominently than the second performer video and second performer vocals, andwherein the combined performance mix supplied to the second remote device features the second performer video and second performer vocals more prominently than the first performer video and first performer vocals.

View all claims

6 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Audiovisual performances, including vocal music, are captured and coordinated with those of other users in ways that create compelling user experiences. In some cases, the vocal performances of individual users are captured (together with performance synchronized video) on mobile devices, television-type display and/or set-top box equipment in the context of karaoke-style presentations of lyrics in correspondence with audible renderings of a backing track. Contributions of multiple vocalists are coordinated and mixed in a manner that selects for visually prominent presentation performance synchronized video of one or more of the contributors. Prominence of particular performance synchronized video may be based, at least in part, on computationally-defined audio features extracted from (or computed over) captured vocal audio. Over the course of a coordinated audiovisual performance timeline, these computationally-defined audio features are selective for performance synchronized video of one or more of the contributing vocalists.

91 Citations

View as Search Results

26 Claims

1. A method of preparing coordinated audiovisual performances from geographically distributed performer contributions, the method comprising:
- receiving via a communication network, a first audiovisual encoding of a first performer, including first performer vocals captured at a first remote device and first performer video;
  
  receiving via the communication network, a second audiovisual encoding of a second performer, including second performer vocals captured at a second remote device and second performer video;
  
  determining, from the first performer vocals, at least one time-varying, computationally-defined audio feature;
  
  determining, from the second performer vocals, at least one time-varying, computationally-defined audio feature; and
  
  based on comparison of the computationally-defined audio feature determined from the first and second performer vocals, dynamically varying relative visual prominence of first and second performer video throughout a combined audiovisual performance mix of the captured first and second performer vocals with a backing track and the first and second performer video; and
  
  supplying the first and second remote devices with corresponding, but differing, versions of the combined audiovisual performance mix,wherein the combined audiovisual performance mix supplied to the first remote device features the first performer video and first performer vocals more prominently than the second performer video and second performer vocals, andwherein the combined performance mix supplied to the second remote device features the second performer video and second performer vocals more prominently than the first performer video and first performer vocals.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
- - 2. The method of claim 1, wherein the computationally-defined audio feature determined from the first performer vocals includes one or more of a spectral centroid, a measure of tempo correspondence with a melody track, a measure of tempo correspondence with a harmony track, a measure of pitch correspondence with a melody track, a measure of pitch correspondence with a harmony track, a measure of tempo correspondence with a score, and a measure of pitch correspondence with a score.
  - 3. The method of claim 1,wherein the first and second first audiovisual encodings include, in addition to captured vocals, performance synchronized video captured at the respective remote device.
  - 4. The method of claim 1, further comprising:
    - mixing the first performer vocals with the backing track, wherein the mixing results in a first mixed audiovisual performance; and
      
      supplying a second remote device with the first mixed audiovisual performance,wherein the second performer vocals are captured against a local audio rendering, at the second remote device, of the first mixed audiovisual performance.
  - 5. The method of claim 1,wherein the dynamic varying of relative visual prominence includes transitioning between prominent visual presentation of first performer video captured at the first remote device and prominent visual presentation of second performer video captured at the second remote device.
  - 6. The method of claim 5,wherein the transitioning includes switching, wiping or crossfading of respective performer video.
  - 7. The method of claim 5,wherein the transitioning is performed, at least in some cases, prior to a triggering change in relative values of the computationally-defined audio feature.
  - 8. The method of claim 7,wherein the transitioning prominently presents performer video beginning just prior to onset of corresponding prominent vocals.
  - 9. The method of claim 5,wherein transitioning is subject to duration filtering or a hysteresis function.
  - 10. The method of claim 9,wherein duration filtering or hysteresis function parameters are selected to limit excessive visual transitions between performers.
  - 11. The method of claim 5,wherein the transitioning is amongst video corresponding to three or more performers and their respective vocal performances.
  - 12. The method of claim 1,wherein the dynamically varied relative visual prominence includes, for at least some values of the computationally-defined audio feature, visual presentation of both first and second performer video, though with differing visual prominence.
  - 13. The method of claim 1,wherein the dynamically varied relative visual prominence includes, for at least some values of the computationally-defined audio feature, visual presentation of both first and second performer video, with equal levels of visual prominence.
  - 14. The method of claim 1,wherein the dynamically varied relative visual prominence includes, for at least some values of the computationally-defined audio feature, visual presentation of first or second performer video, but not both.
  - 15. The method of claim 1, wherein the computationally-defined audio feature is computed over pre-processed audio signals.
  - 16. The method of claim 15, wherein the pre-processing of the audio signals includes one or more of:
    - application of a bark-band auditory model;
      
      vocal detection; and
      
      noise cancellation.
  - 17. The method of claim 15, wherein the preprocessing is performed, at least in part, at the respective first or second remote device.
  - 18. The method of claim 1, further comprising:
    - inviting via electronic message or social network posting at least the second performer to join the combined audiovisual performance.
  - 19. The method of claim 18,wherein the inviting includes the supplying of the second remote device with the resulting first mixed audiovisual performance.
  - 20. The method of claim 1,wherein the more prominently featured of the first and second performer vocals is pitch-shifted to a vocal melody position in at least one of the corresponding, but differing, combined audiovisual performance mix versions supplied, andwherein a less prominently featured of the first and second performer vocals is pitch-shifted to a harmony position.
  - 21. The method of claim 1,wherein amplitudes of respective spatially differentiated audio channels of the first and second performer vocals are adjusted to provide apparent spatial separation therebetween in the supplied audiovisual performance mix versions.
  - 22. The method of claim 1, further comprising:
    - supplying the first and second remote devices with a vocal score that encodes (i) a sequence of notes for a vocal melody and (ii) at least a first set of harmony notes for at least some portions of the vocal melody,wherein at least one of the received first and second performer vocals is pitch corrected at the respective first or second remote device in accord with the supplied vocal score.
  - 23. The method of claim 1, further comprising:
    - pitch correcting at least one of the received first and second performer vocals in accord with a vocal score that encodes (i) a sequence of notes for a vocal melody and (ii) at least a first set of harmony notes for at least some portions of the vocal melody.
  - 24. The method of claim 1, further comprising:
    - mixing either or both of the first and second performer vocals with the backing track wherein the mixing results in a second mixed audiovisual performance, and supplying a third remote device with the second mixed audiovisual performance; and
      
      receiving via the communication network, a third audiovisual encoding of a third performer, including third performer vocals captured at the third remote device against a local audio rendering of the second mixed performance.
  - 25. The method of claim 24, further comprising:
    - including the captured third performer vocals in the combined audiovisual performance mix.
  - 26. The method of claim 1, wherein the first and second portable computing devices are selected from the group of:
    - a mobile phone;
      
      a personal digital assistant;
      
      a laptop computer, notebook computer, a pad-type computer or netbook.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Smule, Inc.
Original Assignee
Smule, Inc.
Inventors
Godfrey, Mark T., Cook, Perry R.
Primary Examiner(s)
He, Jialong

Application Number

US14/928,727
Publication Number

US 20160057316A1
Time in Patent Office

802 Days
Field of Search
US Class Current
CPC Class Codes

G10H 1/366   with means for modifying or...

G10H 2210/066   for pitch analysis as part ...

G10H 2210/331   Note pitch correction, i.e....

G10H 2240/251   Mobile telephone transmissi...

G10L 13/0335   Pitch control

G10L 21/013   Adapting to target pitch

H04N 5/04   Synchronising for televisio...

Y10S 84/04   Chorus; ensemble; celeste

Coordinating and mixing audiovisual content captured from geographically distributed performers

First Claim

6 Assignments

0 Petitions

Accused Products

Abstract

91 Citations

26 Claims

Specification

Use Cases

Quick Links

Others

Coordinating and mixing audiovisual content captured from geographically distributed performers

First Claim

6 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

91 Citations

26 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others