Ambisonic depth extraction

US 10,609,503 B2
Filed: 12/06/2018
Issued: 03/31/2020
Est. Priority Date: 04/08/2018
Status: Active Grant

First Claim

Patent Images

1. A method for positioning a virtual source to be rendered at an intended depth relative to a listener position, the virtual source including information from two or more spatial audio submix signals configured to be spatially rendered together relative to a first listener position, and each of the spatial audio submix signals corresponds to a respective different reference depth relative to a reference position, the method comprising:

identifying, in each of the spatial audio submix signals, respective candidate components of the virtual source;

determining a first relatedness metric for the identified candidate components of the virtual source from the spatial audio submix signals; and

using the first relatedness metric, determining depths other than the respective reference depths of the spatial audio submix signals at which to render the candidate components from the spatial audio submix signals for a listener at the first listener position such that the listener at the first listener position perceives the virtual source substantially at the intended depth.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The systems and methods described herein can be configured to identify, manipulate, and render different audio source components from encoded 3D audio mixes, such as can include content mixed for azimuth, elevation, and/or depth relative to a listener. The systems and methods can be configured to decouple depth encoding and decoding to permit spatial performance to be tailored to a particular playback environment or platform. In an example, the systems and methods improve rendering in applications that involve listener tracking, including tracking over six degrees of freedom (e.g., yaw, pitch, roll orientation, and x, y, z position).

37 Citations

View as Search Results

20 Claims

1. A method for positioning a virtual source to be rendered at an intended depth relative to a listener position, the virtual source including information from two or more spatial audio submix signals configured to be spatially rendered together relative to a first listener position, and each of the spatial audio submix signals corresponds to a respective different reference depth relative to a reference position, the method comprising:
- identifying, in each of the spatial audio submix signals, respective candidate components of the virtual source;
  
  determining a first relatedness metric for the identified candidate components of the virtual source from the spatial audio submix signals; and
  
  using the first relatedness metric, determining depths other than the respective reference depths of the spatial audio submix signals at which to render the candidate components from the spatial audio submix signals for a listener at the first listener position such that the listener at the first listener position perceives the virtual source substantially at the intended depth.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 2. The method of claim 1, further comprising determining a confidence for the first relatedness metric, the confidence indicating a belongingness of the one or more candidate components to the virtual source;
    - andwherein determining the depths at which to render the candidate components includes proportionally adjusting the depths based on the determined confidence, wherein the proportionally adjusting includes positioning the spatial audio signal components along a depth spectrum from their respective reference positions to the intended depth.
  - 3. The method of claim 2, wherein determining the confidence for the first relatedness metric includes using information about a trend, moving average, or smoothed feature of the candidate components.
  - 4. The method of claim 2, wherein determining the confidence for the first relatedness metric includes determining whether respective spatial distributions or directions of two or more of the candidate components correspond.
  - 5. The method of claim 2, wherein determining the confidence for the first relatedness metric includes determining a correlation between at least two of the candidate components of the virtual source.
  - 6. The method of claim 1, wherein determining the first relatedness metric includes using a ratio of respective signal levels of two of the candidate components.
  - 7. The method of claim 1, wherein determining the depths at which to render the candidate components includes:
    - comparing a value of the first relatedness metric with values in a look-up table that includes potential values for the first relatedness metric and respective corresponding depths, andselecting the depths at which to render the candidate components based on a result of the comparison.
  - 8. The method of claim 1, further comprising rendering an audio output signal for the listener at the first listener position using the candidate components, wherein rendering the audio output signal includes using an HRTF renderer circuit or wavefield synthesis circuit to process the spatial audio submix signals according to the determined depths.
  - 9. The method of claim 1, wherein the spatial audio submix signals comprise multiple time-frequency signals and wherein the identifying the respective candidate components of the virtual source includes identifying candidate components corresponding to discrete frequency bands in the time-frequency signals, and wherein the determining the first relatedness metric includes for the candidate components corresponding to the discrete frequency bands.
  - 10. The method of claim 1, further comprising receiving information about an updated position of the listener, and determining different updated depths at which to render the candidate components from the spatial audio submix signals for the listener at the updated position such that the listener at the updated position perceives the virtual source substantially at a position corresponding to the intended depth relative to the first listener position.
  - 11. The method of claim 1, further comprising:
    - receiving a first spatial audio submix signal with audio information corresponding to a first depth; and
      
      receiving a second spatial audio submix signal with audio information corresponding to a second depth;
      
      wherein the determining the depths at which to render the candidate components includes determining an intermediate depth between the first and second depths; and
      
      wherein the first and second spatial audio submix signals comprise (1) near-field and far-field submixes, respectively, or (2) first and second ambisonic signals, respectively.
  - 12. The method of claim 1, further comprising determining the intended depth using one or more of depth-indicating metadata associated with the two or more spatial audio submix signals and depth-indicating information implied by a context or content of the two or more spatial audio submix signals.
  - 13. The method of claim 1, further comprising generating a consolidated source signal for the virtual source using the determined depths and the candidate components.
  - 14. The method of claim 1, further comprising:
    - determining whether each of the candidate components of the virtual source includes a directional characteristic; and
      
      if a particular one of the candidate components lacks a directional characteristic, then assigning a directional characteristic for the particular one of the candidate components based on a directional characteristic from a different one of the candidate components of the same virtual source.

15. A system for processing audio information to position a virtual audio source to be rendered at an intended depth relative to a listener position, the virtual source including information from two or more spatial audio submix signals configured to be spatially rendered together relative to a first listener position, and each of the spatial audio submix signals corresponds to a respective different reference depth relative to a reference position, the system comprising:
- an audio signal depth processor circuit configured to;
  
  identify, in each of the spatial audio submix signals, respective candidate components of the virtual source;
  
  determine a first relatedness metric for the identified candidate components of the virtual source from the spatial audio submix signals; and
  
  using the first relatedness metric, determine depths other than the respective reference depths of the spatial audio submix signals at which to render the candidate components from the spatial audio submix signals for a listener at the first listener position such that the listener at the first listener position perceives the virtual source substantially at the intended depth.
- View Dependent Claims (16, 17)
- - 16. The system of claim 15, further comprising a rendering circuit configured to provide an audio output signal for the listener at the first listener position using the candidate components, wherein the audio output signal is provided using HRTF binaural/transaural or wavefield synthesis processing of the spatial audio submix signals according to the determined depths and characteristics of a playback system.
  - 17. The system of claim 15, further comprising a listener head tracker configured to sense information about an updated position of the listener;
    - wherein the processor circuit is configured to determine different updated depths at which to render the candidate components from the spatial audio submix signals for the listener at the updated position such that the listener at the updated position perceives the virtual source substantially at the intended depth relative to the first listener position.

18. A method for positioning a virtual source to be rendered at an intended depth relative to a listener position, the virtual source based on information from one or more spatial audio signals and each of the spatial audio signals corresponds to a respective different reference depth relative to a reference position, the method comprising:
- identifying, in each of multiple spatial audio signals, respective candidate components of the virtual source;
  
  determining a first relatedness metric for the identified candidate components of the virtual source from the spatial audio signals; and
  
  determining a confidence for the first relatedness metric, the confidence indicating a belongingness of the one or more candidate components to the virtual source; and
  
  when the confidence for the first metric indicates a correspondence in content and/or location between the identified candidate components, determining first depths at which to render the candidate components for a listener at the first listener position such that the listener perceives the virtual source substantially at the intended depth, wherein at least one of the determined first depths is other than its corresponding reference depth; and
  
  when the confidence for the first relatedness metric indicates a non-correspondence in content or location between the identified candidate components, determining second depths at which to render the candidate components for the listener at the first listener position such that the listener perceives the virtual source substantially at the intended depth, wherein the determined second depths correspond to the reference depths.
- View Dependent Claims (19, 20)
- - 19. The method of claim 18, wherein determining the confidence for the first relatedness metric includes using information about a trend, moving average, or smoothed feature of the candidate components.
  - 20. The method of claim 18, wherein determining the depths at which to render the candidate components includes proportionally adjusting the reference depths based on the determined confidence, wherein the proportionally adjusting includes positioning the spatial audio signal components along a depth spectrum from their respective reference positions to the intended depth.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
DTS, Inc. (Adeia Inc.)
Original Assignee
DTS, Inc. (Adeia Inc.)
Inventors
Stein, Edward
Primary Examiner(s)
Holder, Regina N

Application Number

US16/212,387
Publication Number

US 20190313200A1
Time in Patent Office

481 Days
Field of Search
US Class Current
CPC Class Codes

G06F 3/011   Arrangements for interactio...

G06F 3/012   Head tracking input arrange...

G06F 3/04815   Interaction with a metaphor...

G06F 3/04845   for image manipulation, e.g...

G06F 3/04847   Interaction techniques to c...

H04S 1/002   Non-adaptive circuits, e.g....

H04S 2400/01   Multi-channel, i.e. more th...

H04S 2400/11   Positioning of individual s...

H04S 2420/01   Enhancing the perception of...

H04S 2420/11   Application of ambisonics i...

H04S 3/008   in which the audio signals ...

H04S 7/303   Tracking of listener positi...

Ambisonic depth extraction

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

37 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Ambisonic depth extraction

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

37 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links