SKELETAL MODELING FOR WORLD SPACE OBJECT SOUNDS

US 20130208897A1
Filed: 12/21/2012
Published: 08/15/2013
Est. Priority Date: 10/13/2010
Status: Abandoned Application

First Claim

Patent Images

1. A method for providing three-dimensional audio, comprising:

receiving a depth map imaging a scene from a depth camera;

recognizing a human subject present in the scene;

modeling the human subject with a virtual skeleton comprising a plurality of joints defined with a three-dimensional position;

determining, based on the virtual skeleton, a world space ear position of the human subject;

recognizing an object present in the scene;

determining a world space object position of the object;

recognizing audio input information encoding a sound;

determining one or more audio-output transformations based on the world space ear position, the one or more audio-output transformations configured to produce a three-dimensional audio output from the audio input information, the three-dimensional audio output configured such that at the world space ear position the sound appears to originate from the world space object position; and

providing the three-dimensional audio output to the human subject via an acoustic transducer array comprising one or more acoustic transducers.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for providing three-dimensional audio includes determining a world space object position and a world space ear position of a human subject based on a modeled virtual skeleton. The method further includes providing three-dimensional audio output to the human subject via an acoustic transducer array including one or more acoustic transducers. The three-dimensional audio output is configured such that sounds appear to originate from the object.

Citations

20 Claims

1. A method for providing three-dimensional audio, comprising:
- receiving a depth map imaging a scene from a depth camera;
  
  recognizing a human subject present in the scene;
  
  modeling the human subject with a virtual skeleton comprising a plurality of joints defined with a three-dimensional position;
  
  determining, based on the virtual skeleton, a world space ear position of the human subject;
  
  recognizing an object present in the scene;
  
  determining a world space object position of the object;
  
  recognizing audio input information encoding a sound;
  
  determining one or more audio-output transformations based on the world space ear position, the one or more audio-output transformations configured to produce a three-dimensional audio output from the audio input information, the three-dimensional audio output configured such that at the world space ear position the sound appears to originate from the world space object position; and
  
  providing the three-dimensional audio output to the human subject via an acoustic transducer array comprising one or more acoustic transducers.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 2. The method of claim 1, wherein the object present in the scene is a moving object, the method further comprising:
    - determining a second world space object position of the object; and
      
      updating the one or more audio-output transformations to produce updated three-dimensional audio output from the audio input information, the updated three-dimensional audio output configured such that at the world space ear position the sound appears to originate from the second world space object position.
  - 3. The method of claim 1, wherein the object present in the scene is an anatomical structure of the human subject, wherein the world space object position is determined based on the virtual skeleton, and wherein the three-dimensional audio output is provided such that the audio output appears to originate from the anatomical structure.
  - 4. The method of claim 1, wherein determining the world space ear position includes:
    - recognizing one or more joints of the virtual skeleton;
      
      recognizing depth information in the depth map that corresponds to the one or more joints; and
      
      estimating the world space ear position based on the depth information.
  - 5. The method of claim 4, wherein the one or more joints include one or more neck joints.
  - 6. The method of claim 1, wherein determining the world space ear position comprises:
    - recognizing one or more joints of the virtual skeleton;
      
      receiving color information imaging the scene from one or more color image sensors;
      
      recognizing a portion of the color information that corresponds to the one or more joints; and
      
      estimating the world space ear position based on the portion of the color information.
  - 7. The method of claim 6, wherein recognizing the portion of the color information includes recognizing one or more anatomical structures of the human subject imaged by the color information.
  - 8. The method of claim 7, wherein the one or more anatomical structures include one or both ears of the human subject.
  - 9. The method of claim 7, wherein the one or more anatomical structures include a mouth of the human subject.
  - 10. The method of claim 1, wherein the one or more audio-output transformations include a head-related transfer function (HRTF).
  - 11. The method of claim 10, wherein determining the HRTF comprises:
    - recognizing depth information in the depth map that corresponds to a head of the human subject; and
      
      calculating the HRTF based on the depth information.
  - 12. The method of claim 1, wherein the one or more audio-output parameters include a crosstalk cancellation transformation, wherein determining the crosstalk cancellation transformation includes:
    - determining a world space transducer position of the acoustic transducer array; and
      
      determining the crosstalk cancellation transformation based on a spatial relationship between the world space transducer position and the world space ear position.
  - 13. The method of claim 12, wherein determining the world space transducer position comprises:
    - providing calibration audio output to the acoustic transducer array;
      
      receiving acoustic sensor information from one or more acoustic sensors during output of the calibration audio by the acoustic transducer array; and
      
      identifying the world space transducer position based on the calibration audio output and the acoustic sensor information.
  - 14. The method of claim 1, further comprising:
    - recognizing a second human subject present in the scene;
      
      modeling the second human subject with a second virtual skeleton comprising a plurality of joints defined with a three-dimensional position;
      
      determining, based on the second virtual skeleton, a world space ear position of the second human subject; and
      
      wherein determining the one or more audio-output transformations is further based on the world space ear position of the second human subject, the three-dimensional audio output configured such that at the world space ear position of the human subject and at the world space ear position of the second human subject the sound appears to originate from the world space object position.

15. A three-dimensional audio system, comprising:
- a depth camera input to receive a depth map imaging a scene from one or more depth cameras;
  
  an audio input;
  
  an audio output to provide three-dimensional audio output information to an acoustic transducer array comprising one or more acoustic transducers;
  
  a logic subsystem; and
  
  a storage subsystem storing instructions that are executable by the logic subsystem to;
  
  receive the depth map;
  
  recognize a human subject present in the scene;
  
  model the human subject with a virtual skeleton comprising a plurality of joints defined with a three-dimensional position;
  
  determine, based on the virtual skeleton, a world space ear position of the human subject;
  
  recognize an object present in the scene;
  
  determine a world space object position of the object;
  
  receive the audio input information via the audio input;
  
  determine one or more audio-output transformations based on the world space ear position of the human subject, the one or more audio-output transformations configured to produce three-dimensional audio output information from the audio input information, the three-dimensional audio output information configured to effect the acoustic transducer array to provide a three-dimensional audio output such that at the world space ear position the sound appears to originate from the world space object position; and
  
  provide the three-dimensional audio output information to the acoustic transducer array such that the acoustic transducer array provides the three-dimensional audio output to the human subject.
- View Dependent Claims (16, 17)
- - 16. The three-dimensional audio system of claim 15, wherein the one or more audio-output transformations include a head-related transfer function (HRTF), and wherein determining the HRTF comprises:
    - recognizing depth information in the depth map that corresponds to a head of the human subject; and
      
      calculating the HRTF based on the depth information.
  - 17. The three-dimensional audio system of claim 15, wherein the one or more audio-output parameters include a crosstalk cancellation transformation, wherein determining the crosstalk cancellation transformation includes:
    - determining a world space transducer position of the acoustic transducer array; and
      
      determining the crosstalk cancellation transformation based on a spatial relationship between the world space transducer position and the world space ear position.

18. A method of providing three-dimensional audio, comprising:
- receiving a depth map imaging a scene from a depth camera;
  
  recognizing a human subject present in the scene;
  
  modeling the human subject with a virtual skeleton comprising a plurality of joints defined with a three-dimensional position;
  
  determining, based on the virtual skeleton, a world space ear position of the human subject;
  
  determining a world space object position of an object present in the scene;
  
  recognizing audio input information encoding a sound;
  
  determining a head related transfer function (HRTF) for the human subject;
  
  determining a crosstalk cancellation transformation based on a spatial relationship between the world space ear position and a world space transducer position of the one or more acoustic transducers;
  
  producing a three-dimensional audio output from the audio input information, the HRTF, and the crosstalk cancellation transformation, the three-dimensional audio output configured such that at the world space ear position the sound appears to originate from the world space object position; and
  
  providing the three-dimensional audio output to the human subject via the one or more acoustic transducers.
- View Dependent Claims (19, 20)
- - 19. The method of claim 18, wherein determining the HRTF includes:
    - recognizing one or more joints of the virtual skeleton;
      
      recognizing depth information in the depth map that corresponds to the one or more joints; and
      
      calculating the HRTF based on the depth information.
  - 20. The method of claim 18, wherein the object present in the scene is a moving object, the method further comprising:
    - determining a second world space object position of the object; and
      
      updating the one or more audio-output transformations to produce updated three-dimensional audio output from the audio input information, the updated three-dimensional audio output configured such that at the world space ear position the sound appears to originate from the second world space object position.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Flaks, Jason, Tardif, John, Vincent, Jon, Pile, Shawn, Amdur, Daniel M., Bar-Zeev, Avi

Application Number

US13/725,571
Publication Number

US 20130208897A1
Time in Patent Office

Days
Field of Search
US Class Current

381/17
CPC Class Codes

A63F 13/10   Control of the course of th...

A63F 13/213   comprising photodetecting m...

A63F 13/428   involving motion or positio...

A63F 13/45   Controlling the progress of...

A63F 13/54   involving acoustic signals,...

A63F 2300/1012   involving biosensors worn b...

A63F 2300/6081   generating an output signal...

H04R 5/04   Circuit arrangements, e.g. ...

H04S 2400/11   Positioning of individual s...

H04S 2420/01   Enhancing the perception of...

H04S 7/303   Tracking of listener positi...

SKELETAL MODELING FOR WORLD SPACE OBJECT SOUNDS

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

SKELETAL MODELING FOR WORLD SPACE OBJECT SOUNDS

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links