Sound alignment using timing information

US 9,355,649 B2
Filed: 11/13/2012
Issued: 05/31/2016
Est. Priority Date: 11/13/2012
Status: Active Grant

First Claim

Patent Images

1. A method implemented by one or more computing devices, the method comprising:

identifying features of first sound data generated from a first sound signal using a feature module, the features including bases that describe spectral characteristics of the first sound data and weights that describe temporal features of the first sound data;

identifying timing information of the first sound data using a timing module, the timing information being a cross-correlation of the weights for different frames of the first sound data;

estimating parameters of the features and the timing information of the first sound data;

processing second sound data generated from a second sound signal to identify second features and second timing information of the second sound data that are within the estimated parameters of the first sound data;

extracting the identified features of the first sound data;

inserting the extracted identified features of the first sound data into the second sound data based on the second features and second timing information of the second sound data, the inserting effective to provide altered second sound data; and

producing the altered second sound data with the extracted identified features of the first sound data.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Sound alignment techniques that employ timing information are described. In one or more implementations, features and timing information of sound data generated from a first sound signal are identified and used to identify features of sound data generated from a second sound signal. The identified features may then be utilized to align portions of the sound data from the first and second sound signals to each other.

214 Citations

20 Claims

1. A method implemented by one or more computing devices, the method comprising:
- identifying features of first sound data generated from a first sound signal using a feature module, the features including bases that describe spectral characteristics of the first sound data and weights that describe temporal features of the first sound data;
  
  identifying timing information of the first sound data using a timing module, the timing information being a cross-correlation of the weights for different frames of the first sound data;
  
  estimating parameters of the features and the timing information of the first sound data;
  
  processing second sound data generated from a second sound signal to identify second features and second timing information of the second sound data that are within the estimated parameters of the first sound data;
  
  extracting the identified features of the first sound data;
  
  inserting the extracted identified features of the first sound data into the second sound data based on the second features and second timing information of the second sound data, the inserting effective to provide altered second sound data; and
  
  producing the altered second sound data with the extracted identified features of the first sound data.
- View Dependent Claims (2, 3, 10, 11, 12, 13, 15, 16)
- - 2. A method as described in claim 1, wherein the timing information is expressed using a transition matrix that is computed as the cross-correlation of the weights for different frames in the first sound data generated from the first sound signal.
  - 3. A method as described in claim 1, wherein the processing of the second sound data generated from the second sound signal is performed iteratively by estimating a new set of weights for the first features of the first sound data generated from the first sound signal.
  - 10. The method of claim 1, wherein the bases that describe spectral characteristics further comprise spectral basis vectors that are building blocks of the first and second sound data.
  - 11. The method of claim 10, wherein the weights that describe temporal features of the first and second sound data define a temporal evolution of a signal such that at each instance of the signal, the signal may be defined by a linear combination of the spectral basis vectors.
  - 12. The method of claim 1, wherein the features and the second features further comprise speech bases and speech weights that describe vocal characteristics of spoken sound in the first and second sound data, respectively.
  - 13. The method of claim 1, wherein the features and the second features further comprise noise bases and noise weights that describe background noise in the first and second sound data, respectively.
  - 15. The method of claim 1, the method further comprising modifying the extracted identified features of the first sound data to match the timing information of the identified second features of the second sound data prior to inserting into the second sound data.
  - 16. The method of claim 15, the modifying comprising stretching, compressing, warping, or shifting.

4. A system comprising:
- at least one extraction module implemented at least partially in hardware and configured to process sound data generated from a first and a second sound signal and identify features and timing information common to the first and second sound signals, the identification of features and timing information based on an estimated set of parameters for the features and timing information of the sound data generated from the first sound signal;
  
  the features including bases that describe spectral characteristics of the sound data and weights that describe temporal features of the sound data;
  
  the timing information being a cross-correlation of the weights for different frames of the sound data; and
  
  one or more modules implemented at least partially in hardware and configured to extract the identified features of the sound data from the first sound signal and insert the identified features of the sound data from the first sound signal into the second sound signal to produce altered sound data from the second sound signal with the identified features of the first sound signal.
- View Dependent Claims (5, 6, 14, 17, 18)
- - 5. A system as described in claim 4, wherein the timing information is expressed using a transition matrix that is computed as a cross-correlation of the weights for different frames in the sound data generated from the first sound signal.
  - 6. A system as described in claim 4, wherein the at least one extraction module is further configured to estimate the set of parameters for the features and timing information of the sound data generated from the first sound signal and iteratively narrow the set of parameters to identify corresponding features of the sound data generated from the second sound signal.
  - 14. The system of claim 4, further comprising a parameter module implemented at least partially in hardware, the parameter module configured to estimate the set of parameters for the features and timing information identified in the first sound signal and to pass the set of parameters to the extraction module.
  - 17. The method of claim 4, the at least one extraction module further configured to modify the extracted identified features of the sound data from the first sound signal to match the timing information of the identified features of the second sound signal prior to inserting into the second sound signal.
  - 18. The method of claim 17, the at least one extraction module configured to modify the extracted features of the first sound signal by stretching, compressing, warping, or shifting.

7. One or more computer-readable and non-transitory storage media having instructions stored thereon that, responsive to execution on a computing device, causes the computing device to perform operations comprising:
- identifying features and timing information of sound data of a first sound signal, the identified features including bases that describe spectral characteristics of the sound data and weights that describe temporal features of the sound data and the timing information is computed as a cross-correlation of the weights for different frames in the sound data generated from the first sound signal;
  
  estimating parameters for the features and timing information of the sound data of the first sound signal;
  
  processing sound data generated from a second sound signal to identify second features and second timing information that are within the estimated parameters from the sound data generated from the first sound signal;
  
  extracting the identified features of the sound data from the first sound signal;
  
  inserting the extracted identified features of the first sound signal into the second sound signal based on the second features and second timing information of the second sound signal, the inserting effective to provide altered second sound data; and
  
  producing the altered second sound signal with the extracted identified features of the first sound signal.
- View Dependent Claims (8, 9, 19, 20)
- - 8. One or more computer-readable and non-transitory storage media as described in claim 7, wherein the identifying of the second features from the second sound signal is performed iteratively by estimating a new set of weights for the second features of the sound data generated from the second sound signal.
  - 9. One or more computer-readable and non-transitory storage media as described in claim 7, the instructions further comprising modifying portions of the sound data for the first or second sound signals by stretching or compressing the first or second sound signal.
  - 19. One or more computer-readable storage media as described in claim 7, the instructions further comprising modifying the extracted identified features of the first sound signal to match the timing information of the identified second features of the second sound signal prior to inserting into the second sound data.
  - 20. The method of claim 19, the modifying comprising stretching, compressing, warping, or shifting.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Adobe Inc.
Original Assignee
Adobe Systems Incorporated (Adobe Inc.)
Inventors
King, Brian John, Mysore, Gautham J., Smaragdis, Paris
Primary Examiner(s)
MCCORD, PAUL C

Application Number

US13/675,711
Publication Number

US 20140135962A1
Time in Patent Office

1,295 Days
Field of Search

700/94
US Class Current

1/1
CPC Class Codes

G10L 25/48   specially adapted for parti...

G11B 27/10   Indexing; Addressing; Timin...

G11B 27/28   by using information signal...

H04H 60/04   Studio equipment; Interconn...

H04N 21/42203   sound input device, e.g. mi...

H04N 21/4394   involving operations for an...

Sound alignment using timing information

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

214 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Sound alignment using timing information

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

214 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links