Method of segmenting an audio stream

US 7,346,516 B2
Filed: 02/21/2003
Issued: 03/18/2008
Est. Priority Date: 02/21/2002
Status: Expired due to Fees

First Claim

Patent Images

1. At method of segmentation of an audio stream, comprising;

receiving the audio stream;

calculating a first-grade characteristic of the audio stream;

calculating a second-grade characteristic of the audio stream; and

performing a decision-making analysis, wherein the segmentation includes a division of the audio stream into segments containing different homogeneous signals based on the first-grade characteristic and the second-grade characteristic of the audio stream;

wherein calculating of the first-grade characteristic is performed by a division of the audio stream into frames for which of them an audio feature vector is calculated;

wherein the audio feature vector includes five formant frequencies, first and second reflection coefficients, an energy of a prediction error coefficient, and a pre-emphasized energy ratio coefficient;

wherein calculating the second-grade characteristic is performed in a sequence of a predefined and not overlapped windows, each of the windows includes a definite number of said frames with said audio feature vectors calculated during the calculating of the first-grade characteristic;

wherein calculating the second-grade characteristic includes calculating a statistical feature vector for each said window;

wherein the statistical feature vector includes two sub-vectors, a first one of said two sub-vectors includes mean values of the formant frequencies and dispersions of the formant frequencies, and a second one of said two sub-vectors includesa difference between maximal and minimal values of the second reflection coefficient multiplied by the mean value of the second reflection coefficient,a product of the mean value and the dispersion of the energy of the prediction error coefficient,a sum of modules of differences between said energies of the prediction error coefficients for said neighboring frames divided by the sum of the modules of differences between said energies of the prediction error coefficients,a difference between maximal and minimal values of said pre-emphasized energy ratio coefficients, anda number of said frames in the window in which the first reflection coefficients outnumber a predefined threshold value.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed herein is a segmentation method, which divides an input audio stream into segments containing different homogeneous signals. The main objective of this method is localization of segments with stationary properties. This method seeks all no-stationary points or intervals in the audio stream and creates a list of segments. The obtained list of segments can be used as an input data for the following procedures, such as classification, speech/music/noise attribution and so on. The proposed segmentation method is based on the analysis of audio signal statistical features variation and comprises three main stages: stage of first-grade characteristics calculation, stage of second-grade characteristics calculation and stage of decision-making.

28 Citations

View as Search Results

2 Claims

1. At method of segmentation of an audio stream, comprising;
- receiving the audio stream;
  
  calculating a first-grade characteristic of the audio stream;
  
  calculating a second-grade characteristic of the audio stream; and
  
  performing a decision-making analysis, wherein the segmentation includes a division of the audio stream into segments containing different homogeneous signals based on the first-grade characteristic and the second-grade characteristic of the audio stream;
  
  wherein calculating of the first-grade characteristic is performed by a division of the audio stream into frames for which of them an audio feature vector is calculated;
  
  wherein the audio feature vector includes five formant frequencies, first and second reflection coefficients, an energy of a prediction error coefficient, and a pre-emphasized energy ratio coefficient;
  
  wherein calculating the second-grade characteristic is performed in a sequence of a predefined and not overlapped windows, each of the windows includes a definite number of said frames with said audio feature vectors calculated during the calculating of the first-grade characteristic;
  
  wherein calculating the second-grade characteristic includes calculating a statistical feature vector for each said window;
  
  wherein the statistical feature vector includes two sub-vectors, a first one of said two sub-vectors includes mean values of the formant frequencies and dispersions of the formant frequencies, and a second one of said two sub-vectors includesa difference between maximal and minimal values of the second reflection coefficient multiplied by the mean value of the second reflection coefficient,a product of the mean value and the dispersion of the energy of the prediction error coefficient,a sum of modules of differences between said energies of the prediction error coefficients for said neighboring frames divided by the sum of the modules of differences between said energies of the prediction error coefficients,a difference between maximal and minimal values of said pre-emphasized energy ratio coefficients, anda number of said frames in the window in which the first reflection coefficients outnumber a predefined threshold value.
- View Dependent Claims (2)
- - 2. The method according to claim 1, wherein the audio stream is a sequence of digital samples which are broadcasted or recorded using some media.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
LG Electronics, Inc. (LG Corporation)
Original Assignee
LG Electronics, Inc. (LG Corporation)
Inventors
Sall, Mikhael A., Viktorov, Andrei B., Maiboroda, Alexandr L., Redkov, Victor V., Gramnitskiy, Sergei N., Tikhotsky, Anatoli I.
Primary Examiner(s)
Vo; Huyen X.

Application Number

US10/370,065
Publication Number

US 20030171936A1
Time in Patent Office

1,852 Days
Field of Search

704/231, 704/273, 704/200, 704/246, 704/251, 704/272, 704/235, 704/260, 704/230, 704/222, 704/219, 704/206, 704/205, 704/207, 704/209, 704/216, 704/217, 704/218, 704/500, 704/501
US Class Current

704/500
CPC Class Codes

G10L 15/04 Segmentation; Word boundary...

G10L 25/48 specially adapted for parti...

Method of segmenting an audio stream

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

28 Citations

2 Claims

Specification

Solutions

Use Cases

Quick Links

Method of segmenting an audio stream

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

28 Citations

2 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links