Voiced/unvoiced decision based on frequency band ratio

US 5,960,388 A
Filed: 06/09/1997
Issued: 09/28/1999
Est. Priority Date: 03/18/1992
Status: Expired due to Term

First Claim

Patent Images

1. A method for processing an audio signal, comprising the steps of:

generating frequency domain data by dividing an input audio signal on a block-by-block basis thereby determining blocks of data, and performing time domain to frequency domain conversion on each of the blocks thereby generating the frequency domain data;

dividing the frequency domain data for at least one of the blocks into plural bands;

deciding, for each of the bands for one of the blocks, whether said each of the bands is voiced or unvoiced;

if at least one of the bands for said one of the blocks is voiced, identifying as a highest frequency voiced band a voiced band whose center frequency is F, where F is the highest center frequency among said at least one of the bands for said one of the blocks which are voiced; and

generating boundary point data indicative of a boundary point between a voiced sound region and an unvoiced sound region of said one of the blocks in accordance with the number B_VH of bands of the frequency domain data for said one of the blocks which have center frequency less than the center frequency F.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Input audio signal is divided on a block-by-block basis. Frequency domain conversion is done on each of the blocks. Voiced bands of the frequency domain data for one of the blocks are searched for a voiced band B_VH with the highest center frequency if it is decided that there are one or more shift points of voiced (V)/unvoiced (UV) decision data of all the bands. The number N_V of voiced bands having center frequency less than that of the band B_VH is found, so as to decide whether a proportion of the voiced bands is equal to or higher than a predetermined threshold N_th, thereby deciding one V/UV boundary point. Thus, it is possible to replace the V/UV decision data for each band by information on one demarcation in all bands, thereby reducing data volume and bit rate.

Citations

4 Claims

1. A method for processing an audio signal, comprising the steps of:
- generating frequency domain data by dividing an input audio signal on a block-by-block basis thereby determining blocks of data, and performing time domain to frequency domain conversion on each of the blocks thereby generating the frequency domain data;
  
  dividing the frequency domain data for at least one of the blocks into plural bands;
  
  deciding, for each of the bands for one of the blocks, whether said each of the bands is voiced or unvoiced;
  
  if at least one of the bands for said one of the blocks is voiced, identifying as a highest frequency voiced band a voiced band whose center frequency is F, where F is the highest center frequency among said at least one of the bands for said one of the blocks which are voiced; and
  
  generating boundary point data indicative of a boundary point between a voiced sound region and an unvoiced sound region of said one of the blocks in accordance with the number B_VH of bands of the frequency domain data for said one of the blocks which have center frequency less than the center frequency F.
- View Dependent Claims (2, 3, 4)
- - 2. The method of claim 1, wherein the step of generating the boundary point data includes the steps of:
    - determining a ratio R=N_V /(B_VH +N), where N_V is the number of voiced bands for said one of the blocks, and N is an integer; and
      
      generating the boundary point data to be indicative of the voiced band whose center frequency is F, if the ratio R is not less than a predetermined threshold.
  - 3. The method of claim 2, wherein N=1.
  - 4. The method of claim 2, wherein the step of generating the boundary point data includes the step of:
    - generating the boundary point data to be indicative of the voiced band including frequency F₂, if the ratio R is less than the predetermined threshold, where F₂ =kF and k is a constant satisfying 0<
      
      k<
      
      1.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sony Corporation (Sony Group Corp.)
Original Assignee
Sony Corporation (Sony Group Corp.)
Inventors
Nishiguchi, Masayuki, Matsumoto, Jun, Ono, Shinobu
Primary Examiner(s)
Knepper, David D.

Application Number

US08/871,335
Time in Patent Office

841 Days
Field of Search

704/200, 704/201, 704/205-209, 704/214, 704/229, 704/230, 704/219, 370/62
US Class Current

704/208
CPC Class Codes

G10L 19/0212   using orthogonal transforma...

G10L 19/038   Vector quantisation, e.g. T...

G10L 19/04   using predictive techniques

G10L 19/10   the excitation function bei...

G10L 19/12   the excitation function bei...

G10L 19/18   Vocoders using multiple modes

G10L 2019/0005   Multi-stage vector quantisa...

G10L 2025/937   Signal energy in various fr...

G10L 25/27   characterised by the analys...

G10L 25/90   Pitch determination of spee...

G10L 25/93   Discriminating between voic...

Voiced/unvoiced decision based on frequency band ratio

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

Citations

4 Claims

Specification

Solutions

Use Cases

Quick Links

Voiced/unvoiced decision based on frequency band ratio

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

4 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links