Method and apparatus for dynamically adjusting the playout delay of audio signals

US 7,881,284 B2
Filed: 05/04/2006
Issued: 02/01/2011
Est. Priority Date: 03/10/2006
Status: Active Grant

First Claim

Patent Images

1. A method for dynamically adjusting playout delay of audio signals encoded into a sequence of voice packets and transmitted from a transmitting end through a packet-switched network to a receiving end, said method comprising the steps of:

storing a plurality of said voice packets in a jitter buffer at said receiving end, and dynamically determining whether to adjust silence length in said voice packets based on the number of said voice packets in said jitter buffer in order to adjust said playout delay;

dividing said jitter buffer into three zones for temporarily storing said voice packets, and providing dynamic adjustment of silence length to extend or shrink said playout delay; and

dynamically adjusting the sizes of said three zones of said jitter buffer according to the number of said voice packets in said jitter buffer;

wherein said step of dynamically adjusting the sizes of said three zones further comprises the steps of;

mapping said jitter buffer into five zones according to the number of said voice packets in said jitter buffer, said five zones including a no data to play zone A0, an extending silence zone A1, a normal delay zone A2, a shrinking silence zone A3, and a discarding voice packet zone A4, thereby said jitter buffer being divided into said zone A1, said zone A2, and said zone A3 with said zone A2 having a lower bound of normal delay L and an upper bound of normal delay U;

using a probability model to obtain P_Tn(Ai) of said zone Ai over a next time interval [Tn,Tn+1], said P_Tn(Ai) being the probability that the number of said voice packets in said jitter buffer falls into said zone Ai in the time interval [Tn,Tn+1], i being an integer number from 0 to 4 and n being a natural number; and

comparing pre-defined values T_A0, T_A1and T_A3, with said probability P_Tn(A0), P_Tn(A1), and P_Tn(A3) to determine whether to adjust said upper bound of normal delay U and said lower bound of normal delay L.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed is a method and apparatus for dynamically adjusting the playout delay for audio signals, which mainly includes three parts of dynamic adjustment, i.e., playout delay, silence length, and jitter buffer size. In the invention, the time for playout delay is real-time adjusted according to the probability distribution of the number of packets buffered in a jitter buffer. A voice active detection mechanism is taken to detect silence within a voice packet. By dynamically adjusting the silence length in the voice packets, the present invention reduces the network variation impact on the voice quality. It also overcomes the drawback of conventional techniques for estimating playout delay, and reduces the whole computation complexity of the playout delay for the voice packets.

Citations

6 Claims

1. A method for dynamically adjusting playout delay of audio signals encoded into a sequence of voice packets and transmitted from a transmitting end through a packet-switched network to a receiving end, said method comprising the steps of:
- storing a plurality of said voice packets in a jitter buffer at said receiving end, and dynamically determining whether to adjust silence length in said voice packets based on the number of said voice packets in said jitter buffer in order to adjust said playout delay;
  
  dividing said jitter buffer into three zones for temporarily storing said voice packets, and providing dynamic adjustment of silence length to extend or shrink said playout delay; and
  
  dynamically adjusting the sizes of said three zones of said jitter buffer according to the number of said voice packets in said jitter buffer;
  
  wherein said step of dynamically adjusting the sizes of said three zones further comprises the steps of;
  
  mapping said jitter buffer into five zones according to the number of said voice packets in said jitter buffer, said five zones including a no data to play zone A0, an extending silence zone A1, a normal delay zone A2, a shrinking silence zone A3, and a discarding voice packet zone A4, thereby said jitter buffer being divided into said zone A1, said zone A2, and said zone A3 with said zone A2 having a lower bound of normal delay L and an upper bound of normal delay U;
  
  using a probability model to obtain P_Tn(Ai) of said zone Ai over a next time interval [Tn,Tn+1], said P_Tn(Ai) being the probability that the number of said voice packets in said jitter buffer falls into said zone Ai in the time interval [Tn,Tn+1], i being an integer number from 0 to 4 and n being a natural number; and
  
  comparing pre-defined values T_A0, T_A1and T_A3, with said probability P_Tn(A0), P_Tn(A1), and P_Tn(A3) to determine whether to adjust said upper bound of normal delay U and said lower bound of normal delay L.
- View Dependent Claims (2, 3)
- - 2. The method as claimed in claim 1, wherein said upper bound of normal delay U and said lower bound of normal delay L are adjusted according to the steps of:
    - increasing both said upper bound of normal delay U and said lower bound of normal delay L when P_Tn(A0) is greater than T_A0;
      
      decreasing both said upper bound of normal delay U and said lower bound of normal delay L when P_Tn(A0) is less than T_A0;
      
      increasing said upper bound of normal delay U and decreasing said lower bound of normal delay L when P_Tn(A1) is greater than T_A1and P_Tn(A3) is greater than T_A3; and
      
      decreasing said upper bound of normal delay U and increasing said lower bound of normal delay L when P_Tn(A1) is less than T_A1and P_Tn(A3) is less than T_A3.
  - 3. The method as claimed in claim 2, wherein said P_Tn(Ai) is computed using the steps of:
    - initializing P_T0(Ai) for said zone Ai;
      
      predicting P_Tn(Ai) using previous P_Tn−
      
      1(Ai) and P_Tn−
      
      1,Tn(Ai), said P_Tn−
      
      1,Tn(Ai) being the probability that the number of said voice packets in said jitter buffer falls into said zone Ai in a time interval [Tn−
      
      1,Tn]; and
      
      computing P_Tn(Ai) as
      P_Tn(Ai)=P_Tn−
      
      1,Tn(Ai)×
      
      α
      
      +P_Tn−
      
      1(Ai)×
      
      (1−
      
      α
      
      ),wherein α
      
      is a parameter used to determine sensitivity of P_Tn(Ai) to network jitter, and P_Tn(A0)+P_Tn(A1)+P_Tn(A2)+P_Tn(A3)+P_Tn(A4)=1.

4. An apparatus used in a packet-switched network for dynamically adjusting playout delay of audio signals, comprising:
- a jitter buffer for temporarily storing a plurality of received voice packets, and delaying and re-ordering playout time of said voice packets;
  
  a dynamic playout delay adjustment module for dividing said jitter buffer into three zones, and dynamically extending or shrinking silence length of said voice packets to adjust said playout delay of said voice packets according to the number of said voice packets in said jitter buffer;
  
  a dynamic silence length adjustment module for dynamically adjusting a shrinking size or an extending size of said silence length according to the number of said voice packets in said jitter buffer; and
  
  a dynamic jitter buffer zone adjustment module for dynamically adjusting the sizes of said three zones of said jitter buffer according to the number of said voice packets in said jitter buffer;
  
  wherein at least one of said jitter buffer, said dynamic playout delay adjustment module, said dynamic silence length adjustment module and said dynamic jitter buffer zone adjustment module in said apparatus is a hardware module, and said jitter buffer is mapped into an extending silence zone A1 in which the number of said voice packets in said jitter buffer is below a lower bound of normal delay L, a normal delay zone A2 in which the number of said voice packets in said jitter buffer is in a normal range between said lower bound of normal delay L and an upper bound of normal delay U, and a shrinking silence zone A3 in which the number of said voice packets in said jitter buffer is above said upper bound of normal delay U;
  
  when said jitter buffer contains no voice packets for playout, said jitter buffer falls into a no data to play zone A0; and
  
  when said jitter buffer contains more voice packets for playout than a maximum acceptable delay Max, said jitter buffer falls into a discarding voice packet zone A4.
- View Dependent Claims (5, 6)
- - 5. The apparatus as claimed in claim 4, wherein said extending silence zone A1 has a maximum extending size in extending said silence length, said shrinking silence zone A3 has a maximum shrinking size in shrinking said silence length.
  - 6. The apparatus as claimed in claim 4, wherein said dynamic jitter buffer zone adjustment module further comprises:
    - a probability model estimation unit for predicting the probability that the number of said voice packets in said jitter buffer falls into zone Ai in a next time interval [T_n,T_n+1], with i being an integer from 0 to 4 and n being a natural number; and
      
      a zone size adjustment unit for determining whether to increase or decrease said lower bound of normal delay L or said upper bound of normal delay U of said normal delay zone A2.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Industrial Technology Research Institute
Original Assignee
Industrial Technology Research Institute
Inventors
Lin, Zhe-Hong, Wu, Yi-Wei, Shiue, De-Hui
Primary Examiner(s)
Flynn; Nathan
Assistant Examiner(s)
KASSIM, KHALED M

Application Number

US11/381,534
Publication Number

US 20070211704A1
Time in Patent Office

1,734 Days
Field of Search

370/516, 370351-356, 379/516
US Class Current

370/352
CPC Class Codes

G10L 19/167 Audio streaming, i.e. forma...

G10L 25/78 Detection of presence or ab...

Method and apparatus for dynamically adjusting the playout delay of audio signals

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

6 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for dynamically adjusting the playout delay of audio signals

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

6 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links