Audio encoding and decoding with intra frames and adaptive forward error correction

US 7,668,712 B2
Filed: 03/31/2004
Issued: 02/23/2010
Est. Priority Date: 03/31/2004
Status: Active Grant

First Claim

Patent Images

1. In a speech processing tool operated on a computing device, a method comprising:

receiving a frame for a speech signal at the computing device, the frame representing audio samples taken from the speech signal;

processing the frame for the speech signal with the computing device, the processing including processing primary encoded information for the frame and one or more versions of forward error correction information for the frame, wherein the primary encoded information comprises plural parameter values signaled in a bitstream, and wherein each of the one or more versions of forward error correction information comprises a subset of the plural parameter values selected based at least in part on an estimate of extra available bits and signaled in the bitstream in addition to the plural parameter values of the primary encoded information; and

outputting a result usable for playback of the speech signal from the computing device.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Various strategies for rate/quality control and loss resiliency in an audio codec are described. The various strategies can be used in combination or independently. For example, a real-time speech codec uses intra frame coding/decoding, adaptive multi-mode forward error correction [“FEC”], and rate/quality control techniques. Intra frames help a decoder recover quickly from packet losses, while compression efficiency is still emphasized with predicted frames. Various strategies for inserting intra frames and signaling intra/predicted frames are described. With the adaptive multi-mode FEC, an encoder adaptively selects between multiple modes to efficiently and quickly provide a level of FEC that takes into account the bandwidth currently available for FEC. The FEC information itself may be predictively encoded and decoded relative to primary encoded information. Various rate/quality and FEC control strategies allow additional adaptation to available bandwidth and network conditions.

Citations

18 Claims

1. In a speech processing tool operated on a computing device, a method comprising:
- receiving a frame for a speech signal at the computing device, the frame representing audio samples taken from the speech signal;
  
  processing the frame for the speech signal with the computing device, the processing including processing primary encoded information for the frame and one or more versions of forward error correction information for the frame, wherein the primary encoded information comprises plural parameter values signaled in a bitstream, and wherein each of the one or more versions of forward error correction information comprises a subset of the plural parameter values selected based at least in part on an estimate of extra available bits and signaled in the bitstream in addition to the plural parameter values of the primary encoded information; and
  
  outputting a result usable for playback of the speech signal from the computing device.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1 wherein the subset is also selected based at least in part on network loss rate or decoder loss rate.
  - 3. The method of claim 1 wherein the subset is also selected based at least in part on frame class.
  - 4. The method of claim 1 wherein the primary encoded information is packed into a single packet with forward error correction information for a preceding frame.
  - 5. The method of claim 1 wherein the speech processing tool is a real-time speech encoder that uses linear prediction, wherein the result is encoded speech which is decodable into reconstructed speech for the speech signal, and wherein the plural parameter values are plural linear prediction parameter values.
  - 6. The method of claim 1 wherein the speech processing tool is a real-time speech decoder that uses linear prediction, wherein the result is reconstructed speech, and wherein the plural parameter values are plural linear prediction parameter values.

7. In a speech processing tool operated on a computing device, a method comprising:
- receiving a frame for a speech signal at the computing device, the frame representing audio samples taken from the speech signal;
  
  processing the frame for the speech signal with the computing device, the processing the frame including processing primary encoded information for the frame and plural versions of forward error correction information for the frame, wherein each of the plural versions of forward error correction information for the frame is separately signaled in a bitstream in addition to the primary encoded information for the frame, wherein the primary encoded information comprises plural parameter values, and wherein each of the plural versions of forward error correction information comprises a different subset of the plural parameter values for the frame; and
  
  outputting a result usable for playback of the speech signal from the computing device.
- View Dependent Claims (8, 9, 10)
- - 8. The method of claim 7 wherein each of the plural versions of forward error correction information is packed into a different packet for network transmission.
  - 9. The method of claim 7 wherein the speech processing tool is a real-time speech encoder that uses linear prediction, wherein the result is encoded speech which is decodable into reconstructed speech for the speech signal, and wherein the plural parameter values are plural linear prediction parameter values.
  - 10. The method of claim 7 wherein the speech processing tool is a real-time speech decoder that uses linear prediction, wherein the result is reconstructed speech, and wherein the plural parameter values are plural linear prediction parameter values.

11. In an audio processing tool operated on a computing device, a method comprising:
- receiving encoded information for an audio signal at the computing device;
  
  processing encoded information for the audio signal with the computing device, the encoded information representing audio samples taken from the speech signal, wherein the encoded information includes forward error correction information for a first frame and primary encoded information for a second frame, wherein the forward error correction information for the first frame and the primary encoded information for the second frame are signaled in a bitstream in addition to forward error correction information for the second frame and primary encoded information for the first frame, and wherein at least some of the forward error correction information for the first frame is predictively encoded relative to the primary encoded information for the second frame; and
  
  outputting a result usable for playback of the speech signal from the computing device.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18)
- - 12. The method of claim 11 wherein a single packet includes the forward error correction information for the first frame and the primary encoded information for the second frame.
  - 13. The method of claim 12 wherein the single packet further includes forward error correction information for one or more other frames.
  - 14. The method of claim 11 wherein the second frame is the current frame and the first frame is a preceding frame.
  - 15. The method of claim 11 wherein the forward error correction information for the first frame comprises linear prediction coefficient information predicted from corresponding coefficient information for the second frame.
  - 16. The method of claim 15 wherein the forward error correction information for the first frame comprises one or more excitation parameters predicted from corresponding excitation parameters for the second frame.
  - 17. The method of claim 11 wherein the audio processing tool is a real-time speech encoder and the result is encoded speech which is decodable into reconstructed speech for the speech signal.
  - 18. The method of claim 11 wherein the audio processing tool is a real-time speech decoder and the result is reconstructed speech.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Khalil, Hosam A., Chen, Wei-Ge, Koishida, Kazuhito, Wang, Tian, Han, Mu
Primary Examiner(s)
Opsasnick; Michael N

Application Number

US10/816,466
Publication Number

US 20050228651A1
Time in Patent Office

2,155 Days
Field of Search

704/200, 704/201, 704219-213
US Class Current

704/219
CPC Class Codes

G10L 19/005   Correction of errors induce...

G10L 19/08   Determination or coding of ...

G10L 19/22   Mode decision, i.e. based o...

Audio encoding and decoding with intra frames and adaptive forward error correction

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Audio encoding and decoding with intra frames and adaptive forward error correction

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links