Audio conferencing utilizing packets with unencrypted power level information

US 20080165707A1
Filed: 01/04/2007
Published: 07/10/2008
Est. Priority Date: 01/04/2007
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving a plurality of packet streams input from different endpoints, packets of each stream including header and payload portions, the header portion containing audio power level information that includes power levels for each of a respective plurality of frequencies;

comparing the audio power level information contained in the packets of each of the packet streams at a particular point in time to select N, where N is an integer greater than or equal to one, packet streams with loudest audio;

decoding the N packet streams to obtain audio content contained in the payload portion of each of the N packet streams; and

mixing the audio content of the N packet streams to produce one or more output packet streams.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In one embodiment, a method that includes receiving a plurality of packet streams input from different endpoints, packets of each stream including encrypted and unencrypted portions, the unencrypted portion containing audio power level information. The audio power level information contained in the packets of each of the packet streams is then compared to select N packet streams with loudest audio. The N packet streams are then decrypted to obtain audio content, and the audio content of the N packet streams mixed to produce one or more output packet streams. It is emphasized that this abstract is provided to comply with the rules requiring an abstract that will allow a searcher or other reader to quickly ascertain the subject matter of the technical disclosure.

Citations

29 Claims

1. A method comprising:
- receiving a plurality of packet streams input from different endpoints, packets of each stream including header and payload portions, the header portion containing audio power level information that includes power levels for each of a respective plurality of frequencies;
  
  comparing the audio power level information contained in the packets of each of the packet streams at a particular point in time to select N, where N is an integer greater than or equal to one, packet streams with loudest audio;
  
  decoding the N packet streams to obtain audio content contained in the payload portion of each of the N packet streams; and
  
  mixing the audio content of the N packet streams to produce one or more output packet streams.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, wherein a first portion of the header is encrypted, a second portion of the header containing the audio power level information is unencrypted, and all or a portion of the payload is encrypted.
  - 3. The method of claim 1 further comprising sending the one or more output packet streams to the different endpoints.
  - 4. The method of claim 1 further comprising discarding a remainder of the packet streams.
  - 5. The method of claim 2 wherein each power level comprises a moving average of a normalized power level at a particular frequency.
  - 6. The method of claim 1 wherein the frequencies include at least one frequency associated with human speech.
  - 7. The method of claim 5 wherein the second portion further includes timebase information associated with the moving average.
  - 8. The method of claim 2 wherein each of the packets has a packet format compatible with Secure Real-Time Protocol (SRTP).
  - 9. The method of claim 1 wherein each of the packets has a packet format compatible with Real-Time Protocol (RTP).
  - 10. The method of claim 7 further comprising:
    - generating a report that contains statistics regarding the audio content and an adjusted timebase; and
      
      sending the report to at least the endpoints associated with the N packet streams.

11. A method comprising:
- receiving a plurality of packet streams input from a corresponding plurality of endpoints, packets of each stream including header and payload portions, the header portion containing audio power level information that includes timebase information and a moving average of normalized power levels for each of a respective plurality of frequencies;
  
  comparing the audio power level information contained in the packets of each of the packet streams at a particular point in time to select N, where N is an integer greater than or equal to one, packet streams with loudest audio;
  
  decoding the N packet streams to obtain audio content contained in the payload portion of each of the N packet streams; and
  
  mixing the audio content of the N packet streams to produce N+1 output packet streams.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 21, 22, 23, 24, 25, 27, 28, 29)
- - 12. The method of claim 11 wherein N of the output packet streams are each customized for a corresponding one of the endpoints.
  - 13. The method of claim 11 further comprising:
    - encoding and encrypting each of the output packet streams; and
      
      sending the encoded and encrypted output packet streams to the endpoints.
  - 14. The method of claim 11 wherein portions of the header and payload are encrypted, the timebase information and the moving average of the normalized power levels being included in an unencrypted portion of the header.
  - 15. The method of claim 11 wherein the frequencies include at least one frequency associated with human speech.
  - 16. The method of claim 11 wherein the loudest audio of the N packet streams is determined with respect to the at least one frequency.
  - 17. The method of claim 11 further comprising discarding a remainder of the packet streams.
  - 18. The method of claim 17 wherein the remainder of the packet streams includes all of the packet streams, except for the N packet streams.
  - 19. The method of claim 14 wherein the unencrypted portion comprises a Real-Time Protocol (RTP) extension section.
  - 21. The logic of claim 16 wherein execution of the one or more media is further operable to send the one or more output packet streams to the different endpoints.
  - 22. The logic of claim 16 wherein execution of the one or more media is further operable to discard a remainder of the packet streams.
  - 23. The logic of claim 16 wherein the audio power level information includes a moving average of normalized power levels for each of a respective plurality of frequencies.
  - 24. The logic of claim 19 wherein the frequencies include at least one frequency associated with human speech.
  - 25. The logic of claim 19 wherein the unencrypted portion further includes timebase information associated with the moving average.
  - 27. The system of claim 22 wherein the unencrypted portion comprises a Real-Time Protocol extension section.
  - 28. The system of claim 22 wherein the audio power level information includes a moving average of normalized power levels at the one or more frequencies, and further where the unencrypted portion includes a timebase associated with the moving average.
  - 29. The system of claim 24 wherein the mixer is further operable to:
    - generate a report that contains statistics regarding the audio content and an adjusted timebase; and
      
      send the report to at least the endpoints associated with the N packet streams.

20. Logic encoded in one or more tangible media for execution and when executed operable to:
- receive a plurality of packet streams input from different endpoints, packets of each stream including encrypted and unencrypted portions, the unencrypted portion containing audio power level information;
  
  compare the audio power level information contained in the packets of each of the packet streams at a particular point in time to select N, where N is an integer greater than or equal to one, packet streams with loudest audio;
  
  decrypt the N packet streams to obtain audio content contained in the encrypted portion of each of the N packet streams; and
  
  mix the audio content of the N packet streams to produce one or more output packet streams.

26. A system comprising:
- a conferencing server; and
  
  a mixer coupled to receive control information from the conferencing server, the mixer being operable to;
  
  examine an unencrypted portion of each of a plurality of packets associated with streams input from corresponding endpoints, the unencrypted portion containing audio power level information at one or more frequencies associated with human speech;
  
  select, based on the audio power level information, N packet streams having highest power levels, where N is an integer greater than or equal to one;
  
  mix audio content from the N packet streams to produce a plurality of output packet streams; and
  
  send the output packet streams to the endpoints.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Cisco Technology, Inc. (Cisco Systems, Inc.)
Original Assignee
Cisco Technology, Inc. (Cisco Systems, Inc.)
Inventors
Baird, Randall B., Surazski, Luke K.

Granted Patent

US 8,116,236 B2
Time in Patent Office

Days
Field of Search
US Class Current

370/260
CPC Class Codes

H04L 12/1827   Network arrangements for co...

H04L 65/4038   with floor control

H04L 65/70   Media network packetisation

H04L 65/765   intermediate

H04M 3/56   Arrangements for connecting...

H04M 3/568   audio processing specific t...

Audio conferencing utilizing packets with unencrypted power level information

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

29 Claims

Specification

Solutions

Use Cases

Quick Links

Audio conferencing utilizing packets with unencrypted power level information

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

29 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links