User attribute derivation and update for network/peer assisted speech coding

US 9,058,818 B2
Filed: 09/21/2010
Issued: 06/16/2015
Est. Priority Date: 10/22/2009
Status: Active Grant

First Claim

Patent Images

1. A communication terminal, comprising:

a speech capture module that obtains a speech signal associated with a user;

a speech analysis module that processes the speech signal associated with the user to generate user attribute information;

a network interface module that transmits the user attribute information to a network for the purpose of making the user attribute information available to at least one other communication terminal for use in configuring a configurable speech codec of the at least one other communication terminal to operate in a speaker-dependent manner;

decomposition logic configured to decompose the speech signal into a speaker-independent signal and a speaker-dependent signal;

a first encoder configured to encode the speaker-independent signal without using the user attribute information and provide the encoded speaker-independent signal for transmission to the at least one other communication terminal; and

a second encoder configured to encode the speaker dependent signal using the user attribute information and provide the encoded speaker-dependent signal for transmission to the at least one other communication terminal.

View all claims

7 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems, methods and apparatuses are described for deriving and updating user attribute information about users of a communications system. A communications network is then used to transfer the user attribute information to communication terminals, which use the user attribute information to configure a speech codec to operate in a speaker-dependent manner during a communication session, thereby improving speech coding efficiency. In a network-assisted model, the user attribute information is stored on the communications network and selectively transmitted to the communication terminals while in a peer-assisted model, the user attribute information is derived by and transferred between communication terminals.

63 Citations

44 Claims

1. A communication terminal, comprising:
- a speech capture module that obtains a speech signal associated with a user;
  
  a speech analysis module that processes the speech signal associated with the user to generate user attribute information;
  
  a network interface module that transmits the user attribute information to a network for the purpose of making the user attribute information available to at least one other communication terminal for use in configuring a configurable speech codec of the at least one other communication terminal to operate in a speaker-dependent manner;
  
  decomposition logic configured to decompose the speech signal into a speaker-independent signal and a speaker-dependent signal;
  
  a first encoder configured to encode the speaker-independent signal without using the user attribute information and provide the encoded speaker-independent signal for transmission to the at least one other communication terminal; and
  
  a second encoder configured to encode the speaker dependent signal using the user attribute information and provide the encoded speaker-dependent signal for transmission to the at least one other communication terminal.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. The communication terminal of claim 1, wherein the speech capture module obtains the speech signal associated with the user when the user is using the communication terminal to conduct a communication session.
  - 3. The communication terminal of claim 1, wherein the speech capture module obtains the speech signal associated with the user when the communication terminal is operating in a training mode.
  - 4. The communication terminal of claim 1, wherein the speech analysis module processes the speech signal associated with the user to generate user attribute information that includes information associated with at least one of:
    - a vocal tract of the user;
      
      a pitch or pitch range of the user; and
      
      an excitation signal associated with the user.
  - 5. The communication terminal of claim 1, wherein the speech analysis module processes the speech signal associated with the user to generate user attribute information that includes information associated with at least one of:
    - a pitch of the user;
      
      a timing of the user;
      
      a voice quality of the user; and
      
      an articulation of the user.
  - 6. The communication terminal of claim 1, wherein the network interface module transmits the user attribute information via the network to a server that stores the user attribute information for subsequent transmission to the at least one other communication terminal.
  - 7. The communication terminal of claim 1, wherein the network interface module transmits the user attribute information via the network to the at least one other communication terminal.
  - 8. The communication terminal of claim 1, wherein the speech analysis module is further configured to process additional speech signals associated with the user to update the user attribute information and wherein the network interface module transmits the updated user attribute information to the network.
  - 9. The communication terminal of claim 8, wherein the speech analysis module is configured to process the additional speech signals associated with the user to update the user attribute information each time the user uses the communication terminal to conduct a communication session.
  - 10. The communication terminal of claim 8, wherein the speech analysis module is configured to process the additional speech signals associated with the user to update the user attribute information periodically based on a predetermined time interval or number of communication sessions.
  - 11. The communication terminal of claim 8, wherein the network interface module transmits the updated user attribute information to the network by transmitting only differences between the updated user attribute information and previously-transmitted user attribute information.
  - 12. The communication terminal of claim 8, wherein the network interface module transmits the updated user attribute information to the network only if a measure of difference between the updated user attribute information and previously-transmitted user attribute information exceeds a threshold.

13. A method performed by a communication terminal, comprising:
- obtaining a speech signal associated with a user;
  
  processing the speech signal associated with the user to generate user attribute information;
  
  transmitting the user attribute information to a network for the purpose of making the user attribute information available to at least one other communication terminal for use in configuring a configurable speech codec of the at least one other communication terminal to operate in a speaker-dependent manner;
  
  decomposing the speech signal into a speaker-independent signal and a speaker-dependent signal;
  
  encoding, by a first encoder, the speaker-independent signal without using the user attribute information and providing the encoded speaker-independent signal to the at least one other communication terminal; and
  
  encoding, by a second encoder, the speaker dependent signal using the user attribute information and providing the encoded speaker-dependent signal to the at least one other communication terminal.
- View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
- - 14. The method of claim 13, wherein obtaining the speech signal associated with the user comprises obtaining the speech signal associated with the user when the user is using the communication terminal to conduct a communication session.
  - 15. The method of claim 13, wherein obtaining the speech signal associated with the user comprises obtaining the speech signal associated with the user when the communication terminal is operating in a training mode.
  - 16. The method of claim 13, wherein processing the speech signal associated with the user to generate the user attribute information comprises processing the speech signal associated with the user to generate information associated with at least one of:
    - a vocal tract of the user;
      
      a pitch or pitch range of the user; and
      
      an excitation signal associated with the user.
  - 17. The method of claim 13, wherein processing the speech signal associated with the user to generate the user attribute information comprises processing the speech signal associated with the user to generate information associated with at least one of:
    - a pitch of the user;
      
      a timing of the user;
      
      a voice quality of the user; and
      
      an articulation of the user.
  - 18. The method of claim 13, wherein transmitting the user attribute information via the network comprises transmitting the user attribute information via the network to a server that stores the user attribute information for subsequent transmission to the at least one other communication terminal.
  - 19. The method of claim 13, wherein transmitting the user attribute information via the network comprises transmitting the user attribute information via the network to the at least one other communication terminal.
  - 20. The method of claim 13, further comprising:
    - processing additional speech signals associated with the user to update the user attribute information; and
      
      transmitting the updated user attribute information to the network.
  - 21. The method of claim 20, wherein processing the additional speech signals associated with the user to update the user attribute information comprises:
    - processing the additional speech signals associated with the user to update the user attribute information each time the user uses the communication terminal to conduct a communication session.
  - 22. The method of claim 20, wherein processing the additional speech signals associated with the user to update the user attribute information comprises:
    - processing the additional speech signals associated with the user to update the user attribute information periodically based on a predetermined time interval or number of communication sessions.
  - 23. The method of claim 20, wherein transmitting the updated user attribute information to the network comprises transmitting only differences between the updated user attribute information and previously-transmitted user attribute information.
  - 24. The method of claim 20, wherein transmitting the updated user attribute information to the network comprises transmitting the updated user attribute information to the network only if a measure of difference between the updated user attribute information and previously-transmitted user attribute information exceeds a threshold.

25. A server, comprising:
- a speech capture module that obtains a speech signal associated with a user that is transmitted by a communication terminal over a network;
  
  a speech analysis module that processes the speech signal associated with the user to generate user attribute information; and
  
  a user attribute storage module that makes the user attribute information available to at least one other communication terminal that connects to the network for use in configuring a configurable speech codec of the at least one other communication terminal to operate in a speaker-dependent manner and makes the user attribute information available to the communication terminal for use in encoding the speech signal using the user attribute information for transmission to the at least one other communication terminal, wherein configuring the configurable speech codec of the at least one other communication terminal to operate in the speaker-dependent manner comprises decoding, by a first decoder of the at least one other communication terminal, a speaker-independent signal without using the user attribute information and decoding, by a second decoder of the at least one other communication terminal, speaker-dependent signal using the user attribute information.
- View Dependent Claims (26, 27, 28, 29, 30, 31, 32, 33, 34)
- - 26. The server of claim 25, wherein the speech analysis module processes the speech signal associated with the user to generate user attribute information that includes information associated with at least one of:
    - a vocal tract of the user;
      
      a pitch or pitch range of the user; and
      
      an excitation signal associated with the user.
  - 27. The server of claim 25, wherein the speech analysis module processes the speech signal associated with the user to generate user attribute information that includes information associated with at least one of:
    - a pitch of the user;
      
      a timing of the user;
      
      a voice quality of the user; and
      
      an articulation of the user.
  - 28. The server of claim 25, wherein the user attribute storage module makes the user attribute information available to the at least one other communication terminal by storing the user attribute information for subsequent transmission to the at least one other communication terminal.
  - 29. The server of claim 25, wherein the user attribute storage module makes the user attribute information available to the at least one other communication terminal by transmitting the user attribute information to another server that stores the user attribute information for subsequent transmission to the at least one other communication terminal.
  - 30. The server of claim 25, wherein the speech analysis module is further configured to process additional speech signals associated with the user to update the user attribute information and wherein the user attribute distribution modules makes the updated user attribute information available to the at least one other communication terminal.
  - 31. The server of claim 30, wherein the speech analysis module is configured to process the additional speech signals associated with the user to update the user attribute information each time the user uses a communication terminal to conduct a communication session.
  - 32. The server of claim 30, wherein the speech analysis module is configured to process the additional speech signals associated with the user to update the user attribute information periodically based on a predetermined time interval or number of communication sessions.
  - 33. The server of claim 30, wherein the user attribute distribution module makes the updated user attribute information available to the at least one other communication terminal by making available only differences between the updated user attribute information and user attribute information that was previously made available.
  - 34. The server of claim 30, wherein the user attribute distribution module makes the updated user attribute information available only if a measure of difference between the updated user attribute information and user attribute information that was previously made available exceeds a threshold.

35. A method implemented by a server, comprising:
- obtaining a speech signal associated with a user that is transmitted by a communication terminal over a network;
  
  processing the speech signal associated with the user to generate user attribute information;
  
  making the user attribute information available to at least one other communication terminal that connects to the network for use in configuring a configurable speech codec of the at least one other communication terminal to operate in a speaker-dependent manner, wherein configuring the configurable speech codec of the at least one other communication terminal to operate in the speaker-dependent manner comprises decoding, by a first decoder of the at least one other communication terminal, a speaker-independent signal without using the user attribute information and decoding, by a second decoder of the at least one other communication terminal, speaker-dependent signal using the user attribute information; and
  
  providing the user attribute information to the communication terminal for use in encoding the speech signal using the user attribute information for transmission to the at least one other communication terminal.
- View Dependent Claims (36, 37, 38, 39, 40, 41, 42, 43, 44)
- - 36. The method of claim 35, wherein processing the speech signal associated with the user to generate the user attribute information comprises processing the speech signal associated with the user to generate information associated with at least one of:
    - a vocal tract of the user;
      
      a pitch or pitch range of the user; and
      
      an excitation signal associated with the user.
  - 37. The method of claim 35, wherein processing the speech signal associated with the user to generate the user attribute information comprises processing the speech signal associated with the user to generate information associated with at least one of:
    - a pitch of the user;
      
      a timing of the user;
      
      a voice quality of the user; and
      
      an articulation of the user.
  - 38. The method of claim 35, wherein making the user attribute information available to the at least one other communication terminal comprises storing the user attribute information for subsequent transmission to the at least one other communication terminal.
  - 39. The method of claim 35, wherein making the user attribute information available to the at least one other communication terminal comprises transmitting the user attribute information to another server that stores the user attribute information for subsequent transmission to the at least one other communication terminal.
  - 40. The method of claim 35, further comprising:
    - processing additional speech signals associated with the user to update the user attribute information; and
      
      making the updated user attribute information available to the at least one other communication terminal.
  - 41. The method of claim 40, wherein processing the additional speech signals associated with the user to update the user attribute information comprises:
    - processing the additional speech signals associated with the user to update the user attribute information each time the user uses a communication terminal to conduct a communication session.
  - 42. The method of claim 40, wherein the speech analysis module is configured to process the additional speech signals associated with the user to update the user attribute information periodically based on a predetermined time interval or number of communication sessions.
  - 43. The method of claim 40, wherein making the updated user attribute information available to the at least one other communication terminal comprises making available only differences between the updated user attribute information and user attribute information that was previously made available.
  - 44. The method of claim 40, wherein making the updated user attribute information available to the at least one other communication terminal comprises making the updated user attribute information available only if a measure of difference between the updated user attribute information and user attribute information that was previously made available exceeds a threshold.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Avago Technologies International Sales Pte Limited (Broadcom, Inc.)
Original Assignee
Broadcom Corporation (Broadcom, Inc.)
Inventors
Zopf, Robert W.
Primary Examiner(s)
ABEBE, DANIEL DEMELASH

Application Number

US12/887,329
Publication Number

US 20110099015A1
Time in Patent Office

1,729 Days
Field of Search

704/231, 704/250, 704/270, 704/200, 704/243, 704/273, 704/500, 704/504, 455/419
US Class Current

1/1
CPC Class Codes

G10L 17/00   Speaker identification or v...

G10L 19/0018   Speech coding using phoneti...

G10L 19/16   Vocoder architecture

G10L 21/00   Speech or voice signal proc...

User attribute derivation and update for network/peer assisted speech coding

First Claim

7 Assignments

0 Petitions

Accused Products

Abstract

63 Citations

44 Claims

Specification

Solutions

Use Cases

Quick Links

User attribute derivation and update for network/peer assisted speech coding

First Claim

7 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

63 Citations

44 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links