Speech transfer over packet networks using very low digital data bandwidths

US 7,177,801 B2
Filed: 12/21/2001
Issued: 02/13/2007
Est. Priority Date: 12/21/2001
Status: Active Grant

First Claim

Patent Images

1. A method of speech communication using very low digital data bandwidth, comprising:

providing a bi-directional digital telephony link over a digital packet network between a source terminal and a destination terminal,wherein the source terminal for a forward link serves as the destination terminal for a reverse link on the digital telephony link;

distinguishing between speech and a pause in the speech communication using an ITU voice activity detection module;

providing a comfort noise simulation of background noise for each distinguished pause in the speech communication in time order with the speech;

translating said speech into text at the source terminal;

communicating said text and said time-ordered simulated comfort noise of each pause in speech across the bi-directional digital telephony link to the destination terminal;

determining a status of the telephony link;

ending the communicating if the telephony link is terminated;

generating a speaker voice profile by training the source terminal to recognize words spoken by a speaker for reproduction of audible speech corresponding to words spoken by the speaker having audible qualities approximating that of the speaker;

communicating the voice profile across the telephony link from the source terminal to the destination terminal,wherein the speaker'"'"'s voice profile contains the information needed to generate the reproduced speech that substantially resembles the sound of the speaker'"'"'s voice; and

translating said text into reproduced speech received at said destination terminal using the speaker'"'"'s voice profile and reproducing the simulated background noise in the reproduced speech in time order with each pause in the speech at the source terminal at the destination terminal.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method of communicating speech across a communication link using very low digital data bandwidth is disclosed, having the steps of: translating speech into text at a source terminal; communicating the text across the communication link to a destination terminal; and translating the text into reproduced speech at the destination terminal. In a preferred embodiment, a speech profile corresponding to the speaker is used to reproduce the speech at the destination terminal so that the reproduced speech more closely approximates the original speech of the speaker. A default voice profile is used to recreate speech when a user profile is unavailable. User specific profiles can be created during training prior to communication or can be created during communication from actual speech. The user profiles can be updated to improve accuracy of recognition and to enhance reproduction of speech. The updated user profiles are transmitted to the destination terminals as needed.

Citations

20 Claims

1. A method of speech communication using very low digital data bandwidth, comprising:
- providing a bi-directional digital telephony link over a digital packet network between a source terminal and a destination terminal,wherein the source terminal for a forward link serves as the destination terminal for a reverse link on the digital telephony link;
  
  distinguishing between speech and a pause in the speech communication using an ITU voice activity detection module;
  
  providing a comfort noise simulation of background noise for each distinguished pause in the speech communication in time order with the speech;
  
  translating said speech into text at the source terminal;
  
  communicating said text and said time-ordered simulated comfort noise of each pause in speech across the bi-directional digital telephony link to the destination terminal;
  
  determining a status of the telephony link;
  
  ending the communicating if the telephony link is terminated;
  
  generating a speaker voice profile by training the source terminal to recognize words spoken by a speaker for reproduction of audible speech corresponding to words spoken by the speaker having audible qualities approximating that of the speaker;
  
  communicating the voice profile across the telephony link from the source terminal to the destination terminal,wherein the speaker'"'"'s voice profile contains the information needed to generate the reproduced speech that substantially resembles the sound of the speaker'"'"'s voice; and
  
  translating said text into reproduced speech received at said destination terminal using the speaker'"'"'s voice profile and reproducing the simulated background noise in the reproduced speech in time order with each pause in the speech at the source terminal at the destination terminal.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of claim 1, further comprising the step of:
    - generating said reproduced speech using a default voice profile.
  - 3. The method of claim 1, further comprising the step of:
    - generating said reproduced speech using a speaker'"'"'s voice profile at said destination terminal, wherein said speaker'"'"'s voice profile contains the information needed to generate said reproduced speech that substantially resembles the sound of said speaker'"'"'s voice.
  - 4. The method of claim 3, further comprising the steps of:
    - generating said reproduced speech using a default voice profile of said destination terminal, until said speaker'"'"'s voice profile has been communicated across said communication link; and
      
      generating said reproduced speech using said speaker'"'"'s voice profile at said destination terminal, after said speaker'"'"'s voice profile has been communicated across said communication link.
  - 5. The method of claim 1, wherein:
    - a portion of said training is performed during a portion of the time said speaker is communicating speech across said communication link.
  - 6. The method of claim 5, wherein:
    - said speaker'"'"'s voice profile is periodically updated as said speaker uses said method, andsaid updated profile is periodically communicated to said destination terminal.
  - 7. The method of claim 1, wherein the translating said speech into text comprises translating the speech into digitally coded symbols.
  - 8. The method of claim 1, wherein the translating said speech into text comprises translating the speech into digitally coded characters.
  - 9. The method of claim 1, further comprising:
    - after the ending of the communicating, determining if the telephony link is re-connected; and
      
      continuing the communicating when the determining indicates that the telephony link is re-connected.

10. A method of communicating speech for a telephone conversation across a bi-directional communication link using very low digital data bandwidth, comprising:
- providing a bi-directional digital telephony link over a digital network between a source terminal and a destination terminal,wherein the source terminal for a forward link serves as the destination terminal for a reverse link on the digital telephony link;
  
  distinguishing between a first speaker'"'"'s speech and a pause in the speech communication of the first speaker using an ITU voice activity detection module;
  
  providing a comfort noise simulation of background noise for each distinguished pause in speech in the speech communication of the first speaker in time order with the first sneaker'"'"'s speech;
  
  translating a first speaker'"'"'s speech into first text characters at the source terminal;
  
  communicating, from the source terminal, said first text characters and said time-ordered simulated comfort noise of each pause in the first speaker'"'"'s speech across the digital telephony link to the destination terminal;
  
  determining, from the source terminal, whether the bi-directional digital telephony link has terminated, and if the digital telephony link has terminated then ending the translating and the communicating from the source terminal;
  
  translating said first text characters into first reproduced speech and reproducing the simulated background noise in the reproduced speech of the first speaker in time order with each pause in the speech at the source terminal at the destination terminal;
  
  distinguishing between a second speaker'"'"'s speech and a pause in the speech communication of the second speaker using an ITU voice activity detection module;
  
  providing a comfort noise simulation of background noise for each distinguished pause in speech in the speech communication of the second speaker in time order with the second speaker'"'"'s speech;
  
  translating a second speaker'"'"'s speech into second text characters at the destination terminal;
  
  communicating said second text characters and said time-ordered simulated comfort noise of each pause in the second speaker'"'"'s speech across the digital telephony link to the source terminal;
  
  determining, from the destination terminal, whether the bi-directional digital telephony link has terminated, and if the digital telephony link has terminated then ending the translating and the communicating from the destination terminal; and
  
  translating said second text characters into second reproduced speech and reproducing the simulated background noise in the reproduced speech of the second speaker in time order with each pause in the speech at the source terminal at the source terminal.
- View Dependent Claims (11, 12, 13, 14, 15, 16)
- - 11. The method of claim 10, further comprising:
    - providing the source terminal with a first voice profile of said first speaker; and
      
      providing the destination terminal with a second voice profile of said second speaker;
      
      communicating said first voice profile across said bi-directional digital telephony link to the destination terminal; and
      
      communicating said second voice profile across said bi-directional digital telephony link to said first terminal;
      
      generating said first reproduced speech using said first speaker'"'"'s voice profile at the destination terminal;
      
      generating said second reproduced speech using said second speaker'"'"'s voice profile at the source terminal, whereinsaid first speaker'"'"'s voice profile contains the information needed to generate said first reproduced speech that substantially resembles the sound of said first speaker'"'"'s voice, andsaid second speaker'"'"'s voice profile contains the information needed to generate said second reproduced speech that substantially resembles the sound of said second speaker'"'"'s voice.
  - 12. The method of claim 11, further comprising:
    - generating said first reproduced speech using a default voice profile of the destination terminal, until said first speaker'"'"'s voice profile has been communicated across the digital telephony link;
      
      generating said second reproduced speech using a default voice profile of the source terminal, until said second speaker'"'"'s voice profile has been communicated across the digital telephony link;
      
      generating said first reproduced speech using said first speaker'"'"'s voice profile at said second terminal, after said first speaker'"'"'s voice profile has been communicated across the digital telephony link; and
      
      generating said second reproduced speech using said second speaker'"'"'s voice profile at the source terminal, after said second speaker'"'"'s voice profile has been communicated across the digital telephony link.
  - 13. The method of claim 12, wherein:
    - said first text and said first speaker'"'"'s voice profile are simultaneously communicated across the digital telephony link; and
      
      said second text and second speaker'"'"'s voice profile are simultaneously communicated across the digital telephony link.
  - 14. The method of claim 12, wherein:
    - said first speaker'"'"'s voice profile is provided by first training at the source terminal; and
      
      said second speaker'"'"'s voice profile is provided by second training at the destination terminal, and whereinsaid first training comprises said first speaker speaking a number of words, which can be pre-determined words, expected words, or unexpected but recognized words, andsaid second training comprises said second speaker speaking a number of words, which can be pre-determined words, expected words, or unexpected but recognized words.
  - 15. The method of claim 10, wherein the translating the first speaker'"'"'s speech into the first text characters comprises translating the first speaker'"'"'s speech into first digitally coded symbols representing the first text characters, andthe translating the second speaker'"'"'s speech into the second text characters comprises translating the second speaker'"'"'s speech into second digitally coded symbols representing the second text characters.
  - 16. The method of claim 10, wherein the translating the first speaker'"'"'s speech into the first text comprises translating the first speaker'"'"'s speech into first digitally coded symbols representing the first text characters, andthe translating the second speaker'"'"'s speech into the second text characters comprises translating the second speaker'"'"'s speech into second digitally coded symbols representing the second text characters.

17. A system of communicating speech using very low digital data bandwidth, comprising:
- a digital packet network;
  
  a source terminal and a destination terminal operatively connected over the packet network on bi-directional digital telephony link;
  
  wherein the source terminal for a forward link serves as the destination terminal for a reverse link on the digital telephony link;
  
  wherein the source terminal;
  
  distinguishes between speech and a pause in the speech communication using an ITU voice activity detection module;
  
  provides a comfort noise simulation of background noise for each distinguished pause in the speech communication in time order with the speech, translates speech into text, communicates the text and the time-ordered simulated comfort noise of each pause in speech across the bi-directional digital telephony link to the destination terminal, determines a status of the telephony link, ends the communication if the telephony link is terminated,generates a voice profile by recognizing words spoken by a speaker for reproduction of audible speech corresponding to words spoken by the speaker having audible qualities approximating that of the speaker, andcommunicates the voice profile across the telephony link to the destination terminal, andwherein the speaker'"'"'s voice profile contains the information needed to generate the reproduced speech that substantially resembles the sound of the speaker'"'"'s voice; and
  
  wherein the destination terminal translates the text into reproduced speech using the speaker'"'"'s voice profile and reproduces the simulated background noise in the reproduced speech in time order with each pause in the speech at the source terminal at the destination terminal.
- View Dependent Claims (18, 19, 20)
- - 18. The method of claim 17, wherein the sourceterminal translates said speech into text as digitally coded symbols.
  - 19. The method of claim 17, wherein the source terminal translates said speech into text as digitally coded characters.
  - 20. The method of claim 17, further comprising:
    - wherein, after the source terminal communicates the text the source terminal determines if the telephony link is re-connected, and upon detection of re-connection of the telephony link, continues to communicate the text to the destination terminal.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Telogy Networks, Inc. (Texas Instruments, Inc.)
Original Assignee
Texas Instruments, Inc.
Inventors
Krasnanski, Kieth, Taboada, William, Wescott, Doug
Primary Examiner(s)
Knepper; David D.

Application Number

US10/024,315
Publication Number

US 20030120489A1
Time in Patent Office

1,880 Days
Field of Search

None
US Class Current

704/201
CPC Class Codes

G10L 19/0018   Speech coding using phoneti...

H04L 69/04   Protocols for data compress...

H04M 7/006   Networks other than PSTN/IS...

Speech transfer over packet networks using very low digital data bandwidths

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Speech transfer over packet networks using very low digital data bandwidths

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links