Speech transfer over packet networks using very low digital data bandwidths
First Claim
1. A method of speech communication using very low digital data bandwidth, comprising:
- providing a bi-directional digital telephony link over a digital packet network between a source terminal and a destination terminal,wherein the source terminal for a forward link serves as the destination terminal for a reverse link on the digital telephony link;
distinguishing between speech and a pause in the speech communication using an ITU voice activity detection module;
providing a comfort noise simulation of background noise for each distinguished pause in the speech communication in time order with the speech;
translating said speech into text at the source terminal;
communicating said text and said time-ordered simulated comfort noise of each pause in speech across the bi-directional digital telephony link to the destination terminal;
determining a status of the telephony link;
ending the communicating if the telephony link is terminated;
generating a speaker voice profile by training the source terminal to recognize words spoken by a speaker for reproduction of audible speech corresponding to words spoken by the speaker having audible qualities approximating that of the speaker;
communicating the voice profile across the telephony link from the source terminal to the destination terminal,wherein the speaker'"'"'s voice profile contains the information needed to generate the reproduced speech that substantially resembles the sound of the speaker'"'"'s voice; and
translating said text into reproduced speech received at said destination terminal using the speaker'"'"'s voice profile and reproducing the simulated background noise in the reproduced speech in time order with each pause in the speech at the source terminal at the destination terminal.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of communicating speech across a communication link using very low digital data bandwidth is disclosed, having the steps of: translating speech into text at a source terminal; communicating the text across the communication link to a destination terminal; and translating the text into reproduced speech at the destination terminal. In a preferred embodiment, a speech profile corresponding to the speaker is used to reproduce the speech at the destination terminal so that the reproduced speech more closely approximates the original speech of the speaker. A default voice profile is used to recreate speech when a user profile is unavailable. User specific profiles can be created during training prior to communication or can be created during communication from actual speech. The user profiles can be updated to improve accuracy of recognition and to enhance reproduction of speech. The updated user profiles are transmitted to the destination terminals as needed.
-
Citations
20 Claims
-
1. A method of speech communication using very low digital data bandwidth, comprising:
-
providing a bi-directional digital telephony link over a digital packet network between a source terminal and a destination terminal, wherein the source terminal for a forward link serves as the destination terminal for a reverse link on the digital telephony link; distinguishing between speech and a pause in the speech communication using an ITU voice activity detection module; providing a comfort noise simulation of background noise for each distinguished pause in the speech communication in time order with the speech; translating said speech into text at the source terminal; communicating said text and said time-ordered simulated comfort noise of each pause in speech across the bi-directional digital telephony link to the destination terminal; determining a status of the telephony link; ending the communicating if the telephony link is terminated; generating a speaker voice profile by training the source terminal to recognize words spoken by a speaker for reproduction of audible speech corresponding to words spoken by the speaker having audible qualities approximating that of the speaker; communicating the voice profile across the telephony link from the source terminal to the destination terminal, wherein the speaker'"'"'s voice profile contains the information needed to generate the reproduced speech that substantially resembles the sound of the speaker'"'"'s voice; and translating said text into reproduced speech received at said destination terminal using the speaker'"'"'s voice profile and reproducing the simulated background noise in the reproduced speech in time order with each pause in the speech at the source terminal at the destination terminal. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method of communicating speech for a telephone conversation across a bi-directional communication link using very low digital data bandwidth, comprising:
-
providing a bi-directional digital telephony link over a digital network between a source terminal and a destination terminal, wherein the source terminal for a forward link serves as the destination terminal for a reverse link on the digital telephony link; distinguishing between a first speaker'"'"'s speech and a pause in the speech communication of the first speaker using an ITU voice activity detection module; providing a comfort noise simulation of background noise for each distinguished pause in speech in the speech communication of the first speaker in time order with the first sneaker'"'"'s speech; translating a first speaker'"'"'s speech into first text characters at the source terminal; communicating, from the source terminal, said first text characters and said time-ordered simulated comfort noise of each pause in the first speaker'"'"'s speech across the digital telephony link to the destination terminal; determining, from the source terminal, whether the bi-directional digital telephony link has terminated, and if the digital telephony link has terminated then ending the translating and the communicating from the source terminal; translating said first text characters into first reproduced speech and reproducing the simulated background noise in the reproduced speech of the first speaker in time order with each pause in the speech at the source terminal at the destination terminal; distinguishing between a second speaker'"'"'s speech and a pause in the speech communication of the second speaker using an ITU voice activity detection module; providing a comfort noise simulation of background noise for each distinguished pause in speech in the speech communication of the second speaker in time order with the second speaker'"'"'s speech; translating a second speaker'"'"'s speech into second text characters at the destination terminal; communicating said second text characters and said time-ordered simulated comfort noise of each pause in the second speaker'"'"'s speech across the digital telephony link to the source terminal; determining, from the destination terminal, whether the bi-directional digital telephony link has terminated, and if the digital telephony link has terminated then ending the translating and the communicating from the destination terminal; and translating said second text characters into second reproduced speech and reproducing the simulated background noise in the reproduced speech of the second speaker in time order with each pause in the speech at the source terminal at the source terminal. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
-
17. A system of communicating speech using very low digital data bandwidth, comprising:
-
a digital packet network; a source terminal and a destination terminal operatively connected over the packet network on bi-directional digital telephony link; wherein the source terminal for a forward link serves as the destination terminal for a reverse link on the digital telephony link; wherein the source terminal; distinguishes between speech and a pause in the speech communication using an ITU voice activity detection module; provides a comfort noise simulation of background noise for each distinguished pause in the speech communication in time order with the speech, translates speech into text, communicates the text and the time-ordered simulated comfort noise of each pause in speech across the bi-directional digital telephony link to the destination terminal, determines a status of the telephony link, ends the communication if the telephony link is terminated, generates a voice profile by recognizing words spoken by a speaker for reproduction of audible speech corresponding to words spoken by the speaker having audible qualities approximating that of the speaker, and communicates the voice profile across the telephony link to the destination terminal, and wherein the speaker'"'"'s voice profile contains the information needed to generate the reproduced speech that substantially resembles the sound of the speaker'"'"'s voice; and wherein the destination terminal translates the text into reproduced speech using the speaker'"'"'s voice profile and reproduces the simulated background noise in the reproduced speech in time order with each pause in the speech at the source terminal at the destination terminal. - View Dependent Claims (18, 19, 20)
-
Specification