Signal processing method and system utilizing logical speech boundaries

US 6,167,374 A
Filed: 02/13/1997
Issued: 12/26/2000
Est. Priority Date: 02/13/1997
Status: Expired due to Term

First Claim

Patent Images

1. A method of processing speech information comprising steps of:

corverting analog voice signals representative of sound of a sequence of spoken words into digital voice signals representative of said sound of said sequence of spoken words;

analyzing said digital voice signals representative of said sound of said sequence of spoken words to detect signal segments representative of isolated words within said sequence of spoken words;

segmenting said digital voice signals representative of said sound of said sequence of spoken words at least partially based upon said detection of said signal segments representative of isolated words, thereby forming frames of digital voice signals; and

data compressing said digital voice signals within said frames, said compressed digital voice signals within said frames having phonetic information that substantially preserves individual sounds of said isolated word a within said sequence of spoken words.

View all claims

8 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and system of processing speech information includes segmenting the speech information based upon detection of logical speech boundaries, such as isolated words, prior to compressing and/or transmitting the speech information. In one embodiment, a continuous stream of voice data is analyzed to detect signal segments containing the characteristics of an isolated word, thereby forming frames of speech information. The frames are data compressed to form packets that are transmitted to a remote site. Preferably, the packets include error checking information. In a receive mode, incoming packets are error checked prior to packet decoding. If transmission errors are detected, repairable packets may be corrected. Non-correctable errors cause generation of notice data that are used to notify a listener of the location of lost speech information. Notice data are also generated if the duration between two arriving packets exceeds a preselected threshold.

Citations

17 Claims

1. A method of processing speech information comprising steps of:
- corverting analog voice signals representative of sound of a sequence of spoken words into digital voice signals representative of said sound of said sequence of spoken words;
  
  analyzing said digital voice signals representative of said sound of said sequence of spoken words to detect signal segments representative of isolated words within said sequence of spoken words;
  
  segmenting said digital voice signals representative of said sound of said sequence of spoken words at least partially based upon said detection of said signal segments representative of isolated words, thereby forming frames of digital voice signals; and
  
  data compressing said digital voice signals within said frames, said compressed digital voice signals within said frames having phonetic information that substantially preserves individual sounds of said isolated word a within said sequence of spoken words.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of claim 1 further comprising steps of forming said frames of data compressed digital voice signals into packets and transmitting said packets to a remote site.
  - 3. The method of claim 2 further comprising steps of receiving packets of data compressed digital voice signals from said remote site and error checking said received packets.
  - 4. The method of claim 3 further comprising steps of data decompressing said digital voice signals of said received packets to form a stream of digital voice signals and injecting notice data indicating detection of a transmission error into said stream at a place in said stream of digital voice signals where said step of error checking determines that digital voice signals have been lost.
  - 5. The method of claim 4 wherein said step of injecting notice data indicating detection of a transmission error includes generating continuous-tone data.
  - 6. The method of claim 2 further comprising steps of receiving packets of data compressed digital voice signals from said remote site and detecting when a packet has been lost in transmission from said remote site, including decompressing said data compressed digital voice signals of said received packets to form a continuous stream and injecting notice data indicating detection of a transmission error into said stream in place of a packet that has been lost in transmission.
  - 7. The method of claim 2 further comprising storing said packets of data compressed digital voice signals on a recording medium.
  - 8. The method of claim 1 wherein said step of segmenting includes establishing a time threshold and includes forming said frames based upon limiting each frame to containing the lesser of data specific to an isolated word of said sequence of words and data generated during passage of said time threshold.
  - 9. The method of claim 1 wherein said step of segmenting said digital voice signals representative of the sound of said sequence of words is a step of segmenting said digital voice signals by word, thereby forming single-word frames of data compressed digital voice signals, and further comprising the steps of forming each one of said single-word frames of data compressed digital voice signals into separate single-word packets, and transmitting said single-word packets to a remote site.

10. A method of processing speech information for real-time voice communications comprising steps of:
- generating digital voice signals from analog voice signals in response to a voice input of a sequence of words, said digital voice signals containing phonetic information that is representative of individual sounds of said voice input;
  
  analyzing said digital voice signals to recognize logical speech boundaries relating to said sequence of words;
  
  establishing signal segments of said digital voice signals based upon said logical speech boundaries and a threshold time, including forming said signal segments based upon limiting each signal segment to containing the lesser of data specific to a detected isolated word contained in said voice input that is defined by said logical speech boundaries and data generated during passage of said threshold time;
  
  compressing said digital voice signals within each of said signal segments of said digital voice signals, said compressed digital voice signals within each of said signal segments being in a form to substantially preserve said phonetic information that is representative of said individual sounds of said voice input; and
  
  transmitting said signal segments of said compressed digital voice signals to a remote site.
- View Dependent Claims (11, 12, 13)
- - 11. The method of claim 10 wherein said step of transmitting includes packetizing said signal segments of said compressed digital voice signals such that each signal segment is associated with a packet.
  - 12. The method of claim 11 further comprising a step of attaching error checking data to each packet to accommodate error checking at said remote site.
  - 13. The method of claim 10 further comprising receiving digital voice signals from said remote site in said signal segments, including implementing error checking to detect lost signal segments and injecting notice data indicating detection of a transmission error in place of a lost signal segment.

14. A system for processing speech information comprising:
- a speech input device for receiving an analog voice input;
  
  a signal generator responsive to said speech input device for forming digital voice signals at an output, said digital voice signals containing phonetic information that is representative of individual sounds of said analog voice input;
  
  speech recognition means coupled to said output of said signal generator for detecting signal segments within said digital voice signals that represent isolated words, said speech recognition means being configured to form said signal segments based upon limiting each signal segment to containing the lesser of data specific to an isolated word contained in said analog voice input and data generated during passage of a threshold time, said speech recognition means maintaining said digital voice signals as containing said phonetic information that is representative of said individual sounds of said analog voice input; and
  
  compression means, connected to said speech recognition means, for compressing said digital voice signals that are within said signal segments while maintaining said digital voice signals to contain said phonetic information that is representative of said individual sounds of said analog voice input.
- View Dependent Claims (15, 16, 17)
- - 15. The system of claim 14 further comprising a transmitter connected to said compression means for transferring said signal segments of compressed digital voice signals to a remote site.
  - 16. The system of claim 15 further comprising a receiver connected to receive signal segments from said remote site, said receiver having error checking means for detecting a missing signal segment.
  - 17. The system of claim 16 wherein said speech input device is a telephone.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Unify Inc. (Atos SE)
Original Assignee
Siemens Information And Communication Networks, Inc. (Siemens AG)
Inventors
Shaffer, Shmuel, Lai, Dan, Beyda, William J.
Primary Examiner(s)
Hudspeth, David R.
Assistant Examiner(s)
Storm, Donald L.

Application Number

US08/800,001
Time in Patent Office

1,412 Days
Field of Search

704/215, 704/241, 704/244, 704/243, 704/253, 704/232, 704/245, 704/226-228, 704/210, 375/233
US Class Current

704/227
CPC Class Codes

G10L 19/005 Correction of errors induce...

G10L 25/87 Detection of discrete point...

Signal processing method and system utilizing logical speech boundaries

First Claim

8 Assignments

0 Petitions

Accused Products

Abstract

Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Signal processing method and system utilizing logical speech boundaries

First Claim

8 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links