Voice and data exchange over a packet based network with voice detection
First Claim
1. A method of detecting voice in a signal having first, second and third frames, the first frame preceding the second frame in time and the second frame preceding the third frame in time, the method comprising:
- autocorrelating the signal;
estimating a characteristic of the autocorrelated signal;
detecting voice in the signal as a function of the estimated characteristic; and
vacating the voice detection for the second frame if voice is not detected in both the first and third frames.
6 Assignments
0 Petitions
Accused Products
Abstract
A signal processing system which discriminates between voice signals and data signals modulated by a voiceband carrier. The signal processing system includes a voice exchange, a data exchange and a call discriminator. The voice exchange is capable of exchanging voice signals between a switched circuit network and a packet based network. The signal processing system also includes a data exchange capable of exchanging data signals modulated by a voiceband carrier on the switched circuit network with unmodulated data signal packets on the packet based network. The data exchange is performed by demodulating data signals from the switched circuit network for transmission on the packet based network, and modulating data signal packets from the packet based network for transmission on the switched circuit network. The call discriminator is used to selectively enable the voice exchange and data exchange.
-
Citations
74 Claims
-
1. A method of detecting voice in a signal having first, second and third frames, the first frame preceding the second frame in time and the second frame preceding the third frame in time, the method comprising:
-
autocorrelating the signal; estimating a characteristic of the autocorrelated signal; detecting voice in the signal as a function of the estimated characteristic; and vacating the voice detection for the second frame if voice is not detected in both the first and third frames.
-
-
2. The method of claim 1 wherein the power threshold is in the range of −
- 45 to −
55 dBm.
- 45 to −
-
3. The method of claim 1 wherein the characteristic comprises pitch period.
-
4. The method of claim 3 wherein the detection of voice in the signal is further based on the estimated pitch period of the autocorrelated signal being in the range of 60–
- 400 Hz.
-
5. The method of claim 4 wherein the characteristic comprises amplitude, and the voice detection comprises detecting the amplitude of the autocorrelated signal with one period shift and with no shift, the voice detection being further based on the amplitude of autocorrelated signal with one period shift being in the range of 0.25 to 0.40 of the amplitude of the autocorrelated signal with no shift.
-
6. The method of claim 4 wherein the characteristic comprises peak amplitude, and the voice detection comprises detecting the peak amplitude of the autocorrelated signal with no shift and with a shift, the detection of voice in the signal being further based on the peak amplitude of the shifted autocorrelated signal being less than 0.75 to 0.90 of the peak amplitude of the autocorrelated signal with no shift.
-
7. A voice detector, comprising:
-
autocorrelation logic to autocorrelate a signal having first, second and third frames, the first frame preceding the second frame in time and the second frame preceding the third frame in time; frame based decision logic that detects voice in the signal as a function of the autocorrelated signal; and final decision logic which vacates the detection of voice in the signal for the second frame if voice is not detected by the frame based decision logic for both the first and third frames.
-
-
8. The voice detector of claim 7 further comprising a pitch period tracker to estimate a pitch period of the autocorrelated, and wherein the frame based decision logic detects voice in the signal as a function of the estimated pitch period of the autocorrelated signal.
-
9. The voice detector of claim 8 further comprising a power estimator which estimates power of the signal, and wherein the frame based decision logic further compares the estimated power of the signal to a power threshold, the detection of voice in the signal being further a function of the power comparison.
-
10. The voice detector of claim 9 wherein the power threshold is in the range of −
- 45 to −
55 dBm, and the detection of voice in the signal is further based on the estimated power exceeding the power threshold.
- 45 to −
-
11. The voice detector of claim 9 wherein the detection of voice in the signal by the frame based decision logic is further based on the estimated pitch period for the autocorrelated signal being in the range of 60 to 400 Hz.
-
12. The voice detector of claim 11 wherein the frame based decision logic further detects an amplitude for the autocorrelated signal with one period shift and with no shift, the detection of voice in the signal being further based on the amplitude of the autocorrelated signal with one period being in the range of 0.25 to 0.40 of the amplitude of the autocorrelated signal with no shift.
-
13. The voice detector of claim 11 wherein the frame based decision logic further detects a peak amplitude of the autocorrelated signal with no shift and with a shift, the detection of voice in the signal being further based on the peak amplitude of the shifted autocorrelated signal being less than 0.75 to 0.90 of the peak amplitude of the autocorrelated signal with no shift.
-
14. A transmission system, comprising:
-
a telephony device which outputs a signal having first, second and third frames, the first frame preceding the second frame in time and the second frame preceding the third frame in time; and a voice detector having autocorrelation logic to autocorrelate the signal, frame based decision logic that detects voice in the signal as a function of the autocorrelated signal, and final decision logic which vacates the detection of voice in the signal for the second frame if voice is not detected by the frame based decision logic for both the first and third frames.
-
-
15. The transmission system of claim 14 wherein the voice detector further comprises a pitch period tracker to estimate a pitch period of the autocorrelated, and wherein the frame based decision logic detects voice in the signal as a function of the estimated pitch period of the autocorrelated signal.
-
16. The transmission system of claim 15 wherein the voice detector further comprises a power estimator which estimates power of the signal, and wherein the frame based decision logic further compares the estimated power of the signal to at least one power threshold, the detection of voice in the signal being further a function of the power comparison.
-
17. The transmission system of claim 16 wherein the power threshold is in the range of −
- 45 to −
55 dBm, and the detection of voice in the signal is further based on the estimated power exceeding the power threshold.
- 45 to −
-
18. The transmission system of claim 15 wherein the detection of voice in the signal by the frame based decision logic is further based on the estimated pitch period for the autocorrelated signal being in the range of 60 to 400 Hz.
-
19. The transmission system of claim 18 wherein the frame based decision logic further detects an amplitude for the autocorrelated signal with one period shift and with no shift, the detection of voice in the signal being further based on the amplitude of the autocorrelated signal with one period being in the range of 0.25 to 0.40 of the amplitude of the autocorrelated signal with no shift.
-
20. The transmission system of claim 18 wherein the frame based decision logic further detects a peak amplitude of the autocorrelated signal with no shift and with a shift, the detection of voice in the signal being further based on the peak amplitude of the shifted autocorrelated signal being less than 0.75 to 0.90 of the peak amplitude of the autocorrelated signal with no shift.
-
21. The transmission system of claim 14 wherein the telephony device comprises a telephone.
-
22. The transmission system of claim 14 further comprising a public switched telephone network coupling the telephony device to the voice detector.
-
23. A system for processing a signal, comprising:
-
a voice exchange capable of exchanging voice in the signal between a telephony device and a network; a voiceband data exchange capable of exchanging data in the signal between a data device and the network; a voice detector to detect voice in the signal during the voiceband data exchange, wherein the voice detector comprises autocorrelation logic to autocorrelate the signal and frame based decision logic to detect voice in the signal as a function of the autocorrelated signal; and a resource manager which terminates the voiceband data exchange and invokes the voice exchange when the voice detector detects voice in the signal, wherein the signal comprises first, second and third frames, the first frame preceding the second frame in time and the second frame preceding the third frame in time, the voice detector further comprising final decision logic which vacates the detection of voice in the signal for the second frame if voice is not detected by the frame based decision logic for both the first and third frames.
-
-
24. The signal processing system of claim 23 wherein the voice detector further comprises a pitch period tracker to estimate a pitch period of the autocorrelated, and wherein the frame based decision logic detects voice in the signal as a function of the estimated pitch period of the autocorrelated signal to a threshold.
-
25. The signal processing system of claim 24 wherein the voice detector further comprises a power estimator which estimates power of the signal, and wherein the frame based decision logic further compares the estimated power of the signal to a power threshold, the detection of voice in the signal being further a function of the power comparison.
-
26. The signal processing system of claim 25 wherein the power threshold is in the range of −
- 45 to −
55 dBm, and the detection of voice in the signal is further based on the estimated power exceeding the power threshold.
- 45 to −
-
27. The signal processing system of claim 24 wherein the detection of voice in the signal by the frame based decision logic is further based on the estimated pitch period for the autocorrelated signal being in the range of 60 to 400 Hz.
-
28. The signal processing system of claim 27 wherein the frame based decision logic further detects an amplitude for the autocorrelated signal with one period shift and with no shift, the detection of voice in the signal being further based on the amplitude of the autocorrelated signal with one period being in the range of 0.25 to 0.40 of the amplitude of the autocorrelated signal with no shift.
-
29. The signal processing system of claim 27 wherein the frame based decision logic further detects a peak amplitude of the autocorrelated signal with no shift and with a shift, the detection of voice in the signal being further based on the peak amplitude of the shifted autocorrelated signal being less than 0.75 to 0.90 of the peak amplitude of the autocorrelated signal with no shift.
-
30. A method of processing a signal, comprising:
-
invoking a data exchange service to exchange data in the signal between a data device and a network; invoking a voice detection service to detect voice in the signal when the data exchange service is invoked, wherein the invoked voice detection service comprises autocorrelating the signal, estimating a characteristic of the autocorrelated signal; and
detecting voice in the signal as a function of the estimated characteristic; andterminating the data exchange service and invoking a voice exchange service when the voice detector detects voice in the signal, wherein the signal comprises first, second and third frames, the first frame preceding the second frame in time and the second frame preceding the third frame in time, the voice detector further comprising vacating the detection of voice in the signal for the second frame if voice is not detected by the frame based decision logic for both the first and third frames.
-
-
31. The method of claim 30 wherein the invoked voice detection service further comprising estimating power of the signal, and comparing the estimated power of the signal to a power threshold, the detection of voice in the signal being further a function of the estimated power comparison.
-
32. The method of claim 31 wherein the power threshold is in the range of −
- 45 to −
55 dBm.
- 45 to −
-
33. The method of claim 30 wherein the characteristic comprises pitch period.
-
34. The method of claim 33 wherein the detection of voice in the signal is based on an autocorrelation pitch period in the range of 60 to 400 Hz.
-
35. The method of claim 34 wherein the characteristic comprises amplitude, and the invoked voice detection service further comprises detecting the amplitude of the autocorrelated signal with one period shift and with no shift, the detection of voice in the signal being further based on the amplitude of autocorrelated signal with one period shift being in the range of 0.25 to 0.40 of the amplitude of the autocorrelated signal with no shift.
-
36. The method of claim 34 wherein the characteristic comprises peak amplitude, and the invoked voice detection service comprises detecting the peak amplitude of the autocorrelated signal with no shift and with a shift, the detection of voice in the signal being further based on the peak amplitude of the shifted autocorrelated signal being less than 0.75 to 0.90 of the peak amplitude of the autocorrelated signal with no shift.
-
37. Computer-readable media embodying a program of instructions executable by a computer to perform a method of detecting voice in a signal having first, second and third frames, the first frame preceding the second frame in time and the second frame preceding the third frame in time, the method comprising:
-
autocorrelating the signal; estimating a characteristic of the autocorrelated signal; detecting voice in the signal as a function of the estimated characteristic; and vacating the voice detection for the second frame if voice is not detected in both the first and third frames.
-
-
38. The computer-readable media of claim 37 wherein the power threshold is in the range of −
- 45 to −
55 dBm.
- 45 to −
-
39. The computer-readable media of claim 37 wherein the characteristic comprises pitch period.
-
40. The computer-readable media of claim 39 wherein the detection of voice in the signal is further based on the estimated pitch period of the autocorrelated signal being in the range of 60 to 400 Hz.
-
41. The computer-readable media of claim 40 wherein the characteristic comprises amplitude, and the voice detection comprises detecting the amplitude of the autocorrelated signal with one period shift and with no shift, the voice detection being further based on the amplitude of autocorrelated signal with one period shift being in the range of 0.25 to 0.40 of the amplitude of the autocorrelated signal with no shift.
-
42. The computer-readable media of claim 40 wherein the characteristic comprises peak amplitude, and the voice detection comprises detecting the peak amplitude of the autocorrelated signal with no shift and with a shift, the detection of voice in the signal being further based on the peak amplitude of the shifted autocorrelated signal being less than 0.75 to 0.90 of the peak amplitude of the autocorrelated signal with no shift.
-
43. A voice detector, comprising:
-
autocorrelation means for autocorrelating a signal having first, second and third frames, the first frame preceding the second frame in time and the second frame preceding the third frame in time; voice detection means for detecting voice in the signal as a function of the autocorrelated signal, the estimated power and the estimated pitch; and means for vacating the detection of voice in the signal for the second frame if the voice detection means does not detect voice for both the first and third frames.
-
-
44. The voice detector of claim 43 further comprising means for estimating a pitch period of the autocorrelated signal, and wherein the voice detection means detects voice in the signal in the signal as a function of the estimated pitch period of the autocorrelated signal.
-
45. The voice detector of claim 44 further comprising power estimation means for estimating power of the signal, and means for comparing the estimated power of the signal to a power threshold, wherein the voice detection means is further adapted to detect voice in the signal as a function of the power comparison.
-
46. The voice detector of claim 45 wherein the power threshold is in the range of −
- 45 to −
55 dBm, and the detection of voice in the signal is further based on the estimated power exceeding the power threshold.
- 45 to −
-
47. The voice detector of claim 44 wherein the detection of voice in the signal by the voice detection means is further based on the estimated pitch period for the autocorrelated signal being in the range of 60 to 400 Hz.
-
48. The voice detector of claim 47 further comprising amplitude detection means for detecting an amplitude for the autocorrelated signal with one period shift and with no shift, the detection of voice in the signal by the voice detecting means being further based on the amplitude of the autocorrelated signal with one period being in the range of 0.25 to 0.40 of the amplitude of the autocorrelated signal with no shift.
-
49. The voice detector of claim 47 further comprising amplitude detection means for detecting a peak amplitude of the autocorrelated signal with no shift and with a shift, the detection of voice in the signal by the voice detection means being further based on the peak amplitude of the shifted autocorrelated signal being less than 0.75 to 0.90 of the peak amplitude of the autocorrelated signal with no shift.
-
50. A transmission system, comprising:
-
a telephony device which outputs a signal having first, second and third frames, the first frame preceding the second frame in time and the second frame preceding the third frame in time; a voice detector having autocorrelation means for autocorrelating the signal, and voice detection means for detecting voice in the signal as a function of the autocorrelated signal; and means for vacating the detection of voice in the signal for the second frame if voice is not detected by the voice detection means for both the first and third frames.
-
-
51. The transmission system of claim 50 wherein the voice detector further comprises estimating a pitch period of the autocorrelated, and wherein the voice detection means detects voice in the signal as a function of the estimated pitch period of the autocorrelated signal.
-
52. The transmission system of claim 51 wherein the voice detector further comprises power estimation means for estimating power of the signal, and means for comparing the estimated power of the signal to at least one power threshold, wherein the detection means is further adapted to detect voice in the signal as a function of the power comparison.
-
53. The transmission system of claim 52 wherein the power threshold is in the range of −
- 45 to −
55 dBm, and the detection of voice in the signal is further based on the estimated power exceeding the power threshold.
- 45 to −
-
54. The transmission system of claim 51 wherein the detection of voice in the signal by the voice detection means is further based on the estimated pitch period for the autocorrelated signal being in the range of 60 to 400 Hz.
-
55. The transmission system of claim 54 further comprising amplitude detection means for detecting an amplitude for the autocorrelated signal with one period shift and with no shift, the detection of voice in the signal by the voice detection means being further based on the amplitude of the autocorrelated signal with one period being in the range of 0.25 to 0.40 of the amplitude of the autocorrelated signal with no shift.
-
56. The transmission system of claim 54 further comprising amplitude detection means for detecting a peak amplitude of the autocorrelated signal with no shift and with a shift, the detection of voice in the signal by the voice detection means being further based on the peak amplitude of the shifted autocorrelated signal being less than 0.75 to 0.90 of the peak amplitude of the autocorrelated signal with no shift.
-
57. The transmission system of claim 50 wherein the telephony device comprises a telephone.
-
58. The transmission system of claim 50 further comprising a public switched telephone network coupling the telephony device to the voice detector.
-
59. A system for processing a signal having first, second and third frames, the first frame preceding the second frame in time and the second frame preceding the third frame in time, comprising:
-
voice means for exchanging voice in the signal between a telephony device and a network; data means for exchanging data in the signal between a data device and the network; a voice detector to detect voice in the signal during the data exchange; means for terminating the data exchange and invoking the voice exchange when the voice detector detects voice in the signal; and means for vacating the detection of voice in the signal for the second frame if the voice detection means does not detect voice for both the first and third frames.
-
-
60. The signal processing system of claim 59 wherein the voice detector comprises autocorrelation means for autocorrelating the signal, and voice detection means for detecting voice in the signal as a function of the autocorrelated signal.
-
61. The signal processing system of claim 60 wherein the voice detector further comprises means for estimating a pitch period of the autocorrelated, and wherein the voice detection means detects voice in the signal as a function of the estimated pitch period of the autocorrelated signal.
-
62. The signal processing system of claim 61 wherein the detection of voice in the signal by the voice detection means is further based on the estimated pitch period for the autocorrelated signal being in the range of 60 to 400 Hz.
-
63. The signal processing system of claim 62 further comprising amplitude detection means for detecting an amplitude for the autocorrelated signal with one period shift and with no shift, the detection of voice in the signal by the voice detection means being further based on the amplitude of the autocorrelated signal with one period being in the range of 0.25 to 0.40 of the amplitude of the autocorrelated signal with no shift.
-
64. The signal processing system of claim 62 further comprising amplitude detection means for detecting a peak amplitude of the autocorrelated signal with no shift and with a shift, the detection of voice in the signal by the detection means being further based on the peak amplitude of the shifted autocorrelated signal being less than 0.75 to 0.90 of the peak amplitude of the autocorrelated signal with no shift.
-
65. The signal processing system of claim 59 wherein the voice detector further comprises power estimation means for estimating power of the signal, means for comparing the estimated power of the signal to at least one power threshold, wherein the detection means is further adapted to detect voice as a function of the power comparison.
-
66. The signal processing system of claim 65 wherein the power threshold is in the range of −
- 45 to −
55 dBm, and the detection of voice in the signal is further based on the estimated power exceeding the power threshold.
- 45 to −
-
67. Computer-readable media embodying a program of instructions executable by a computer to perform a method of processing a signal having first, second and third frames, the first frame preceding the second frame in time and the second frame preceding the third frame in time, the method comprising:
-
invoking a data exchange service to exchange data in the signal between a data device and a network; invoking a voice detection service to detect voice in the signal when the data exchange service is invoked; terminating the data exchange service and invoking a voice exchange service when the voice detector detects voice in the signal; and means for vacating the detection of voice in the signal for the second frame if the voice detection means does not detect voice for both the first and third frames.
-
-
68. The computer-readable media of claim 67 wherein the invoked voice detection service comprises autocorrelating the signal, estimating a characteristic of the autocorrelated signal;
- and detecting voice in the signal as a function of the estimated characteristic.
-
69. The method of claim 68 wherein the invoked voice detection service further comprising estimating power of the signal, and comparing the estimated power of the signal to a power threshold, the detection of voice in the signal being further a function of the estimated power comparison.
-
70. The computer-readable media of claim 68 wherein the power threshold is in the range of −
- 45 to −
55 dBm.
- 45 to −
-
71. The computer-readable media of claim 68 wherein the characteristic comprises pitch period.
-
72. The computer-readable media of claim 71 wherein the detection of voice in the signal is further based on an autocorrelation pitch period in the range of 60 to 400 Hz.
-
73. The computer-readable media of claim 72 wherein the characteristic comprises amplitude, and the invoked voice detection service further comprises detecting the amplitude of the autocorrelated signal with one period shift and with no shift, the detection of voice in the signal being further based on the amplitude of autocorrelated signal with one period shift being in the range of 0.25 to 0.40 of the amplitude of the autocorrelated signal with no shift.
-
74. The computer-readable media of claim 72 wherein the characteristic comprises peak amplitude, and the invoked voice detection service comprises detecting the peak amplitude of the autocorrelated signal with no shift and with a shift, the detection of voice in the signal being further based on the peak amplitude of the shifted autocorrelated signal being less than 0.75 to 0.90 of the peak amplitude of the autocorrelated signal with no shift.
Specification