Dual uplink pre-processing paths for machine and human listening

US 9,449,602 B2
Filed: 12/03/2013
Issued: 09/20/2016
Est. Priority Date: 12/03/2013
Status: Active Grant

First Claim

Patent Images

1. A device for providing dual uplink processing paths, the device comprising:

at least one semi-conductor processor;

a non-transitory computer-readable medium storing instructions, when executed by the at least one semi-conductor processor, are configured to implement;

parallel uplink processing paths including a first uplink processing path for processing and uploading audio signals adapted for human listening to at least one remote server and a second uplink processing path for processing and uploading audio signals adapted for machine listening to the at least one remote server, the first and second uplink processing paths being separate uplink communication links,the parallel uplink processing paths configured to receive an audio stream representing speech from a user and apply two different pre-processing algorithms separately and in parallel to generate a first audio signal adapted for human listening and a second audio signal adapted for machine listening, the two different pre-processing algorithms applying different noise reduction and compression techniques on the speech,the first uplink processing path configured to apply a first pre-processing algorithm to the audio stream to create the first audio signal adapted for human listening such that the first audio signal includes a non-linear gain, artifacts apart from the speech of the user, and multiple background sound levels, the second uplink processing path configured to apply a second pre-processing algorithm to the audio stream to create the second audio signal adapted for machine listening such that the second audio signal includes a linear gain and a substantially constant background sound level, the second audio signal being devoid of the artifacts of the first audio signal; and

a network interface unit configured to concurrently and separately transmit the first audio signal and the second audio signal to the at least one remote server such that the first audio signal is transmitted via the first uplink processing path and the second audio signal is transmitted via the second uplink processing path.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In some implementations, a device for providing dual uplink processing paths may include a human listening (HL) input processing unit configured to receive an audio stream and pre-process the audio stream to create a first audio signal adapted for human listening via a first uplink processing path, a machine listening (ML) input processing unit configured to receive the audio stream and pre-process the audio stream to create a second audio signal adapted for machine listening via a second uplink processing path, and a network interface unit configured to transmit the first audio signal via the first uplink processing path and transmit the second audio signal via the second uplink processing path to a remote server.

39 Citations

View as Search Results

19 Claims

1. A device for providing dual uplink processing paths, the device comprising:
- at least one semi-conductor processor;
  
  a non-transitory computer-readable medium storing instructions, when executed by the at least one semi-conductor processor, are configured to implement;
  
  parallel uplink processing paths including a first uplink processing path for processing and uploading audio signals adapted for human listening to at least one remote server and a second uplink processing path for processing and uploading audio signals adapted for machine listening to the at least one remote server, the first and second uplink processing paths being separate uplink communication links,the parallel uplink processing paths configured to receive an audio stream representing speech from a user and apply two different pre-processing algorithms separately and in parallel to generate a first audio signal adapted for human listening and a second audio signal adapted for machine listening, the two different pre-processing algorithms applying different noise reduction and compression techniques on the speech,the first uplink processing path configured to apply a first pre-processing algorithm to the audio stream to create the first audio signal adapted for human listening such that the first audio signal includes a non-linear gain, artifacts apart from the speech of the user, and multiple background sound levels, the second uplink processing path configured to apply a second pre-processing algorithm to the audio stream to create the second audio signal adapted for machine listening such that the second audio signal includes a linear gain and a substantially constant background sound level, the second audio signal being devoid of the artifacts of the first audio signal; and
  
  a network interface unit configured to concurrently and separately transmit the first audio signal and the second audio signal to the at least one remote server such that the first audio signal is transmitted via the first uplink processing path and the second audio signal is transmitted via the second uplink processing path.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The device of claim 1, wherein the audio stream includes a background signal level change, the background signal level change being a change in background sound levels, the first uplink processing path configured to permit the background signal level change within the first audio signal when applying the first pre-processing algorithm to the audio stream, the second uplink processing path configured to tune the background signal level change to the substantially constant background sound level within the second audio signal when applying the second pre-processing algorithm to the audio stream.
  - 3. The device of claim 1, wherein the first uplink processing path is configured to apply the first pre-processing algorithm to create a non-linear audio signal as the first audio signal, and the second uplink processing path is configured to apply the second pre-processing algorithm to create a linear audio signal as the second audio signal.
  - 4. The device of claim 1, wherein the first uplink processing path is configured to apply the first pre-processing algorithm to permit insertion of the artifacts into the first audio signal such that the first audio signal includes the artifacts, the second uplink processing path configured to apply the second pre-processing algorithm to block the insertion of the artifacts into the second audio signal such that the second audio signal is devoid of the artifacts inserted into the first audio signal.
  - 5. The device of claim 1, further comprising at least one microphone configured to receive the audio stream, and provide the audio stream to the parallel uplink processing paths such that the first uplink processing path and the second uplink processing path pre-process the audio stream in parallel.
  - 6. The device of claim 1, wherein the noise reduction techniques includes at least one of active noise control (ANC), active noise reduction (ANR), acoustic echo canceller (AES), acoustic echo supressor (AES), acoustic noise canceller (ANC), and noise suppressor (NS).

7. A method for processing an audio stream using dual pre-processing paths, the method being performed by at least one semi-conductor processor, the method including:
- providing parallel uplink processing paths including a first uplink processing path for processing and uploading audio signals adapted for human listening to at least one remote server and a second uplink processing path for processing and uploading audio signals adapted for machine listening to the at least one remote server, the machine listening including speech-to-text conversion, the first and second uplink processing paths being separate uplink communication links;
  
  receiving an audio stream representing speech from a user;
  
  applying two different pre-processing algorithms separately and in parallel to generate a first audio signal adapted for human listening and a second audio signal adapted for machine listening, the two different pre-processing algorithms applying different noise reduction techniques on the speech when a bandwidth of the audio stream is decreased, the applying including,applying a first pre-processing algorithm to the audio stream in the first uplink processing path to create the first audio signal adapted for human listening such that the first audio signal includes a non-linear gain, artifacts apart from the speech of the user, and multiple background sound levels,applying a second pre-processing algorithm to the audio stream in the second uplink processing path to create the second audio signal adapted for machine listening such that the second audio signal includes a linear gain and a substantially constant background sound level, the second audio signal being devoid of the artifacts of the first audio signal; and
  
  concurrently and separately transmitting the first audio signal and the second audio signal to the at least one remote server such that the first audio signal is transmitted via a first uplink processing link and the second audio signal is transmitted via a second uplink processing link.
- View Dependent Claims (8, 9, 10)
- - 8. The method of claim 7, wherein the audio stream includes a background signal level change, the background signal level change being a change in background sound levels, the applying the first pre-processing algorithm to the audio stream in the first uplink processing path to create the first audio signal adapted for human listening includes permitting the background signal level change within the first audio signal, and the applying the second pre-processing algorithm to the audio stream in the second uplink processing path to create the second audio signal adapted for machine listening includes tuning the background signal level change to the substantially constant background sound level within the second audio signal.
  - 9. The method of claim 7, wherein the applying the first pre-processing algorithm to the audio stream in the first uplink processing path to create the first audio signal adapted for human listening includes creating a non-linear audio signal as the first audio signal, and the applying the second pre-processing algorithm to the audio stream in the second uplink processing path to create the second audio signal adapted for machine listening includes creating a linear audio signal as the second audio signal.
  - 10. The method of claim 7, wherein the applying the first pre-processing algorithm to the audio stream in the first uplink processing path to create the first audio signal adapted for human listening includes permitting insertion of the artifacts into the first audio signal such that the first audio signal includes the artifacts, and the applying the second pre-processing algorithm to the audio stream in the second uplink processing path to create the second audio signal adapted for machine listening includes blocking the insertion of the artifacts into the second audio signal such that the second audio signal is devoid of the artifacts inserted into the first audio signal.

11. A non-transitory computer-readable medium storing executable instructions, when executed by at least one semi-conductor processor, are configured to:
- provide parallel uplink processing paths including a first uplink processing path for processing and uploading server audio signals adapted for human listening to at least one remote server and a second uplink processing path for processing and uploading audio signals adapted for machine listening to the at least one remote server, the machine listening includes voice command recognition, the first and second uplink processing paths being separate uplink communication links;
  
  receive an audio stream representing speech from a user via at least one microphone;
  
  apply two different pre-processing algorithms separately and in parallel to generate a first audio signal adapted for human listening and a second audio signal adapted for machine listening, the two different pre-processing algorithms applying different noise reduction and suppression techniques on the speech, including,apply a first pre-processing algorithm to the audio stream in the first uplink processing path to create the first audio signal adapted for human listening such that the first audio signal includes a non-linear gain, artifacts apart from the speech of the user, and multiple background sound levels,apply a second pre-processing algorithm to the audio stream in the second uplink processing path to create the second audio signal adapted for machine listening such that the second audio signal includes a linear gain and a substantially constant background sound level, the second audio signal being devoid of the artifacts of the first audio signal; and
  
  concurrently and separately transmit the first audio signal and the second audio signal to the at least one remote server such that the first audio signal is transmitted via the first uplink processing path and the second audio signal is transmitted via the second uplink processing path.
- View Dependent Claims (12, 13, 14)
- - 12. The non-transitory computer-readable medium of claim 11, wherein the audio stream includes a background signal level change, the background signal level change being a change in background sound levels, the executable instructions to apply the first pre-processing algorithm to the audio stream in the first uplink processing path to create the first audio signal adapted for human listening includes executable instructions to permit the background signal level change within the first audio signal, and the executable instructions to apply the second pre-processing algorithm to the audio stream in the second uplink processing path to create the second audio signal adapted for machine listening includes executable instructions to tune the background signal level change to the substantially constant background sound level within the second audio signal.
  - 13. The non-transitory computer-readable medium of claim 11, wherein the executable instructions to apply the first pre-processing algorithm to the audio stream in the first uplink processing path to create the first audio signal adapted for human listening includes executable instructions to create a non-linear audio signal as the first audio signal, and the executable instructions to apply the second pre-processing algorithm to the audio stream in the second uplink processing path to create the second audio signal adapted for machine listening includes executable instructions to create a linear audio signal as the second audio signal.
  - 14. The non-transitory computer-readable medium of claim 11, wherein the executable instructions to apply the first pre-processing algorithm to the audio stream in the first uplink processing path to create the first audio signal adapted for human listening includes executable instructions to permit insertion of the artifacts into the first audio signal such that the first audio signal includes the artifacts, and the executable instructions to apply the second pre-processing algorithm to the audio stream in the second uplink processing path to create the second audio signal adapted for machine listening includes executable instructions to block the insertion of the artifacts into the second audio signal such that the second audio signal is devoid of the artifacts inserted into the first audio signal.

15. A device comprising:
- at least one semi-conductor processor;
  
  a non-transitory computer-readable medium storing instructions, when executed by the at least one semi-conductor processor, are configured to implement;
  
  parallel uplink processing paths including a first uplink processing path for processing and uploading audio signals adapted for human listening to at least one remote server and a second uplink processing path for processing and uploading audio signals adapted for machine listening to the at least one remote server, the machine listening being associated with a speech-to-text conversion application, the human listening being associated with a voice application, the first and second uplink processing paths being separate uplink communication links,the parallel uplink processing paths configured to receive an audio stream representing speech from a user and apply two different pre-processing algorithms in parallel to generate a first audio signal adapted for human listening and a second audio signal adapted for machine listening, the two different pre-processing algorithms applying different noise reduction techniques on the speech when a bandwidth of the audio stream is decreased,the first uplink processing path configured to apply a first pre-processing algorithm to the audio stream to create the first audio signal adapted for human listening such that the first audio signal includes a non-linear gain, artifacts apart from the speech of the user, and multiple background sound levels, the second uplink processing path configured to apply a second pre-processing algorithm to the audio stream to create the second audio signal adapted for machine listening such that the second audio signal includes a linear gain and a substantially constant background sound level, the second audio signal being devoid of the artifacts of the first audio signal; and
  
  a network interface unit configured to concurrently and separately transmit, over a network, the first audio signal and the second audio signal to the at least one remote server such that the first audio signal is transmitted via the first uplink processing path and the second audio signal is transmitted via the second uplink processing path,the network interface unit configured to receive, over the network, text information of the speech of the user corresponding to the second audio signal from the at least one remote server while the voice application is actively processing the first audio signal.
- View Dependent Claims (16, 17, 18, 19)
- - 16. The device of claim 15, wherein the audio stream includes a background signal level change, the background signal level change being a change in background sound levels, the first uplink processing path configured to permit the background signal level change within the first audio signal when applying the first pre-processing algorithm to the audio stream, the second uplink processing path configured to tune the background signal level change to the substantially constant background sound level within the second audio signal when applying the second pre-processing algorithm to the audio stream.
  - 17. The device of claim 15, wherein the first uplink processing path is configured to apply the first pre-processing algorithm to create a non-linear audio signal as the first audio signal, and the second uplink processing path is configured to apply the second pre-processing algorithm to create a linear audio signal as the second audio signal.
  - 18. The device of claim 15, further comprising at least one microphone configured to receive the audio stream, and provide the audio stream to the parallel uplink processing paths such that the first uplink processing path and the second uplink processing path pre-process the audio stream in parallel.
  - 19. The device of claim 15, wherein the first uplink processing path is configured to apply the first pre-processing algorithm to permit insertion of the artifacts into the first audio signal such that the first audio signal includes the artifacts, the second uplink processing path configured to apply the second pre-processing algorithm to block the insertion of the artifacts into the second audio signal such that the second audio signal is devoid of the artifacts inserted into the first audio signal.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Ooi, Leng, Eash, Aaron Matthew, Reid, Dylan
Primary Examiner(s)
WOZNIAK, JAMES S

Application Number

US14/095,181
Publication Number

US 20150154964A1
Time in Patent Office

1,022 Days
Field of Search

704/200, 704/251, 704/270.1, 704/275, 704/500
US Class Current

1/1
CPC Class Codes

G10L 15/30   Distributed recognition, e....

G10L 19/008   Multichannel audio signal c...

G10L 19/02   using spectral analysis, e....

G10L 19/26   Pre-filtering or post-filte...

G10L 21/02   Speech enhancement, e.g. no...

Dual uplink pre-processing paths for machine and human listening

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

39 Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

Dual uplink pre-processing paths for machine and human listening

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

39 Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links