PRIVACY-SENSITIVE SPEECH MODEL CREATION VIA AGGREGATION OF MULTIPLE USER MODELS
First Claim
1. A computer-implemented method of speech recognition processing, the computer-implemented method comprising:
- receiving a spoken utterance;
storing audio data from the spoken utterance at a first-party device;
creating adaptation data from the audio data via processing at the first-party device, the adaptation data being in a format that prevents reconstruction of the audio data; and
transmitting the adaptation data to a third-party server.
2 Assignments
0 Petitions
Accused Products
Abstract
Techniques disclosed herein include systems and methods for privacy-sensitive training data collection for updating acoustic models of speech recognition systems. In one embodiment, the system locally creates adaptation data from raw audio data. Such adaptation can include derived statistics and/or acoustic model update parameters. The derived statistics and/or updated acoustic model data can then be sent to a speech recognition server or third-party entity. Since the audio data and transcriptions are already processed, the statistics or acoustic model data is devoid of any information that could be human-readable or machine readable such as to enable reconstruction of audio data. Thus, such converted data sent to a server does not include personal or confidential information. Third-party servers can then continually update speech models without storing personal and confidential utterances of users.
188 Citations
20 Claims
-
1. A computer-implemented method of speech recognition processing, the computer-implemented method comprising:
-
receiving a spoken utterance; storing audio data from the spoken utterance at a first-party device; creating adaptation data from the audio data via processing at the first-party device, the adaptation data being in a format that prevents reconstruction of the audio data; and transmitting the adaptation data to a third-party server. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A system for speech processing, the system comprising:
-
a processor; and a memory coupled to the processor, the memory storing instructions that, when executed by the processor, cause the system to perform the operations of; receiving a spoken utterance; storing audio data from the spoken utterance at a first-party device; creating adaptation data from the audio data via processing at the first-party device, the adaptation data being in a format that prevents reconstruction of the audio data; and transmitting the adaptation data to a third-party server. - View Dependent Claims (17, 18, 19)
-
-
20. A computer program product including a non-transitory computer-storage medium having instructions stored thereon for processing data information, such that the instructions, when carried out by a processing device, cause the processing device to perform the operations of:
-
receiving a spoken utterance; storing audio data from the spoken utterance at a first-party device; creating adaptation data from the audio data via processing at the first-party device, the adaptation data being in a format that prevents reconstruction of the audio data; and transmitting the adaptation data to a third-party server.
-
Specification