Methods and systems for providing speech recognition systems based on speech recordings logs
First Claim
1. A method, comprising:
- receiving one or more data logs, wherein the one or more data logs comprise at least one or more recordings of spoken queries;
transcribing the one or more recordings of spoken queries;
identifying within transcriptions of the one or more recordings of spoken queries transcriptions having an occurrence exceeding a threshold, wherein the threshold is based on a comparison of the transcriptions with previous transcribed queries;
processing, by a computing device, recordings of spoken queries corresponding to the identified transcriptions using both a language model and an acoustic model;
based on a comparison of the processing using the language model with the processing using the acoustic model, identifying, from the one or more data logs, one or more recordings of spoken queries corresponding to transcriptions deemed to be due to noise and a remainder of the one or more recordings of spoken queries;
generating one or more modified data logs including the remainder of the recordings of spoken queries; and
providing the one or more modified data logs and associated transcriptions of the one or more recordings of spoken queries within the one or more modified data logs as a training data set to update one or more acoustic models for particular languages.
2 Assignments
0 Petitions
Accused Products
Abstract
Examples of methods and systems for providing speech recognition systems based on speech recordings logs are described. In some examples, a method may be performed by a computing device within a system to generate modified data logs to use as a training data set for an acoustic model for a particular language. A device may receive one or more data logs that comprise at least one or more recordings of spoken queries and transcribe the recordings. Based on comparisons, the device may identify any transcriptions that may be indicative of noise and may remove those transcriptions indicative of noise from the data logs. Further, the device may remove unwanted transcriptions from the data logs and the device may provide the modified data logs as a training data set to one or more acoustic models for particular languages.
-
Citations
20 Claims
-
1. A method, comprising:
-
receiving one or more data logs, wherein the one or more data logs comprise at least one or more recordings of spoken queries; transcribing the one or more recordings of spoken queries; identifying within transcriptions of the one or more recordings of spoken queries transcriptions having an occurrence exceeding a threshold, wherein the threshold is based on a comparison of the transcriptions with previous transcribed queries; processing, by a computing device, recordings of spoken queries corresponding to the identified transcriptions using both a language model and an acoustic model; based on a comparison of the processing using the language model with the processing using the acoustic model, identifying, from the one or more data logs, one or more recordings of spoken queries corresponding to transcriptions deemed to be due to noise and a remainder of the one or more recordings of spoken queries; generating one or more modified data logs including the remainder of the recordings of spoken queries; and providing the one or more modified data logs and associated transcriptions of the one or more recordings of spoken queries within the one or more modified data logs as a training data set to update one or more acoustic models for particular languages. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer readable medium having stored therein instructions, that when executed by a computing device, cause the computing device to perform functions comprising:
-
receiving one or more data logs, wherein the one or more data logs comprise at least one or more recordings of spoken queries; transcribing the one or more recordings of spoken queries; identifying within transcriptions of the one or more recordings of spoken queries transcriptions having an occurrence exceeding a threshold, wherein the threshold is based on a comparison of the transcriptions with previous transcribed queries; processing recordings of spoken queries corresponding to the identified transcriptions using both a language model and an acoustic model; based on a comparison of the processing using the language model with the processing using the acoustic model, identifying, from the one or more data logs, one or more recordings of spoken queries corresponding to transcriptions deemed to be due to noise and a remainder of the one or more recordings of spoken queries; generating one or more modified data logs including the remainder of the recordings of spoken queries; and providing the one or more modified data logs and associated transcriptions of the one or more recordings of spoken queries within the one or more modified data logs as a training data set to update one or more acoustic models for particular languages. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A system, comprising:
-
at least one processor; and data storage comprising program instructions executable by the at least one processor to cause the at least one processor to perform functions comprising; receiving one or more data logs, wherein the one or more data logs comprise at least one or more recordings of spoken queries; transcribing the one or more recordings of spoken queries; identifying within transcriptions of the one or more recordings of spoken queries transcriptions having an occurrence exceeding a threshold, wherein the threshold is based on a comparison of the transcriptions with previous transcribed queries; processing recordings of spoken queries corresponding to the identified transcriptions using both a language model and an acoustic model; based on a comparison of the processing using the language model with the processing using the acoustic model, identifying, from the one or more data logs, one or more recordings of spoken queries corresponding to transcriptions deemed to be due to noise and a remainder of the one or more recordings of spoken queries; generating one or more modified data logs including the remainder of the recordings of spoken queries; and providing the one or more modified data logs and associated transcriptions of the one or more recordings of spoken queries within the one or more modified data logs as a training data set to update one or more acoustic models for particular languages. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification