SYSTEMS AND METHODS FOR IMPROVING THE ACCURACY OF A TRANSCRIPTION USING AUXILIARY DATA SUCH AS PERSONAL DATA
First Claim
1. A personal computing device for use with an automatic speech recognition engine, the device comprising:
- a communications port configured to receive a data set from the automatic speech recognition engine,wherein the data set includes a word list for candidate words with confidence scores generated by the automatic speech recognition engine in response to audio data;
a display device for displaying information to a user;
memory for at least temporarily storing personal data and executable code for a re-recognition engine,wherein the re-recognition engine is similar to, but has less speech recognition functionality than, the automatic speech recognition engine; and
at least one processor coupled among the communications port, the display device, and the memory,wherein the at least one processor is configured to perform a re-recognition of the data set to generate a transcription of the audio data, andwherein the at least one processor is further configured to—
access the personal data from the memory,rescore the data set received from the automatic speech recognition engine, using the re-recognition engine, based on the accessed personal data, andpresent, via the display device, a transcription of the audio to the user based on the rescored data set, or transmit, via the communications port, the rescored data set to the automatic speech recognition engine.
2 Assignments
0 Petitions
Accused Products
Abstract
A method is described for improving the accuracy of a transcription generated by an automatic speech recognition (ASR) engine. A personal vocabulary is maintained that includes replacement words. The replacement words in the personal vocabulary are obtained from personal data associated with a user. A transcription is received of an audio recording. The transcription is generated by an ASR engine using an ASR vocabulary and includes a transcribed word that represents a spoken word in the audio recording. Data is received that is associated with the transcribed word. A replacement word from the personal vocabulary is identified, which is used to re-score the transcription and replace the transcribed word.
313 Citations
29 Claims
-
1. A personal computing device for use with an automatic speech recognition engine, the device comprising:
-
a communications port configured to receive a data set from the automatic speech recognition engine, wherein the data set includes a word list for candidate words with confidence scores generated by the automatic speech recognition engine in response to audio data; a display device for displaying information to a user; memory for at least temporarily storing personal data and executable code for a re-recognition engine, wherein the re-recognition engine is similar to, but has less speech recognition functionality than, the automatic speech recognition engine; and at least one processor coupled among the communications port, the display device, and the memory, wherein the at least one processor is configured to perform a re-recognition of the data set to generate a transcription of the audio data, and wherein the at least one processor is further configured to— access the personal data from the memory, rescore the data set received from the automatic speech recognition engine, using the re-recognition engine, based on the accessed personal data, and present, via the display device, a transcription of the audio to the user based on the rescored data set, or transmit, via the communications port, the rescored data set to the automatic speech recognition engine. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method of generating a secondary transcription from a primary transcription generated by an automatic speech recognition (ASR) engine, wherein the method is performed by a computing system having a processor and a memory, the method comprising:
-
maintaining a personal vocabulary that includes replacement words, wherein the replacement words in the personal vocabulary are obtained from personal data associated with a user; receiving primary transcription data from an audio recording, wherein the primary transcription data is generated by the ASR engine using an ASR vocabulary; wherein the primary transcription data includes a primary transcription and a score associated words in with the primary transcription, and wherein the score is generated by the ASR engine; identifying at least one replacement word from the personal vocabulary; comparing the replacement word to at least a portion of the received primary transcription; producing a modified score associated with the portion of the received primary transcription based at least in part on the comparison; and generating a secondary transcription using the modified score, wherein the secondary transcription includes at least the one replacement word. - View Dependent Claims (8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A method of replacing one or more words in a transcription generated by an automatic speech recognition (ASR) engine, wherein the method is performed by a personal computing system having a processor and a memory, the method comprising:
-
maintaining a personal vocabulary that includes replacement words; wherein the replacement words in the personal vocabulary are obtained from personal data associated with a user; and receiving a transcription of an audio recording, wherein the transcription is generated by an ASR engine using an ASR vocabulary, wherein the ASR vocabulary is separate from the personal vocabulary, and, wherein the transcription includes at least one transcribed word that represents at least one spoken word in the audio recording; receiving data associated with the transcribed word; identifying a replacement word from the personal vocabulary; and
,replacing the transcribed word with the replacement word. - View Dependent Claims (18)
-
-
19. A method of replacing one or more words in a transcription generated by an automatic speech recognition (ASR) engine, wherein the method is performed by a portable computing system having a processor and a memory, the method comprising:
-
maintaining a personal vocabulary that includes replacement words; wherein the replacement words in the personal vocabulary are obtained from personal data associated with a user; and wherein the personal data is obtained from; stored contact data for the user, stored calendar data for the user, text-based messages sent or received by the user;
ora social network of which the user is a member, receiving a transcription of an audio recording, wherein the transcription is generated by the ASR engine using an ASR vocabulary, wherein the transcription includes a transcribed word that represents a spoken word in the audio recording, and wherein the ASR engine is located geographically remotely from the portable computing system; receiving data associated with the transcribed word, wherein the data associated with the transcribed word includes a confidence score, and wherein the confidence score is generated by the ASR engine; identifying a replacement word from the personal vocabulary; - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28, 29)
-
Specification