Syntactic re-ranking of potential transcriptions during automatic speech recognition
First Claim
1. A system for syntactic re-ranking in automatic speech recognition, the system comprising:
- a computer-readable memory storing computer-executable instructions that, when executed by one or more hardware processors, configure the system to;
access acoustic data for a recorded spoken language;
generate a plurality of potential transcriptions for the acoustic data;
score the plurality of potential transcriptions to create an initial likelihood score for the plurality of potential transcriptions; and
for each particular potential transcription in the plurality of transcriptions;
generate a syntactic likelihood score for the particular potential transcription, wherein the syntactic likelihood score is generated by evaluation of a syntactic structure for the particular potential transcription, and wherein the syntactic structure includes relationships between words included in the particular potential transcription; and
create an adjusted score for the particular potential transcription by combining the initial likelihood score and the syntactic likelihood score for the particular potential transcription;
generate a reduced plurality of transcriptions through elimination of one or more particular potential transcripts based on respective adjusted scores of the one or more particular potential transcripts indicating the unlikelihood of the one or more particular potential transcripts; and
output a transcription from the reduced plurality of transcriptions based on the adjusted score of the transcription of the reduced plurality of transcriptions being greater than adjusted scores of other members of the reduced plurality of transcriptions.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for syntactic re-ranking of possible transcriptions generated by automatic speech recognition are disclosed. A computer system accesses acoustic data for a recorded spoken language and generates a plurality of potential transcriptions for the acoustic data. The computer system scores the plurality of potential transcriptions to create an initial likelihood score for the plurality of potential transcriptions. For a particular potential transcription in the plurality of transcriptions, the computer system generates a syntactical likelihood score. The computer system creates an adjusted score for the particular potential transcription by combining the initial likelihood score and the syntactic likelihood score for the particular potential transcription.
8 Citations
20 Claims
-
1. A system for syntactic re-ranking in automatic speech recognition, the system comprising:
a computer-readable memory storing computer-executable instructions that, when executed by one or more hardware processors, configure the system to; access acoustic data for a recorded spoken language; generate a plurality of potential transcriptions for the acoustic data; score the plurality of potential transcriptions to create an initial likelihood score for the plurality of potential transcriptions; and for each particular potential transcription in the plurality of transcriptions; generate a syntactic likelihood score for the particular potential transcription, wherein the syntactic likelihood score is generated by evaluation of a syntactic structure for the particular potential transcription, and wherein the syntactic structure includes relationships between words included in the particular potential transcription; and create an adjusted score for the particular potential transcription by combining the initial likelihood score and the syntactic likelihood score for the particular potential transcription; generate a reduced plurality of transcriptions through elimination of one or more particular potential transcripts based on respective adjusted scores of the one or more particular potential transcripts indicating the unlikelihood of the one or more particular potential transcripts; and output a transcription from the reduced plurality of transcriptions based on the adjusted score of the transcription of the reduced plurality of transcriptions being greater than adjusted scores of other members of the reduced plurality of transcriptions. - View Dependent Claims (2, 3, 4, 5, 6)
-
7. A method for syntactic re-ranking in automatic speech recognition, the method comprising:
at a computer system with one or more processors; accessing acoustic data for a recorded spoken language; generating a plurality of potential transcriptions for the acoustic data; scoring the plurality of potential transcriptions to create an initial likelihood score for the plurality of potential transcriptions; and for each particular potential transcription in the plurality of potential transcriptions; generating a syntactic likelihood score for the particular potential transcription, wherein generating the syntactic likelihood score includes evaluating a syntactic structure for the particular potential transcription, and wherein the syntactic structure includes relationships between words included in the particular potential transcription; and creating an adjusted score for the particular potential transcription by combining the initial likelihood score and the syntactic likelihood score for the particular potential transcription; generating a reduced plurality of transcriptions through elimination of one or more particular potential transcripts based on respective adjusted scores of the one or more particular potential transcripts indicating the unlikelihood of the one or more particular potential transcripts; and outputting a transcription from the reduced plurality of transcriptions based on the adjusted score of the transcription of the reduced plurality of transcriptions being greater than adjusted scores of other members of the reduced plurality of transcriptions. - View Dependent Claims (8, 9, 13, 14, 15)
-
10. The method of 7, wherein generating a syntactic likelihood score for the particular potential transcription further comprises, for the particular potential transcription:
-
analyzing the particular potential transcription to identify a plurality of words in the transcription; and assigning a part of speech tag to an identified work in the plurality of words in the transcription. - View Dependent Claims (11, 12)
-
-
16. At least one non-transitory computer-readable storage medium storing instructions that, when executed by the one or more processors of a machine, cause the machine to:
-
access acoustic data for a recorded spoken language; generate a plurality of potential transcriptions for the acoustic data; score the plurality of potential transcriptions to create an initial likelihood score for the plurality of potential transcriptions; and for each particular potential transcription in the plurality of transcriptions; generate a syntactic likelihood score for the particular potential transcription, wherein the syntactic likelihood score is generated by evaluation of a syntactic structure for the particular potential transcription, and wherein the syntactic structure includes relationships between words included in the particular potential transcription; and create an adjusted score for the particular potential transcription by combining the initial likelihood score and the syntactic likelihood score for the particular potential transcription; generate a reduced plurality of transcriptions through elimination of one more particular potential transcripts based on respective adjusted scores of the one or more particular potential transcripts indicating the unlikelihood of the one or more particular potential transcripts; and output a transcription of the reduced plurality of transcriptions based on the adjusted score of the transcription of the reduced plurality of transcriptions being greater than adjusted scores of other members of the reduced plurality of transcriptions. - View Dependent Claims (17, 18, 19, 20)
-
Specification