Syntactic re-ranking of potential transcriptions during automatic speech recognition

US 10,242,670 B2
Filed: 09/21/2016
Issued: 03/26/2019
Est. Priority Date: 09/21/2016
Status: Active Grant

First Claim

Patent Images

1. A system for syntactic re-ranking in automatic speech recognition, the system comprising:

a computer-readable memory storing computer-executable instructions that, when executed by one or more hardware processors, configure the system to;

access acoustic data for a recorded spoken language;

generate a plurality of potential transcriptions for the acoustic data;

score the plurality of potential transcriptions to create an initial likelihood score for the plurality of potential transcriptions; and

for each particular potential transcription in the plurality of transcriptions;

generate a syntactic likelihood score for the particular potential transcription, wherein the syntactic likelihood score is generated by evaluation of a syntactic structure for the particular potential transcription, and wherein the syntactic structure includes relationships between words included in the particular potential transcription; and

create an adjusted score for the particular potential transcription by combining the initial likelihood score and the syntactic likelihood score for the particular potential transcription;

generate a reduced plurality of transcriptions through elimination of one or more particular potential transcripts based on respective adjusted scores of the one or more particular potential transcripts indicating the unlikelihood of the one or more particular potential transcripts; and

output a transcription from the reduced plurality of transcriptions based on the adjusted score of the transcription of the reduced plurality of transcriptions being greater than adjusted scores of other members of the reduced plurality of transcriptions.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method for syntactic re-ranking of possible transcriptions generated by automatic speech recognition are disclosed. A computer system accesses acoustic data for a recorded spoken language and generates a plurality of potential transcriptions for the acoustic data. The computer system scores the plurality of potential transcriptions to create an initial likelihood score for the plurality of potential transcriptions. For a particular potential transcription in the plurality of transcriptions, the computer system generates a syntactical likelihood score. The computer system creates an adjusted score for the particular potential transcription by combining the initial likelihood score and the syntactic likelihood score for the particular potential transcription.

8 Citations

View as Search Results

20 Claims

1. A system for syntactic re-ranking in automatic speech recognition, the system comprising:
- a computer-readable memory storing computer-executable instructions that, when executed by one or more hardware processors, configure the system to;
  
  access acoustic data for a recorded spoken language;
  
  generate a plurality of potential transcriptions for the acoustic data;
  
  score the plurality of potential transcriptions to create an initial likelihood score for the plurality of potential transcriptions; and
  
  for each particular potential transcription in the plurality of transcriptions;
  
  generate a syntactic likelihood score for the particular potential transcription, wherein the syntactic likelihood score is generated by evaluation of a syntactic structure for the particular potential transcription, and wherein the syntactic structure includes relationships between words included in the particular potential transcription; and
  
  create an adjusted score for the particular potential transcription by combining the initial likelihood score and the syntactic likelihood score for the particular potential transcription;
  
  generate a reduced plurality of transcriptions through elimination of one or more particular potential transcripts based on respective adjusted scores of the one or more particular potential transcripts indicating the unlikelihood of the one or more particular potential transcripts; and
  
  output a transcription from the reduced plurality of transcriptions based on the adjusted score of the transcription of the reduced plurality of transcriptions being greater than adjusted scores of other members of the reduced plurality of transcriptions.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The system of claim 1, further comprising instructions to rank the plurality of potential transcriptions based on adjusted likelihood scores associated with each potential transcription.
  - 3. The system of claim 2, further comprising instructions to select a transcription from the plurality of potential transcriptions based on the ranking of the plurality of potential transcriptions.
  - 4. The system of claim 1, wherein the instructions to generate the syntactic likelihood score for the particular potential transcription further comprise instructions to, for particular potential transcription:
    - analyze the particular potential transcription to identify a plurality of words in the transcription; and
      
      assign a part of speech tag to an identified word in the plurality of words in the transcript.
  - 5. The system of claim 4, further comprising instructions to, for the particular potential transcription:
    - construct a syntactic parse tree for the particular potential transcription, based at least in part on part of speech tags associated with the plurality of words in the particular potential transcription.
  - 6. The system of claim 5, further comprising instructions to, for the particular potential transcription:
    - extract a plurality of syntactic features from the syntactic parse tree; and
      
      use a syntactic coherency model, generate a syntactic likelihood score,wherein the syntactic likelihood score is based on syntactic coherency of the particular potential transcription.

7. A method for syntactic re-ranking in automatic speech recognition, the method comprising:
- at a computer system with one or more processors;
  
  accessing acoustic data for a recorded spoken language;
  
  generating a plurality of potential transcriptions for the acoustic data;
  
  scoring the plurality of potential transcriptions to create an initial likelihood score for the plurality of potential transcriptions; and
  
  for each particular potential transcription in the plurality of potential transcriptions;
  
  generating a syntactic likelihood score for the particular potential transcription, wherein generating the syntactic likelihood score includes evaluating a syntactic structure for the particular potential transcription, and wherein the syntactic structure includes relationships between words included in the particular potential transcription; and
  
  creating an adjusted score for the particular potential transcription by combining the initial likelihood score and the syntactic likelihood score for the particular potential transcription;
  
  generating a reduced plurality of transcriptions through elimination of one or more particular potential transcripts based on respective adjusted scores of the one or more particular potential transcripts indicating the unlikelihood of the one or more particular potential transcripts; and
  
  outputting a transcription from the reduced plurality of transcriptions based on the adjusted score of the transcription of the reduced plurality of transcriptions being greater than adjusted scores of other members of the reduced plurality of transcriptions.
- View Dependent Claims (8, 9, 13, 14, 15)
- - 8. The method of claim 7, further comprising:
    - ranking the plurality of potential transcriptions based on adjusted likelihood scores associated with each potential transcription.
  - 9. The method of claim 8, further comprising selecting a potential transcription from the plurality of potential transcriptions based on the ranking of the plurality of potential transcriptions.
  - 13. The method of claim 7, wherein the initial likelihood score is based at east partially on an acoustic analysis of the acoustic data.
  - 14. The method of claim 7, wherein the initial likelihood score is based at least partially on analysis using a statistical word n-gram language model.
  - 15. The method of claim 7, further comprising, prior to generating a syntactic likelihood score for the particular potential transcription, generating a syntactic coherency model using existing syntactic data.

10. The method of 7, wherein generating a syntactic likelihood score for the particular potential transcription further comprises, for the particular potential transcription:
- analyzing the particular potential transcription to identify a plurality of words in the transcription; and
  
  assigning a part of speech tag to an identified work in the plurality of words in the transcription.
- View Dependent Claims (11, 12)
- - 11. The method of claim 10, further comprising, for the particular potential transcription:
    - constructing a syntactic parse tree for the particular potential transcription, based at least in part on part of speech tags associated with the plurality of words in the particular potential transcription.
  - 12. The method of claim 11, further comprising, for a particular potential transcription:
    - extracting a plurality of syntactic features from the syntactic parse tree; and
      
      using a syntactic coherency model, generating a syntactic likelihood score,wherein the syntactic likelihood score is based on syntactic coherency of the particular potential transcription.

16. At least one non-transitory computer-readable storage medium storing instructions that, when executed by the one or more processors of a machine, cause the machine to:
- access acoustic data for a recorded spoken language;
  
  generate a plurality of potential transcriptions for the acoustic data;
  
  score the plurality of potential transcriptions to create an initial likelihood score for the plurality of potential transcriptions; and
  
  for each particular potential transcription in the plurality of transcriptions;
  
  generate a syntactic likelihood score for the particular potential transcription, wherein the syntactic likelihood score is generated by evaluation of a syntactic structure for the particular potential transcription, and wherein the syntactic structure includes relationships between words included in the particular potential transcription; and
  
  create an adjusted score for the particular potential transcription by combining the initial likelihood score and the syntactic likelihood score for the particular potential transcription;
  
  generate a reduced plurality of transcriptions through elimination of one more particular potential transcripts based on respective adjusted scores of the one or more particular potential transcripts indicating the unlikelihood of the one or more particular potential transcripts; and
  
  output a transcription of the reduced plurality of transcriptions based on the adjusted score of the transcription of the reduced plurality of transcriptions being greater than adjusted scores of other members of the reduced plurality of transcriptions.
- View Dependent Claims (17, 18, 19, 20)
- - 17. The computer-readable storage medium of claim 16, the instructions further comprising instructions to:
    - rank the plurality of potential transcriptions based on adjusted likelihood scores associated with each potential transcription.
  - 18. The computer-readable storage medium of claim 17, the instructions further comprising instructions to:
    - select a transcription from the plurality of potential transcriptions based on the ranking of the plurality of potential transcriptions.
  - 19. The computer-readable storage medium of claim 16, wherein the instructions to generate the syntactic likelihood score for the particular potential transcription further comprise instructions to, for the particular potential transcription:
    - analyze the particular potential transcription to identify a plurality of words in the transcription; and
      
      assign a part of speech tag to an identified word in the plurality of words in the transcription.
  - 20. The computer-readable storage medium of claim 19, the instructions further comprising instructions to, for the particular potential transcription:
    - construct a syntactic parse tree for the particular potential transcription, based at least in part on part of speech tags associated with the plurality of words in the particular potential transcription.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Intel Corporation
Original Assignee
Intel Corporation
Inventors
Pereg, Oren, Wasserblat, Moshe, Mamou, Jonathan, Assayag, Michel
Primary Examiner(s)
Hang, Vu B

Application Number

US15/272,078
Publication Number

US 20180082680A1
Time in Patent Office

916 Days
Field of Search
US Class Current
CPC Class Codes

G06F 40/253   Grammatical analysis; Style...

G10L 15/02   Feature extraction for spee...

G10L 15/1822   Parsing for meaning underst...

G10L 15/19   Grammatical context, e.g. d...

G10L 15/197   Probabilistic grammars, e.g...

G10L 15/20   Speech recognition techniqu...

G10L 15/22   Procedures used during a sp...

Syntactic re-ranking of potential transcriptions during automatic speech recognition

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

8 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Syntactic re-ranking of potential transcriptions during automatic speech recognition

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

8 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links