Input generation for classifier

US 10,540,963 B2
Filed: 02/02/2017
Issued: 01/21/2020
Est. Priority Date: 02/02/2017
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method for improving performance of a speech recognition system, comprising:

generating a single text data structure for a classifier of a speech recognition system, including;

obtaining first n-best hypotheses as an output of a speech recognition task performed by automatic speech recognition (ASR) for an utterance received by the speech recognition system; and

combining the first n-best hypotheses horizontally in a predetermined order with a separator between each pair of n-best hypotheses to generate the single text data structure, wherein each separator is set based on a classification algorithm of the classifier of the speech recognition system as a symbol that is usable by the classifier of the speech recognition system; and

outputting the single text data structure as an input to the classifier to perform a classification task;

wherein the classifier is trained to perform the classification task based on a single training text data structure by;

obtaining source training data including a plurality of second n-best hypotheses and a transcription for each utterance from a database;

arranging the source training data with the transcription at a head or at an end of the second n-best hypotheses depending on the predetermined order to generate the single training text data structure; and

outputting the single training text data structure to the classifier.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A computer-implemented method for generating an input for a classifier. The method includes obtaining n-best hypotheses which is an output of an automatic speech recognition (ASR) for an utterance, combining the n-best hypotheses horizontally in a predetermined order with a separator between each pair of hypotheses, and outputting the combined n-best hypotheses as a single text input to a classifier.

18 Citations

9 Claims

1. A computer-implemented method for improving performance of a speech recognition system, comprising:
- generating a single text data structure for a classifier of a speech recognition system, including;
  
  obtaining first n-best hypotheses as an output of a speech recognition task performed by automatic speech recognition (ASR) for an utterance received by the speech recognition system; and
  
  combining the first n-best hypotheses horizontally in a predetermined order with a separator between each pair of n-best hypotheses to generate the single text data structure, wherein each separator is set based on a classification algorithm of the classifier of the speech recognition system as a symbol that is usable by the classifier of the speech recognition system; and
  
  outputting the single text data structure as an input to the classifier to perform a classification task;
  
  wherein the classifier is trained to perform the classification task based on a single training text data structure by;
  
  obtaining source training data including a plurality of second n-best hypotheses and a transcription for each utterance from a database;
  
  arranging the source training data with the transcription at a head or at an end of the second n-best hypotheses depending on the predetermined order to generate the single training text data structure; and
  
  outputting the single training text data structure to the classifier.
- View Dependent Claims (2, 3)
- - 2. The method according to claim 1, wherein each of the first and second n-best hypotheses is associated with a corresponding confidence score, and wherein the predetermined order is determined based on the confidence scores.
  - 3. The method according to claim 1, wherein each separator is selected from the group consisting of:
    - EOS and <
      
      /s>
      
      .

4. An apparatus comprising:
- a processor; and
  
  one or more non-transitory computer readable storage mediums collectively including instructions that, when executed by the processor, cause the processor to perform a method for improving performance of a speech recognition system, the method comprising;
  
  generating a single text data structure for a classifier of a speech recognition system, including;
  
  obtaining first n-best hypotheses as an output of a speech recognition task performed by automatic speech recognition (ASR) for an utterance received by the speech recognition system; and
  
  combining the first n-best hypotheses horizontally in a predetermined order with a separator between each pair of n-best hypotheses to generate the single text data structure, wherein each separator is set based on a classification algorithm of the classifier of the speech recognition system as a symbol that is usable by the classifier of the speech recognition system; and
  
  outputting the single text data structure as an input to the classifier to perform a classification task;
  
  wherein the classifier is trained to perform the classification task based on a single training text data structure by;
  
  obtaining source training data including a plurality of second n-best hypotheses and a transcription for each utterance from a database;
  
  arranging the source training data with the transcription at a head or at an end of the second n-best hypotheses depending on the predetermined order to generate the single training text data structure; and
  
  outputting the single training text data structure to the classifier.
- View Dependent Claims (5, 6)
- - 5. The apparatus according to claim 4, wherein each of the first and second n-best hypotheses is associated with a corresponding confidence score, and wherein the predetermined order is determined based on the confidence scores.
  - 6. The apparatus according to claim 4, wherein each separator is selected from the group consisting of:
    - EOS and <
      
      /s>
      
      .

7. A computer program product comprising a computer readable storage medium having program instructions embodied therewith, wherein the computer readable storage medium is not a transitory signal per se, the program instructions executable by a computer to cause the computer to perform a method for improving performance of a speech recognition system, the method comprising:
- generating a single text data structure for a classifier of a speech recognition system, including;
  
  obtaining first n-best hypotheses as an output of a speech recognition task performed by automatic speech recognition (ASR) for an utterance received by the speech recognition system; and
  
  combining the first n-best hypotheses horizontally in a predetermined order with a separator between each pair of n-best hypotheses to generate the single text data structure, wherein each separator is set based on a classification algorithm of the classifier of the speech recognition system as a symbol that is usable by the classifier of the speech recognition system; and
  
  outputting the single text data structure as an input to the classifier to perform a classification task;
  
  wherein the classifier is trained to perform the classification task based on a single training text data structure by;
  
  obtaining source training data including a plurality of second n-best hypotheses and a transcription for each utterance from a database;
  
  arranging the source training data with the transcription at a head or at an end of the second n-best hypotheses depending on the predetermined order to generate the single training text data structure; and
  
  outputting the single training text data structure to the classifier.
- View Dependent Claims (8, 9)
- - 8. The computer program product according to claim 7, wherein each of the first and second n-best hypotheses is associated with a corresponding confidence score, and wherein the predetermined order is determined based on the confidence scores.
  - 9. The computer program product according to claim 7, wherein each separator is selected from the group consisting of:
    - EOS and <
      
      /s>
      
      .

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Itoh, Nobuyasu, Kurata, Gakuto, Tachibana, Ryuki
Primary Examiner(s)
Roberts, Shaun

Application Number

US15/422,507
Publication Number

US 20180218736A1
Time in Patent Office

1,083 Days
Field of Search

704235
US Class Current
CPC Class Codes

G10L 15/01   Assessment or evaluation of...

G10L 15/08   Speech classification or se...

G10L 15/18   using natural language mode...

G10L 15/1815   Semantic context, e.g. disa...

G10L 15/26   Speech to text systems G10L...

G10L 2015/0631   Creating reference template...

Input generation for classifier

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

18 Citations

9 Claims

Specification

Solutions

Use Cases

Quick Links

Input generation for classifier

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

18 Citations

9 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links