Calibration of a speech recognition engine using validated text
First Claim
Patent Images
1. A speech recognition system that can be acoustically trained with free text audio, the system comprising:
- a speech recognition software application operating on a computing device having a processor, the speech recognition software application comprising;
a speech recognition engine configured to receive the free text audio at the speech recognition engine which is unknown to the speech recognition engine previous to acoustical training of the speech recognition engine in both spoken audio and text forms, translate the free text audio into text form for display to a user, and receive a reviewed version of the text form and convert the reviewed version of the text form into a context free grammar based on text indicated as validated text as indicated by the user;
a comparison module configured to receive an indication of the validated text and associate the validated text with at least one word from the free text audio; and
a plurality of voice models;
wherein upon receipt of a plurality of instances in which validated text is associated with the at least one word from the free text audio, the speech recognition software application selects a subset of voice models of the plurality of voice models in such a way that the subset of voice models shares a plurality of characteristics with the free text audio associated with the validated text.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method provide acoustic training of a voice or speech recognition engine and/or voice or speech recognition software application. Instead of requiring a user to read from a prepared or predetermined script, the system and method described herein enable acoustic training using any free text spoken phrases provided by the user directly, or by a previously recorded speech, presentation, or the like, performed by the user.
-
Citations
7 Claims
-
1. A speech recognition system that can be acoustically trained with free text audio, the system comprising:
-
a speech recognition software application operating on a computing device having a processor, the speech recognition software application comprising; a speech recognition engine configured to receive the free text audio at the speech recognition engine which is unknown to the speech recognition engine previous to acoustical training of the speech recognition engine in both spoken audio and text forms, translate the free text audio into text form for display to a user, and receive a reviewed version of the text form and convert the reviewed version of the text form into a context free grammar based on text indicated as validated text as indicated by the user; a comparison module configured to receive an indication of the validated text and associate the validated text with at least one word from the free text audio; and a plurality of voice models; wherein upon receipt of a plurality of instances in which validated text is associated with the at least one word from the free text audio, the speech recognition software application selects a subset of voice models of the plurality of voice models in such a way that the subset of voice models shares a plurality of characteristics with the free text audio associated with the validated text. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
Specification