System and method for speech-to-text conversion using constrained dictation in a speak-and-spell mode
First Claim
Patent Images
1. A system for converting speech to text comprising:
- a) an interface requesting a user to speak a word and to speak a spelling of said word for each of a plurality of words spoken during a communication;
b) an audio receiving module for receiving said spoken word and spoken spelling of said word;
c) a signal processing module for extracting specific features of the spoken word and said spelled word;
d) a search module that uses said extracted specific features in conjunction with an at least one acoustic module;
e) at least one language module that matches a result presented by the search module; and
f) at least one constrained lexicon that takes the resulting match and using a system module outputting the desired text matching the spoken word.
0 Assignments
0 Petitions
Accused Products
Abstract
For improving the accuracy of a speech recognition system, for the specific task of speech-to-text (dictation style speech) translation, a constrained dictation methodology using speak-and-spell mode is disclosed. The invention is perfectly suited for modern day “text-messaging” applications wherein the number of words being dictated is very small (limited by the 140-160 characters message length constraint). Additionally, the invention adds a control on the way users interact with machines, thereby making the speech recognition task easier and improving system accuracy.
-
Citations
6 Claims
-
1. A system for converting speech to text comprising:
-
a) an interface requesting a user to speak a word and to speak a spelling of said word for each of a plurality of words spoken during a communication; b) an audio receiving module for receiving said spoken word and spoken spelling of said word; c) a signal processing module for extracting specific features of the spoken word and said spelled word; d) a search module that uses said extracted specific features in conjunction with an at least one acoustic module; e) at least one language module that matches a result presented by the search module; and f) at least one constrained lexicon that takes the resulting match and using a system module outputting the desired text matching the spoken word. - View Dependent Claims (2)
-
-
3. A method for converting speech to text comprising:
-
a) requesting a user to speak a word and to speak a spelling of said word for each of a plurality of words spoken during a communication; b) receiving said spoken word and spoken spelling of said word using an audio receiving module; c) extracting specific features of the spoken word and said spelling of said spoken word using a signal processing module; d) using said extracted specific features in conjunction with an at least acoustic module using a search module; e) matching a result presented by the search module using an at least one language module; and f) taking the results that match an at least one constrained lexicon outputting the desired text matching the spoken word. - View Dependent Claims (4)
-
-
5. A means for converting speech to text comprising:
-
a) means for requesting a user to speak a word and to speak a spelling of said word for each of a plurality of words spoken during a communication; b) means for receiving said spoken word and spoken spelling of said word using an audio receiving module; c) means for extracting specific features of the spoken word and said spelling of said spoken word using a signal processing module; d) means for using said extracted specific features in conjunction with an at least one acoustic module using a search module; e) means for matching a result presented by the search module using an at least one language module; and f) means for taking the results that match an at least one constrained lexicon outputting the desired text matching the spoken word. - View Dependent Claims (6)
-
Specification