DICTATION WITH INCREMENTAL RECOGNITION OF SPEECH
First Claim
1. A method, performed by computing functionality, for providing a dictating service, comprising:
- receiving a speech signal in response to vocalization, by a user, of an incremental portion of a complete utterance;
interpreting the incremental portion based on the speech signal, to provide recognized speech, prior to the user finishing the complete utterance;
providing rendered text associated with the recognized speech on an output presentation, for review by the user, prior to a user finishing the complete utterance;
modifying a selected part of the rendered text based on input from the user, when the user chooses to modify the selected part; and
repeating said receiving, interpreting, providing, and modifying at least one time.
2 Assignments
0 Petitions
Accused Products
Abstract
A dictation module is described herein which receives and interprets a complete utterance of the user in incremental fashion, that is, one incremental portion at a time. The dictation module also provides rendered text in incremental fashion. The rendered text corresponds to the dictation module'"'"'s interpretation of each incremental portion. The dictation module also allows the user to modify any part of the rendered text, as it becomes available. In one case, for instance, the dictation module provides a marking menu which includes multiple options by which a user can modify a selected part of the rendered text. The dictation module also uses the rendered text (as modified or unmodified by the user using the marking menu) to adjust one or more models used by the dictation model to interpret the user'"'"'s utterance.
32 Citations
20 Claims
-
1. A method, performed by computing functionality, for providing a dictating service, comprising:
-
receiving a speech signal in response to vocalization, by a user, of an incremental portion of a complete utterance; interpreting the incremental portion based on the speech signal, to provide recognized speech, prior to the user finishing the complete utterance; providing rendered text associated with the recognized speech on an output presentation, for review by the user, prior to a user finishing the complete utterance; modifying a selected part of the rendered text based on input from the user, when the user chooses to modify the selected part; and repeating said receiving, interpreting, providing, and modifying at least one time. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A dictation module, implemented by computing functionality, comprising:
-
a pre-processing module configured to extract features from a speech signal, the speech signal being received in response to vocalization, by a user, of an incremental portion of a complete utterance; a decoder module configured to interpret the incremental portion based on the features extracted from the speech signal, prior to the user finishing the complete utterance, to provide recognized speech; an acoustic model, for use by the decoder module, configured to acoustically interpret the speech signal; a language model, for use by the decoder module, configured to linguistically interpret the speech signal; a user interaction module configured to; provide rendered text associated with the recognized speech on an output presentation, for review by the user, prior to the user finishing the complete utterance; and provide a plurality of options to the user that give the user an opportunity to modify any part of the rendered text in different respective ways; and an adaptation module configured to modify at least one of the acoustic model and the language model, based on the rendered text that is either unmodified or modified by the user via the user interaction module. - View Dependent Claims (17)
-
-
18. A computer readable storage medium for storing computer readable instructions, the computer readable instructions providing a dictation module when executed by one or more processing devices, the computer readable instructions comprising:
-
logic configured to present rendered text associated with a vocalization, by a user, of an incremental portion of a complete utterance, prior to the user finishing the complete utterance; and logic configured to present a marking menu to the user that provides a plurality of options, the plurality of options giving the user an opportunity to modify any part of the rendered text in different respective ways. - View Dependent Claims (19, 20)
-
Specification