Synchronous reproduction in a speech recognition system
First Claim
1. A speech recognition system comprising:
- an input device for receiving a speech representative signal;
a first memory for storing a representation of the received signal suitable for audible reproduction;
a speech recognizer operative to represent the received signal as a sequence of recognized words;
a second memory for storing the sequence of recognized words, with each recognized word being stored in association with a marker indicating a correspondence between the word and a segment of the received signal in which the word was recognized; and
a controller operative to;
generate a synchronous reproduction of an audible and visible representation of at least part of the sequence of recognized words to enable a user to visually review the part of the sequence of recognized words in conjunction with audio play-back of the received signal corresponding to the part of the sequence of recognized words, the synchronous reproduction including audibly reproducing a corresponding part of the received signal stored in the first memory and for each segment of the corresponding part of the received signal, at the moment when the segment is being audibly reproduced, indicating on a display a textual representation of a recognized word which corresponds to the segment, the correspondence being given by the markers stored in the second memory;
detect whether the user has provided an editing instruction while the synchronous reproduction is active and the part of the sequence of recognized words is being audibly and visually reproduced;
pause the synchronous reproduction in response to detection of an editing instruction during the synchronous reproduction; and
cause the deleted editing instruction to be performed during the pause in the synchronous reproduction.
3 Assignments
0 Petitions
Accused Products
Abstract
In a speech recognition system, the received speech and the sequence of words, recognized in the speech by a recognizer (100), are stored in a memory (320, 330). Markers are stored as well, indicating a correspondence between the word and a segment of the received signal in which the word was recognized. In a synchronous reproduction mode, a controller (310) ensures that the speech is played-back via speakers (350) and that for each speech segment a word, which has been recognized for the segment, is indicated (e.g. highlighted) on a display (340). The controller (310) can detect whether the user has provided an editing instruction, while the synchronous reproduction is active. If so, the synchronous reproduction is automatically paused and the editing instruction executed.
-
Citations
18 Claims
-
1. A speech recognition system comprising:
-
an input device for receiving a speech representative signal;
a first memory for storing a representation of the received signal suitable for audible reproduction;
a speech recognizer operative to represent the received signal as a sequence of recognized words;
a second memory for storing the sequence of recognized words, with each recognized word being stored in association with a marker indicating a correspondence between the word and a segment of the received signal in which the word was recognized; and
a controller operative to;
generate a synchronous reproduction of an audible and visible representation of at least part of the sequence of recognized words to enable a user to visually review the part of the sequence of recognized words in conjunction with audio play-back of the received signal corresponding to the part of the sequence of recognized words, the synchronous reproduction including audibly reproducing a corresponding part of the received signal stored in the first memory and for each segment of the corresponding part of the received signal, at the moment when the segment is being audibly reproduced, indicating on a display a textual representation of a recognized word which corresponds to the segment, the correspondence being given by the markers stored in the second memory;
detect whether the user has provided an editing instruction while the synchronous reproduction is active and the part of the sequence of recognized words is being audibly and visually reproduced;
pause the synchronous reproduction in response to detection of an editing instruction during the synchronous reproduction; and
cause the deleted editing instruction to be performed during the pause in the synchronous reproduction. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 11, 12, 13, 14, 15, 16, 17, 18)
detect when a user has finished editing during the pause in the synchronous reproduction; and
automatically restart the synchronous reproduction in response to the detection that the user has finished editing.
-
-
3. A system as claimed in claim 2, wherein the controller is further operative to automatically restart a paused synchronous reproduction in response to the absence of received input from the user for a predetermined time-out period.
-
4. A system as claimed in claim 3, wherein the controller is further operative to enable a user to configure the predetermined time-out period.
-
5. A system as claimed in claim 2, wherein the controller is further operative to automatically restart a paused synchronous reproduction at a word that was being reproduced at the moment the synchronous reproduction was paused.
-
6. A system as claimed in claim 2, wherein the controller is further operative to automatically restart a paused synchronous reproduction at a word which in the sequence of recognized words immediately follows a word that has been edited last by the user.
-
7. A system as claimed in claim 2, wherein the controller is further operative to, during synchronous reproduction:
-
detect that the user has indicated on the display a position different from a position of a word being indicated during the synchronous reproduction at that moment; and
continue the synchronous reproduction with a word corresponding to the position indicated by the user.
-
-
8. A system as claimed in claim 3, wherein the controller is further operative to:
-
detect that, while the synchronous reproduction is paused, the user has supplied editing instructions for more than a predetermined period without an interruption of more than the time-out period; and
enter a dictation mode upon detection of the editing instructions for more than the predetermined period without an interruption of more than the time-out period.
-
-
11. A system as claimed in claim 1, wherein the controller is further operative to indicate on the display the textual representation of the recognized word which corresponds to the segment by highlighting the textual representation.
-
12. A system as claimed in claim 1, wherein the input device includes an audio card and a microphone connected to the audio card.
-
13. A system as claimed in claim 12, further comprising an output device coupled to and controlled by the controller to audibly reproduce the received signal, the output device being coupled to the audio card.
-
14. A system as claimed in claim 1, further comprising an output device coupled to and controlled by the controller to audibly reproduce the received signal.
-
15. A system as claimed in claim 14, wherein the output device is at least one speaker.
-
16. A system as claimed in claim 1, wherein the controller is further operative to generate the synchronous reproduction of the audible and visible representation of the part of the sequence of recognized words after completion of the entire input of the speech representative signal.
-
17. A system as claimed in claim 1, wherein the input device is separate from the speech recognizer and the first memory is separate from the second memory such that the speech representative signal is recordable off-line and transferable into the speech recognizer at a later stage.
-
18. A system as claimed in claim 1, wherein the controller is further operative to enable a user to select a position in the sequence of recognized words at which the synchronous reproduction starts.
-
9. A method of enabling reviewing a sequence of words recognized by a speech recognizer in a speech representative input signal, the method including:
-
storing a representation of the received signal suitable for audible reproduction;
using a speech recognizer to represent the received signal as a sequence of recognized words;
storing the sequence of recognized words, with each recognized word being stored in association with a marker indicating a correspondence between the word and a segment of the received signal in which the word was recognized;
generating a synchronous reproduction of an audible and visible representation of at least part of the sequence of recognized words to enable a user to visually review the part of the sequence of recognized words in conjunction with audio play-back of the received signal corresponding to the part of the sequence of recognized words, the synchronous reproduction including audibly reproducing a corresponding part of the received signal stored in the first memory and for each segment of the corresponding part of the received signal, at the moment when the segment is being audibly reproduced, indicating on a display a textual representation of a recognized word which corresponds to the segment, the correspondence being given by the markers stored in the second memory;
detecting whether the user has provided an editing instruction while the synchronous reproduction is active and the part of the sequence of recognized words is being audibly and visually reproduced; and
pausing the synchronous reproduction in response to detection of an editing instruction during the synchronous reproduction; and
causing the detected editing instruction to be performed during the pause in the synchronous reproduction. - View Dependent Claims (10)
-
Specification