Method and apparatus for processing the output of a speech recognition engine
First Claim
Patent Images
1. Data processing apparatus comprising:
- input means for receiving recognition data from a speech recognition engine and audio data, said recognition data including a string of recognised characters and audio identifiers identifying audio components corresponding to a character component of the recognised characters;
storage means for storing said audio data received from said input means;
processing means for receiving and processing the input recognised characters to at least one of replace, insert, move and position the recognised characters to form a processed character string;
link means for forming link data linking the audio identifiers to the character component positions in the character string and for updating said link data after processing to maintain the link between the audio identifiers and the character component positions in the processed character string;
display means for displaying the characters received and processed by said processing means;
user operable selection means for selecting characters in the displayed characters for audio playback, where said link data identifies any selected audio components, if present, which are linked to the selected characters; and
audio playback means for playing back the selected audio components in the order of the character component positions in the character string or the processed character string.
4 Assignments
0 Petitions
Accused Products
Abstract
data processing apparatus is disclosed for receiving recognition data from a speech recognition engine and its corresponding dictated audio data where the recognition data includes recognized words or characters. A display displays the recognized words or characters and the recognized words or characters re stored as a file together with the corresponding audio data. The recognized words or characters can be processed and link data is formed to link the position of the words or characters in the file and the position of the corresponding audio component in the audio data.
-
Citations
78 Claims
-
1. Data processing apparatus comprising:
-
input means for receiving recognition data from a speech recognition engine and audio data, said recognition data including a string of recognised characters and audio identifiers identifying audio components corresponding to a character component of the recognised characters;
storage means for storing said audio data received from said input means;
processing means for receiving and processing the input recognised characters to at least one of replace, insert, move and position the recognised characters to form a processed character string;
link means for forming link data linking the audio identifiers to the character component positions in the character string and for updating said link data after processing to maintain the link between the audio identifiers and the character component positions in the processed character string;
display means for displaying the characters received and processed by said processing means;
user operable selection means for selecting characters in the displayed characters for audio playback, where said link data identifies any selected audio components, if present, which are linked to the selected characters; and
audio playback means for playing back the selected audio components in the order of the character component positions in the character string or the processed character string. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15, 18, 20, 21, 57)
-
-
13. A data processing arrangement comprising:
a data processing apparatus, the data processing apparatus comprising;
input means for receiving recognition data from a speech recognition engine and audio data, said recognition data including a string of recognised characters and audio identifiers identifying audio components corresponding to a character component of the recognised characters;
processing means for receiving and processing the input recognised characters to at least one of replace, insert, move and position the recognised characters to form a processed character string;
link means for forming link data linking the audio identifiers to the character component positions in the character string, and for updating said link data after processing to maintain the link between the audio identifiers and the character component positions in the processed character string;
storage means for storing said recognition data and audio data received from said input means, and for storing said link data;
display means for displaying the characters received and processed by said processing means;
user operable selection means for selecting characters in the displayed characters for audio playback, where said link data identifies any selected audio components, if present, which are linked to the selected characters; and
audio playback means for playing back the selected audio components in the order of the character component positions in the character string or the processed character string; and
an editor work station comprising;
data reading means for reading the characters, link data, and audio data from said data processing apparatus;
editor processing means for processing the characters;
editor link means for linking the audio data to the character component position using the link data;
editor display means for displaying the characters being processed;
editor correction means for selecting and correcting any displayed characters which have been incorrectly recognised;
editor audio playback means for playing back any audio component corresponding to the selected characters to aid correction;
editor speech recognition update means for storing the corrected characters and the audio identifier for the audio component corresponding to the corrected character in a character correction file; and
data transfer means for transferring the character correction file to said data processing apparatus for later updating of models used by said speech recognition engine;
said data processing apparatus including correction file reading means for reading said character correction file to pass the data contained therein to said speech recognition engine for the updating of the models used by said speech recognition engine. - View Dependent Claims (14, 16, 17, 19, 22, 24, 25, 26, 28, 30, 35, 38, 39, 41, 42)
-
23. A data processing method comprising the steps of:
-
receiving recognition data from a speech recognition engine and audio data, said recognition data including a string of recognised characters and audio identifiers identifying audio components corresponding to a character component of the recognised characters;
storing the received audio data;
inputting the recognised characters to a processor for the processing of the characters to at least one of replace, insert, move and position the characters to form a processed character string;
forming link data linking the audio identifiers to the character component positions in the character string and updating said link data after processing to maintain the link between the audio identifiers and the character component positions in the processed character string;
displaying the characters input to and processed by the processor;
selecting displayed characters for audio playback, whereby said link data identifies any selected audio components, if present, which are linked to the selected characters; and
playing back the selected audio components in the order of the character component positions in the character string or processed character string. - View Dependent Claims (27, 29, 31, 32, 33, 34, 36, 37, 40)
-
-
43. Data processing apparatus comprising
means for receiving recognition data from a speech recognition engine and corresponding audio data; - the recognition data including recognised characters;
display means for displaying the recognised characters;
storage means for storing the recognised characters as a file;
means for selectively disabling one of the display and storage of the recognised characters and the speech recognition engine for a period of time; and
means for storing the received audio data during said period of time in said storage means as an audio message associated with the file. - View Dependent Claims (44, 45, 47, 48)
- the recognition data including recognised characters;
-
46. Data processing apparatus comprising
means for receiving data from a speech recognition engine and corresponding audio data, the recognition data including recognised characters; -
display means for displaying the recognised characters;
storage means for storing the recognised characters as a file and for storing the corresponding audio data.
-
-
49. Data processing apparatus comprising
means for receiving recognition data from a speech recognition engine and corresponding audio data, said recognition data including recognised characters representing the recognised characters and audio identifier identifying the audio component corresponding to a character in the recognised characters; -
storage means for storing said audio data and the recognised characters;
display means for displaying the recognised characters received from said speech recognition means or retrieved from said storage means;
user operable selection and correction means for selecting and correcting any displayed recognised characters;
audio playback means for playing back any audio component corresponding to the selected characters to aid correction; and
speech recognition update means for sending the corrected character and the audio identifier for the audio component corresponding to the corrected character to the speech recognition engine.
-
-
50. Data correction apparatus comprising
means for receiving recognition data from a speech recognition engine, said recognition data including recognised characters representing the most likely characters, and a likelihood indicator for each character indicating the likelihood that the character is correct; -
display means for displaying the recognised characters;
automatic error detection means for detecting possible errors in recognition of characters in the recognised characters by scanning the likelihood indicators for the recognised characters and detecting if the likelihood indicator for a character is below a likelihood threshold, whereby said display means highlights at least the first, if any, character having a likelihood indicator below the likelihood threshold;
user operable selection means for selecting a character to replace an incorrectly recognised character highlighted in the recognised characters; and
correction means for replacing the incorrectly recognised character with the selected character to correct the recognised characters. - View Dependent Claims (51)
-
-
52. A computer usable medium having computer readable instructions stored therein for causing a processor in a data processing apparatus to process signals defining a string of characters and corresponding audio data to display the characters and selectively play the audio data, the instructions comprising instructions for:
-
a) causing the processor to receive the signals from a speech recognition engine, the recognition signals including recognised characters and audio identifier identifying the audio components corresponding to character components in the recognised characters;
b) causing the processor to process the signals to manipulate the characters;
c) causing the processor to process the signals to form link data linking the audio identifier to the character component positions in the character string;
d) causing the processor to generate an image of the characters on a display;
e) causing the processor to receive a selection signal generated by a user and to identify any audio components corresponding to the selected characters; and
f) causing the processor to send the identified audio components in the order of the character component positions in the characters to an audio play back device.
-
-
53. A computer usable medium having computer readable instructions stored therein for causing the processor in a data processing apparatus to process signals defining a string of characters and audio data to store the characters and the audio data, the instructions comprising instructions for
a) causing the processor to receive the signals from a speech recognition engine; -
b) causing the processor to generate an image of the characters on a display;
c) causing the processor to store the characters as a file;
d) causing the processor to selectively disable one of the display and storage of the characters and the speech recognition engine for a period of time; and
e) causing the processor to store the audio signal for the period of time as an audio message associated with the file. - View Dependent Claims (54)
-
-
55. A computer usable medium having computer readable instructions stored therein for causing a processor in a data processing apparatus to process signals defining a string of characters and corresponding audio data to store the characters and the audio data, the instructions comprising instructions for:
-
a) causing the processor to receive the signals from a speech recognition engine;
b) causing the processor to generate an image of the characters for display; and
c) causing the processor to store the characters as a file and to store the corresponding audio signal. - View Dependent Claims (58, 60, 61, 62)
-
-
56. A computer usable medium having computer readable instructions stored therein for causing a processor in a data processing apparatus to process signals defining a string of characters and corresponding audio data from a speech recognition engine to update the models used by speech recognition engine, the instructions comprising instructions for:
-
a) causing the processor to receive the characters, audio data, and audio identifiers from the speech recognition engine, said audio identifier identifying audio components corresponding to components in the characters;
b) causing the processor to store the audio data and the characters, in a storage device;
c) causing the processor to generate an image for display of the characters received from the speech recognition engine or retrieved from the storage device;
d) causing the processor to receive a selection signal generated by a user to select characters which have been incorrectly recognised by the speech recognition engine;
e) causing the processor to retrieve any audio component from the storage device corresponding to the selected characters and to send the retrieved audio to an audio play back device;
f) causing the processor to receive corrected characters input by a user and to replace the incorrect characters with the corrected characters; and
g) causing the processor to send the corrected characters and the audio identifier for the audio component corresponding to the corrected characters to the speech recognition engine for the correction of models used by the speech recognition engine.
-
-
59. A data processing arrangement comprising:
data processing apparatus comprising;
input means for receiving recognition data from a speech recognition engine and corresponding audio data, said recognition data including a string of recognised characters and audio identifiers identifying audio components corresponding to character components of the recognised characters;
link means for forming link data linking the audio identifiers to the character component positions in the character string;
storage means for storing said audio data received from said input means, said link data, and said recognised characters; and
display means for displaying the recognised characters; and
an editor work station comprising;
data reading means for obtaining the characters, link data, and audio data from said data processing apparatus;
editor processing means for processing the characters;
editor link means for linking the audio data to the character component position using the link data;
editor display means for displaying the characters being processed;
editor correction means for selecting and correcting any displayed characters which have been incorrectly recognised;
editor audio playback means for playing back any audio component corresponding to the selected characters to aid correction;
editor speech recognition update means for storing the corrected characters and the audio identifier for the audio component corresponding to the corrected character in a character correction file; and
data transfer means for transferring the character correction file to said data processing apparatus for later updating of models used by said speech recognition engine;
said data processing apparatus including correction file reading means for reading said character correction file to pass the data contained therein to said speech recognition engine.
-
63. A computer usable medium having computer readable instructions stored therein for causing the processor in a data processing apparatus to process signals defining a string of characters and audio data to store the characters and the audio data, the instructions comprising instructions for
a) causing the processor to receive the signals from a speech recognition engine; -
b) causing the processor to generate an image of the characters on a display;
c) causing the processor to store the characters as a file;
d) causing the processor to selectively disable one of the display and storage of the characters and the speech recognition engine for a period of time; and
e) causing the processor to store the received audio data during said period of time as an audio message associated with the file. - View Dependent Claims (64)
-
-
65. Data processing apparatus comprising
input means for inputting audio data; -
means for receiving recognition data from a speech recognition engine, said recognition data including recognised characters corresponding to input audio data;
storage means for storing the recognised characters in a file and for storing the audio data; and
user operable selection means for selecting one of the recognised characters and corresponding audio data for storage in said storage means, or the audio data for which there are no corresponding recognised characters for storage in said storage means in association with a file of recognised characters. - View Dependent Claims (66, 68, 70)
-
-
67. A data processing method comprising the steps of:
-
inputting audio data; and
selecting one of receiving recognition data from a speech recognition engine and storing the recognition data in a file, said recognition data including recognised characters corresponding to input audio data, or storing the input audio data for which there is no corresponding recognition data in association with a file of recognition data.
-
-
69. Speech recognition apparatus comprising:
-
input means for inputting speech data;
recogniser means for receiving input speech data and for selectively performing speech recognition to generate recognised characters;
output means for visibly outputting the recognised characters, storage means for storing the input speech data and recognised characters;
storage control means for controlling said storage means to store recognised characters from said recogniser means corresponding to a portion of input speech data as a file when said recogniser means is operating, and for controlling said storage means to store a portion of the input speech data in association with a file of recognised characters as an audio message when said recogniser means is not operating. - View Dependent Claims (71)
-
-
72. A speech recognition method comprising the steps of
inputting speech data; -
selectively performing speech recognition on input speech data to generate the recognised characters;
visibly outputting the recognised characters;
storing recognised characters corresponding to a portion of input speech data as a file when speech recognition is performed; and
storing a portion of the input speech data in association with a file of recognised characters as an audio message when speech recognition is not performed. - View Dependent Claims (73, 74)
-
-
75. A computer usable medium having computer readable instructions stored therein for causing the processor in a data processing apparatus to process signals defining recognition data from a speech recognition engine and audio data to store the recognition data and the audio data, the instructions comprising instructions for
a) causing the processor to receive audio data signals; -
b) causing the processor to receive recognition data signals from a speech recognition engine; and
c) selectively causing the processor to store the recognition data signals in a file and to store corresponding audio data signals in storage means, or to store the audio data signals for which there is no corresponding recognition data signals in association with a file of recognition data signals. - View Dependent Claims (76)
-
-
77. A computer usable medium having computer readable instructions stored therein for causing a processor in a speech recognition apparatus to process signal defining recognised characters and audio data to store the recognised characters and audio data, the instructions comprising instructions for
a) causing the processor to receive input speech data; -
b) causing the processor to perform speech recognition on the input speech data to generate recognised characters c) causing the processor to visibly output the recognised characters;
d) causing the processor to store the input speech data and recognised characters, in storage means;
e) causing the processor to control said storage means to store recognised characters corresponding to a portion of input speech data as a file when speech recognition is being carried out, and to control said storage means to store a portion of the input speech data in association with a file of recognised characters as an audio message when speech recognition is not being carried out. - View Dependent Claims (78)
-
Specification