Automated proofreading using interface linking recognized words to their audio data while text is being changed
DC CAFCFirst Claim
Patent Images
1. Data processing apparatus comprising:
- input means for receiving recognition data from a speech recognition engine and corresponding audio data, said recognition data including a string of recognised words and audio identifiers identifying audio components corresponding to each recognised word;
storage means for storing said audio data received from said input means;
interface application program means comprising means for receiving the input recognised words, means for placing the recognised words into positions in text in a processing application program means to allow the processing of the recognised words to change the positions of the recognised words to form a processed word string, means for determining the positions of the recognised words in said processing application program means, means for monitoring changes in the positions of the recognised words, and means for forming link data linking the audio data to the recognised words, said link data comprising the audio identifiers and the determined positions of corresponding recognised words, said interface application program means including means for updating said link data in response to monitored changes in positions of the recognised words;
display means for displaying the recognised words received and processed by said processing application program means;
user operable selection means for selecting at least one word in the displayed words, said interface application program means including means for identifying any audio components, if present, which are linked to the at least one selected word; and
audio playback means for playing back any identified audio components in the order of the word positions in the word string or the processed word string.
5 Assignments
Litigations
0 Petitions
Accused Products
Abstract
Data processing apparatus is disclosed for receiving recognition data from a speech recognition engine and its corresponding dictated audio data where the recognition data includes recognised words or characters. A display displays the recognised words or characters and the recognised words or characters are stored as a file together with the corresponding audio data. Link data is formed to link the position of the words or characters in the file and the position of the corresponding audio component in the audio data. The recognised words or characters can be processed without loosing the audio data.
-
Citations
78 Claims
-
1. Data processing apparatus comprising:
-
input means for receiving recognition data from a speech recognition engine and corresponding audio data, said recognition data including a string of recognised words and audio identifiers identifying audio components corresponding to each recognised word; storage means for storing said audio data received from said input means; interface application program means comprising means for receiving the input recognised words, means for placing the recognised words into positions in text in a processing application program means to allow the processing of the recognised words to change the positions of the recognised words to form a processed word string, means for determining the positions of the recognised words in said processing application program means, means for monitoring changes in the positions of the recognised words, and means for forming link data linking the audio data to the recognised words, said link data comprising the audio identifiers and the determined positions of corresponding recognised words, said interface application program means including means for updating said link data in response to monitored changes in positions of the recognised words; display means for displaying the recognised words received and processed by said processing application program means; user operable selection means for selecting at least one word in the displayed words, said interface application program means including means for identifying any audio components, if present, which are linked to the at least one selected word; and audio playback means for playing back any identified audio components in the order of the word positions in the word string or the processed word string. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A data processing arrangement comprising:
a data processing apparatus, the data processing apparatus comprising; input means for receiving recognition data from a speech recognition engine and corresponding audio data, said recognition data including a string of recognised words and audio identifiers identifying audio components corresponding to each recognised word; interface application program means comprising means for receiving the input recognised words, means for placing the recognised words into positions in text in a processing application program means to allow the processing of the recognised words to change the positions of the recognised words to form a processed word string, means for determining the positions of the recognised words in said processing application program means, means for monitoring changes in the positions of the recognised words, and means for forming link data linking the audio data to the recognised words, said link data comprising the audio identifiers and the determined positions of corresponding recognised words, said interface application program means including means for updating said link data in response to monitored changes in positions of the recognised words; storage means for storing said recognition data and audio data received from said input means, and for storing said link data; display means for displaying the recognised words received and processed by said processing application program means; user operable selection means for selecting at least one word in the displayed words, said interface application program means including means for identifying any audio components, if present, which are linked to the at least one selected word; and audio playback means for playing back any identified audio components in the order of the word positions in the word string or the processed word string; and an editor work station comprising; data reading means for reading the words, link data, and audio data from said data processing apparatus; editor processing means for processing the words; editor link means for linking the audio data to the word positions using the link data; editor display means for displaying the words being processed; editor correction means for selecting and correcting any displayed words which have been incorrectly recognised; editor audio playback means for playing back an audio component corresponding to any selected words to aid correction; editor speech recognition update means for storing the corrected words and the audio identifier for the audio component corresponding to the corrected word in a word correction file; and data transfer means for transferring the word correction file to said data processing apparatus for later updating of models used by said speech recognition engine; said data processing apparatus including correction file reading means for reading said word correction file to pass the data contained therein to said speech recognition engine for the updating of the models used by said speech recognition engine. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27)
-
28. A data processing method comprising:
-
receiving recognition data from a speech recognition engine and corresponding audio data in an interface application program, said recognition data including a string of recognised words and audio identifiers identifying audio components corresponding to each recognised word; storing the audio data; inputting the recognised words into a processing application program which places the words in positions in the application, and which processes the recognised words such that positions of the recognised words are changed to form a processed word string; using the interface application program to determine the positions of the recognised words in the processing application program, monitor changes in the positions of the recognised words, and to form link data linking the audio data to the recognised words, said link data comprising the audio identifiers and the determined positions of corresponding recognised words, said link data being updated in response to monitored changes in positions of the recognised words; displaying the recognised words input to and processed by the processor application; selecting at least one displayed word, whereby said link data identifies any audio components, if present, which are linked to the at least one selected word; and playing back any selected audio components in the order of the word positions in the word string. - View Dependent Claims (29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50)
-
-
51. A computer usable medium having computer readable instructions stored therein for causing a processor in a data processing apparatus to process recognition signals defining a string of recognised words and corresponding audio data signals to display the words and selectively play the audio data, the instructions comprising instructions for:
-
a) causing the processor to receive the recognition signals from a speech recognition engine and the audio data signals, the recognition signals including a string of recognised words and audio identifiers identifying audio components corresponding to each recognised word; b) causing the processor to store the audio data; c) causing the processor to implement an interface application program which receives the recognised words and places the words in positions in a processing application program which can process the recognised words such that the positions of the recognised words are changed to form a processed word string; d) causing the processor to implement the interface application program to determine the positions of the recognised words in the processing application program and to monitor changes in the positions of the recognised words; e) causing the processor to implement the interface application program to form link data linking the audio data to the recognized words, wherein said link data comprises the audio identifiers and the determined positions of corresponding recognised words, and to update said link data in response to monitored changes in positions of the recognised words; f) causing the processor to generate an image of the recognised words on a display; g) causing the processor to receive a selection signal generated by a user for selecting at least one word and to identify audio components corresponding to the at least one selected word; and h) causing the processor to send the identified audio components in the order of the word positions in the word string to an audio play back device.
-
-
52. Data processing apparatus comprising:
-
input means for receiving recognition data and corresponding audio data from a speech recognition engine, said recognition data including a string of recognised characters and audio identifiers identifying audio components corresponding to a character component of the recognised characters; storage means for storing said audio data received from said input means; processing means for receiving and processing the input recognised characters to at least one of replace, insert move and position the recognised characters to form a processed character string; link means for forming link data linking the audio identifiers to the character component positions in the character string and for updating said link data after processing to maintain the link between the audio identifiers and the character component positions in the processed character string; display means for displaying the characters received by said processing means; user operable selection means for selecting characters in the displayed characters for audio playback, where said link data identifies any selected audio components, if present, which are linked to the selected characters; audio playback means for playing back the selected audio components in the order of the character component positions in the character string or the processed character string; file storage means for storing the recognised characters in a file; means for selectively disabling one of the receipt of the recognised characters by said processing means and the recognition of speech by said speech recognition engine for a period of time, means for storing the audio data for the period of time in said storage means as an audio message associated with the file; and storage reading means for reading said file for input to said processing means, and for reading said audio message for playback by said audio playback means. - View Dependent Claims (53)
-
-
54. A data processing arrangement comprising:
a data processing apparatus, the data processing apparatus comprising; input means for receiving recognition data and corresponding audio data from a speech recognition engine, said recognition data including a string of recognised characters and audio identifiers identifying audio components corresponding to a character component of the recognised characters; processing means for receiving and processing the input recognised characters to at least one of replace, insert move and position the recognised characters to form a processed character string; link means for forming link data linking the audio identifiers to the character component positions in the character string and for updating said link data after processing to maintain the link between the audio identifiers and the character component positions in the processed character string; storage means for storing said recognition data and audio data received from said input means, and for storing said link data; display means for displaying the characters received by said processing means; user operable selection means for selecting characters in the displayed characters for audio playback, where said link data identifies any selected audio components, if present, which are linked to the selected characters; and audio playback means for playing back the selected audio components in the order of the character component positions in the character string or the processed character string; file storage means for storing the recognised characters in a file; means for selectively disabling one of the receipt of the recognised characters by said processing means and the recognition of speech by said speech recognition engine for a period of time with means for storing the audio data for the period of time in said storage means as an audio message associated with the document; storage reading means for reading said document for input to said processing means, and for reading said audio message for playback by said audio playback means; and an editor work station comprising; data reading means for reading the characters, link data, and audio data from said data processing apparatus; editor processing means for processing the characters; editor link means for linking the audio data to the character component position using the link data; editor display means for displaying the characters being processed; editor correction means for selecting and correcting any displayed characters which have been incorrectly recognised; editor audio playback means for playing back any audio component corresponding to the selected characters to aid correction; editor speech recognition update means for storing the corrected characters and the audio identifier for the audio component corresponding to the corrected character in a character correction file; data transfer means for transferring the character correction file to said data processing apparatus for later updating of models used by said speech recognition engine; and audio message reading means for reading the audio message associated with characters being processed by said editor processing means for playback by said editor audio playback means; said data processing apparatus including correction file reading means for reading said character correction file to pass the data contained therein to said speech recognition engine for the updating of the models used by said speech recognition engine. - View Dependent Claims (55)
-
56. A data processing method comprising:
-
receiving recognition data and corresponding audio data from a speech recognition engine, said recognition data including recognised characters and audio identifiers identifying audio components corresponding to text components in the recognised text; storing the audio data; inputting the recognised characters to a processor for the processing of the characters to at least one of replace, insert move and position the characters to form a processed character string; forming link data linking the audio identifiers to the character component positions in the characters and updating said link data after processing to maintain the link between the audio identifiers and the character component positions in the processed character string; displaying the characters input to the processor; selecting displayed characters for audio playback, whereby said link data identifies any selected audio components, if present, which are linked to the selected characters; playing back the selected audio components in the order of the character component positions in the character string; storing the characters as a file; selectively disabling one of the importation of recognised characters into the processor and the recognition of speech by said speech recognition engine for a period of time; storing the audio data for the period of time as an audio message associated with the file; at a later time, reading said file for input to the processor; and allowing a user to select whether to read and playback said audio message associated with said file. - View Dependent Claims (57)
-
-
58. A method of processing data comprising:
at an author work station; receiving recognition data and corresponding audio data from a speech recognition engine, said recognition data including recognised characters and audio identifiers identifying audio components corresponding to text components in the recognised text; storing the audio data; inputting the recognised characters to a processor for the processing of the characters to at least one of replace, insert move and position the characters to form a processed character string; forming link data linking the audio identifiers to the character component positions in the characters and updating said link data after processing to maintain the link between the audio identifiers and the character component positions in the processed character string; displaying the characters input to the processor; selecting displayed characters for audio playback, whereby said link data identifies any selected audio components, if present, which are linked to the selected characters; and playing back the selected audio components in the order of the character component positions in the character string;
wherein the characters, the link data, and the audio data are stored; andat an editor work station; obtaining the stored characters, link data and audio data from the author work station; inputting the characters into a processor; linking the audio data to the character component positions using the link data; displaying the characters being processed; selecting any displayed characters which have been incorrectly recognised; playing back any audio component corresponding to the selected characters to aid correction; correcting the incorrectly recognised characters; storing the corrected characters and the audio identifier for the audio component corresponding to the corrected character in a character correction file; and transferring the character correction file to the author work station for later updating of models used by said speech recognition engine; wherein, at a later time, said character correction file is read at said author work station to pass the data contained therein to said speech recognition engine for updating of said models; wherein, at said author work station, storing the characters as a file; selectively disabling one of the importation of recognised characters into the processor and the recognition of speech by said speech recognition engine for a period of time; storing the audio data for the period of time as an audio message associated with the file; and at a later time, reading said file for input to the processor; and at said editor work station, reading the audio message associated with the file being processed by the processor, and playing back the read audio message. - View Dependent Claims (59)
-
60. A universal speech-recognition interface that enables operative coupling of a speech-recognition engine to at least any one of a plurality of different computer-related applications, the universal speech-recognition interface comprising:
-
input means for receiving speech-recognition data including recognised words; output means for outputting the recognised words into at least any one of the plurality of different computer-related applications to allow processing of the recognised words as input text; and audio playback means for playing audio data associated with the recognised words. - View Dependent Claims (61, 62, 63)
-
-
64. A speech-recognition interface that enables operative coupling of a speech-recognition engine to a computer-related application, the interface comprising:
-
input means for receiving speech-recognition data including recognised words; output means for outputting the recognised words into a computer-related application to allow processing of the recognised words as input text, including changing positions of the recognised words; and means, independent of the computer-related application, for determining positions of the recognised words in the computer-related application. - View Dependent Claims (65, 66, 67, 68)
-
-
69. Data processing apparatus comprising
input means for receiving recognition data from a speech recognition engine and corresponding audio data, said recognition data including a string of recognised words and audio identifiers identifying audio components corresponding to each of the recognised words; -
processing means for implementing an interface application program which receives the input recognised words, inputs the recognised words into a processing application program to process the input recognised words to cause the recognised words to be moved, and forms link data linking the audio data to the recognised words, said link data comprising the audio identifiers and information identifying the corresponding recognised words; display means for displaying the words received and processed by said processing application program; user operable selection means for selectively identifying a word in the displayed words, wherein said interface application program is operative to compare the identity of the selected word with said link data to identify any corresponding audio component; and audio playback means for playing back any identified corresponding audio component. - View Dependent Claims (70)
-
-
71. A data processing method comprising:
-
inputting recognition data from a speech recognition engine and corresponding audio data, said recognition data including a string of recognised words and audio identifiers identifying audio components corresponding to each of the recognised words; inputting the recognised words to a processor implementing an interface application program to receive the input recognised words, to pass the recognised words to a processing application program for processing the recognised words to cause the recognised words to be moved, and to form link data linking the audio data to the recognised words, said link data comprising the audio identifiers and information identifying the corresponding recognised words; displaying the recognised words input to and processed by the processor application program; selectively identifying a word in the displayed words; using the interface application program to compare the identity of the selected word with said link data to identify any corresponding audio component; and playing back any identified corresponding audio component. - View Dependent Claims (72)
-
-
73. A computer usable medium having computer readable instructions stored therein for causing a processor in a data processing apparatus to process recognition signals defining a string of recognised words and corresponding audio data to display the words and selectively play the audio data, the instructions comprising instructions for:
-
a) causing the processor to input the recognition signals from a speech recognition engine and the audio data, the recognition signals including a string of recognised words and audio identifiers identifying audio components corresponding to each recognised word; b) causing the processor to implement an interface application program to receive the input recognised words and to input the recognised words into a processing application program to process the recognised words to cause the recognised words to be relatively moved; c) causing the processor to implement the interface application program to form link data linking the audio data to the recognised words, said link data comprising the audio identifiers and information identifying the corresponding recognised words; d) causing the processor to generate an image of the recognised words on a display; e) causing the processor to receive a selection signal generated by a user for selectively identifying a word in the displayed words; f) causing the processor to implement the interface application program to compare the identity of the selected word with said link data to identify any corresponding audio component; and g) causing the processor to send the identified corresponding audio component to an audio playback device. - View Dependent Claims (74)
-
-
75. Data processing apparatus comprising:
-
input means for receiving recognition data from a speech recognition engine and corresponding audio data, said recognition data including a string of recognised words and audio identifiers including audio components corresponding to each recognised word; storage means for storing the audio data received from said input means; processing means operative under the control of an operating system to implement a first application program which receives the input recognised words in text positions, and which processes the recognised words such that the positions of the recognised words are changed to form a processed word string, and a second application program which determines the positions of and monitors changes in the positions of the recognised words in said first application program using operating system functions communicated via the computer operating system, and which forms link data linking the audio data to the recognised words and updates said link data in response to monitored changes in the positions of the recognised words, said link data comprising the audio identifiers and the determined positions of corresponding recognised words; display means for displaying the recognised words; user operable selection means for selecting at least one word in the displayed words, wherein said second application program is operative to identify any selected audio components, if present, which are linked to the at least one selected word; and audio playback means for playing back any selected audio component. - View Dependent Claims (76)
-
-
77. A data processing method comprising:
-
inputting recognition data from a speech recognition engine and corresponding audio data, said recognition data including a string of recognised words and audio identifiers identifying audio components corresponding to each of the recognised words; storing the audio; implementing a first application program within a computer operating system to receive the input recognised words in text positions, and to process the recognised words such that the positions of the recognised words are changed to form a processed word string; implementing a second application program from within the computer operating system to determine the positions of the recognised words and monitor changes in the positions of the recognised words in the first application program using operating system functions communicated via the computer operating system, to form link data linking the audio data to the recognised words, and to update the link data in response to monitored changes in the positions of the recognised words, wherein said link data comprises the audio identifiers and the determined positions of corresponding recognised words; displaying the recognised words; selecting at least one word in the displayed words, wherein the second application program identifies any selected audio components, if present, which are linked to the at least one selected word; and playing back any selected audio component. - View Dependent Claims (78)
-
Specification