Speech recognition by automated context creation
First Claim
Patent Images
1. In a speech recognition system, a method of speech recognition comprising:
- (a) receiving non-voice input in a computer system communicatively linked to the speech recognition system, said input having been sent to a user from a different user and comprising at least one of text contained in an e-mail sent or received by the user, information in a document attached to an e-mail sent or received by the user, information in a document viewed by the user on a display of the computer system, information in a plurality of linked documents accessible to the computer system, information in a spread sheet executing on the computer system, facsimile information received via a facsimile device connected to the computer system, call center information received via calling device connected to the computer system, and information recorded by a web browser executing on the computer system;
(b) creating a word list defining a context-enhanced database based upon said input or modifying an existing context-enhanced database by adding a word list created based upon said input, wherein said created and modified context-enhanced databases are dynamically generated based upon at least one of a current activity performed by the user on the computer system and a past activity performed by the user on the computer system within a predetermined time interval, said current and past activities comprising at least one of sending or receiving an e-mail, displaying a document contained in an e-mail, displaying information contained in a spread sheet executing on the computer system, receiving facsimile information via a facsimile device connected to the computer system, receiving call center information via a calling device connected to the computer system, and receiving information recorded by a web browser executing on the computer system;
(c) preparing a first textual output from the speech signal by performing a speech recognition task to convert a speech signal into said first textual output, wherein said context-enhanced database is accessed to improve the speech recognition rate, wherein said speech signal is parsed into a plurality of computer processable speech segments, wherein said first textual output comprises a plurality of text segments, each corresponding to one of the computer processable speech segments, and wherein selective ones of the text segments are generated by matching a computer processable speech segment against an entry within the context-enhanced database, said context-enhanced database including a plurality of entries, each entry comprising a speech utterance and a corresponding textual segment for the speech utterance;
(d) enabling editing of said first textual output to generate a final voice-generated output; and
(e) making said final voice-generated output available.
3 Assignments
0 Petitions
Accused Products
Abstract
A method for speech recognition can include generating a context-enhanced database from a system input. A voice-generated output can be generated from a speech signal by performing a speech recognition task to convert the speech signal into computer-processable segments. During the speech recognition task, the context-enhanced database can be accessed to improve the speech recognition rate. Accordingly, the speech signal can be interpreted with respect to words included within the context-enhanced database. Additionally, a user can edit or correct an output in order to generate the final voice-generated output which can be made available.
-
Citations
38 Claims
-
1. In a speech recognition system, a method of speech recognition comprising:
-
(a) receiving non-voice input in a computer system communicatively linked to the speech recognition system, said input having been sent to a user from a different user and comprising at least one of text contained in an e-mail sent or received by the user, information in a document attached to an e-mail sent or received by the user, information in a document viewed by the user on a display of the computer system, information in a plurality of linked documents accessible to the computer system, information in a spread sheet executing on the computer system, facsimile information received via a facsimile device connected to the computer system, call center information received via calling device connected to the computer system, and information recorded by a web browser executing on the computer system; (b) creating a word list defining a context-enhanced database based upon said input or modifying an existing context-enhanced database by adding a word list created based upon said input, wherein said created and modified context-enhanced databases are dynamically generated based upon at least one of a current activity performed by the user on the computer system and a past activity performed by the user on the computer system within a predetermined time interval, said current and past activities comprising at least one of sending or receiving an e-mail, displaying a document contained in an e-mail, displaying information contained in a spread sheet executing on the computer system, receiving facsimile information via a facsimile device connected to the computer system, receiving call center information via a calling device connected to the computer system, and receiving information recorded by a web browser executing on the computer system; (c) preparing a first textual output from the speech signal by performing a speech recognition task to convert a speech signal into said first textual output, wherein said context-enhanced database is accessed to improve the speech recognition rate, wherein said speech signal is parsed into a plurality of computer processable speech segments, wherein said first textual output comprises a plurality of text segments, each corresponding to one of the computer processable speech segments, and wherein selective ones of the text segments are generated by matching a computer processable speech segment against an entry within the context-enhanced database, said context-enhanced database including a plurality of entries, each entry comprising a speech utterance and a corresponding textual segment for the speech utterance; (d) enabling editing of said first textual output to generate a final voice-generated output; and (e) making said final voice-generated output available. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A machine-readable storage, having stored thereon a computer program having a plurality of code sections executable by a machine for causing the machine to perform the steps of:
-
(a) receiving non-voice input in a computer system communicatively linked to the speech recognition system, said input having been sent to a user from a different user and comprising at least one of text contained in an e-mail sent or received by the user, information in a document attached to an e-mail sent or received by the user, information in a document viewed by the user on a display of the computer system, information in a plurality of linked documents accessible to the computer system, information in a spread sheet executing on the computer system, facsimile information received via a facsimile device connected to the computer system, call center information received via calling device connected to the computer system, and information recorded by a web browser executing on the computer system; (b) creating a word list defining a context-enhanced database based upon said input or modifying an existing context-enhanced database by adding a word list created based upon said input, wherein said created and modified context-enhanced databases are dynamically generated based upon at least one of a current activity performed by the user on the computer system and a past activity performed by the user on the computer system within a predetermined time interval, said current and past activities comprising at least one of sending or receiving an e-mail, displaying a document contained in an e-mail, displaying information contained in a spread sheet executing on the computer system, receiving facsimile information via a facsimile device connected to the computer system, receiving call center information via a calling device connected to the computer system, and receiving information recorded by a web browser executing on the computer system; (c) preparing a first textual output from a speech signal by performing a speech recognition task to convert said speech signal into said first textual output, wherein said context-enhanced database is accessed to improve the speech recognition rate, wherein said speech signal is parsed into a plurality of computer processable speech segments, wherein said first textual output comprises a plurality of text segments, each corresponding to one of the computer processable speech segments, and wherein selective ones of the text segments are generated by matching a computer processable speech segment against an entry within the context-enhanced database, said context-enhanced database including a plurality of entries, each entry comprising a speech utterance and a corresponding textual segment for the speech utterance; (d) enabling editing of said first textual output to generate a final voice-generated output; and (e) making said final voice-generated output available. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32)
-
-
33. In a speech recognition system, a method of speech recognition comprising the steps of:
-
receiving non-voice input in a computer system communicatively linked to the speech recognition system, said input having been sent to a user from a different user and comprising at least one of text contained in an e-mail sent or received by the user, information in a document attached to an e-mail sent or received by the user, information in a document viewed by the user on a display of the computer system, information in a plurality of linked documents accessible to the computer system, information in a spread sheet executing on the computer system, facsimile information received via a facsimile device connected to the computer system, call center information received via calling device connected to the computer system, and information recorded by a web browser executing on the computer system; creating a word list defining a context-enhanced database based upon the input or modifying an existing context-enhanced database by adding a word list created based upon the input, wherein said created and modified context-enhanced databases are dynamically generated based upon at least one of a current activity performed by the user on the computer system and a past activity performed by the user on the computer system within a predetermined time interval, said current and past activities comprising at least one of sending or receiving an e-mail, displaying a document contained in an e-mail, displaying information contained in a spread sheet executing on the computer system, receiving facsimile information via a facsimile device connected to the computer system, receiving call center information via a calling device connected to the computer system, and receiving information recorded by a web browser executing on the computer system; parsing a received speech signal into a plurality of speech segments; comparing said speech segments against entries in the context-enhanced database; when matching entries are found in the comparing step, for each matching entry retrieving a textual segment from the context-enhanced database that is associated with the matching entry; and generating textual output for the speech signal that includes the retrieved textual segments. - View Dependent Claims (34, 35, 36, 37, 38)
-
Specification