PHONETIC ALIGNMENT FOR USER-AGENT DIALOGUE RECOGNITION
First Claim
1. A method for speech to text transcription comprising:
- providing access to a knowledge base containing solution descriptions, each solution description including a textual description of a solution to a respective problem;
generating a preliminary transcription of at least an agent'"'"'s part of an audio recording of a dialogue between the agent and a user in which the agent had access to the knowledge base, the generating comprising;
identifying a sequence of phonemes based on the agent'"'"'s part of the audio recording, andbased on the identified sequence of phonemes, generating the preliminary transcription, the preliminary transcription including a sequence of words recognized as corresponding to phonemes in the sequence of phonemes and unrecognized phonemes from the phoneme sequence that are not recognized as corresponding to one of the recognized words; and
revising the preliminary transcription, the revising comprising replacement of unrecognized phonemes with at least one word from a solution description, the solution description including words which match words of the sequence of recognized words,wherein at least one of the generating of the preliminary transcription and the revising of the preliminary transcription is performed with a processor.
2 Assignments
0 Petitions
Accused Products
Abstract
A method for speech to text transcription uses a knowledge base containing solution descriptions, each describing, in words, a solution to a respective problem. An audio recording of a dialogue between an agent and a user in which the agent had access to the knowledge base is received. A sequence of phonemes based on the agent'"'"'s part of the audio recording is identified and from this, a preliminary transcription is made which includes a sequence of words recognized as corresponding to phonemes in the identified sequence of phonemes together with any unrecognized phonemes from the phoneme sequence that are not recognized as corresponding to one of the recognized words. The preliminary transcription is revised by replacing one or more of the unrecognized phonemes with a word or words from a solution description that includes words which match adjacent words of the sequence of recognized words.
-
Citations
20 Claims
-
1. A method for speech to text transcription comprising:
-
providing access to a knowledge base containing solution descriptions, each solution description including a textual description of a solution to a respective problem; generating a preliminary transcription of at least an agent'"'"'s part of an audio recording of a dialogue between the agent and a user in which the agent had access to the knowledge base, the generating comprising; identifying a sequence of phonemes based on the agent'"'"'s part of the audio recording, and based on the identified sequence of phonemes, generating the preliminary transcription, the preliminary transcription including a sequence of words recognized as corresponding to phonemes in the sequence of phonemes and unrecognized phonemes from the phoneme sequence that are not recognized as corresponding to one of the recognized words; and revising the preliminary transcription, the revising comprising replacement of unrecognized phonemes with at least one word from a solution description, the solution description including words which match words of the sequence of recognized words, wherein at least one of the generating of the preliminary transcription and the revising of the preliminary transcription is performed with a processor. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A system for speech to text transcription comprising:
-
a speech to text decoder for generating a preliminary transcription of at least an agent'"'"'s part of an audio recording of a dialogue between the agent and a user, the agent having access to an associated knowledge base of solution descriptions, each solution description including a textual description of a solution to a respective problem, the decoder configured for; identifying a sequence of phonemes based on the agent'"'"'s part of the audio recording, and based on the identified sequence of phonemes, generating the preliminary transcription, the preliminary text transcription including a sequence of words recognized as corresponding to phonemes in the sequence of phonemes and unrecognized phonemes from the phoneme sequence that are not recognized as corresponding to one of the recognized words; a revision component for revising the preliminary transcription, the revision component configured for; comparing recognized words in the preliminary transcription with words in solution descriptions in the knowledge base to identify candidate solution descriptions which each include a sequence of text which includes words which are determined to match at least some of the identified words in the preliminary transcription, and using a phoneme sequence corresponding to a sequence of text in one of the candidate solution descriptions, replacing unrecognized phonemes in the preliminary transcription with at least one word of the sequence of text in the candidate solution description to generate a revised transcription; and a processor which implements at least one of the generating of the preliminary transcription and the revising of the preliminary transcription. - View Dependent Claims (19)
-
-
20. A method for providing a system for speech to text transcription comprising:
-
with a processor, for each of a set of solution descriptions in a knowledge base which includes a textual description of a solution to a respective problem with a device, associating the solution description with a sequence of phonemes corresponding to at least a part of the textual description; providing access to a speech to text converter which is configured for generating a preliminary transcription of at least an agent'"'"'s part of an audio recording of a dialogue between the agent and a user in which the agent has access to the knowledge base, the generating comprising; identifying a sequence of phonemes based on the agent'"'"'s part of the audio recording, and based on the identified sequence of phonemes, generating the preliminary transcription, the preliminary transcription including a sequence of words recognized as corresponding to phonemes in the sequence of phonemes and any unrecognized phonemes from the phoneme sequence that are not recognized as corresponding to one of the recognized words; and providing instructions for revising the preliminary transcription when there are unrecognized phonemes from the phoneme sequence, the instructions providing for replacement of unrecognized phonemes with text from a solution description which includes words from the sequence of recognized words.
-
Specification