Transcription system for multiple speakers, using and establishing identification
First Claim
Patent Images
1. In a computer system having a text independent speech recognition application, a method of transcribing text from multiple speakers comprising the steps of:
- receiving a speech signal from one of a plurality of speakers through a single channel;
assigning a unique speaker ID to said speaker providing said speech signal through said channel;
processing said speech signal into text using a speech recognition model;
creating a document containing said text;
associating said processed speech signal and said text in said document with said unique speaker ID assigned to said speaker; and
, monitoring said speech signal for a speaker change to a different one of said plurality of speakers.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for transcribing text from multiple speakers in a computer system having a speech recognition application. The system receives speech from one of a plurality of speakers through a single channel, assigns a speaker ID to the speaker, transcribes the speech into text, and associates the speaker ID with the speech and text. In order to detect a speaker change, the system monitors the speech input through the channel for a speaker change.
232 Citations
36 Claims
-
1. In a computer system having a text independent speech recognition application, a method of transcribing text from multiple speakers comprising the steps of:
-
receiving a speech signal from one of a plurality of speakers through a single channel;
assigning a unique speaker ID to said speaker providing said speech signal through said channel;
processing said speech signal into text using a speech recognition model;
creating a document containing said text;
associating said processed speech signal and said text in said document with said unique speaker ID assigned to said speaker; and
,monitoring said speech signal for a speaker change to a different one of said plurality of speakers. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 22, 23)
assigning a different unique speaker ID to said different speaker.
-
-
3. In a method of transcribing text from multiple speakers as claimed in claim 2, further comprising the step:
associating said processed speech signal and said text in said document from said different speaker with said different unique speaker ID.
-
4. In a method of transcribing text from multiple speakers as claimed in claim 1, wherein said speech signal is buffered.
-
5. In a method of transcribing text from multiple speakers as claimed in claim 1, wherein at least one of said speakers is an enrolled speaker, and a preassigned unique speaker ID is assigned to said enrolled speaker.
-
6. In a method of transcribing text from multiple speakers as claimed in claim 1, wherein text in said document which has been processed from portions of said speech signal which can be attributed to one speaker is distinguished from text in said document which has been processed from other portions of said speech signal which can be attributed to different speakers.
-
7. In a method of transcribing text from multiple speakers as claimed in claim 6, wherein text in said document which can be attributed to different speakers is distinguished by starting a new paragraph in said document for every speaker change.
-
8. In a method of transcribing text from multiple speakers as claimed in claim 1, wherein at least one of said speakers is an unenrolled speaker.
-
9. In a method of transcribing text from multiple speakers as claimed in claim 8, wherein a speech signal and corresponding processed text from said unenrolled speaker is used to enroll said speaker.
-
10. In a method of transcribing text from multiple speakers as claimed in claim 8, wherein at least a portion of said speech signal and corresponding processed text from said unenrolled speaker is used to develop a speaker dependent speech recognition model.
-
11. In a method of transcribing text from multiple speakers as claimed in claim 10, wherein said speaker dependent model is used to reprocess the text in said document for said unenrolled speaker.
-
12. In a method of transcribing text from multiple speakers as claimed in claim 1, wherein a different speech recognition model is used to reprocess said text in said document.
-
22. In a system as claimed in claim 8, wherein at least a portion of said speech signal and corresponding processed text from said unenrolled speaker is used to develop a speaker dependent speech recognition model.
-
23. In a system as claimed in claim 22, wherein said speaker dependent model is used to reprocess the text in said document for said unenrolled speaker.
-
13. In a computer system having a text independent speech recognition application adapted for transcribing text from multiple speakers comprising:
-
means for receiving a speech signal from one of a plurality of speakers through a single channel;
means for assigning a unique speaker ID to said speaker providing said speech signal through said channel;
means for processing said speech signal into text using a speech recognition model;
means for creating a document containing said text;
means for associating said processed speech signal and said text in said document with said unique speaker ID assigned to said speaker; and
,means for monitoring said speech signal for a speaker change to a different one of said plurality of speakers. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 24)
means for assigning a different unique speaker ID to said different speaker.
-
-
15. In a system as claimed in claim 14, further comprising:
means for associating said processed speech signal and said text in said document from said different speaker with said different unique speaker ID.
-
16. In a system as claimed in claim 13, wherein said speech signal is buffered.
-
17. In a system as claimed in claim 13, wherein at least one of said speakers is an enrolled speaker, and a preassigned unique speaker ID is assigned to said enrolled speaker.
-
18. In a system as claimed in claim 13, wherein text in said document which has been processed from portions of said speech signal which can be attributed to one speaker is distinguished from text in said document which has been processed from other portions of said speech signal which can be attributed to different speakers.
-
19. In a system as claimed in claim 18, wherein text in said document which can be attributed to different speakers is distinguished by starting a new paragraph in said document for every speaker change.
-
20. In a system as claimed in claim 13, wherein at least one of said speakers is an unenrolled speaker.
-
21. In a system as claimed in claim 20, wherein a speech signal and corresponding processed text from said unenrolled speaker is used to enroll said speaker.
-
24. In a system as claimed in claim 13, wherein a different speech recognition model is used to reprocess said text in said document.
-
25. A machine readable storage, having stored thereon a computer program having a plurality of code sections executable by a machine for causing the machine to perform the steps of:
-
receiving a speech signal from one of a plurality of speakers through a single channel;
assigning a unique speaker ID to said speaker providing said speech signal through said channel;
processing said speech signal into text using a speech recognition model;
creating a document containing said text;
associating said processed speech signal and said text in said document with said unique speaker ID assigned to said speaker; and
,monitoring said speech signal for a speaker change to a different one of said plurality of speakers. - View Dependent Claims (26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36)
assigning a different unique speaker ID to said different speaker.
-
-
27. The machine readable storage as claimed in claim 26, further including a plurality of code sections executable by a machine for causing the machine to perform the step of:
associating said processed speech signal and said text in said document from said different speaker with said different unique speaker ID.
-
28. The machine readable storage as claimed in claim 25, wherein said speech signal is buffered.
-
29. The machine readable storage as claimed in claim 25, wherein at least one of said speakers is an enrolled speaker, and a preassigned unique speaker ID is assigned to said enrolled speaker.
-
30. The machine readable storage as claimed in claim 25, wherein text in said document which has been processed from portions of said speech signal which can be attributed to one speaker is distinguished from text in said document which has been processed from other portions of said speech signal which can be attributed to different speakers.
-
31. The machine readable storage as claimed in claim 30, wherein text in said document which can be attributed to different speakers is distinguished by starting a new paragraph in said document for every speaker change.
-
32. The machine readable storage as claimed in claim 25, wherein at least one of said speakers is an unenrolled speaker.
-
33. The machine readable storage as claimed in claim 32, wherein a speech signal and corresponding processed text from said unenrolled speaker is used to enroll said speaker.
-
34. The machine readable storage as claimed in claim 32, wherein at least a portion of said speech signal and corresponding processed text from said unenrolled speaker is used to develop a speaker dependent speech recognition model.
-
35. The machine readable storage as claimed in claim 34, wherein said speaker dependent model is used to reprocess the text in said document for said unenrolled speaker.
-
36. The machine readable storage as claimed in claim 25, wherein a different speech recognition model is used to reprocess said text in said document.
Specification