SPEECH PROCESSING SYSTEM AND METHOD
First Claim
Patent Images
1. A method for identifying a plurality of speakers in audio data and for decoding the speech spoken by said speakers;
- the method comprising;
receiving speech;
dividing the speech into segments as it is received;
processing the received speech segment by segment in the order received to identify the speaker and to decode the speech, processing comprising;
performing primary decoding of the segment using an acoustic model and a language model;
obtaining segment parameters indicating the differences between the speaker of the segment and a base speaker during the primary decoding;
comparing the segment parameters with a plurality of stored speaker profiles to determine the identity of the speaker, and selecting a speaker profile for said speaker;
updating the selected speaker profile;
performing a further decoding of the segment using a speaker independent acoustic model, adapted using the updated speaker profile;
outputting the decoded speech for the identified speaker,wherein the speaker profiles are updated as further segments of speech relating to a speaker profile are processed.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for identifying a plurality of speakers in audio data and for decoding the speech spoken by said speakers;
- the method comprising:
- receiving speech;
- dividing the speech into segments as it is received;
- processing the received speech segment by segment in the order received to identify the speaker and to decode the speech, processing comprising:
- performing primary decoding of the segment using an acoustic model and a language model;
- obtaining segment parameters indicating the differences between the speaker of the segment and a base speaker during the primary decoding;
- comparing the segment parameters with a plurality of stored speaker profiles to determine the identity of the speaker, and selecting a speaker profile for said speaker;
- updating the selected speaker profile;
- performing a further decoding of the segment using a speaker independent acoustic model, adapted using the updated speaker profile;
- outputting the decoded speech for the identified speaker,
wherein the speaker profiles are updated as further segments of speech relating to a speaker profile are processed.
85 Citations
14 Claims
-
1. A method for identifying a plurality of speakers in audio data and for decoding the speech spoken by said speakers;
-
the method comprising; receiving speech; dividing the speech into segments as it is received; processing the received speech segment by segment in the order received to identify the speaker and to decode the speech, processing comprising; performing primary decoding of the segment using an acoustic model and a language model; obtaining segment parameters indicating the differences between the speaker of the segment and a base speaker during the primary decoding; comparing the segment parameters with a plurality of stored speaker profiles to determine the identity of the speaker, and selecting a speaker profile for said speaker; updating the selected speaker profile; performing a further decoding of the segment using a speaker independent acoustic model, adapted using the updated speaker profile; outputting the decoded speech for the identified speaker, wherein the speaker profiles are updated as further segments of speech relating to a speaker profile are processed. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A system for identifying a plurality of speakers in audio data and for decoding the speech spoken by said speakers;
-
the system comprising; a receive for audio containing speech; and a processor, said processor being adapted to; divide the speech into segments as it is received; process the received speech segment by segment in the order received to identify the speaker and to decode the speech, processing comprising; perform primary decoding of the segment using an acoustic model and a language model; obtain segment parameters indicating the differences between the speaker of the segment and a base speaker during the primary decoding; compare the segment parameters with a plurality of stored speaker profiles to determine the identity of the speaker, and selecting a speaker profile for said speaker; update the selected speaker profile; and perform a further decoding of the segment using a speaker independent acoustic model, adapted using the updated speaker profile, the system further comprising an output for outputting the decoded speech for the identified speaker, wherein the speaker profiles are updated as further segments of speech relating to a speaker profile are processed.
-
Specification