Speech recognition system and method for generating a mask of the system
First Claim
Patent Images
1. A speech recognition system comprising:
- multiple sound sources;
a sound source separating section which separates mixed speeches from the multiple sound sources; and
at least one processor configured to;
generate a soft mask which can take continuous values between 0 and 1 for each separated speech according to reliability of separation in separating operation of the sound source separating section, andrecognize speeches separated by the sound source separating section using the soft masks,wherein the reliability of separation R(f,t) is defined as
1 Assignment
0 Petitions
Accused Products
Abstract
The speech recognition system of the present invention includes: a sound source separating section which separates mixed speeches from multiple sound sources; a mask generating section which generates a soft mask which can take continuous values between 0 and 1 for each separated speech according to reliability of separation in separating operation of the sound source separating section; and a speech recognizing section which recognizes speeches separated by the sound source separating section using soft masks generated by the mask generating section.
22 Citations
10 Claims
-
1. A speech recognition system comprising:
-
multiple sound sources; a sound source separating section which separates mixed speeches from the multiple sound sources; and at least one processor configured to; generate a soft mask which can take continuous values between 0 and 1 for each separated speech according to reliability of separation in separating operation of the sound source separating section, and recognize speeches separated by the sound source separating section using the soft masks, wherein the reliability of separation R(f,t) is defined as - View Dependent Claims (2, 3, 10)
-
-
4. A method for generating a soft mask for a speech recognition system, the method comprising:
-
separating, at a sound source separating section of the speech recognition system, mixed speeches from multiple sound sources; generating, at a mask generating section of the speech recognition system, a soft mask which can take continuous values between 0 and 1 for each separated speech according to reliability of separation in separating operation of the sound source separating section; recognizing, at a speech recognizing section of the speech recognition system, speeches separated by the sound source separating section using soft masks generated by the mask generating section, the soft mask being determined using a function of the reliability of separation, which has at least one parameter; determining a search space of said at least one parameter; obtaining a speech recognition rate of the speech recognition system while changing a value of the speech recognition system in the search space; and setting the value which maximizes a speech recognition rate of the speech recognition system to said at least one parameter, wherein the reliability of separation R(f,t) is defined as
-
-
5. A method for generating a soft mask for a speech recognition system, the method comprising:
-
separating, at a sound source separating section of the speech recognition system, mixed speeches from multiple sound sources; generating, at a mask generating section of the speech recognition system, a soft mask which can take continuous values between 0 and 1 for each separated speech according to reliability of separation in separating operation of the sound source separating section; recognizing, at a speech recognizing section of the speech recognition system, speeches separated by the sound source separating section using soft masks generated by the mask generating section, the soft mask being determined using a function of the reliability of separation, which has at least one parameter; obtaining a histogram of the reliability of separation; and determining a value of said at least one parameter from a form of the histogram of the reliability of separation, wherein the reliability of separation R(f,t) is defined as - View Dependent Claims (6, 7, 8, 9)
-
Specification