Speech recognition system, training arrangement and method of calculating iteration values for free parameters of a maximum-entropy speech model
First Claim
1. A method of calculating iteration values for free parameters λ
-
α
ortho(n) of a maximum-entropy speech model MESM in a speech recognition system with the aid of the generalized iterative scaling training algorithm in accordance with the following formula;
λ
α
ortho( n+l)=G(λ
α
ortho(n), mα
ortho, . . . ) where;
n;
is an iteration parameter;
G;
is a mathematical function;
α
;
is an attribute in the MESM; and
mα
ortho;
is a desired orthogonalized boundary value in the MESM for the attribute α
, characterized in that the desired orthogonalized boundary value mα
ortho is calculated by linearly combining the desired boundary value mα
with desired boundary values mβ
of attributes β
that have a larger range than the attribute α
.
2 Assignments
0 Petitions
Accused Products
Abstract
The invention relates to a speech recognition system and a method of calculating iteration values for free parameters λαortho(n) of a maximum-entropy speech model MESM with the aid of the generalized-iterative scaling training algorithm in a computer-supported speech recognition system in accordance with the formula
λαortho(n+1)=G(λαortho(n), mαortho, . . . )
where n is an iteration parameter, G a mathematical function, α an attribute in the MESM and mαortho a desired orthogonalized boundary value in the MESM for the attribute α. It is an object of the invention to further develop the system and method so that they make a fast computation of the free parameters λ possible without a change of the original training object. According to the invention this object is achieved in that the desired orthogonalized boundary value mαortho is calculated by a linear combination of the desired boundary value mα with desired boundary values mβ from attributes β that have a larger range than the attribute α. mα and mβ are then desired boundary values of the original training object.
3 Citations
15 Claims
-
1. A method of calculating iteration values for free parameters λ
-
α
ortho(n) of a maximum-entropy speech model MESM in a speech recognition system with the aid of the generalized iterative scaling training algorithm in accordance with the following formula;
λ
α
ortho( n+l)=G(λ
α
ortho(n), mα
ortho, . . . )where;
n;
is an iteration parameter;
G;
is a mathematical function;
α
;
is an attribute in the MESM; and
mα
ortho;
is a desired orthogonalized boundary value in the MESM for the attribute α
, characterized in that the desired orthogonalized boundary value mα
ortho is calculated by linearly combining the desired boundary value mα
with desired boundary values mβ
of attributes β
that have a larger range than the attribute α
.- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
α
-
4. A method as claimed in claim 3, characterized in that the calculation of the desired orthogonalized boundary value mβ
- 1ortho is made in step g) according to the following formula;
- 1ortho is made in step g) according to the following formula;
-
5. A method as claimed in claim 2, characterized in that the calculation of the desired boundary values mβ
- 1 for the attributes β
i with i=0, . . . , g is made in step b) by respectively calculating the frequency N(β
i), with which the attribute β
i occurs in a training corpus and by subsequently smoothing the calculated frequency value N(β
i).
- 1 for the attributes β
-
6. A method as claimed in claim 5, characterized in that the calculation of the frequency N(β
- i) is made by applying a binary attribute function fβ
i to the training corpus where fβ
i is defined as;
- i) is made by applying a binary attribute function fβ
-
7. A method as claimed in claim 1, characterized in that the mathematical function G has as a further variable the magnitude of a convergence step tα
-
ortho with;
tα
ortho=1/Morthowhere Mortho;
represents for binary functions ƒ
α
ortho the maximum number of functions which yield the value 1 for the same argument (h,w).
-
ortho with;
-
8. A method as claimed in claim 7, characterized in that the attribute function ƒ
-
α
ortho is calculated by linearly combining an attribute function ƒ
α
with orthogonalized attribute functions ƒ
β
ortho is calculated from attributes p that have a larger range than the attribute α
.
-
α
-
9. A method as claimed in claim 8, characterized in that the calculation of the orthogonalized attribute function ƒ
-
α
ortho for the attribute α
=β
0 comprises the following steps;
a) Selecting all the attributes β
i with i=1 . . . g in the speech model that have a larger range RW than the attribute α
=β
0 and include the latter;
b) Calculating boundary values fβ
i for the attributes β
i with i=0 . . . g;
c) Sorting the attributes β
i with i=0 . . . g according to their RW;
d) Selecting one of the attributes β
i having the largest RW;
e) Checking whether there are other attributes β
k which include the attribute β
i and have a larger RW than the selected attribute β
i;
f1) If so, defining a function F as a linear combination of the orthogonalized attribute function ƒ
β
kortho calculated in step g) during the last run of the steps e) to g) for all the attributes β
k that have a larger range determined in the most recently run step e);
f2) If not, defining the function F to F=0;
g) Calculating the orthogonalized attribute function ƒ
β
kortho for the attribute Pi by arithmetically combining the attribute function fβ
i with the function F; and
h) Repeating the steps e) to g) for the attribute β
i-1 whose range is smaller than or equal to the range of the attribute β
i until the orthogonalized attribute function ƒ
β
0ortho=ƒ
α
ortho with i=0 has been calculated in step g).
-
α
-
10. A method as claimed in claim 9, characterized in that the calculation of the function F in step f1) is made according to the following formula:
-
11. A method as claimed in claim 9, characterized in that the calculation of the orthogonalized attribute function ƒ
-
β
iortho in step g) is made according to the following formula;
ƒ
β
iortho=ƒ
β
i−
F
-
β
-
12. A method as claimed in claim 1, characterized in that the mathematical function G has the following form:
-
13. A method as claimed in claim 1, characterized in that the mathematical function has the following form:
-
14. A speech recognition system (10) comprising:
- a recognition device (12) for recognizing the semantic content of an acoustic signal captured and rendered available by a microphone (20), more particularly a speech signal, by mapping parts of this signal onto predefined recognition symbols as they are offered by the implemented maximum-entropy speech model MESM, and for generating output signals which represent the recognized semantic content; and
a training system (14) for adapting the MESM to recurrent statistical patterns in the speech of a certain user of the speech recognition system (10);
characterized in that the training system (14) calculates free parameters λ
in the MESM in accordance with the method as claimed in claim 1.
- a recognition device (12) for recognizing the semantic content of an acoustic signal captured and rendered available by a microphone (20), more particularly a speech signal, by mapping parts of this signal onto predefined recognition symbols as they are offered by the implemented maximum-entropy speech model MESM, and for generating output signals which represent the recognized semantic content; and
-
15. A training system (14) for adapting the maximum-entropy speech model MESM in a speech recognition system (10) to recurrent statistical patterns in the speech of a certain user of this speech recognition system (10), characterized in that the training system (14) calculates free parameters λ
- in the MESM in accordance with the method as claimed in claim 1.
Specification