Method for statistical language modeling in speech recognition
First Claim
Patent Images
1. A system for generating language modeling data for a speech recognition system, comprising:
- an expression extractor to extract expression from domain-specific data of an existing domain using a base of linguistic knowledge;
a concept structure mapper to map extracted expression to expression in a new domain using vocabulary for the new domain;
a concatenation module to concatenate extracted expression with domain-general data; and
a filter arrangement to identify and filter out unrealistic expression in at least one of mapped and concatenated expression.
1 Assignment
0 Petitions
Accused Products
Abstract
A system for generating language modeling data for a speech recognition system includes an expression extractor to extract expression from domain-specific data of an existing domain using a base of linguistic knowledge, a concept structure mapper to map extracted expression to expression in a new domain using vocabulary for the new domain, a concatenation module to concatenate extracted expression with domain-general data, and a filter arrangement to identify and filter out unrealistic expression in the mapped or concatenated expression.
-
Citations
29 Claims
-
1. A system for generating language modeling data for a speech recognition system, comprising:
-
an expression extractor to extract expression from domain-specific data of an existing domain using a base of linguistic knowledge; a concept structure mapper to map extracted expression to expression in a new domain using vocabulary for the new domain; a concatenation module to concatenate extracted expression with domain-general data; and a filter arrangement to identify and filter out unrealistic expression in at least one of mapped and concatenated expression. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method for generating language modeling data for a speech recognition system, comprising:
-
extracting expression from domain-specific data for an existing domain using a base of linguistic knowledge; mapping an extracted expression to an expression in a new domain using vocabulary for the new domain and a concept mapping table; concatenating the extracted expression using domain-general data; and filtering at least one of the mapped and concatenated expression. - View Dependent Claims (13, 14, 15, 16, 17)
-
-
18. A method for generating language modeling data for a speech recognition system, comprising:
-
extracting expression from domain-specific data for an existing domain using a base of linguistic knowledge; mapping an extracted expression to a expression in a new domain using vocabulary for the new domain; concatenating the extracted expression using domain-general data; and
filtering at least one of the mapped and concatenated expression,wherein the step of mapping the extracted expression includes performing a neighboring word collocation verification test on the mapped expression to verify a naturalness of the mapped expression.
-
-
19. A method for generating language modeling data for a speech recognition system, comprising:
-
extracting expression from domain-specific data for an existing domain using a base of linguistic knowledge; mapping an extracted expression to a expression in a new domain using vocabulary for the new domain; concatenating the extracted expression using domain-general data; and
filtering at least one of the mapped and concatenated expression,wherein the step of concatenating includes performing a statistical collocation measurement of the concatenated expression to ensure a smoothness of at least one of neighboring words and neighboring phrases and chaining highly-collocated pairs to form candidate sentences for the new domain. - View Dependent Claims (20)
-
-
21. A storage medium having a set of instructions residing therein, the set of instructions being executable by a processor to implement a method for performing:
-
extracting expression from domain-specific data for an existing domain using a base of linguistic knowledge; mapping an extracted expression to an expression in a new domain using vocabulary for the new domain and a concept mapping table; concatenating the extracted expression using domain-general data; and filtering at least one of the mapped and concatenated expression. - View Dependent Claims (22, 23, 24, 25, 26)
-
-
27. A storage medium having a set of instructions residing therein, the set of instructions being executable by a processor to implement a method for performing:
-
extracting expression from domain-specific data for an existing domain using a base of linguistic knowledge; mapping an extracted expression to a expression in a new domain using vocabulary for the new domain; concatenating the extracted expression using domain-general data; and
filtering at least one of the mapped and concatenated expression,wherein the step of mapping the extracted expression includes performing a neighboring word collocation verification test on the mapped expression to verify a naturalness of the mapped expression.
-
-
28. A storage medium having a set of instructions residing therein, the set of instructions being executable by a processor to implement a method for performing:
-
extracting expression from domain-specific data for an existing domain using a base of linguistic knowledge; mapping an extracted expression to a expression in a new domain using vocabulary for the new domain; concatenating the extracted expression using domain-general data; and
filtering at least one of the mapped and concatenated expression,wherein the step of concatenating includes performing a statistical collocation measurement of the concatenated expression to ensure a smoothness of at least one of neighboring words and neighboring phrases and chaining highly-collocated pairs to form candidate sentences for the new domain. - View Dependent Claims (29)
-
Specification