LANGUAGE MODEL ADAPTATION BASED ON FILTERED DATA
First Claim
1. A method for adapting a language model for a context of a domain, comprising:
- obtaining textual contents from a large source by a request directed to the context of the domain;
discarding at least a part of the textual contents that contain textual terms determined as irrelevant to the context of the domain, thereby retaining, as retained data, at least a part of the textual contents that contain textual terms determined as relevant to the context of the domain; and
adapting the language model by incorporating therein at least a part of the textual terms of the retained data,wherein the method is performed on an at least one computerized apparatus configured to perform the method and equipped for communication with the large source.
3 Assignments
0 Petitions
Accused Products
Abstract
A method for adapting a language model for a context of a domain, comprising obtaining textual contents from a large source by a request directed to the context of the domain, discarding at least a part of the textual contents that contain textual terms determined as irrelevant to the context of the domain, thereby retaining, as retained data, at least a part of the textual contents that contain textual terms determined as relevant to the context of the domain, and adapting the language model by incorporating therein at least a part of the textual terms of the retained data, wherein the method is performed on an at least one computerized apparatus configured to perform the method and equipped for communication with the large source, and an apparatus for performing the same.
36 Citations
15 Claims
-
1. A method for adapting a language model for a context of a domain, comprising:
-
obtaining textual contents from a large source by a request directed to the context of the domain; discarding at least a part of the textual contents that contain textual terms determined as irrelevant to the context of the domain, thereby retaining, as retained data, at least a part of the textual contents that contain textual terms determined as relevant to the context of the domain; and adapting the language model by incorporating therein at least a part of the textual terms of the retained data, wherein the method is performed on an at least one computerized apparatus configured to perform the method and equipped for communication with the large source. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method for adapting a baseline language model for a context of a domain by data of the Web, comprising:
-
obtaining, from the domain, data representative of the context of the domain; based on the data representative of the context of the domain, forming a query that is provided to an at least one search engine of the Web, thereby acquiring an at least one result comprising textual contents; discarding at least a part of the at least one result in which the textual contents includes at least one textual term that does not pertain to the data representative of the context of the domain; adapting the baseline language model to an adapted language model by incorporating therein textual terms of the at least one result that pertain to the data representative of the context of the domain, wherein the method is performed on an at least one computerized apparatus configured to perform the method and equipped for communication with at least one computerized server linkable to the Web. - View Dependent Claims (8, 9, 10, 11, 12, 13, 14, 15)
-
Specification