Language model adaptation based on filtered data
First Claim
1. A method for adapting a language model for a context of a domain, comprising;
- from a source having textual information with a variety of phrases related to the context of the domain obtaining textual contents as data directed to the context of the domain by querying the source with phrases representative of the subject matter of the domain regardless and irrespective of any language model;
responsive to a state of a provided selector, determining is one state semantic relevancy or in another state semantic relevancy and lexical relevancy of the textual contents to the context of the domain;
discarding at least a part of the textual contents that contain textual terms determined as irrelevant to the context of the domain, thereby retaining, as retained data, at least a part of the textual contents that contain textual terms determined as relevant to the context of the domain; and
adapting the language model by incorporating therein at least a part of the textual terms of the retained data,wherein the method is performed on an at least one computerized apparatus configured to perform the method and equipped for communication with the source.
3 Assignments
0 Petitions
Accused Products
Abstract
A method for adapting a language model for a context of a domain, comprising obtaining textual contents from a large source by a request directed to the context of the domain, discarding at least a part of the textual contents that contain textual terms determined as irrelevant to the context of the domain, thereby retaining, as retained data, at least a part of the textual contents that contain textual terms determined as relevant to the context of the domain, and adapting the language model by incorporating therein at least a part of the textual terms of the retained data, wherein the method is performed on an at least one computerized apparatus configured to perform the method and equipped for communication with the large source, and an apparatus for performing the same.
23 Citations
27 Claims
-
1. A method for adapting a language model for a context of a domain, comprising;
-
from a source having textual information with a variety of phrases related to the context of the domain obtaining textual contents as data directed to the context of the domain by querying the source with phrases representative of the subject matter of the domain regardless and irrespective of any language model; responsive to a state of a provided selector, determining is one state semantic relevancy or in another state semantic relevancy and lexical relevancy of the textual contents to the context of the domain; discarding at least a part of the textual contents that contain textual terms determined as irrelevant to the context of the domain, thereby retaining, as retained data, at least a part of the textual contents that contain textual terms determined as relevant to the context of the domain; and adapting the language model by incorporating therein at least a part of the textual terms of the retained data, wherein the method is performed on an at least one computerized apparatus configured to perform the method and equipped for communication with the source. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A method for adapting a baseline language model for a context of a domain by data of the Web, comprising:
-
obtaining, from the domain, textual data as data representative of the context of the domain; based on the data representative of the context of the domain and regardless and irrespective of any language model, forming a query that is provided to an at least one search engine of the Web, thereby acquiring an at least one result comprising textual contents; responsive to a state of a provided selector, determining in one state semantic relevancy or in another state semantic relevancy and lexical relevancy of the at least one result to the context of the domain; discarding at least a part of the at least one result in which the textual contents includes at least one textual term that does not pertain to the data representative of the context of the domain; adapting the baseline language model to an adapted language model by incorporating therein textual terms of the at least one result that pertain to the data representative of the context of the domain, wherein the method is performed on an at least one computerized apparatus configured to perform the method and equipped for communication with at least one computerized server linkable to the Web. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27)
-
Specification