Language model adaptation for specific texts
First Claim
1. A computerized method for adapting a baseline language model, comprising:
- obtaining a textual corpus of documents that comprise textual expressions;
incorporating in the baseline language model textual expressions from documents which are determined as relevant to a provided target text based on a plurality of different relevancy determinations between the documents and the provided target text, thereby adapting the baseline language model to form an adapted language model for recognizing terms of a context of the provided target text,wherein a first relevancy determination comprises a determination by sufficiently small evaluated perplexities of the baseline language model with respect to the target text, andwherein a second relevancy determination comprises a determination of matches between stems of words in the textual expressions in the documents and stems of words of the target text, andwherein a third relevancy determination comprises a determination of matches between words in the textual expressions of the documents with words of the target text that for the matching have been converted to synonyms thereof according to a preset dictionary of synonyms, andwherein a fourth relevancy determination comprises a determination of semantic similarities of words in the textual expressions of the documents and words of the target text based on semantic distance and reduction thereof, andwherein the method is automatically performed on an at least one computerized apparatus configured to perform the method.
3 Assignments
0 Petitions
Accused Products
Abstract
A computerized method for adapting a baseline language model, comprising obtaining a textual corpus of documents that comprise textual expressions, incorporating in the baseline language model textual expressions from documents which are determined as relevant to a provided target text based on a plurality of different relevancy determinations between the documents and the provided target text, thereby adapting the baseline language model to form an adapted language model for recognizing terms of a context of the provided target text, wherein the method is automatically performed on an at least one computerized apparatus configured to perform the method.
-
Citations
6 Claims
-
1. A computerized method for adapting a baseline language model, comprising:
-
obtaining a textual corpus of documents that comprise textual expressions; incorporating in the baseline language model textual expressions from documents which are determined as relevant to a provided target text based on a plurality of different relevancy determinations between the documents and the provided target text, thereby adapting the baseline language model to form an adapted language model for recognizing terms of a context of the provided target text, wherein a first relevancy determination comprises a determination by sufficiently small evaluated perplexities of the baseline language model with respect to the target text, and wherein a second relevancy determination comprises a determination of matches between stems of words in the textual expressions in the documents and stems of words of the target text, and wherein a third relevancy determination comprises a determination of matches between words in the textual expressions of the documents with words of the target text that for the matching have been converted to synonyms thereof according to a preset dictionary of synonyms, and wherein a fourth relevancy determination comprises a determination of semantic similarities of words in the textual expressions of the documents and words of the target text based on semantic distance and reduction thereof, and wherein the method is automatically performed on an at least one computerized apparatus configured to perform the method. - View Dependent Claims (2, 3, 4, 5, 6)
-
Specification