Enhancing QA System Cognition With Improved Lexical Simplification Using Multilingual Resources
First Claim
1. A method implemented by an information handling system that includes a processor and a memory accessible by the processor, the method comprising:
- returning a simplified set of text to a user of a natural language processing (NLP) system, wherein the simplified set of text comprises text appropriate to a reading level of the user, wherein a text simplification process retrieves the simplified set of text from a corpus using a plurality of words that have a complexity level appropriate to the reading level, and wherein the complexity level is based on a multi-language word mapping performed on at least a selected one of the plurality of words using a process comprising;
receiving the selected word, wherein the selected word belongs to a first natural language;
retrieving a first set of complexity data pertaining to the selected word in the first natural language, wherein the first set of complexity data comprises a first word length and a first word frequency;
translating the selected word to one or more translated words, wherein each of the translated words corresponds to one or more second natural languages;
retrieving one or more second sets of complexity data, wherein each of the second sets of complexity data correspond to a different one of the translated words, and wherein the one or more second sets of complexity data comprises one or more second word lengths and one or more second word frequencies; and
determining a complexity of the selected word in the first natural language based on an overall word length and an overall word frequency, wherein the overall word length is based on the first word length and the one or more second word lengths, and wherein the overall word frequency is based on the first word frequency and the one or more second word frequencies, and wherein the determined complexity of the word is utilized to enhance the multi-language word mapping.
1 Assignment
0 Petitions
Accused Products
Abstract
An approach is provided that returns a simplified set of text to a user of a natural language processing (NLP) system with the simplified set of text having a complexity appropriate to the reading level of the user. The approach receives a word that belongs to a first natural language and retrieves a first set of complexity data pertaining to the word in the first natural language. The approach translates the word to one or more translated words, with each of the translated words corresponding to one or more second natural languages. The approach then retrieves sets of complexity data, with the sets of complexity data corresponding to a different translated word. The approach determines a complexity of the word in the first natural language based on an analysis of the first and second sets of complexity data.
5 Citations
7 Claims
-
1. A method implemented by an information handling system that includes a processor and a memory accessible by the processor, the method comprising:
returning a simplified set of text to a user of a natural language processing (NLP) system, wherein the simplified set of text comprises text appropriate to a reading level of the user, wherein a text simplification process retrieves the simplified set of text from a corpus using a plurality of words that have a complexity level appropriate to the reading level, and wherein the complexity level is based on a multi-language word mapping performed on at least a selected one of the plurality of words using a process comprising; receiving the selected word, wherein the selected word belongs to a first natural language; retrieving a first set of complexity data pertaining to the selected word in the first natural language, wherein the first set of complexity data comprises a first word length and a first word frequency; translating the selected word to one or more translated words, wherein each of the translated words corresponds to one or more second natural languages; retrieving one or more second sets of complexity data, wherein each of the second sets of complexity data correspond to a different one of the translated words, and wherein the one or more second sets of complexity data comprises one or more second word lengths and one or more second word frequencies; and determining a complexity of the selected word in the first natural language based on an overall word length and an overall word frequency, wherein the overall word length is based on the first word length and the one or more second word lengths, and wherein the overall word frequency is based on the first word frequency and the one or more second word frequencies, and wherein the determined complexity of the word is utilized to enhance the multi-language word mapping. - View Dependent Claims (4, 5, 6, 7)
-
2. (canceled)
-
3. (canceled)
Specification