System and Method Using Data Reduction Approach and Nonlinear Algorithm to Construct Chinese Readability Model
First Claim
1. A method for constructing a Chinese readability model by using data reduction approach and smart/advanced artificial intelligence algorithm, which includes the steps:
- (A) collect at least a Chinese text for each grade level, and compare the text features with the texts in the corpus for word segmentation, and tag the part of speech of the segmented words. Each Chinese text has at least one readability feature;
(B) analyze the segmented words of each text and the part of speech tagging to compute the value of the readability features;
(C) determine at least one reading comprehension factors for a readability feature through the data reduction method, where the reading comprehension factor is represented as the linear combination of the readability features; and
(D) apply the reading comprehension factors through a smart/advanced artificial intelligence algorithm to construct a Chinese readability model to determine the readability level of a text.
1 Assignment
0 Petitions
Accused Products
Abstract
The invention constructs Chinese readability model with data reduction and smart/advanced artificial intelligence algorithm. The model contains 1) a word segmentation which segments words and tags the part of speech of the words. 2) a readability indicator unit which analyzes readability features based the segmented words segmentation and part of speech tagging; and 3) an evolution algorithm unit, which construct a Chinese text readability model using data reduction approach and smart/advanced artificial intelligence algorithm. The present invention assesses the readability of Chinese texts, based on a small amount of Chinese text, and identifies the adequate readers.
18 Citations
12 Claims
-
1. A method for constructing a Chinese readability model by using data reduction approach and smart/advanced artificial intelligence algorithm, which includes the steps:
-
(A) collect at least a Chinese text for each grade level, and compare the text features with the texts in the corpus for word segmentation, and tag the part of speech of the segmented words. Each Chinese text has at least one readability feature; (B) analyze the segmented words of each text and the part of speech tagging to compute the value of the readability features; (C) determine at least one reading comprehension factors for a readability feature through the data reduction method, where the reading comprehension factor is represented as the linear combination of the readability features; and (D) apply the reading comprehension factors through a smart/advanced artificial intelligence algorithm to construct a Chinese readability model to determine the readability level of a text.
-
-
2. As in 1 (C), the data reduction method overcomes the issue of colinearity between the readability features.
-
3. As in 2 (D), the smart/advanced artificial intelligence algorithm nonlinearly forms at least one reading comprehension factor.
-
4. As in 1 step (A) the corpus is the CKIP Chinese Electronic Dictionary, Sinica Corpus, or Sinica Treebank, where the corpus serves as a criterion for comparing Chinese features.
-
5. As in 1 (A), at least one readability feature comprises word feature, semantic feature, syntactic feature, and article coherence feature, where the readability feature serves as a criterion for determining the reading comprehension factors.
-
6. As in 5 (C), at least one reading comprehension factor is represented as the features in the same feature category, which is classified through data reduction method, Each reading comprehension factor is represented as the linear combination of the readability feature in the same feature category.
-
7. A system for constructing Chinese readability model by using data reduction approach and smart/advanced artificial intelligence algorithm, which includes:
-
a word segmentation unit for receiving at least one Chinese text suitable for a predetermined reading level, and comparing with Chinese features of a corpus to segment the words and to tag part of speech for the segmented words, where each Chinese text is assigned a readability feature; a readability feature unit for receiving the results of word segmentation and part of speech tagging to calculate the feature values; and an evolution algorithm unit for receiving the readability features and determining at least a reading comprehension factor through a data reduction method, using the smart/advanced artificial intelligence algorithm. It constructs a Chinese readability model based on at least one reading comprehension factor. The model evaluates whether the Chinese text is suitable for a predetermined reading level, where at least one reading comprehension factor is represented as a linear combination as at least one readability feature. - View Dependent Claims (8, 9, 10, 11, 12)
-
Specification