Abstract

The assessment of Chinese text readability plays a significant role in Chinese language education. Due to the intrinsic differences between alphabetic languages and Chinese character representations, the readability assessment becomes more challenging in terms of the language’s inherent complexity in vocabulary, syntax, and semantics. The article proposed the conceptual analogy between Chinese readability assessment and music’s rhythm and tempo patterns, in which the syntactic structures of the Chinese sentences could be transformed into an image. The Chinese Knowledge and Information Processing Tagger (CkipTagger) tool developed by Sinica-Taiwan is utilized to decompose the Chinese text into a set of tokens. These tokens are then refined through a user-defined token pool to retain meaningful units. An image with part-of-speech (POS) information will be generated by using the token versus syntax alignment. A discrete cosine transform (DCT) is then applied to extract the temporal characteristics of the text. Moreover, the study integrated four categories: linguistic features–type–token ratio, average sentence length, total word, and difficulty level of vocabulary for the readability assessment. Finally, these features were fed into the Support Vector Machine (SVM) network for the classifications. Furthermore, a bidirectional long short-term memory (Bi-LSTM) network is adopted for quantitative comparisons. In simulation, a total of 774 Chinese texts fitted with Taiwan Benchmarks for the Chinese Language were selected and graded by Chinese language experts, consisting of equal amounts of basic, intermediate, and advanced levels. The finding indicated the proposed POS with the linguistic features work well in the SVM network, and the performance matches with the more complex architectures like the Bi-LSTM network in Chinese readability assessments.

Affiliated Institutions

Related Publications

Publication Info

Year
2025
Type
article
Volume
18
Issue
12
Pages
777-777
Citations
0
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

0
OpenAlex

Cite This

Chi-Yi Hsieh, Jingyan Lin, Chi‐Wen Hsieh et al. (2025). Chinese Text Readability Assessment Based on the Integration of Visualized Part-of-Speech Information with Linguistic Features. Algorithms , 18 (12) , 777-777. https://doi.org/10.3390/a18120777

Identifiers

DOI
10.3390/a18120777