He, Li and Liu, Feng and Liu, Jie and Duan, Jianyong and Wang, Hao (2024) Self-Distillation and Pinyin Character Prediction for Chinese Spelling Correction Based on Multimodality. Applied Sciences, 14 (4). p. 1375. ISSN 2076-3417
applsci-14-01375.pdf - Published Version
Download (1MB)
Abstract
Chinese spelling correction (CSC) constitutes a pivotal and enduring goal in natural language processing, serving as a foundational element for various language-related tasks by detecting and rectifying spelling errors in textual content. Numerous methods for Chinese spelling correction leverage multimodal information, including character, character sound, and character shape, to establish connections between incorrect and correct characters. Research indicates that a majority of spelling errors stem from pinyin similarity, with character similarity accounting for half of the errors. Consequently, effectively modeling character pinyin and character relationships emerges as a key challenge in the CSC task. In this study, we propose enhancing the CSC task by introducing the pinyin character prediction task. We employ an adaptive weighting method in the pinyin character prediction task to address predictions in a more granular manner, achieving a balance between the two prediction tasks. The proposed model, SPMSpell, utilizes ChineseBERT as an encoder to capture multimodal feature information simultaneously. It incorporates three parallel decoders for character prediction, pinyin prediction, and self-distillation modules. To mitigate potential overfitting concerning pinyin, a self-distillation method is introduced to prioritize character information in predictions. Extensive experiments conducted on three SIGHAN benchmark tests showcase that the model introduced in this paper attains a superior level of performance. This substantiates the correctness and superiority of the adaptive weighted pinyin character prediction task and underscores the effectiveness of the self-distillation module.
Item Type: | Article |
---|---|
Subjects: | OA Open Library > Multidisciplinary |
Depositing User: | Unnamed user with email support@oaopenlibrary.com |
Date Deposited: | 08 Feb 2024 09:38 |
Last Modified: | 08 Feb 2024 09:38 |
URI: | http://archive.sdpublishers.com/id/eprint/2493 |