Shiquan Wang


2024

pdf bib
Sentence Segmentation and Punctuation for Ancient Books Based on Supervised In-context Training
Shiquan Wang | Weiwei Fu | Mengxiang Li | Zhongjiang He | Yongxiang Li | Ruiyu Fang | Li Guan | Shuangyong Song
Proceedings of the Third Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA) @ LREC-COLING-2024

This paper describes the participation of team “TeleAI” in the third International Chinese Ancient Chinese Language Information Processing Evaluation (EvalHan24). The competition comprises a joint task of sentence segmentation and punctuation, categorized into open and closed tracks based on the models and data used. In the final evaluation, our system achieved significantly better results than the baseline. Specifically, in the closed-track sentence segmentation task, we obtained an F1 score of 0.8885, while in the sentence punctuation task, we achieved an F1 score of 0.7129.

2023

pdf bib
CCL23-Eval 任务1系统报告:基于持续预训练方法与上下文增强策略的古籍命名实体识别(System Report for CCL23-Eval Task 1:Named Entity Recognition for Ancient Books based on Continual Pre-training Method and Context Augmentation Strategy)
Shiquan Wang (士权王,) | Lingling Shi (石玲玲) | Luwen Pu (蒲璐汶) | Ruiyu Fang (方瑞玉) | Yu Zhao (宇赵,) | Shuangyong Song (宋双永)
Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)

“本文描述了队伍“翼智团”在CCL23古籍命名实体识别评测中提交的参赛系统。该任务旨在自动识别出古籍文本中人名、书名、官职名等事件基本构成要素的重要实体,并根据使用模型参数是否大于10b分为开放赛道和封闭赛道。该任务中,我们首先利用古籍相关的领域数据和任务数据对开源预训练模型进行持续预训练和微调,显著提升了基座模型在古籍命名实体识别任务上的性能表现。其次提出了一种基于pair-wise投票的不置信实体筛选算法用来得到候选实体,并对候选实体利用上下文增强策略进行实体识别修正。在最终的评估中,我们的系统在封闭赛道中排名第二,F1得分为95.8727。”