Xiting Wang


2024

pdf bib
Distillation with Explanations from Large Language Models
Hanyu Zhang | Xiting Wang | Xiang Ao | Qing He
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Free-text explanations are crucial for enhancing the interpretability of AI models. However, training models to generate high-quality free-text explanations is challenging, primarily due to the requirement of a substantial amount of human-written explanations, which can be expensive. Recently, Large language models (LLMs) like ChatGPT and GPT-4 have made remarkable progress in various NLP tasks while also providing explanations alongside their answers. Leveraging LLMs for data labeling offers a more cost-effective alternative. However, a key concern arises from the fact that the answers provided by LLMs are not entirely accurate, potentially introducing noise to both task outputs and explanation generation. To remedy this, we propose a new mechanism, Distillation with Explanations from LLMs. we observe that despite the incorrectness in LLMs-generated answers, their explanations are consistent with their answers. Leveraging this consistency, our method combines the ground truth labels and answers-explanations generated by LLMs, to simultaneously generate more accurate answers and the corresponding free-text explanations. Experimental results demonstrate that our approach achieves improved predictive performance and also generates explanations that exhibit greater alignment with the model’s task outputs.

2023

pdf bib
DuNST: Dual Noisy Self Training for Semi-Supervised Controllable Text Generation
Yuxi Feng | Xiaoyuan Yi | Xiting Wang | Laks Lakshmanan, V.S. | Xing Xie
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Self-training (ST) has prospered again in language understanding by augmenting the fine-tuning of big pre-trained models when labeled data is insufficient. However, it remains challenging to incorporate ST into attribute-controllable language generation. Augmented only by self-generated pseudo text, generation models over-exploit the previously learned text space and fail to explore a larger one, suffering from a restricted generalization boundary and limited controllability. In this work, we propose DuNST, a novel ST framework to tackle these problems. DuNST jointly models text generation and classification as a dual process and further perturbs and escapes from the collapsed space by adding two kinds of flexible noise. In this way, our model could construct and utilize both pseudo text generated from given labels and pseudo labels predicted from available unlabeled text, which are gradually refined during the ST phase. We theoretically demonstrate that DuNST can be regarded as enhancing the exploration of the potentially larger real text space while maintaining exploitation, guaranteeing improved performance. Experiments on three controllable generation tasks show that DuNST significantly boosts control accuracy with comparable generation fluency and diversity against several strong baselines.

2022

pdf bib
Generating Multiple-Length Summaries via Reinforcement Learning for Unsupervised Sentence Summarization
Dongmin Hyun | Xiting Wang | Chayoung Park | Xing Xie | Hwanjo Yu
Findings of the Association for Computational Linguistics: EMNLP 2022

Sentence summarization shortens given texts while maintaining core contents of the texts. Unsupervised approaches have been studied to summarize texts without ground-truth summaries. However, recent unsupervised models are extractive, which remove words from texts and thus they are less flexible than abstractive summarization. In this work, we devise an abstractive model based on reinforcement learning without ground-truth summaries. We formulate the unsupervised summarization based on the Markov decision process with rewards representing the summary quality. To further enhance the summary quality, we develop a multi-summary learning mechanism that generates multiple summaries with varying lengths for a given text, while making the summaries mutually enhance each other. Experimental results show that the proposed model substantially outperforms both abstractive and extractive models, yet frequently generating new words not contained in input texts.

2021

pdf bib
PENS: A Dataset and Generic Framework for Personalized News Headline Generation
Xiang Ao | Xiting Wang | Ling Luo | Ying Qiao | Qing He | Xing Xie
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

In this paper, we formulate the personalized news headline generation problem whose goal is to output a user-specific title based on both a user’s reading interests and a candidate news body to be exposed to her. To build up a benchmark for this problem, we publicize a large-scale dataset named PENS (PErsonalized News headlineS). The training set is collected from user impressions logs of Microsoft News, and the test set is manually created by hundreds of native speakers to enable a fair testbed for evaluating models in an offline mode. We propose a generic framework as a preparatory solution to our problem. At its heart, user preference is learned by leveraging the user behavioral data, and three kinds of user preference injections are proposed to personalize a text generator and establish personalized headlines. We investigate our dataset by implementing several state-of-the-art user modeling methods in our framework to demonstrate a benchmark score for the proposed dataset. The dataset is available at https://msnews.github.io/pens.html.