Xirui Ke


2024

pdf bib
Diversifying Question Generation over Knowledge Base via External Natural Questions
Shasha Guo | Jing Zhang | Xirui Ke | Cuiping Li | Hong Chen
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Previous methods on knowledge base question generation (KBQG) primarily focus on refining the quality of a single generated question. However, considering the remarkable paraphrasing ability of humans, we believe that diverse texts can express identical semantics through varied expressions. The above insights make diversifying question generation an intriguing task, where the first challenge is evaluation metrics for diversity. Current metrics inadequately assess the aforementioned diversity. They calculate the ratio of unique n-grams in the generated question, which tends to measure duplication rather than true diversity. Accordingly, we devise a new diversity evaluation metric, which measures the diversity among top-k generated questions for each instance while ensuring their relevance to the ground truth. Clearly, the second challenge is how to enhance diversifying question generation. To address this challenge, we introduce a dual model framework interwoven by two selection strategies to generate diverse questions leveraging external natural questions. The main idea of our dual framework is to extract more diverse expressions and integrate them into the generation model to enhance diversifying question generation. Extensive experiments on widely used benchmarks for KBQG show that our approach can outperform pre-trained language model baselines and text-davinci-003 in diversity while achieving comparable performance with ChatGPT.

2022

pdf bib
Knowledge-augmented Self-training of A Question Rewriter for Conversational Knowledge Base Question Answering
Xirui Ke | Jing Zhang | Xin Lv | Yiqi Xu | Shulin Cao | Cuiping Li | Hong Chen | Juanzi Li
Findings of the Association for Computational Linguistics: EMNLP 2022

The recent rise of conversational applications such as online customer service systems and intelligent personal assistants has promoted the development of conversational knowledge base question answering (ConvKBQA). Different from the traditional single-turn KBQA, ConvKBQA usually explores multi-turn questions around a topic, where ellipsis and coreference pose great challenges to the single-turn KBQA systems which require self-contained questions. In this paper, we propose a rewrite-and-reason framework to first produce a full-fledged rewritten question based on the conversation history and then reason the answer by existing single-turn KBQA models. To overcome the absence of the rewritten supervision signals, we introduce a knowledge-augmented self-training mechanism to transfer the question rewriter from another dataset to adapt to the current knowledge base. Our question rewriter is decoupled from the subsequent QA process, which makes it easy to be united with either retrieval-based or semantic parsing-based KBQA models. Experiment results demonstrate the effectiveness of our method and a new state-of-the-art result is achieved. The code and dataset are available online now.

2021

pdf bib
P-INT: A Path-based Interaction Model for Few-shot Knowledge Graph Completion
Jingwen Xu | Jing Zhang | Xirui Ke | Yuxiao Dong | Hong Chen | Cuiping Li | Yongbin Liu
Findings of the Association for Computational Linguistics: EMNLP 2021

Few-shot knowledge graph completion is to infer the unknown facts (i.e., query head-tail entity pairs) of a given relation with only a few observed reference entity pairs. Its general process is to first encode the implicit relation of an entity pair and then match the relation of a query entity pair with the relations of the reference entity pairs. Most existing methods have thus far encoded an entity pair and matched entity pairs by using the direct neighbors of concerned entities. In this paper, we propose the P-INT model for effective few-shot knowledge graph completion. First, P-INT infers and leverages the paths that can expressively encode the relation of two entities. Second, to capture the fine grained matches, P-INT calculates the interactions of paths instead of mix- ing them for each entity pair. Extensive experimental results demonstrate that P-INT out- performs the state-of-the-art baselines by 11.2– 14.2% in terms of Hits@1. Our codes and datasets are online now.