Knowledge-Guided Cross-Topic Visual Question Generation

Hongfei Liu, Guohua Wang, Jiayuan Xie, Jiali Chen, Wenhao Fang, Yi Cai


Abstract
Visual question generation (VQG) task aims to generate high-quality questions based on the input image. Current methods primarily focus on generating questions containing specified content utilizing answers or question types as constraints. However, these constraints make it challenging to control the topic of generated questions (e.g., conversation or test subject topics) for various applications. Thus, it is necessary to utilize topics as constraints to guide question generation. Considering that there are many topics and it is almost impossible for human annotations to cover them, we propose the cross-topic learning VQG (CTL-VQG) task, which aims to generate questions related to unseen topics in cross-topic scenarios. In this paper, we propose a knowledge-guided cross-topic visual question generation (KC-VQG) model to extract unseen topic-related information for question generation. Specifically, an image-topic feature extractor is introduced in our model to extract topic-related intuitive visual features; an image-topic knowledge extractor is used to extract and select the most appropriate topic-related implicit knowledge from large language models for generating questions. Extensive experiments show that our model outperforms baselines and can effectively generate unseen topic-related questions in cross-topic scenarios.
Anthology ID:
2024.lrec-main.861
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
9854–9864
Language:
URL:
https://aclanthology.org/2024.lrec-main.861
DOI:
Bibkey:
Cite (ACL):
Hongfei Liu, Guohua Wang, Jiayuan Xie, Jiali Chen, Wenhao Fang, and Yi Cai. 2024. Knowledge-Guided Cross-Topic Visual Question Generation. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 9854–9864, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Knowledge-Guided Cross-Topic Visual Question Generation (Liu et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.lrec-main.861.pdf