Overcoming Language Priors with Counterfactual Inference for Visual Question Answering

Zhibo Ren; Huizhen Wang; Muhua Zhu; Yichao Wang; Tong Xiao (肖桐); Jingbo Zhu

Overcoming Language Priors with Counterfactual Inference for Visual Question Answering

Zhibo Ren, Huizhen Wang, Muhua Zhu, Yichao Wang, Tong Xiao, Jingbo Zhu

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

“Recent years have seen a lot of efforts in attacking the issue of language priors in the field ofVisual Question Answering (VQA). Among the extensive efforts, causal inference is regarded asa promising direction to mitigate language bias by weakening the direct causal effect of questionson answers. In this paper, we follow the same direction and attack the issue of language priorsby incorporating counterfactual data. Moreover, we propose a two-stage training strategy whichis deemed to make better use of counterfactual data. Experiments on the widely used bench-mark VQA-CP v2 demonstrate the effectiveness of the proposed approach, which improves thebaseline by 21.21% and outperforms most of the previous systems.”

Anthology ID:: 2023.ccl-1.52
Volume:: Proceedings of the 22nd Chinese National Conference on Computational Linguistics
Month:: August
Year:: 2023
Address:: Harbin, China
Editors:: Maosong Sun, Bing Qin, Xipeng Qiu, Jing Jiang, Xianpei Han
Venue:: CCL
SIG:
Publisher:: Chinese Information Processing Society of China
Note:
Pages:: 600–610
Language:: English
URL:: https://aclanthology.org/2023.ccl-1.52/
DOI:
Bibkey:
Cite (ACL):: Zhibo Ren, Huizhen Wang, Muhua Zhu, Yichao Wang, Tong Xiao, and Jingbo Zhu. 2023. Overcoming Language Priors with Counterfactual Inference for Visual Question Answering. In Proceedings of the 22nd Chinese National Conference on Computational Linguistics, pages 600–610, Harbin, China. Chinese Information Processing Society of China.
Cite (Informal):: Overcoming Language Priors with Counterfactual Inference for Visual Question Answering (Ren et al., CCL 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.ccl-1.52.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{zhibo-etal-2023-overcoming,
    title = "Overcoming Language Priors with Counterfactual Inference for Visual Question Answering",
    author = "Ren, Zhibo  and
      Wang, Huizhen  and
      Zhu, Muhua  and
      Wang, Yichao  and
      Xiao, Tong  and
      Zhu, Jingbo",
    editor = "Sun, Maosong  and
      Qin, Bing  and
      Qiu, Xipeng  and
      Jiang, Jing  and
      Han, Xianpei",
    booktitle = "Proceedings of the 22nd Chinese National Conference on Computational Linguistics",
    month = aug,
    year = "2023",
    address = "Harbin, China",
    publisher = "Chinese Information Processing Society of China",
    url = "https://aclanthology.org/2023.ccl-1.52/",
    pages = "600--610",
    language = "eng",
    abstract = "``Recent years have seen a lot of efforts in attacking the issue of language priors in the field ofVisual Question Answering (VQA). Among the extensive efforts, causal inference is regarded asa promising direction to mitigate language bias by weakening the direct causal effect of questionson answers. In this paper, we follow the same direction and attack the issue of language priorsby incorporating counterfactual data. Moreover, we propose a two-stage training strategy whichis deemed to make better use of counterfactual data. Experiments on the widely used bench-mark VQA-CP v2 demonstrate the effectiveness of the proposed approach, which improves thebaseline by 21.21{\%} and outperforms most of the previous systems.''"
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="zhibo-etal-2023-overcoming">
    <titleInfo>
        <title>Overcoming Language Priors with Counterfactual Inference for Visual Question Answering</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Zhibo</namePart>
        <namePart type="family">Ren</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Huizhen</namePart>
        <namePart type="family">Wang</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Muhua</namePart>
        <namePart type="family">Zhu</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Yichao</namePart>
        <namePart type="family">Wang</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Tong</namePart>
        <namePart type="family">Xiao</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Jingbo</namePart>
        <namePart type="family">Zhu</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2023-08</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <language>
        <languageTerm type="text">eng</languageTerm>
    </language>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the 22nd Chinese National Conference on Computational Linguistics</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Maosong</namePart>
            <namePart type="family">Sun</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Bing</namePart>
            <namePart type="family">Qin</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Xipeng</namePart>
            <namePart type="family">Qiu</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Jing</namePart>
            <namePart type="family">Jiang</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Xianpei</namePart>
            <namePart type="family">Han</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Chinese Information Processing Society of China</publisher>
            <place>
                <placeTerm type="text">Harbin, China</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>“Recent years have seen a lot of efforts in attacking the issue of language priors in the field ofVisual Question Answering (VQA). Among the extensive efforts, causal inference is regarded asa promising direction to mitigate language bias by weakening the direct causal effect of questionson answers. In this paper, we follow the same direction and attack the issue of language priorsby incorporating counterfactual data. Moreover, we propose a two-stage training strategy whichis deemed to make better use of counterfactual data. Experiments on the widely used bench-mark VQA-CP v2 demonstrate the effectiveness of the proposed approach, which improves thebaseline by 21.21% and outperforms most of the previous systems.”</abstract>
    <identifier type="citekey">zhibo-etal-2023-overcoming</identifier>
    <location>
        <url>https://aclanthology.org/2023.ccl-1.52/</url>
    </location>
    <part>
        <date>2023-08</date>
        <extent unit="page">
            <start>600</start>
            <end>610</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T Overcoming Language Priors with Counterfactual Inference for Visual Question Answering
%A Ren, Zhibo
%A Wang, Huizhen
%A Zhu, Muhua
%A Wang, Yichao
%A Xiao, Tong
%A Zhu, Jingbo
%Y Sun, Maosong
%Y Qin, Bing
%Y Qiu, Xipeng
%Y Jiang, Jing
%Y Han, Xianpei
%S Proceedings of the 22nd Chinese National Conference on Computational Linguistics
%D 2023
%8 August
%I Chinese Information Processing Society of China
%C Harbin, China
%G eng
%F zhibo-etal-2023-overcoming
%X “Recent years have seen a lot of efforts in attacking the issue of language priors in the field ofVisual Question Answering (VQA). Among the extensive efforts, causal inference is regarded asa promising direction to mitigate language bias by weakening the direct causal effect of questionson answers. In this paper, we follow the same direction and attack the issue of language priorsby incorporating counterfactual data. Moreover, we propose a two-stage training strategy whichis deemed to make better use of counterfactual data. Experiments on the widely used bench-mark VQA-CP v2 demonstrate the effectiveness of the proposed approach, which improves thebaseline by 21.21% and outperforms most of the previous systems.”
%U https://aclanthology.org/2023.ccl-1.52/
%P 600-610

Download as File

Markdown (Informal)

[Overcoming Language Priors with Counterfactual Inference for Visual Question Answering](https://aclanthology.org/2023.ccl-1.52/) (Ren et al., CCL 2023)

Overcoming Language Priors with Counterfactual Inference for Visual Question Answering (Ren et al., CCL 2023)

ACL

Zhibo Ren, Huizhen Wang, Muhua Zhu, Yichao Wang, Tong Xiao, and Jingbo Zhu. 2023. Overcoming Language Priors with Counterfactual Inference for Visual Question Answering. In Proceedings of the 22nd Chinese National Conference on Computational Linguistics, pages 600–610, Harbin, China. Chinese Information Processing Society of China.