CausalQA: A Benchmark for Causal Question Answering

Alexander Bondarenko
Magdalena Wolska
Stefan Heindorf
Lukas Blübaum
Axel-Cyrille Ngonga Ngomo
Benno Stein
Pavel Braslavski
Matthias Hagen
Martin Potthast

At least 5% of questions submitted to search engines ask about cause–effect relationships in some way. To support the development of tailored approaches that can answer such questions, we construct Webis-CausalQA-22, a benchmark corpus of 1.1 million causal questions with answers. We distinguish different types of causal questions using a novel typology derived from a data-driven, manual analysis of questions from ten large question answering (QA) datasets. Using high-precision lexical rules, we extract causal questions of each type from these datasets to create our corpus. As an initial baseline, the state-of-the-art QA model UnifiedQA achieves a ROUGE-L F1 score of 0.48 on our new benchmark.

Original language	English
Pages (from-to)	3296 - 3308
Number of pages	13
Journal	Proceedings - International Conference on Computational Linguistics, COLING
Volume	29
Issue number	1
Publication status	Published - 2022

ID: 43264379