An LLM-Based System for Argument Reconstruction
2026-05-13 • Computation and Language
Computation and Language
AI summaryⓘ
The authors created a system that uses large language models to turn everyday text into clear argument maps, showing how different points support or challenge each other. Their system breaks down text step-by-step to find key parts of arguments and how they relate, organizing them as graphs with premises and conclusions connected by support, attack, or undercut links. They tested the system both by having humans check its work on textbook examples and by comparing its results to standard datasets. The results suggest it can reasonably reconstruct argument structures and work across different formats. This shows promise for using AI to analyze arguments at scale.
argumentslarge language modelsargument graphspremisesconclusionssupport relationattack relationundercut relationargumentation theorybenchmark datasets
Authors
Paulo Pirozelli, Victor Hugo Nascimento Rocha, Fabio G. Cozman, Douglas Aldred
Abstract
Arguments are a fundamental aspect of human reasoning, in which claims are supported, challenged, and weighed against one another. We present an end-to-end large language model (LLM)-based system for reconstructing arguments from natural language text into abstract argument graphs. The system follows a multi-stage pipeline that progressively identifies argumentative components, selects relevant elements, and uncovers their logical relations. These elements are represented as directed acyclic graphs consisting of two component types (premises and conclusions) and three relation types (support, attack, and undercut). We conduct two complementary experiments to evaluate the system. First, we perform a manual evaluation on arguments drawn from an argumentation theory textbook to assess the system's ability to recover argumentative structure. Second, we conduct a quantitative evaluation on benchmark datasets, allowing comparison with prior work by mapping our outputs to established annotation schemes. Results show that the system can adequately recover argumentative structures and, when adapted to different annotation schemes, achieve reasonable performance across benchmark datasets. These findings highlight the potential of LLM-based pipelines for scalable argument reconstruction.