Regime-Conditional Retrieval: Theory and a Transferable Router for Two-Hop QA

2026-04-10Information Retrieval

Information RetrievalArtificial IntelligenceComputation and LanguageMachine Learning
AI summary

The authors studied two-hop question answering (QA), where finding answers depends on two linked pieces of information. They identified two types of questions based on where the second key entity appears: in the question itself or only in a connecting passage. They proved three main points about these question types and used this understanding to create RegimeRouter, a simple tool that decides the best way to search for answers by examining specific text features. Tested on multiple datasets, RegimeRouter improved retrieval accuracy in finding relevant information for these complex questions.

Two-hop question answeringEntity retrievalCosine similarityBridge passagePredicate logicText encodingRetrieval accuracyRegime routingZero-shot learningMulti-hop QA datasets
Authors
Andre Bacellar
Abstract
Two-hop QA retrieval splits queries into two regimes determined by whether the hop-2 entity is explicitly named in the question (Q-dominant) or only in the bridge passage (B-dominant). We formalize this split with three theorems: (T1) per-query AUC is a monotone function of the cosine separation margin, with R^2 >= 0.90 for six of eight type-encoder pairs; (T2) regime is characterized by two surface-text predicates, with P1 decisive for routing and P2 qualifying the B-dominant case, holding across three encoders and three datasets; and (T3) bridge advantage requires the relation-bearing sentence, not entity name alone, with removal causing an 8.6-14.1 pp performance drop (p < 0.001). Building on this theory, we propose RegimeRouter, a lightweight binary router that selects between question-only and question-plus-relation-sentence retrieval using five text features derived directly from the predicate definitions. Trained on 2WikiMultiHopQA (n = 881, 5-fold cross-fitted) and applied zero-shot to MuSiQue and HotpotQA, RegimeRouter achieves +5.6 pp (p < 0.001), +5.3 pp (p = 0.002), and +1.1 pp (non-significant, no-regret) R@5 improvement, respectively, with artifact-driven.