Quantum Maximum Likelihood Prediction via Hilbert Space Embeddings

2026-02-20Information Theory

Information TheoryMachine Learning
AI summary

The authors offer a new way to understand how large language models (LLMs) can learn from the context they're given. They use ideas from quantum physics and statistics to think of training as embedding probability information into quantum states. Then, figuring out what comes next (in-context learning) is like making the best guess using these quantum models. They also provide mathematical guarantees about how well this method works and apply their ideas to both classical and quantum versions of LLMs.

large language modelsin-context learningquantum density operatorsmaximum-likelihood predictionquantum reverse information projectionquantum Pythagorean theoremtrace normquantum relative entropyconvergence ratesconcentration inequalities
Authors
Sreejith Sreekumar, Nir Weinberger
Abstract
Recent works have proposed various explanations for the ability of modern large language models (LLMs) to perform in-context prediction. We propose an alternative conceptual viewpoint from an information-geometric and statistical perspective. Motivated by Bach[2023], we model training as learning an embedding of probability distributions into the space of quantum density operators, and in-context learning as maximum-likelihood prediction over a specified class of quantum models. We provide an interpretation of this predictor in terms of quantum reverse information projection and quantum Pythagorean theorem when the class of quantum models is sufficiently expressive. We further derive non-asymptotic performance guarantees in terms of convergence rates and concentration inequalities, both in trace norm and quantum relative entropy. Our approach provides a unified framework to handle both classical and quantum LLMs.