PRAGMA: Revolut Foundation Model

2026-04-09Machine Learning

Machine LearningComputational Engineering, Finance, and ScienceComputation and LanguageInformation Retrieval
AI summary

The authors developed PRAGMA, a model that learns from large amounts of banking event data like transactions by predicting missing parts in the data. PRAGMA uses a Transformer architecture to understand complex, irregular financial records without needing explicit labels. Once trained, the model creates useful summaries (embeddings) of banking events that help with tasks such as credit scoring and fraud detection. Their tests show PRAGMA performs well across different financial applications, making it a flexible tool for analyzing banking data.

Transformerfoundation modelsmasked modelingself-supervised learningfinancial event datacredit scoringfraud detectionembeddingfine-tuninglifetime value prediction
Authors
Maxim Ostroukhov, Ruslan Mikhailov, Vladimir Iashin, Artem Sokolov, Andrei Akshonov, Vitaly Protasov, Dmitrii Beloborodov, Vince Mullin, Roman Yokunda Enzmann, Georgios Kolovos, Jason Renders, Pavel Nesterov, Anton Repushko
Abstract
Modern financial systems generate vast quantities of transactional and event-level data that encode rich economic signals. This paper presents PRAGMA, a family of foundation models for multi-source banking event sequences. Our approach pre-trains a Transformer-based architecture with masked modelling on a large-scale, heterogeneous banking event corpus using a self-supervised objective tailored to the discrete, variable-length nature of financial records. The resulting model supports a wide range of downstream tasks such as credit scoring, fraud detection, and lifetime value prediction: strong performance can be achieved by training a simple linear model on top of the extracted embeddings and can be further improved with lightweight fine-tuning. Through extensive evaluation on downstream tasks, we demonstrate that PRAGMA achieves superior performance across multiple domains directly from raw event sequences, providing a general-purpose representation layer for financial applications.