Forecasting Conceptual Diffusion in Science: The Case of Quantum Computing

2026-06-02Social and Information Networks

Social and Information NetworksComputers and SocietyDigital LibrariesMachine Learning
AI summary

The authors studied how scientific ideas change by looking at when ideas get stronger within a field (endogenous consolidation) versus when they spread to new areas (exogenous diffusion). Using data from quantum computing and other fields, they built models to predict these patterns based on how diverse and widespread the ideas’ connections were. They found that predicting spreading to new areas works well, but predicting idea reinforcement inside a field varies by discipline. Their work shows that tracking idea diversity and citation patterns can help forecast big shifts and new frontiers in science.

endogenous consolidationexogenous diffusionconcept co-occurrence networkLightGBMcitation lineagequantum computingdiffusion entropySHAP analysisscientometricstechnology foresight
Authors
Thomas Maillart, Thibaut Chataing, David Dosu, Paul Bagourd, Julian Jang-Jaccard, Alain Mermoud
Abstract
Understanding and anticipating scientific change requires models that distinguish between endogenous consolidation and exogenous diffusion of scientific concepts. Using the quantum computing subtree of concepts in OpenAlex, we construct a temporally resolved concept co-occurrence network and track each concept pair through its upstream citation lineage and downstream diffusion. We train LightGBM models on distributional and diversity-aware features to predict four outcomes: endogenous reinforcement, exogenous diffusion, their ratio, and diffusion entropy. After controlling for overall publication growth of the scientific body, endogenous reinforcement proves largely unpredictable in the primary quantum-computing benchmark. In contrast, exogenous diffusion and entropy are strongly predictable ($R^2$ up to $0.78à) and are driven by upstream heterogeneity, citation breadth, and distributional dispersion, as shown by SHAP analyses; replications on robotics, advanced materials, and neuro implants confirm that exogenous diffusion remains the top-ranked target across fields ($R^2_test \sim 0.60-0.87$), while endogenous predictability rises markedly in neuro implants (R^2_test = 0.83), indicating that the quantum-computing asymmetry does not generalise uniformly. Case studies reveal that sharp entropy increases coincide with the opening of new conceptual frontiers, while entropy collapses signal technological convergence or paradigm displacement. These results demonstrate that conceptual diffusion is governed by stable structural regularities embedded in semantic and citation environments. By identifying early diversity-based signals of cross-domain uptake, the approach provides a scalable foundation for anticipatory scientometrics, technology foresight, and innovation-oriented policy analysis in rapidly evolving research fields.