Structural Feature Engineering for Generative Engine Optimization: How Content Structure Shapes Citation Behavior

2026-03-31 • Computation and Language

Computation and LanguageHuman-Computer InteractionInformation Retrieval

AI summaryⓘ

The authors explain that AI search engines now provide direct answers with source citations, which makes it harder for content to get noticed. They propose GEO-SFE, a method that focuses on organizing the structure of content at three levels: overall layout, how information is grouped, and visual highlights. Their method predicts and improves how often content is cited by different AI engines while keeping the meaning intact. Tests showed their approach increased citation rates and quality scores, suggesting that fixing structure is important for better visibility in AI-driven search.

Generative Engine OptimizationAI-powered search enginesContent structureCitation behaviorMacro-structureMeso-structureMicro-structureLarge Language ModelsInformation chunkingSource citation

Authors

Junwei Yu, Mufeng Yang, Yepeng Ding, Hiroyuki Sato

Abstract

The proliferation of AI-powered search engines has shifted information discovery from traditional link-based retrieval to direct answer generation with selective source citation, creating new challenges for content visibility. While existing Generative Engine Optimization (GEO) approaches focus primarily on semantic content modification, the role of structural features in influencing citation behavior remains underexplored. In this paper, we propose GEO-SFE, a systematic framework for structural feature engineering in generative engine optimization. Our approach decomposes content structure into three hierarchical levels: macro-structure (document architecture), meso-structure (information chunking), and micro-structure (visual emphasis), and models their impact on citation probability across different generative engine architectures. We develop architecture-aware optimization strategies and predictive models that preserve semantic integrity while improving structural effectiveness. Experimental evaluation across six mainstream generative engines demonstrates consistent improvements in citation rate (17.3 percent) and subjective quality (18.5 percent), validating the effectiveness and generalizability of the proposed framework. This work establishes structural optimization as a foundational component of GEO, providing a data-driven methodology for enhancing content visibility in LLM-powered information ecosystems.

View PDFOpen arXiv