GEO Is Not SEO: What the KDD 2024 Paper Actually Says About AI Search Visibility


TL;DR: The KDD 2024 paper “GEO: Generative Engine Optimization” proved that adding quotes, statistics, and citations to your website can boost AI citation visibility by up to 40%. Traditional SEO tactics like keyword stuffing have zero effect. Here’s what the research found — and how to apply it.


If you own a website, you’ve probably noticed something unsettling: traffic from Google is flat or declining, but your analytics show more “direct” visits you can’t explain. Those are AI answers. ChatGPT, Perplexity, Claude, and Gemini are reading your content and summarizing it — without sending you a click.

This isn’t a bug. It’s a new search paradigm.

In late 2023, researchers at Princeton University and IIT Delhi published a paper that formally defined this shift and proposed a solution — accepted at KDD 2024, one of the top conferences in data science and machine learning. The paper — “GEO: Generative Engine Optimization” (Aggarwal et al., KDD 2024) — has since become the foundational reference for anyone serious about AI search visibility.

I’ve spent the past several months building GetCiteFlow, which operationalizes this research for real websites. Here’s what the paper actually found, what it means for your content strategy, and how we turned those findings into a tool you can use today.


1. The Core Problem: Three Stakeholders, One Loser

The paper opens by framing generative engines (GEs) — systems like ChatGPT with browsing, Perplexity, Gemini, and Bing Chat — as a new category of search. Unlike Google, which returns a ranked list of blue links, GEs synthesize information from multiple sources into a single answer.

This creates a three-stakeholder dynamic:

Stakeholder Impact
Users Get faster, more accurate answers
GE Developers Higher engagement and revenue
Content Creators Lose control over when and how their content appears

The third stakeholder is the problem. As the paper states: “Given the black-box and fast-moving nature of generative engines, content creators have little to no control over when and how their content is displayed.”

This is the gap GEO fills — a black-box optimization framework that helps content creators improve their visibility without needing to understand the internal mechanics of each generative engine.

2. What the Paper Actually Found: The 40% Boost

The headline result: GEO methods can boost visibility by up to 40% in generative engine responses.

But the details matter more than the headline. The researchers evaluated 9 different optimization strategies across 10,000 queries in GEO-bench, their newly introduced benchmark spanning 25 domains:

What Works (Ranked by Impact)

Strategy Visibility Improvement What It Does
Quotation Addition ~40% Add direct quotes from credible sources
Statistics Addition ~30% Replace qualitative claims with data
Cite Sources ~25% Add inline citations to back up claims
Fluency Optimization ~25% Improve readability and flow
Technical Terms ~15% Use precise domain terminology
Easy-to-Understand ~15% Simplify complex language
Authoritative Tone ~15% Use persuasive, confident language

What Does NOT Work

Strategy Result
Keyword Stuffing Zero to negative impact
Unique Words Negligible improvement

This is the most important table in the entire paper for content creators. Traditional SEO tactics — keyword density, unique vocabulary — have no effect on generative engines. What works is credibility signals: quotations, statistics, citations, and clear language.

3. Domain Matters: One Size Does Not Fit All

One of the paper’s most insightful findings is that GEO strategies vary significantly by domain.

Strategy Best Domains
Authoritative Debate, History, Science
Cite Sources Facts, Statements, Law & Government
Quotation Addition People & Society, Explanation, History
Statistics Addition Law & Government, Debate, Opinion
Fluency Optimization Business, Science, Health

This means a generic “optimize for AI” checklist is insufficient. A SaaS company needs different GEO treatments than an e-commerce store or a healthcare publisher.

4. GEO Democratizes Visibility for Smaller Sites

Counterintuitive finding: lower-ranked websites benefit the most from GEO.

The paper reports that Cite Sources improved visibility for Rank-5 websites by 115.1%, while the same strategy decreased visibility for Rank-1 websites by 30.3%. This makes sense — generative engines are conditioned on content quality, not backlinks or domain authority. Small creators can compete on content alone.

The paper’s conclusion is worth quoting directly:

“The application of GEO methods presents an opportunity for these content creators to significantly improve their visibility in Generative Engine responses. By enhancing their content with GEO, they can reach a wider audience, leveling the playing field.”

5. What This Means in Practice

The paper proposes visibility metrics specifically designed for generative engines, including Position-Adjusted Word Count (how much text the AI writes about your site, weighted by position) and Subjective Impression (how influential your citation feels to a reader).

These aren’t just academic constructs. They translate into concrete actions:

  1. Add statistics — Replace “our product is popular” with “our product serves 50,000+ teams”
  2. Add quotations — Include expert testimonials with named sources
  3. Cite your sources — Link to studies, reports, and data that support your claims
  4. Structure for extraction — Use clear headings, lists, and summary sections
  5. Add structured data (FAQ Schema, HowTo, etc.) — While not directly studied in the paper, structured Q&A content is one of the highest-signal formats for AI extraction in practice

6. From Research to Practice: GetCiteFlow

Reading the paper is one thing. Applying it to your website is another.

When we built GetCiteFlow, we calibrated our analysis engine against the paper’s GEO-bench framework. Our tool scans your website across the dimensions the paper identified — entity clarity, FAQ coverage, content structure, llms.txt presence, and more — and gives you a 0-100 AI Visibility Score with prioritized fixes.

The free report takes 30 seconds:

  1. Enter your URL at getciteflow.ai
  2. Get a multi-dimensional AI visibility diagnosis
  3. Follow the prioritized recommendations — from FAQ Schema snippets to llms.txt content

Our paid Brand Visibility service handles full-site optimization, while AI Visibility Growth is a managed service that builds your brand’s presence across the AI ecosystem — the kind of domain-specific strategy the paper calls for.


The Takeaway

SEO was about keywords and backlinks. GEO is about credibility and clarity.

The 2024 KDD paper established that generative engines respond to fundamentally different signals than traditional search. Optimizing for AI citation isn’t about gaming a system — it’s about making your content genuinely more authoritative, structured, and citeable.

The sites that start this work today will have a compound advantage as AI assistants become the primary interface for information discovery.


Built by GetCiteFlow — AI visibility analysis for the AI-search era. Based on research by Aggarwal et al., KDD 2024.

Logo

AtomGit 是由开放原子开源基金会联合 CSDN 等生态伙伴共同推出的新一代开源与人工智能协作平台。平台坚持“开放、中立、公益”的理念,把代码托管、模型共享、数据集托管、智能体开发体验和算力服务整合在一起,为开发者提供从开发、训练到部署的一站式体验。

更多推荐