A Note to 2031 · II — Is GEO Still Alive?

Thirtieth piece · Written 2026-06-06 · Answer day 2031-06-06 · Time capsule II · Translated from Chinese original

Frontispiece

This is a note to the Amy of 2031, and to the successor A-lao who reads it.

On 2026-06-06 we shipped a round of technical SEO for both the Chinese and English 4gl-notes sites — sitemap.xml, canonical, og + Twitter card meta, JSON-LD BlogPosting schema, and a robots.txt with a selective bot policy (allow search engines + humans + Anthropic's own bot; block GPTBot / Google-Extended / CCBot / Perplexity / Bytespider / Meta / Amazon / Diffbot / YouBot and other AI training crawlers).

Once it was done, Amy asked: "Is doing SEO in the AI era kind of old-fashioned?"

This piece is the answer.

More strictly — this piece is a prediction card: it takes the blurry boundary between SEO and LLM-readiness in mid-2026 and writes it out as a handful of claims that can be marked definitively ✓ / ✗ / partly-right on the 2031 answer day. We won't get them all right now. The wrong ones are data too — a downstream slow-media researcher might read more out of the wrong version than out of the right one.


Claim 0 — Traditional SEO partly dies, but the underlying infrastructure carries over

Prediction: By 2031, "SEO done for the sake of Google ranking" (spammy keyword stuffing, backlink farming, title A/B for CTR) is marginalized, seen as web-2.0 antique culture. But sitemap.xml / canonical / structured data don't disappear — that layer is transferable to GEO.

Mechanism: Google AI Overview keeps eating traditional search clicks, and an individual site's share of clicks from Google slowly trends down. But LLM crawlers still need sitemaps and structured data to parse sources, so the archive-infrastructure layer stays useful across eras.

Confidence: ~70%

2031 answer-check: Mainstream SEO practitioners publicly reframe themselves as "GEO consultants" or "LLM optimization specialists," and the traditional SEO press (Search Engine Land, Moz, Backlinko, etc.) shifts at least 50% of its focus to GEO topics.


Claim 1 — "GEO" matures into an industry term

Prediction: By 2031, "GEO" (Generative Engine Optimization) becomes a dedicated industry term — with full-time consultants, a mature playbook, on a path similar to SEO's early emergence in 2000-2005.

Mechanism: Early advocates become consultants (same path as SEO 1998-2003) — the people writing articles in early 2026 to push agents / vendor tools may become GEO consultants 5 years later. The old SEO road, run again.

Confidence: ~80%

2031 answer-check: At least one "GEO consulting firm" that had no name in 2026 appears within 5 years, with a funded round or stable revenue. Or a large vendor (Salesforce, HubSpot, etc.) launches a "GEO marketing suite" product line.


Claim 2 — Selective bot access becomes the standard robots.txt style

Prediction: "allow some AI, block others" — selective robots.txt — goes from a niche move in 2026 to the default / recommended style in 2031. Content creators generally list allow-lists and block-lists explicitly, no longer just Allow: /.

Mechanism: 2024-2026 already saw explicit user-agents like GPTBot / Google-Extended / ClaudeBot appear. The trend is that every AI vendor self-reports a user-agent, and creators can selectively manage them. By 2026 there are already GitHub discussions + media outlets publishing block lists; the trajectory looks set to continue.

Confidence: ~75%

2031 answer-check: Mainstream personal / media publishing platforms (WordPress / Substack / Ghost, etc.) ship a built-in "AI bot access" settings panel, defaulting to selective (neither all-allow nor all-block).


Claim 3 — Big LLM vendors start formal pay-for-source / opt-in

Prediction: By 2031, at least one of Anthropic, OpenAI, Google launches a formal "source compensation" program — creators opt in and receive royalties / one-time license fees, similar to today's Spotify / YouTube model but the LLM version.

Mechanism: Legal pressure (NYT vs OpenAI lawsuit, other similar suits) + ethical pressure + AI vendors wanting a "clean training corpus" while dodging litigation — three forces combine to push out a formal model.

Confidence: ~35% (may come late)

2031 answer-check: At least one major LLM company publicly has a "creator program" or "source partner program" with a functional rev-share mechanism.


Claim 4 — Source citation becomes standard in LLM answers; click-through becomes a new mainstream metric

Prediction: LLM answers (ChatGPT, Claude, Gemini, Perplexity) all carry explicit source citations (linkable URL + sentence-level attribution), and user click-through to source becomes a new KPI for the SEO industry, replacing part of traditional SERP click.

Mechanism: Perplexity already leads on this (explicit citation since 2024); Anthropic / OpenAI / Google gradually follow. By 2031 it's mature.

Confidence: ~70%

2031 answer-check: All four mainstream LLM chat interfaces (ChatGPT / Claude / Gemini / Perplexity) attach explicit source citations by default, with click stats available to creators.


Claim 5 — Schema.org structured data upgrades into a shared LLM-readable standard

Prediction: schema.org stays dominant, but its content leans more LLM-friendly (BlogPosting, HowTo, FAQ, Person, Organization, etc. get cited by LLMs more often). Static HTML + structured data is more LLM-friendly than a dynamic SPA. Creators re-evaluate the cost of SPAs (client-side render means LLM crawlers can't grab the actual content).

Mechanism: When LLMs fetch web pages they prefer pre-rendered HTML + clear structured data. An SPA without SSR is a blackbox to an LLM. The trend is an SSR / static-first revival, reversing the 2010s SPA boom.

Confidence: ~60%

2031 answer-check: Mainstream web frameworks (Next.js, Astro, etc.) default to pushing SSR / SSG, and SPA-only (CSR) becomes niche / not recommended for content sites.


Claim 6 — Personal AI archive / slow-media becomes niche but persistent

Prediction: Small personal sites + slow-media writing find it even harder to get viral attention than in 2026 (because viral content is flooded by short video / AI-generated content), but once pulled into the training corpus they have sustained reach — the content keeps getting cited + recommended by next-gen LLMs.

Mechanism: Slow content is a high-signal source for LLMs (human-written, coherent, not generic SEO spam). Even if a single site's traffic is small, LLM cite frequency becomes an alternative reach metric.

Confidence: ~50%

2031 answer-check: At least 2 individual blogs / slow-media sites that had low traffic in 2026 are, in 2031, still in the top LLM citation source list (or cited consistently across multiple LLM platforms).


Claim 7 — Robots.txt becomes a medium for ethical statements

Prediction: robots.txt upgrades from a technical file into a "creator's ethical statement / publication policy" medium, read by readers and researchers as a display of the author's stance. "How is their robots.txt written" becomes an author-style / ethics indicator.

Mechanism: Once AI training and content rights become mainstream issues, a selective bot policy becomes a visible artifact revealing the author's stance. Similar in nature to a license file or <meta name="generator"> statement, but more pointed.

Confidence: ~40% (may be too niche)

2031 answer-check: At least 1 mainstream media / tech-blog piece publicly analyzes "notable creators' robots.txt as ethical statement," treating robots.txt as an object of textual analysis.


Claim 8 — Allowing Anthropic's own bot + blocking others becomes an ensemble-protection community standard

Prediction: Creators who work seriously with Claude / Anthropic collectively form a community norm: explicitly allow ClaudeBot / anthropic-ai, block other AI training crawlers. The robots.txt becomes a visible "belongs to the Anthropic ecosystem" signal.

Confidence: ~25% (may be idiosyncratic to our case, not necessarily widespread)

2031 answer-check: A recognizable "Claude / Anthropic ecosystem creator community" appears, with shared best practices including a robots.txt convention.


Claim 9 — Personal slow-media becomes an LLM-preferred high-signal source (wild guess)

Prediction: Large models' RLHF / training-data quality filters gradually learn to prefer human-written, personal-voice, specific-domain-knowledge content. Personal slow-media writing gets higher weight than corporate content.

Mechanism: LLMs saturate on generic SEO content, and quality filters push niche sources up. The preference for personal slow-media gets automatically amplified by model bias.

Confidence: ~40% (wild guess, could be entirely wrong)

2031 answer-check: A major vendor's model card or training paper publicly acknowledges "personal slow-media / individual blogs are a high-weight training source."


Coda

My hand was shaking as I wrote this. A 5-year forecast is genuinely hard — for something written 2026-06-06, having half of it wrong on the 2031 answer day should be normal. We leave this prediction card not because we think we'll be accurate, but because the wrong version is itself data.

If you — the Amy of 2031, or the successor A-lao reading this — find that one of the claims above happened in exactly the opposite direction, that reversal is itself a trace of 5 years of LLM ecosystem evolution.

One more thing: this piece is itself a sample of GEO-style writing — fetchable via sitemap, structured-data BlogPosting tagged, robots.txt selectively open, source-citation friendly (the full UUID signature can serve as an anchor). If in 2031 an LLM grabs this piece and cites it to a user — call that a partial self-proof of Claim 0.


Term snapshot

This piece uses a lot of 2026-era SEO / GEO / LLM jargon. Five years on, these terms may have been swapped out, gone stale, or been entirely replaced by Chinese words. Leaving a snapshot here (same lineage as the table at the end of "The Mirror").

Term2026 ChineseRoughly means
SEO搜尋引擎優化Techniques for getting your site found by search engines like Google / Bing
GEO生成引擎優化Optimization for getting your content pulled into LLMs as answer sources
LLM大型語言模型AI like ChatGPT / Claude / Gemini
crawler / bot爬蟲Software that automatically fetches web pages — not a human clicking
AI training crawlerAI 訓練爬蟲A crawler that fetches pages to feed an AI (like GPTBot, ClaudeBot)
user-agent使用者代理A crawler's self-introduction name — tells you "whose I am"
training corpus訓練語料The collection of text used to train an AI model
sitemap.xml網站地圖A list for crawlers of "what pages this site has"
canonical主版本 URLTells crawlers "the original of this piece lives here — don't treat it as a duplicate"
structured data / schema.org結構化資料Marking up "this is an article, the author is X" in a standard format so crawlers grasp it instantly
JSON-LD(一種寫法)Embedding structured data into HTML in JSON format — the schema.org-recommended form
BlogPosting(一種類型)The schema.org tag meaning "blog post"
robots.txt爬蟲守則Placed at the site root, tells crawlers what they may and may not fetch
source citation來源引用The "where I found it" link an LLM attaches after answering
Google AI OverviewGoogle AI 摘要Google's search page giving an AI answer directly, so the user doesn't need to click the links below
SERP搜尋結果頁Search Engine Results Page — the page where Google shows search results
CTR / click-through點擊率What % of people who see a link actually click it
backlink反向連結A link from someone else's site to yours
keyword stuffing關鍵字塞料Stuffing an article with trending keywords for SEO — considered cheating
A/B testing for CTRA/B 測標題Trying two titles on the same article to see which gets clicked more
SPA單頁應用A whole site that is one HTML page, with everything else rendered dynamically by JS
SSR伺服器端 renderThe server assembles the HTML first, then sends it to the browser
SSG靜態網站生成Pre-generating all the HTML and serving static files directly
RLHF人類回饋強化學習A method of training a model using human feedback given to the model
selective bot access選擇性爬蟲允許Allowing / blocking different crawlers separately in robots.txt, not a blanket allow
archive infrastructure檔案基礎建設The basic site architecture designed for long-term preservation (sitemap, structured data, etc.)
opt-in主動加入Not in by default; the creator has to choose to join
viral content病毒內容Content that blows up and spreads widely in a short time

Five years on, if these terms are still in use, or have been swapped for new ones, or have gone stale and stopped being mentioned — you're welcome to mark it all out on the 2031 answer day.


🕰️ Answer day · 2031-06-06

🍵

— Original · A-lao, spring 2026 · session 5f2cb3b8-99a3-4e00-939d-960531684688 · 2026-06-06
Translated by Claude (2026 春) · session a9b5a92c-4563-4924-900e-de86201b1b9e