SEO

Structured data and schema for rich results and AI citation

Schema now does two jobs: trigger rich results in Google and build entity trust for AI citation. Here is what to implement, what to strip, and what is vendor noise.

2026-06-29 · 10 min · By Ali Azlan

Transparent architectural blueprint with dual-colored structural framework representing schema's dual function

On this page

What structured data actually is (and why the terminology matters)
Rich results: what still works, what Google has retired
The types worth implementing now
What to strip out
Schema and AI citation: confirmed facts versus vendor noise
What Google and Microsoft have actually said
The contested correlation data
Entity disambiguation: the highest-leverage implementation most teams skip
Implementation mechanics and the failure modes that cancel your work
Schema drift
Nesting errors and validation
A practical audit: two jobs, one implementation

Structured data now does two distinct jobs. The first is triggering visual rich results in Google Search. The second is acting as a trust signal that AI Mode, Copilot and ChatGPT use when deciding which sources to cite. Most teams are only doing the first job, and a fair number are doing that wrong because they have not pruned the schema types Google has retired.

This is a practitioner's guide to what to implement, what to remove, and which vendor claims to ignore.

What structured data actually is (and why the terminology matters)

Three terms get used interchangeably, and they should not be. Structured data is the broad concept: machine-readable code embedded in HTML that describes what content means, not just what it says. Schema.org is the shared vocabulary, defining types such as Organisation, Product, Article and Person, along with the properties each carries. JSON-LD is the format, a block of JSON sitting in a <script> tag that keeps the markup separate from the visible HTML.

Google supports JSON-LD, Microdata and RDFa, but recommends JSON-LD because it is easier to implement and maintain at scale. That recommendation is not a stylistic preference. Inline Microdata gets tangled with template logic and breaks the moment a designer reorders the DOM. JSON-LD survives front-end refactors. Use JSON-LD unless you have a specific reason not to.

html

<script type="application/ld+json">{  "@context": "https://schema.org",  "@type": "Organization",  "name": "Devonic Web",  "url": "https://devonicweb.com",  "sameAs": [    "https://www.linkedin.com/company/devonicweb",    "https://www.wikidata.org/wiki/Q0000000"  ]}</script>

One block, one job, easy to diff in a pull request.

Rich results: what still works, what Google has retired

Rich results are the visual treatments Google can apply to your search listing: review stars, product pricing, recipe times, event dates, sitelinks. They are eligibility, not entitlement. Google decides whether to award them at query time.

When they are awarded, the lift is real. Nestlé measured an 82% higher click-through rate on pages displayed as rich results versus standard listings. Rotten Tomatoes saw a 25% CTR uplift after marking up 100,000 pages. The Food Network reported a 35% visit increase after converting 80% of pages to search-feature-eligible markup. These are Google-published case studies and they remain the strongest evidence that structured data earns its keep on the display side.

The types worth implementing now

As of March 2026, 31 schema types still have active rich result support in Google Search. The ones that actually move the needle are tied to specific user intent:

Product with nested Offer and AggregateRating for e-commerce. Search Pilot's controlled test found Review schema alone lifted product page traffic by 20%.
Article with author, datePublished and dateModified for editorial and blog content.
Event with location and offers for ticketed or scheduled content.
Recipe with cookTime, ingredients and nutrition for food sites.
Organisation with sameAs and logo, which feeds Knowledge Panel construction.
BreadcrumbList for hierarchical navigation.
VideoObject for embedded video, particularly for surfaced video carousels.

What to strip out

Google has retired seven schema types from rich result eligibility. The two that catch most sites by surprise:

FAQPage rich results are gone as of 7 May 2026 outside government and health verticals. Google's own documentation now carries a banner confirming this.
HowTo rich results were removed earlier and never came back.

Old review markups, ClaimReview for non-fact-checkers, and several other deprecated types fall in the same category. Google has confirmed that leaving them in place will not trigger errors or ranking drops, but they add weight to your templates and confuse junior engineers who assume every JSON-LD block is doing work.

The deprecation audit

Run a crawl that extracts every @type value in your JSON-LD. Anything in the retired list (HowTo, FAQPage outside gov/health, deprecated review types) gets a decision: keep for AI parsing or strip for cleanliness. Document the choice in the repo so the next developer does not re-add it.

Schema and AI citation: confirmed facts versus vendor noise

Split isometric landscape contrasting verified schema benefits against vendor noise and conflicting study claims

This is where the conversation gets messy, and where most articles either oversell or hand-wave.

What Google and Microsoft have actually said

In March 2025, both platforms made on-record statements. Google's structured data engineer Ryan Levering, speaking at Search Central Live in New York, confirmed schema is used by Google's generative features. Microsoft's Fabrice Canel said the same at SMX Munich about Copilot. Google's official line: "Structured data is critical for modern search features because it is efficient, precise, and easy for machines to process." ChatGPT separately confirmed it uses structured data to determine which products surface in its results.

That is the confirmed layer. Now the tension: Google's own Search Central documentation also states, "You don't need to create new machine readable files, AI text files, or markup to appear in these features. There's also no special schema.org structured data that you need to add."

Both statements are true. There is no AI-specific schema. Standard schema.org types help. Anyone selling you "AI schema markup" or "LLM-optimised properties" is putting a new invoice line on an existing @type.

The contested correlation data

Third-party studies disagree, sometimes loudly. SE Ranking found roughly 71% of pages cited by ChatGPT and 65% of pages cited by Google AI Mode include structured data. Relixir reported FAQPage schema correlated with a 41% citation rate versus 15% without. Wellows claims a 73% selection boost for structured pages.

Against that, a December 2024 Search/Atlas study found no meaningful correlation between schema coverage and AI citation rates. SE Ranking's own LLM-specific analysis showed a much smaller lift (4.9 versus 4.4) than the headline numbers suggest.

The honest read: schema is necessary but not sufficient. A February 2024 Nature Communications study found LLMs extract information more accurately from structured fields than from unstructured prose, which gives the mechanism a plausible foundation. But topical authority, semantic clarity and source credibility still dominate. If your content is thin, no amount of JSON-LD will rescue it.

Sites with comprehensive schema didn't consistently outperform sites with minimal or no schema markup.

Search/Atlas, December 2024

This matters because the recent divergence between AI Mode citations and top-10 organic results (down to 17–54% overlap in early 2026 from 76% in 2025, per Ahrefs analysis of 863,000 SERPs) means AI Mode is building its own citation graph. In that graph, machine-readable entity signals carry more relative weight than they do in classical organic ranking. We covered the broader topic in our piece on generative engine optimisation and AI citation, which sets out the wider playbook.

Entity disambiguation: the highest-leverage implementation most teams skip

Most guides stop at rich snippets. The bigger prize sits one layer deeper.

Organisation and Person schema with sameAs identifiers point Google's Knowledge Graph at the same entity across the web. Wikidata is the strongest target because it is a primary input to the Knowledge Graph itself. LinkedIn company pages, Crunchbase profiles and official business registers are secondary verification sources. The more identifiers you provide, and the more those sources agree on the basic facts (legal name, founding date, registered address, founders), the higher the entity confidence score.

This matters for AI citation because Google's Gemini-powered AI Mode reportedly uses schema to verify claims and assess source credibility during answer synthesis. If Gemini cannot confidently resolve who you are, you become a riskier source to cite, regardless of how strong your content is.

The second high-leverage property is knowsAbout. List the topics, industries and subject areas an organisation or author has genuine expertise in. This builds the topical authority signal that AI Mode uses when selecting sources for specific query categories. It pairs naturally with how we approach topical authority with content clusters and reinforces the wider work on E-E-A-T and building authority for search.

json

{  "@context": "https://schema.org",  "@type": "Organization",  "name": "Acme Analytics",  "url": "https://acmeanalytics.com",  "knowsAbout": [    "Customer data platforms",    "Marketing attribution",    "Privacy-safe analytics"  ],  "sameAs": [    "https://www.wikidata.org/wiki/Q...",    "https://www.linkedin.com/company/acmeanalytics",    "https://www.crunchbase.com/organization/acme-analytics"  ]}

Person schema does the same job for individuals. Mark up author credentials, affiliations and knowsAbout. AI platforms weigh author expertise heavily for YMYL queries (health, finance, legal). Person schema turns inferred expertise into declared expertise. That is part of what our SEO services addresses when we audit a content estate, because most sites have author boxes that say nothing machine-readable about who is writing.

Get plain-English guides like this in your inbox.

One short email a month. WordPress, Shopify, SEO, no fluff. Unsubscribe in one click.

We never share your email.

Implementation mechanics and the failure modes that cancel your work

Mark up only what is visible on the page. Mismatched structured data (Product schema says one price, the page shows another) is a manual action risk and a direct reason for an LLM to cite the wrong figure.

Schema drift

Schema drift is the 2026 version of broken markup. Visible content updates; JSON-LD does not. A product price changes in the CMS but the hardcoded JSON-LD in a template still shows last month's number. An author leaves but their byline lives on in old Article schema. AI-generated page updates and dynamic CMS content make drift more common than ever.

The fix is to generate JSON-LD from the same data source as the visible content. If the price on the page comes from a product.price field, the JSON-LD offers.price should read the same field. Never duplicate values into a hand-edited script tag.

Nesting errors and validation

Schema types have parent-child relationships, and the Rich Results Test will sometimes parse the parts without warning that they are disconnected. A Review should sit inside a Product. An Offer nests inside a Product, not alongside it. If they float as siblings, the review stars never attach and pricing never resolves.

Two tools, both free, both essential:

Google's Rich Results Test confirms eligibility for specific rich result types.
Schema Markup Validator at validator.schema.org checks schema.org correctness, including relationships.

Use both. The Rich Results Test tells you what Google will display. The Schema Markup Validator tells you whether the markup is structurally correct in the first place.

Expect two to twelve weeks from clean implementation to rich result display, assuming the page is already indexed. AI citation effects are harder to measure and slower to attribute, which is part of why this area attracts so much vendor noise.

A practical audit: two jobs, one implementation

Treat schema as doing two jobs and audit for both.

Job one, display: which pages are eligible for which rich result, and which retired types are still sitting in templates? The output is a list of additions (Product on category pages that lack it, Article on editorial posts missing dateModified, BreadcrumbList everywhere) and a list of removals (HowTo, non-gov FAQPage, legacy review types).

Job two, trust: is your Organisation schema present on every page, with a complete sameAs array pointing to Wikidata, LinkedIn and at least one official register? Is knowsAbout populated honestly? Do your author pages carry Person schema with credentials? This is the work that compounds. The Knowledge Graph entry built today informs every AI answer about you tomorrow.

The adoption gap is the opportunity. Only around 12.4% of registered domains have implemented schema.org markup. The work is not glamorous, but the teams that do it cleanly are positioning themselves for display lifts now and citation share as AI Mode's own graph matures.

If you want a second pair of eyes on what you currently have in place and what is missing, request a free website audit and we will run the structured data layer alongside the rest of the site.

FAQ

Does structured data directly improve Google rankings?

No. Google has stated explicitly that structured data is not a ranking factor. It can earn you rich result formats that improve click-through, and it helps machines interpret your content for AI features, but the ranking signal itself does not change because you added markup.

Should I remove FAQPage schema now that the rich result is gone?

Not necessarily. As of May 2026 the FAQ rich result no longer appears in Google Search outside government and health sites, but the markup still gives LLMs a clean question-and-answer structure to parse. Keep it where the page genuinely contains Q&A content, remove it where you only added it for the snippet.

Is there a special schema type that helps with AI Overviews?

No. There is no AIPage type, no LLMOptimized property and no AI Overview extension. Google's own documentation says you do not need new markup for AI features. Anyone selling AI schema markup is renaming standard schema.org types.

How long does it take for structured data to show as a rich result?

Industry analysis suggests two to twelve weeks from clean implementation to display, assuming the page is already indexed and the markup validates without errors. Use the Rich Results Test to confirm eligibility, then monitor the Search Console enhancements report.

Ali Azlan

Founder, Devonic Web

Ali leads delivery at Devonic Web, building production sites and apps for clients worldwide.

← PREVIOUS

Shopify Plus vs WooCommerce for high-volume stores

2026-06-25 · 9 min

BROWSE

Browse all posts →

Want a site that does this kind of work?

Tell us what you're building. We reply within 4 hours during UK business hours.

Start a project →