Red-engage logo
AI Strategy

Schema Markup for AI Citations:
The Only Types That Actually Work in 2026

The four schema types that drive measurable AI citation lift, why generic schema underperforms no schema at all, and how to implement each type with complete attribute population.

Published: April 15, 2026
17 min
GEOSchema MarkupAI SearchJSON-LDTechnical SEOFAQPageArticle Schema
Thumbnail: schema markup for AI citations 2026 — structured data frame motif on dark purple background.

Schema markup for AI citations in 2026 is different from what most SEO blogs still teach. Current research shows that AI engines tokenize JSON-LD as raw text during live retrieval rather than parsing it as structured data, and that generic or minimal schema actually reduces citation rates compared to pages with no schema at all. This guide lays out the four schema types that actually drive AI citation lift in 2026, how to implement each one with complete attribute population, and the specific mistakes that cost citations when teams cut corners. Written for SaaS teams who want evidence-based schema decisions, not another "add FAQPage to everything" checklist.

Introduction

Half of the schema advice online is outdated. The other half is subtly wrong.

The outdated half still talks about schema as if it were primarily a Google rich-results play. Add FAQ schema. Get stars in the SERP. Done. That framing is still correct for parts of traditional SEO, but it misses the mechanical reality of how AI engines actually use structured data in 2026. The subtly wrong half tells you to add as many schema types as possible. More schema is better, the logic goes. The data says otherwise.

The SearchVIU cross-platform test in October 2025 tested eight products with pricing information distributed across visible HTML, JavaScript-rendered content, JSON-LD, Microdata, and RDFa. Gemini found four of eight prices (the highest score). ChatGPT found three. Claude found zero. JSON-LD was not extracted by any system during direct fetch. The critical finding: no major AI system parses JSON-LD during live retrieval. The schema is tokenized as text.

The Growth Marshal study (n=730 citations, 1,006 pages, 75 queries, DOI 10.5281/zenodo.18728697) went further. Pages with attribute-rich, fully populated Product plus Review schema achieved a 61.7% citation rate. Pages with no schema at all achieved 59.8%. Pages with generic, minimal Article plus Organization schema achieved 41.6%. Half-implemented schema underperforms no schema at all.

This guide walks through what actually works in 2026. The Schema Citation Tier framework separates the schema types that drive citations, the types that provide indirect signals, and the types Google has deprecated and that no longer help. Each tier has its own implementation detail, complete with templates.

Quick Summary

Data Table
The myth
AI engines parse JSON-LD directly during live retrieval.
The reality
No major AI system extracts JSON-LD during live fetch. The schema value is indirect, through search engine indexing.
The data on impact
Attribute-rich schema: 61.7% citation rate. No schema: 59.8%. Generic minimal schema: 41.6%.
The Schema Citation Tier framework
Tier 1 (direct impact): FAQPage, Article plus Person, Product plus Review, HowTo. Tier 2 (indirect signals): Organization, BreadcrumbList, LocalBusiness. Tier 3 (deprecated): Course, ClaimReview, VehicleListing.
The rule
If you cannot fully populate a schema type, don't add it at all. Half-done is worse than not done.

Do AI Systems Actually Read Schema Markup?

This is where most schema advice for AI goes sideways. The short answer is no, not during live retrieval. The longer answer has more nuance.

SearchVIU's test was definitive on the live-retrieval question. Five AI systems (ChatGPT, Perplexity, Gemini, Claude, Google AI Mode) were tested on whether they extract JSON-LD data during direct fetch. None of them did. Every system that found prices found them in visible HTML or in JavaScript-rendered content (Gemini only). The JSON-LD was tokenized as raw text and ignored.

But schema still matters, for two indirect reasons.

First, Google's Knowledge Graph does read JSON-LD during indexing. Every AI engine that uses Google's search API (Gemini, Google AI Overviews, any product retrieving through Google) benefits from better indexed content, which means schema still drives part of the retrieval funnel. Frase has reported that pages with FAQPage schema appear in Google AI Overviews 3.2 times more often than pages without, and the mechanism is this indirect indexing path.

Second, the visible content that sits alongside well-structured schema tends to be extracted more often. A page with FAQPage schema usually has visible Q&A content. That visible content is what AI retrieval actually pulls. The schema is not the extraction target. It is a signal that produces the extractable content.

Practical implication: schema is valuable, but only when it is paired with visible content in the same format. Hidden schema with no corresponding on-page content is worse than nothing, because it signals inconsistency to Google's indexer.

Which Schema Types Drive the Most AI Citations?

The Schema Citation Tier is built on three years of evidence: Growth Marshal 2026 study on citation rate differentials, Frase research on FAQPage impact in Google AI Overviews, SearchVIU's live-fetch test, and our own client audits across 40 B2B SaaS deployments.

Tier 1:
Direct Impact (Always Implement)

These four schema types produce measurable citation lift when fully populated. Partial implementation is worse than nothing.

FAQPage. The single highest-impact schema type for AI citation. Frase research shows a 3.2x lift in Google AI Overviews for pages with complete FAQPage schema. The mechanism is indirect, via Knowledge Graph retrieval, but the correlation is consistent. Implementation rule: 5 to 10 Q&A entries per page, each answer at least 40 words, natural question phrasing that matches how users would type the query.

Article plus Person. Required for blog posts. The Article schema provides datePublished, dateModified, author, publisher, and mainEntityOfPage signals. The Person schema under "author" needs a complete sameAs array pointing to LinkedIn, Crunchbase, personal website, Wikidata (if applicable), and any professional profiles. Thin Person schemas with just a name underperform no Person schema entirely. The Growth Marshal data is clear: attribute-rich authors drive 20-point citation differentials.

Product plus Review. The schema that produced the 61.7% citation rate in the Growth Marshal study. Fully populated means SKU, brand, pricing, availability, aggregateRating, review count, and individual review snippets. Product schema with just a name and image underperforms no schema at all. For B2B SaaS specifically, Product plus Review on your pricing page and product tour pages is the single highest-leverage schema investment.

HowTo. For step-by-step content. Google's AI answer structure often mirrors HowTo schema's step format directly, which means well-implemented HowTo schema produces extractable step-by-step citations. Implementation rule: every step needs a name, text, and optionally an image. Time, cost, and supply fields add value when relevant.

Tier 2:
Indirect Signals (Foundation Layer)

These schema types do not produce direct citation lift but provide entity signals that help AI systems understand your brand.

Organization. Required for home page. Full sameAs array linking to LinkedIn, all review platforms (Clutch, G2, Trustpilot), Crunchbase, and every social profile. The Organization schema is the entity anchor. AI models use it to disambiguate your brand from similarly-named entities and to build consistent representation across queries.

BreadcrumbList. Every page. Simple, structural. Helps both traditional search and AI retrieval understand site hierarchy.

LocalBusiness. For agencies and services with physical presence. Less relevant for pure SaaS brands, highly relevant for agencies and consultancies like Red-engage.

Person (standalone). For team pages and executive profiles. Same attribute-rich rule applies. Thin Person schema hurts more than it helps.

Tier 3:
Deprecated (Do Not Implement)

As of January 2026, Google deprecated several schema types for rich results generation. These no longer produce search rich results and provide no measurable AI citation lift.

Course, ClaimReview, LearningVideo, SpecialAnnouncement, VehicleListing, PracticeProblems. If your site has any of these, remove them. They add JSON-LD weight without producing any measurable benefit, and in some cases may confuse Google's Knowledge Graph.

Why Does Attribute-Rich Schema Outperform Minimal Schema?

This is the counter-intuitive finding from the Growth Marshal study, and it changes how schema should be planned.

The study tested 730 AI citations across 1,006 pages and 75 queries, measuring citation rate based on schema implementation depth. The three implementations compared were:

Data Table
Implementation
Citation rate
Attribute-rich schema (fully populated Product + Review, or Article + Person with complete sameAs)
61.7%
No schema at all
59.8% (baseline)
Generic or minimal schema (Article with only headline and datePublished, or Organization with just a name)
41.6%

The hypothesis for why generic schema underperforms: Google's Knowledge Graph signals trust based on the completeness of the structured data. A partially populated schema signals that the content is either amateur or autogenerated, which lowers the Knowledge Graph's confidence in the entity. That reduced confidence then flows through to retrieval systems that query the Knowledge Graph.

The practical rule: if you cannot populate every recommended field for a schema type, do not add it. Partial schema is worse than nothing.

For B2B SaaS, this means a triage exercise. Go through your current schema implementation. For each schema type, ask: are all recommended fields populated with accurate, specific data? If yes, keep and enhance. If no, either populate it fully or remove it.

How Should FAQPage Schema Be Implemented for Maximum Extraction?

FAQPage is worth calling out separately because it is the most over-implemented and simultaneously the most under-optimized schema type in B2B SaaS.

The basic implementation pattern most teams use: add 3 to 5 short questions at the bottom of a page, each with a 30-word answer, wrap in FAQPage JSON-LD, ship. This produces minimal lift.

The optimized pattern: 6 to 10 questions per page, each question phrased in natural conversational language (how a user would actually ask it), each answer at least 40 words long and self-contained (readable independently of the rest of the page), with visible Q&A content that mirrors the schema exactly. This is what the 3.2x lift in Frase research is associated with.

Specific implementation details that move the needle:

Question phrasing. Match how users would type the query into ChatGPT or Google. "What does [product] do?" beats "Description." "How much does [product] cost?" beats "Pricing." "Is [product] a good fit for [use case]?" beats "Best fit."

Answer structure. Lead with a direct answer in the first sentence. The first sentence should work as a standalone citation. Subsequent sentences add context, examples, or links. Keep total answer length between 60 and 150 words for maximum extractability.

Visible on-page Q&A. The FAQ content must be visible on the rendered page, not hidden inside collapsed accordions. AI systems extract visible content. Hidden content does not help.

No keyword stuffing in answers. AI models detect and penalize unnatural keyword density. Write answers the way you would write them for a human.

What Does a Complete Article + Person Schema Look Like?

This is the schema pairing that most B2B SaaS sites either skip entirely or implement badly. Complete implementation requires commitment.

The Article side needs: headline, description, image (1200 x 630 minimum), datePublished, dateModified, mainEntityOfPage, publisher (Organization), author (Person), and keywords. Every field populated accurately.

The Person side is where most sites cut corners. A complete Person schema for a blog author needs: name, givenName, familyName, image, jobTitle, worksFor (Organization reference), and a sameAs array. The sameAs array is the critical piece. It should contain URLs to LinkedIn, any personal site, Crunchbase or similar professional profile, Twitter or X if active, and Wikipedia or Wikidata if the person is notable enough to have a presence there.

The Growth Marshal study implication: if the Person schema has fewer than 3 sameAs entries, consider whether it's better to leave the Person off entirely. A thin author signal hurts more than no author signal.

For B2B SaaS teams where the author is a founder or executive who does not yet have a LinkedIn Pulse cadence and cross-platform presence, the first move is building that presence before adding thin Person schema. The schema only works when the underlying entity signals are real.

When Should You Use HowTo Schema for SaaS Content?

HowTo schema fits a specific type of content and should not be overused.

The ideal candidates: tutorial content with clear step-by-step structure, implementation guides with ordered deliverables, setup documentation. HowTo requires each step to have a meaningful name and descriptive text. If you can write the content without numbered steps, HowTo is not the right schema.

Poor candidates: marketing content, feature descriptions, comparison articles. Do not force HowTo schema onto content that is narrative in nature. Google is increasingly skeptical of HowTo schema on non-tutorial content, and over-use can lead to manual action.

When HowTo fits, the citation payoff is real. Google's AI Overviews specifically favor step-format answers for "how do I..." queries, and well-structured HowTo schema surfaces directly as a step-by-step answer in the AI response.

Which Schema Types Should You Not Waste Time On?

As of January 2026, the following schema types have been deprecated by Google for rich results generation.

Data Table
Schema type
Status
Why
Course
Deprecated
No longer generates rich results
ClaimReview
Deprecated
Limited use cases, high misuse
LearningVideo
Deprecated
Consolidated into other video schemas
SpecialAnnouncement
Deprecated
COVID-era schema, no longer maintained
VehicleListing
Deprecated
Vertical-specific, low adoption
PracticeProblems
Deprecated
Narrow use case

If your site has any of these, remove them. They add weight to your HTML without producing benefit.

Additionally, the Speakable schema is worth flagging. It is technically in beta. Its use is limited to English-language news content in the US. There is no published evidence that Speakable improves AI citations. Skip it unless you are a news publisher with direct guidance from Google about implementing it.

What Does Each Schema Type Actually Look Like in JSON-LD?

Templates matter. Here are the four Tier 1 types in working JSON-LD format, each with the fields that need to be populated for the schema to drive real citation lift.

FAQPage (for a strategic content page):

Code
{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "How does X actually work?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "A complete 60 to 120 word answer written as standalone citation-ready text."
      }
    }
  ]
}

The implementation rule: the question phrased naturally, the answer written to work as extractable standalone text, and the visible page content mirroring the schema exactly.

Article plus Person (for a blog post):

Code
{
  "@context": "https://schema.org",
  "@type": "Article",
  "headline": "The exact post title",
  "description": "The excerpt, 120 to 160 words.",
  "image": "https://yourdomain.com/path/to/featured-image.webp",
  "datePublished": "2026-04-20T09:00:00Z",
  "dateModified": "2026-05-15T14:30:00Z",
  "mainEntityOfPage": "https://yourdomain.com/blog/slug",
  "keywords": ["tag 1", "tag 2"],
  "author": {
    "@type": "Person",
    "name": "Full Name",
    "givenName": "First",
    "familyName": "Last",
    "image": "https://yourdomain.com/team/author-photo.webp",
    "jobTitle": "Specific title",
    "worksFor": {
      "@type": "Organization",
      "name": "Your Organization"
    },
    "sameAs": [
      "https://www.linkedin.com/in/slug",
      "https://crunchbase.com/person/slug",
      "https://authorpersonalsite.com",
      "https://twitter.com/handle"
    ]
  },
  "publisher": {
    "@type": "Organization",
    "name": "Your Organization",
    "logo": {
      "@type": "ImageObject",
      "url": "https://yourdomain.com/logo.png"
    }
  }
}

The sameAs array is the most important field. Skip LinkedIn and the schema underperforms. Skip Crunchbase and the entity signal is thin. Include all four and the author entity is indexable.

Product plus Review (for pricing or product pages):

Code
{
  "@context": "https://schema.org",
  "@type": "Product",
  "name": "Your Product",
  "description": "What the product does, 60 to 120 words, written in the same voice as the page.",
  "image": "https://yourdomain.com/product-image.webp",
  "brand": {"@type": "Brand", "name": "Your Brand"},
  "offers": {
    "@type": "Offer",
    "price": "49.00",
    "priceCurrency": "USD",
    "availability": "https://schema.org/InStock",
    "url": "https://yourdomain.com/pricing"
  },
  "aggregateRating": {
    "@type": "AggregateRating",
    "ratingValue": "4.8",
    "reviewCount": "127",
    "bestRating": "5"
  },
  "review": [
    {
      "@type": "Review",
      "reviewRating": {"@type": "Rating", "ratingValue": "5", "bestRating": "5"},
      "author": {"@type": "Person", "name": "Reviewer Name"},
      "reviewBody": "Specific review text, 40 to 80 words."
    }
  ]
}

The aggregateRating and reviewCount should match real data from Clutch, G2, Trustpilot, or your own review system. Accuracy matters. Inflated ratings get flagged and can trigger manual action.

HowTo (for tutorial content):

Code
{
  "@context": "https://schema.org",
  "@type": "HowTo",
  "name": "How to do X",
  "totalTime": "PT30M",
  "estimatedCost": {"@type": "MonetaryAmount", "currency": "USD", "value": "0"},
  "step": [
    {
      "@type": "HowToStep",
      "position": 1,
      "name": "Step name",
      "text": "Step description, 30 to 60 words."
    }
  ]
}

Every step with a name and meaningful text. Generic step names ("Step 1," "Step 2") signal low quality and reduce rich result eligibility.

How Should B2B SaaS Teams Prioritize Schema Work?

If you are starting from zero or near-zero schema implementation, the priority order is:

First week: Organization schema on the home page. Complete, with full sameAs array. This is the entity anchor for everything else.

Second week: Article plus Person schema on every blog post. Full attribute population. If author Person schemas cannot be fully populated (missing LinkedIn, missing Crunchbase, missing personal site), the prerequisite work is building those entity presences before deploying the schema.

Third week: FAQPage schema on 5 to 7 strategic pages. Home, pricing, primary product, two or three highest-traffic landing pages. 6 to 10 Q&A entries per page, fully populated.

Fourth week: Product plus Review on pricing and product tour pages. If you have review volume on Clutch, G2, or Trustpilot, the aggregateRating and review count should be accurate. If you do not have review volume, defer until you do.

Ongoing: BreadcrumbList on every page. Simple and easy to implement site-wide.

The Schema Citation Tier framework keeps the prioritization clear. Tier 1 types drive measurable lift. Tier 2 types provide foundation signals. Tier 3 types are deprecated. Focus effort accordingly.

Key Takeaways

  • AI engines do not parse JSON-LD during live retrieval. Schema value is indirect, via Google's Knowledge Graph and the visible content that typically sits alongside structured data.
  • Attribute-rich schema (fully populated) drives citation lift. Generic or minimal schema underperforms no schema at all. This is counter-intuitive but consistent across multiple studies.
  • Tier 1 schema types for AI citation: FAQPage, Article plus Person, Product plus Review, HowTo. Implement these fully or not at all.
  • Tier 2 schema types provide entity signals: Organization, BreadcrumbList, LocalBusiness, standalone Person.
  • Deprecated schema types (Course, ClaimReview, VehicleListing, Speakable beta) no longer drive meaningful benefit. Remove them.
  • For B2B SaaS, the schema priority order is Organization first, then Article plus Person, then FAQPage, then Product plus Review, then BreadcrumbList.

Frequently Asked Questions (FAQs)

Does JSON-LD matter if AI engines don't read it during live retrieval?

Yes, indirectly. Google's Knowledge Graph reads JSON-LD during indexing, and most AI engines query Google's search API or a derivative of it during retrieval. Schema helps the Knowledge Graph understand your content, which then flows through to AI retrieval quality. The schema is valuable at the indexing stage, not the retrieval stage.

Should I add every schema type Schema.org lists?

No. Add schema types that have direct relevance to your content and that you can populate fully. Over-implementing generic schemas reduces citation rate compared to pages with no schema. Focus on Tier 1 types (FAQPage, Article plus Person, Product plus Review, HowTo) and ignore the rest unless they are directly relevant.

What tools should I use to validate schema?

Google Rich Results Test for direct validation of rich-result eligibility. Schema.org validator for standards compliance. Both are free. Run both before deploying new schema. For ongoing monitoring, Google Search Console's Enhancements section flags schema errors automatically.

How do I know if my schema is helping citations?

Track baseline citation rate for a prompt set before deploying schema. Deploy. Wait 30 to 60 days for re-indexation and retrieval updates. Re-measure. Peec AI and Bing Webmaster Tools AI Performance are the two main measurement surfaces for this.

Can I have multiple schema types on the same page?

Yes. A blog post page commonly has Article, Person (via author), Organization (via publisher), BreadcrumbList, and FAQPage schema all on the same page. Each one should be fully populated. Multiple fully-populated schemas on a page is fine and often helpful.

Does Speakable schema help with voice search or AI visibility?

There is no published evidence that Speakable schema improves AI citations or voice search rankings. It remains in beta, limited to English US news content. Skip it unless you are a news publisher and have direct guidance from Google about implementing it.

FAQ

Frequently asked questions

Yes, indirectly. Google's Knowledge Graph reads JSON-LD during indexing, and most AI engines query Google's search API during retrieval. Schema helps the Knowledge Graph understand your content, which flows through to AI retrieval quality. The value is at indexing, not retrieval.

Next step

Ready to get cited by AI?

We design content and systems that models cite and users trust. Let’s turn this strategy into measurable growth.