How to become an expert in AI search platforms (AMERS)

In this webinar

Your audience isn't searching the way they used to. They're asking AI search platforms like ChatGPT and Perplexity. And when they do, one question becomes dozens. AI search platforms silently generate a series of follow-up questions behind the scenes, and if your content doesn't already answer them, you're invisible.

In this webinar, you'll get a clear checklist of what to look for on your pages, plus a practical path to reduce the risk of your organization becoming invisible as AI models evolve.

Watch the webinar

Press play to watch the webinar on demand.

Poll results

Yes, I understand it well - 0%
I have a rough idea – 64%
I have a vague idea – 27%
No idea – 9%

AI visibility report

Get a free AI visibility report of your content. See where you stand today and get clear recommendations on next steps.

Get your free report

Webinar Q&A

Content and structure

Abbreviations can hurt your AI search visibility if they're not explained. AI language models learn from text, so if your content only uses an abbreviation without spelling it out, the model may not connect it to the underlying concept a user is asking about. For example, if a user asks about "content management systems" and your site only ever uses "CMS", the AI may not surface your content in response.

The fix is straightforward: spell out abbreviations on first use (e.g. "content management system (CMS)"), or include both forms naturally in your content. This helps AI models understand what your content is about and match it to relevant queries.

AI search platforms are more likely to retrieve and cite content that's clearly structured around questions and direct answers. But that's not the same as saying you need formal FAQ pages.

The Q&A format helps because AI is increasingly matching specific questions to specific answers. But the traditional "dump every question onto one page" FAQ isn't what makes that work, and it can be actively unhelpful. Poor FAQ pages are usually a content quality problem, not a format problem — if the questions are shallow, repetitive, or written to game search, they won't help humans or AI.

A few things to keep in mind:

Topic depth beats topic breadth. A focused page that thoroughly covers one topic will outperform a single page listing 30 loosely related questions. One primary topic per page, answered properly. FAQ pages that sprawl across unrelated topics dilute the signal for both readers and AI.
The Q&A structure is what matters, not the FAQ label. AI search breaks a user's question into many sub-questions behind the scenes and looks for content that directly answers each one. A heading written as the question your audience would actually ask, followed by a passage that fully answers it, does the same job as a formal FAQ block — and is often the better choice when an FAQ page would feel forced.
You don't have to abandon content design best practice. Use real FAQ pages for genuinely common, narrow, transactional questions (entry requirements, fees, deadlines, opening hours). For everything else, weave the Q&A pattern into the page itself through clear question-style headings and direct answers underneath.
Know how hidden content patterns behave. Accordions, tabs, and other "click to reveal" elements are usually fine for AI search, as long as the content sits in the page's HTML at load and is just visually collapsed. The problem is when content is only injected after a click, hover, or other JavaScript event. Many major AI crawlers (Perplexity, ChatGPT, Google's AI systems) don't execute JavaScript, so anything loaded in dynamically may be invisible to them. Screen readers generally work against the rendered DOM and can read this content; AI crawlers often can't. To check, view the page source rather than the rendered DOM.
Consistent terminology and natural variation aren't in conflict. Use the same name for the same thing — your course title, your product name, the terms your organisation has decided on — every time it appears. That builds a coherent signal across your site. Within the surrounding content, it's fine to reflect the different ways your audience naturally phrases things. The rule: be consistent on identity, flexible on phrasing.

In practice: think less about "should we build an FAQ page" and more about "are the questions our audience asks clearly answered somewhere on our site, in language they'd recognise, with the answer right next to the question." That's the pattern AI is rewarding.

General

Yes – the recording and Q&A will be shared after the webinar. Keep an eye out for a follow-up email with the link.

Video and media

AI models can't watch video – they can only read text. So if your video content isn't accompanied by text, it's effectively invisible to AI search.

Here's what you can do:

Add accurate transcripts to your videos and publish them on the same page as body text
Write a clear summary of what the video covers
Use descriptive titles and metadata

Subtitles help with accessibility, but on their own they may not be read by AI in the same way as page body text. A transcript that's part of the page content is the most reliable approach.

Either approach can work, as long as the transcript is in HTML and reachable by AI crawlers. AI search platforms don't watch videos or process audio, so the major crawlers rely on the text version. If the transcript isn't there or isn't readable, the content effectively doesn't exist as far as AI is concerned.

The simplest, strongest option is to put the transcript directly on the same page as the video — either visible underneath, or in an expandable section that's still present in the page source. That keeps the topic, video, and text together, and AI sees one cohesive page with clear context.

Separate transcript pages can also work, but they have to be indexable. AI search platforms find content by running Google or Bing searches behind the scenes and pulling from the top results — so if a transcript lives on its own page, that page needs to be crawlable and indexable. A "hidden" page (set to noindex, behind a login, or unlinked from the rest of the site) won't help your AI visibility, even if you link to it from an indexed page.

If you go this route, link clearly and contextually — for example, "Read the full transcript of [video title]" — rather than vague link text like "click here," so both humans and AI understand the relationship between the two pages.

The same logic applies to PDFs. If you're publishing a plain-text version of a PDF, that text needs to live on an indexable web page — not a hidden file or an unlinked URL. Where you have the choice, publishing the content directly as a properly structured web page (with the PDF as a downloadable supplement) is the stronger option for both AI and users.

In practice: treat a transcript like any other piece of content on your site. It needs to be in HTML, indexable, and clearly linked from the related video or PDF — ideally on the same page where it's most useful. A transcript locked away on a hidden page won't help, because AI can only cite what it can find.

Technical and metadata

There are a few ways. The most reliable is structured metadata – specifically, schema markup (a standardised way of embedding machine-readable information into your page's code) that includes a datePublished or dateModified field. AI crawlers can read this directly.

Beyond that, some AI tools look at signals like HTTP headers, which can indicate when a page was last modified, or rely on when a page was last crawled and indexed.

Not all AI models weight recency the same way, and some don't surface publish dates at all. But if keeping content current is part of your strategy – which it should be – using schema markup to make your publish and update dates explicit is a good habit.

llms.txt is worth knowing about, but it's not something to lose sleep over. The far bigger priority is making sure your actual content is high-quality, well-structured, and genuinely useful.

Why llms.txt isn't a strategic priority right now:

There's a lot of noise around llms.txt, but the evidence is thin. Google's John Mueller has explicitly stated that llms.txt is not a ranking signal, not an SEO tool, and not endorsed by Google Search. He's even compared it to the old meta keywords tag, noting that no major AI search platforms currently use it. Search Engine Land tracked 10 sites after implementing llms.txt and found 8 of 9 showed no measurable change in AI-driven traffic.

It's worth noting that llms.txt does have real adoption among well-known tech companies. Anthropic, Perplexity, Cloudflare, Vercel, Hugging Face, and hundreds of others publish llms.txt files. However, there's an important distinction: that adoption is almost entirely on the publishing side, not the consumption side. These companies publish llms.txt files on their developer documentation sites so that AI coding agents (like Claude Code or Cursor) can quickly ingest their API docs. But their AI search platforms don't look for llms.txt files on other websites when deciding what to cite. Anthropic publishes one for its API docs, but Claude's search feature uses its own retrieval pipeline. Perplexity publishes one on its docs site, but its search engine uses RAG to find and cite content, not llms.txt files. None of the major AI companies, including Google, OpenAI, Anthropic, or Mistral, have formally adopted llms.txt as a standard their search products consume.

So if you're a developer platform with API documentation, there's a legitimate case for publishing one. For general website content aimed at being discovered and cited by AI search platforms, the evidence doesn't support it as a priority.

How AI search actually works (and why good content wins):

This is the key thing to understand. AI search engines don't work like traditional crawlers methodically indexing every file on your site. They use a technique called retrieval-augmented generation (RAG), where they search the live web in real time, pull relevant pages, and synthesize an answer. Perplexity, for instance, defaults to a retrieval-first approach: it runs a live search, retrieves documents, synthesizes an answer, and cites its sources inline. ChatGPT sends queries to Bing's API, fetches full page content from selected URLs at runtime, and processes them directly for synthesis.

In other words, these AI search platforms are browsing the web much like a human researcher would: they're looking for the best, clearest answer to a question, pulling from real web pages. They don't need a special text file to find you. They need your content to be good enough to be worth citing.

As Vercel's SEO team puts it, depth and clarity matter more than repetition or scale, because LLMs don't match keywords – they interpret meaning. AI search platforms don't rank pages the way Google does. They retrieve structured, trustworthy, citable content.

What actually matters:

Rather than spending time maintaining an llms.txt file, focus on:

Writing content that directly and clearly answers the questions your audience is asking
Using clean heading structures and semantic HTML so content is easy for both humans and machines to parse
Keeping content accurate and up to date
Making sure pages load fast – if your server response time is slow, an AI search platform retrieving information in real time will likely abandon the request and pull data from a faster competitor
Ensuring robots.txt isn't blocking AI crawlers (this is the genuinely important technical step)
Building authority through quality backlinks and being cited on trusted third-party sites

If you build content that a smart, time-pressed human would find useful, clear, and trustworthy, you're building content that AI will find useful, clear, and trustworthy. That's the game now. llms.txt is not a waste of time if your CMS generates it automatically, but it shouldn't take priority over the fundamentals. The real moat is content quality.

There are technical changes that help – but they're secondary to the quality of the content itself.

The most impactful thing you can do is ensure your content is clear, complete, and explicitly answers the questions your audience is asking. Technical optimisation supports that foundation by making it easier for AI search platforms to access, interpret, and extract content that's already worth citing.

With that in place, the technical areas worth focusing on are:

Semantic HTML and heading hierarchy – a logical structure (H1 → H2 → H3) helps AI search platforms understand what your content is about and how ideas relate. Pages that skip heading levels or use headings purely for styling make this harder.
Content in HTML, not locked away – AI search platforms primarily process HTML. While formats like PDFs can be read, they're less reliable and harder to interpret structurally. Key information should live in accessible, well-structured web pages.
Crawler access and indexability – ensure your site isn't blocking important crawlers via robots.txt, including AI-specific user agents (e.g. GPTBot, ClaudeBot).
Clear, self-contained passages – AI search platforms retrieve and evaluate content in chunks. Content that directly and concisely answers a question is more likely to be extracted and cited than content that buries the answer or relies heavily on surrounding context.
Internal linking and contextual signals – strong internal linking helps AI search platforms understand relationships between pages and reinforces which content is most important.
Consistent terminology – using consistent language to describe the same concepts strengthens semantic signals and reduces ambiguity.
Descriptive metadata and link text – many AI search platforms reference, at least in part, search infrastructure. Clear titles, descriptions, and anchor text still play a role in how content is discovered and prioritised.
Structured data (where appropriate) – schema markup can help reinforce meaning, signalling to AI search platforms whether a page is an FAQ, a how-to guide, a service page, and so on. It can be a valuable layer to add once your content foundations are in place.
Fast server response – AI search platforms retrieve content in real time when answering a question. If your server responds slowly, the AI is more likely to pull from a faster source. This makes Time to First Byte (TTFB) more directly consequential than in traditional search.

This builds on the JavaScript question above, but the short answer is: AI search platforms generally favour static, HTML-first content that's available on page load.

Dynamic content – anything that loads, changes, or appears based on user interaction, personalisation, or real-time data – can present challenges. Most AI crawlers make a single HTTP request and process the initial response. They typically don't click, scroll, or wait for additional content to load.

Common types of dynamic content and how they're handled:

Personalised content (e.g. content that varies by user, location, or session) – AI crawlers usually see a default, non-personalised version. If important content only appears for certain segments, it may not be visible to AI at all.
Content loaded on interaction (e.g. tabs, accordions, "read more" expandables) – if content isn't present in the initial HTML, AI crawlers may not reliably see it. This is common on service pages and FAQ sections.
Real-time or frequently updating content (e.g. live feeds, stock levels, event listings) – AI crawlers capture a snapshot at the time of crawling, which can lead to outdated or inconsistent information in responses.
Gated or login-protected content – generally invisible to AI crawlers. If key information sits behind a login or form, it won't be discovered or cited.

Where content is inherently dynamic, structured data can help provide AI search platforms with stable, machine-readable signals about what the page represents, even if the visible content changes between visits.

Your core content – the information you most want AI search platforms to find and cite – should be available as static HTML in the initial page response. Dynamic features can enhance the user experience, but they shouldn't be the primary delivery mechanism for critical content. Where dynamic content is unavoidable, server-side rendering can help ensure AI crawlers receive a complete version of the page rather than an empty or partial shell.

Focus on ranking well in search engines, writing clearly and completely, keeping content consistent and authoritative across your site, and measuring change over weeks rather than days. That's the formula that makes you reliably citable.

AI search platforms work in two layers. The first is the model's own training data — the information it was trained on, frozen at a point in time. The second, and the one that matters most for visibility, is live retrieval: when you ask a question, the platform runs a search behind the scenes against a major search index (typically Google, Bing, or the platform's own index) and drafts an answer from the top results.

That second layer is where most of the citation decisions get made — and it has a few important consequences:

Search rankings still matter a lot. When the AI runs its retrieval search, it typically looks at a handful of top-ranked links per sub-question. If your content doesn't rank, the AI generally won't go hunting - it'll work with what it pulled back.

Sub-query fan-out multiplies this. Every user question is broken into multiple sub-questions, each with its own search. Those handfuls of links can stack up to dozens of distinct results stitched together. The more sub-questions your content directly answers, the more chances you have to be cited.

On Reddit: Reddit shows up disproportionately often, for two reasons. Early models (including the original ChatGPT) were trained heavily on Reddit, so it's embedded in the underlying language patterns. More importantly today, Reddit content tends to be written as a clear question with a direct answer — exactly what AI is looking for. If a Reddit post answers a question more clearly than your site does, Reddit wins. The fix is to make your own content clearer, more direct, and more complete. (Note: the recent Google core update reportedly reducing Reddit's authority may flow through to AI platforms over time, but it's not something to plan around, so the underlying lesson is still controlling what you can, which is your own content.)

On link signals: AI platforms use external references and citations as an authority signal, similar to how SEO treats backlinks. Multiple credible sources pointing to the same information on your site strengthens the trust signal — but link-building isn't a substitute for clarity and accuracy on your own pages.

On duplicate content and canonicalisation: canonical tags tell search engines which version of a duplicated page is the authoritative one. Without that signal, your ranking gets diluted across multiple URLs, which weakens your chances of appearing in the results AI pulls from. Canonicalisation helps AI visibility for the same reason it helps SEO.

Audits and strategy

Yes – but your audits need to evolve. Traditional SEO focuses on things like keyword density, backlinks, and page rankings. AI search visibility requires a different lens: is your content clear, accurate, well-structured, and does it directly answer questions?

Some SEO fundamentals still apply (technical health, structured data, page speed), but you'll want to add checks for things like content clarity, the presence of direct answers, and whether your pages are being cited by AI tools. Think of it as expanding your audit criteria rather than replacing them.

Focus on high-leverage, sitewide fixes first; then prioritise your highest-traffic, highest-intent pages; then bring in tooling to scale beyond what your team can do manually. Trying to fix everything page by page is a trap for a small team on a large site.

If you had to pick three things to focus on, in this order:

1. Fix structural and accessibility issues at the template or platform level. This is the highest-leverage work you can do, because a single fix applies across thousands of pages. Things like correcting heading hierarchy in your templates, making sure your CMS outputs semantic HTML, ensuring shared components (header, footer, navigation) have proper ARIA labels and accessible markup, and making sure content isn't being hidden behind JavaScript that AI crawlers can't read. These fixes are invisible to most users but transformative for AI: they make the entire site readable in a way it wasn't before. They also tend to surface where your CMS or platform itself is contributing to the problem, which is far more efficient to fix at the source than page by page.

2. Prioritise your highest-traffic, highest-intent pages and fix the content on them. You won't get to all 10,000 pages manually. Pick the 20 to 50 pages that drive the most value (homepage, key service or course pages, top-converting content) and apply the content checklist there: lead with a direct answer, one primary topic per page, consistent terminology, replace vague marketing language with specifics, make sure the questions your audience actually asks are clearly answered. These pages do disproportionate work for your visibility — fixing them well delivers more impact than spreading effort thinly across the long tail.

3. Bring in tooling to scale beyond what humans can audit manually. This is where small teams genuinely can't keep up. AI search platforms test your content against thousands of sub-questions behind the scenes; no human team can anticipate or audit that volume. A content auditing tool — Squiz Content Intelligence does this — simulates how AI interprets your site, identifies the gaps, recommends prioritised fixes, and lets you work through them systematically rather than guessing where to start. It also makes the work repeatable, which matters because the AI landscape and your content both keep changing.

A few practical notes that go alongside this:

Audit before you act. Start with a clear picture of where you are — which pages are strong, which are weak, where the structural issues are concentrated. Without that, prioritisation is guesswork.
Measure as you go. Pick a baseline (referral traffic from AI platforms, content audit scores, ranking on key topics) and re-check after fixes go live. Expect a lag of a few weeks for changes to flow through.
Repeat. This isn't a one-off project. Models, search platforms, and audience expectations are all moving. Build a quarterly or biannual review into the team's rhythm so you stay ahead rather than catching up.

Fix the foundation once, focus the manual effort where it matters most, and use tooling to cover the ground a small team can't. That's how you get the biggest visibility return for the smallest team capacity.

AI models and algorithms

Yes, there is variation. Different AI models are trained on different datasets, use different retrieval methods, and weight content signals differently. One model might prioritise structured, authoritative content, while another might weight recency more heavily.

That said, the fundamentals tend to hold across models: clear, well-structured, accurate content that directly answers questions performs well regardless of which model is surfacing it. Chasing the specifics of any one model's preferences isn't a sustainable strategy – it's better to focus on content quality and structure.

How to become an expert in AI search platforms

In this webinar

Watch the webinar

Transcript: Watch the webinar

Poll results

Is your content invisible in AI search platforms?

Webinar Q&A

Content and structure

General

Video and media

Technical and metadata

Audits and strategy

AI models and algorithms

How to become an expert in AI search platforms

In this webinar

Watch the webinar

Question 1: Do you know how AI search platforms decide what content to show in their answers?

Question 2: Do you know how ready your content is for AI search visibility?

Is your content invisible in AI search platforms?

Content and structure

We use a lot of abbreviations across our site. How does that affect AI search visibility?

We have step-by-step content split across several pages, connected by button links. This has performed well in usability testing – is it a problem for AI visibility, and how can we make sure AI understands the pages are connected?

For simple infographics and flowcharts – is it worth moving completely to HTML/CSS/JavaScript if the aesthetics are unaffected, so they can be read more easily by AI?

FAQ pages versus FAQ sections within content pages: which approach is more accessible and effective for LLMs?

Are FAQs good or bad for AI visibility, and how do we use Q&A patterns without creating low-quality ‘FAQ pages’?

Do you have any recommendations for putting directional content in dropdown menus over having a short content page which links to an online form?

How should we structure information (homepage, navigation, topic pages, repeated content) so AI can reliably retrieve the right answer?

We refer to student accommodation as "residence halls", "residential communities", and "program houses" – but students might search for "dorm". Do we need to account for that?

Is it ok to have both a marketing passage and a specific, factual passage on the same page – or does that confuse AI?

General

Will the webinar be recorded and available to play back?

I've read that AI search pulls content from Wikipedia and Reddit. Is that true, and how can we help curate those messages?

Video and media

We've invested heavily in video content. Is there anything we can do to make video more visible to AI?

Is alt text just as important for stock imagery?

Where should transcripts live (same page vs separate), and do they need to be indexable to help AI visibility?

Technical and metadata

Can AI see when a webpage was last updated? How do LLMs know the last publish date of content?

Is there a way of clearing the AI search cache, or forcing AI to re-research a topic?

How important is llms.txt?

Are there any specific technical SEO changes that are critical for LLM discoverability?

What role does JavaScript play?

How do AI search platforms treat, read, and interpret dynamic content?

Where do AI search platforms retrieve information from, and what determines what they cite (search rankings, authority signals, platforms like Reddit, links)?

Can AI search platforms read PDFs (including PDF forms), and what’s the best practice for making PDF-based information discoverable and usable?

Can AI search platforms access content that’s hidden or interaction-based (accordions/tabs/pop-ups), and how should we implement it safely?

Is AI search pulling from live websites right now, or is it using content it was trained on previously?

We've heard that adding synonyms or alternate terms to metadata might help – for example, adding "dorm" to meta keywords even though we don't use that word on the page. Is that worth doing?

Audits and strategy

We already run SEO audits – do we still need to?

AI search visibility tracking and reporting – which tools would you recommend for tracking specific metrics?

For a large site with a small optimisation team, what changes deliver the biggest impact on AI visibility—and why?

How can we learn the questions people are asking (search + on-site + AI), and measure whether our changes improved AI visibility?

Can Squiz Content Intelligence be used in place of an accessibility tool like DubBot or Monsido?

AI models and algorithms

How much does discoverability vary per LLM – do some models favour certain metrics or aspects over others?

When a new model comes out, is it the same impact as a Google algorithm update?

How much does optimisation vary across models, and how do we balance human-first content with AI visibility?