# Web4Agents.org

> The authoritative reference for Generative Engine Optimization (GEO).

## Glossary

- [Actionable DOM](https://web4agents.org/en/glossary/actionable-dom): HTML structure that exposes clear actions (forms, links, buttons) so agents can complete tasks.
- [Agent-Readable Web](https://web4agents.org/en/glossary/agent-readable-web): The vision of a web where content and structure are optimized for consumption by autonomous AI agents.
- [Anthropic](https://web4agents.org/en/glossary/anthropic): AI safety and research company behind Claude and agentic AI systems.
- [AutoGPT](https://web4agents.org/en/glossary/autogpt): Open-source autonomous agent that can browse the web and perform multi-step tasks.
- [Backlink](https://web4agents.org/en/glossary/backlink): Inbound link from another site to yours; a core SEO and authority signal for search engines and AI systems.
- [Breadcrumbs](https://web4agents.org/en/glossary/breadcrumbs): Navigation trail showing the page's position in the site hierarchy; implement with Schema.org BreadcrumbList for agents.
- [Cache-Control](https://web4agents.org/en/glossary/cache-control): HTTP header that defines how long and where a response can be cached (browsers, CDNs, intermediaries).
- [Canonical URL](https://web4agents.org/en/glossary/canonical-url): The preferred, authoritative URL for a page when multiple URLs point to the same or similar content.
- [Chunking](https://web4agents.org/en/glossary/chunking): Splitting content into smaller segments (chunks) for RAG indexing; strategy determines retrieval quality.
- [Citation (AI citation)](https://web4agents.org/en/glossary/citation): When an AI system attributes or links to your content in its answer; the AI equivalent of a search result position.
- [Code Mode](https://web4agents.org/en/glossary/code-mode): MCP design that exposes few tools (e.g. search + execute) and runs agent-supplied code server-side to keep token usage fixed.
- [Content negotiation](https://web4agents.org/en/glossary/content-negotiation): HTTP mechanism by which the client indicates preferred response format (e.g. Accept: text/markdown) and the server returns the matching format.
- [Content Signals](https://web4agents.org/en/glossary/content-signals): Open framework for declaring content usage preferences to AI systems via HTTP headers or meta tags (ai-train, search, ai-input).
- [Core Web Vitals](https://web4agents.org/en/glossary/core-web-vitals): Google's set of metrics for page experience: LCP (loading), INP (interactivity), CLS (visual stability).
- [Crawl budget](https://web4agents.org/en/glossary/crawl-budget): The limited time or number of URLs a crawler allocates to a site; efficient pages get more of the budget.
- [E-E-A-T](https://web4agents.org/en/glossary/eeat): Google's framework for Experience, Expertise, Authoritativeness, and Trustworthiness — used to assess content and source quality.
- [ETag](https://web4agents.org/en/glossary/etag): HTTP header that identifies a specific version of a resource; enables conditional requests and 304 Not Modified to save bandwidth.
- [GEO (Generative Engine Optimization)](https://web4agents.org/en/glossary/geo): GEO is the set of practices for optimizing web content and structure for generative AI systems — agents, LLMs, and AI-powered search engines.
- [hreflang](https://web4agents.org/en/glossary/hreflang): HTML link attribute that signals language and regional variants of a page to crawlers for multilingual and multi-region targeting.
- [HSTS (HTTP Strict Transport Security)](https://web4agents.org/en/glossary/hsts): HTTP header that forces browsers and agents to use HTTPS for the site, preventing downgrade attacks and mixed content.
- [humans.txt](https://web4agents.org/en/glossary/humans-txt): Plain-text file at the site root that describes the people behind the project; supports E-E-A-T and identity signals.
- [IndexNow](https://web4agents.org/en/glossary/indexnow): Open protocol to notify search engines and crawlers of URL changes in real time for faster indexing.
- [Internal linking](https://web4agents.org/en/glossary/internal-linking): Links between pages on the same site; they define the content graph for crawlers and AI agents and distribute authority.
- [JSON-LD](https://web4agents.org/en/glossary/json-ld): JavaScript Object Notation for Linked Data — the recommended format for embedding Schema.org structured data in HTML pages.
- [LangChain](https://web4agents.org/en/glossary/langchain): Framework for building applications with LLMs and agents, including web and tool use.
- [LLM Indexing](https://web4agents.org/en/glossary/llm-indexing): The process by which content is ingested and made available to large language models and agentic systems.
- [llms.txt](https://web4agents.org/en/glossary/llms-txt): Convention for a plain-text file that describes a site for LLMs and agents.
- [Machine-Readable Content](https://web4agents.org/en/glossary/machine-readable-content): Content formatted so that software and agents can parse and use it without human interpretation.
- [Markdown for Agents](https://web4agents.org/en/glossary/markdown-for-agents): Serving Markdown to AI agents via content negotiation (Accept: text/markdown), reducing token usage and improving extraction.
- [Model Context Protocol (MCP)](https://web4agents.org/en/glossary/mcp): Open standard for connecting AI agents to external tools, resources, and prompts — the 'USB-C of AI integrations'.
- [n8n](https://web4agents.org/en/glossary/n8n): Open-source workflow automation tool that connects apps and services.
- [nofollow](https://web4agents.org/en/glossary/nofollow): Link attribute that tells search engines not to pass PageRank or endorse the destination (rel="nofollow").
- [noindex](https://web4agents.org/en/glossary/noindex): Directive that instructs search engines and crawlers not to include a page in search or index results.
- [Open Graph](https://web4agents.org/en/glossary/open-graph): Meta tags (og:title, og:description, og:image, etc.) for link previews and machine-readable page summaries, read by AI crawlers and social platforms.
- [OpenAI](https://web4agents.org/en/glossary/openai): AI research and deployment company behind GPT and the OpenAI API.
- [OpenAPI](https://web4agents.org/en/glossary/openapi): Standard for describing REST APIs in a machine-readable format; enables AI agents to discover and call APIs autonomously.
- [PageRank](https://web4agents.org/en/glossary/pagerank): Google's core algorithm signal that weighs inbound links; high-quality backlinks from authoritative sites improve rankings.
- [Perplexity](https://web4agents.org/en/glossary/perplexity): AI-powered search and answer engine that cites web sources in its responses.
- [Playwright](https://web4agents.org/en/glossary/playwright): Browser automation library for testing and scraping; used by agents to interact with web pages.
- [RAG (Retrieval-Augmented Generation)](https://web4agents.org/en/glossary/rag): Technique where an LLM retrieves relevant documents or chunks from an external store before generating a response, instead of relying only on training data.
- [robots.txt](https://web4agents.org/en/glossary/robots-txt): Standard file that instructs crawlers and agents which paths they may or may not access.
- [schema.org](https://web4agents.org/en/glossary/schema-org): Vocabulary of structured data types for describing entities and relationships on the web.
- [Semantic density](https://web4agents.org/en/glossary/semantic-density): Ratio of meaningful information to total tokens; high-density content is preferred by RAG systems for retrieval.
- [Semantic Fragment](https://web4agents.org/en/glossary/semantic-fragment): A self-contained unit of meaning that agents can parse and use independently.
- [SEO (Search Engine Optimization)](https://web4agents.org/en/glossary/seo): Practices to improve visibility in traditional search engine results (ranking, crawlability, relevance).
- [sitemap.xml](https://web4agents.org/en/glossary/sitemap): XML file that lists a site's URLs for crawlers and AI agents; supports discovery, priority, and lastmod for efficient indexing.
- [Token (LLM)](https://web4agents.org/en/glossary/token): Unit of text (roughly a word or subword) that LLMs process; context and cost are measured in tokens.
- [TTFB (Time to First Byte)](https://web4agents.org/en/glossary/ttfb): Time from the start of an HTTP request to the receipt of the first byte of the response; a core latency metric.
- [User-agent](https://web4agents.org/en/glossary/user-agent): HTTP request header that identifies the crawler or client (e.g. GPTBot, ClaudeBot); used in robots.txt rules and analytics.
- [Vector database](https://web4agents.org/en/glossary/vector-database): Database that stores content (e.g. text chunks) as vector embeddings for similarity search; used by RAG systems to retrieve relevant passages.
- [YMYL (Your Money or Your Life)](https://web4agents.org/en/glossary/ymyl): Google's category for content that could affect health, finances, safety, or wellbeing; subject to stricter E-E-A-T evaluation.

## Blog


## Documentation

- [Docs overview](https://web4agents.org/en/docs)

- [What Are AI Agents?](https://web4agents.org/en/docs/what-are-ai-agents): A practical introduction to AI agents, how they browse the web, and why they are changing how content must be structured.
- [Bot Management & Scraping](https://web4agents.org/en/docs/bot-management): How to differentiate legitimate AI agents from abusive scrapers and strategies to protect your content.
- [GEO vs SEO](https://web4agents.org/en/docs/geo-vs-seo): The differences and overlaps between traditional SEO and Generative Engine Optimization (GEO).
- [HTTPS & Security Headers](https://web4agents.org/en/docs/https-and-security): Why HTTPS and security headers matter for AI agent trust and site credibility.
- [llms.txt](https://web4agents.org/en/docs/llms-txt): How to create and serve an llms.txt file so AI agents understand your site.
- [Model Context Protocol (MCP)](https://web4agents.org/en/docs/mcp): What MCP is, how it works, and how to expose your services to AI agents through a standardized protocol.
- [Overview of AI Crawlers](https://web4agents.org/en/docs/ai-crawlers-overview): A reference guide to the main AI crawlers: who they are, what they do, and how to identify them.
- [Schema.org](https://web4agents.org/en/docs/schema-org): How to use Schema.org vocabulary to help AI agents understand your content.
- [Technical SEO for Agents](https://web4agents.org/en/docs/technical-seo): Core technical SEO practices — canonical URLs, meta tags, pagination, hreflang — and how they apply to AI agent optimization.
- [Tracking Agent Traffic](https://web4agents.org/en/docs/tracking-agent-traffic): How to identify and measure AI crawler and agent traffic in your analytics and server logs.
- [Writing for AI Agents](https://web4agents.org/en/docs/writing-for-agents): How to structure and write content so AI agents can extract, summarize, and cite it accurately.
- [Indirect Prompt Injection](https://web4agents.org/en/docs/prompt-injection): Understanding the risks of malicious HTML source code and how to protect visiting LLMs.
- [Internal Linking](https://web4agents.org/en/docs/internal-linking): How internal linking helps both traditional SEO and AI agents navigate your site's content graph.
- [Introduction](https://web4agents.org/en/docs/introduction): What is the agent-ready web, and why does it matter for your website?
- [JSON-LD](https://web4agents.org/en/docs/json-ld): How to implement JSON-LD structured data in your HTML for AI agents and search engines.
- [Monitoring Citations](https://web4agents.org/en/docs/monitoring-citations): How to track when your site is cited in AI-generated answers and measure your AI visibility.
- [OpenAPI for Agents](https://web4agents.org/en/docs/openapi-for-agents): How to design and expose an OpenAPI specification that AI agents can discover, understand, and use autonomously.
- [Performance & Core Web Vitals](https://web4agents.org/en/docs/performance): How page performance affects AI agent crawling and content retrieval.
- [RAG Optimization](https://web4agents.org/en/docs/rag-optimization): How to optimize your content for Retrieval-Augmented Generation (RAG) pipelines used by AI agents and LLM applications.
- [robots.txt](https://web4agents.org/en/docs/robots-txt): Complete guide to robots.txt: syntax, directives, per-AI-crawler rules, and best practices for 2026.
- [Actionable DOM](https://web4agents.org/en/docs/actionable-dom): How to structure your DOM so AI agents can identify and interact with key elements.
- [CDN & Caching for Agents](https://web4agents.org/en/docs/cdn-and-caching): How to configure your CDN and caching strategy to serve both human visitors and AI agents efficiently.
- [Content Freshness](https://web4agents.org/en/docs/content-freshness): How to signal content freshness to AI crawlers and search engines, and why recency matters for AI-generated answers.
- [Data Privacy & LLM Ingestion](https://web4agents.org/en/docs/data-privacy): How to prevent sensitive data ingestion and maintain GDPR compliance when dealing with AI crawlers.
- [LLM Indexing](https://web4agents.org/en/docs/llm-indexing): How large language models index and use web content, and what it means for your site.
- [Markdown for Agents](https://web4agents.org/en/docs/markdown-for-agents): How to serve Markdown directly to AI agents instead of HTML, reducing token usage by up to 80% and improving content quality.
- [Monitoring Tools for SEO & GEO](https://web4agents.org/en/docs/monitoring-tools): An overview of the main tools to monitor your site's SEO and GEO: Google Search Console, Bing Webmaster Tools, Google Analytics 4, Ahrefs, Semrush, Cloudflare Radar, and more.
- [Open Graph & Meta Tags](https://web4agents.org/en/docs/open-graph): How Open Graph and meta tags help AI agents understand your pages.
- [Quick-start checklist](https://web4agents.org/en/docs/checklist): A practical checklist to make your website agent-ready, step by step.
- [AI Overviews and AI Mode (Google)](https://web4agents.org/en/docs/ai-overviews): How Google's AI Overviews and AI Mode work, how to appear in them, and what publishers can do to optimize their visibility.
- [Backlinks & Link Authority](https://web4agents.org/en/docs/backlinks): How backlinks work for traditional SEO and AI authority signals, and how to earn quality links in 2026.
- [Content Signals](https://web4agents.org/en/docs/content-signals): The Content Signals framework: how to declare your content usage preferences to AI agents using HTTP headers and HTML meta tags.
- [Domain Names & SEO](https://web4agents.org/en/docs/domain-names-seo): How to choose a domain name that benefits SEO and AI discoverability: TLD, keywords, age, HTTPS, and best practices.
- [Rate Limiting Agents](https://web4agents.org/en/docs/rate-limiting-agents): Protecting your server resources from the exponential growth of automated AI traffic.
- [Semantic HTML](https://web4agents.org/en/docs/semantic-html): How semantic HTML elements help AI agents understand your content structure.
- [sitemap.xml](https://web4agents.org/en/docs/sitemap-xml): Why sitemap.xml matters for AI agents and how to optimize it.
- [Accessible Forms](https://web4agents.org/en/docs/accessible-forms): How to build forms that AI agents and automated tools can understand and interact with.
- [E-E-A-T: Trust Signals](https://web4agents.org/en/docs/eeat): How Google's E-E-A-T framework (Experience, Expertise, Authoritativeness, Trustworthiness) applies to SEO and GEO in 2026.
- [Entities and Knowledge Graph](https://web4agents.org/en/docs/entities-knowledge-graph): How to be recognized as an entity by Google and AI systems, and why the Knowledge Graph is central to generative search visibility.
- [AI Platform Optimization](https://web4agents.org/en/docs/ai-platforms): Specificities of each AI search platform (Google, Perplexity, ChatGPT, Claude, Bing Copilot) and how to adapt your optimization strategy.
- [JavaScript Rendering & AI Crawlers](https://web4agents.org/en/docs/javascript-rendering): How AI agents process JavaScript (CSR vs SSR/SSG) and why server/static rendering is vital for GEO.