XML Sitemap
An XML sitemap is a file that lists all the URLs on your website that you want search engines to discover and index. Submitting it to Google Search Console is the most direct way to prompt crawling of new content.
An XML sitemap is a structured list of your site's URLs, optionally including metadata like last modification date and relative priority. You submit it to Google Search Console, and Googlebot uses it as a crawl guide — not a restriction. Google can still crawl pages not in your sitemap through link discovery, but the sitemap ensures important URLs aren't missed, especially on new sites with thin internal linking.
A sitemap matters most for large sites and new sites. Large sites with thousands of pages benefit from the structured inventory. New sites with few inbound links benefit because Googlebot may not discover all their content through link following alone. For established sites with strong internal linking, the sitemap is still valuable but less urgently critical.
What not to include: noindex pages, redirected URLs, pages returning non-200 status codes, and paginated archive pages beyond the first. Listing pages in your sitemap that you're simultaneously telling search engines to ignore creates conflicting signals. Your sitemap should only contain the canonical, indexable URLs you actually want to appear in search results.
Sitemap hygiene matters more than most teams realize. Google Search Console reports on which sitemap URLs were crawled, indexed, or flagged with errors. A sitemap full of 404s, redirects, or noindex pages signals poor site maintenance and can dilute the crawl attention given to your real content.
Accelerates discovery of new content — submitting an updated sitemap to Search Console is the most direct mechanism available for prompting Googlebot to crawl and index newly published articles
Surfaces indexing problems before they damage performance — Search Console's sitemap reports flag which URLs weren't indexed and why, often revealing technical issues invisible in normal site review
Ensures full content library coverage — without a sitemap, deep pages on sites with limited internal linking may never be discovered, effectively invisible to search despite being published
Want to put this into practice?
Content Torque builds B2B content programs that apply every one of these principles. Book a free strategy call.
Book a free callExplore More Terms
Full glossaryCrawl Budget
Crawl budget is the number of pages Googlebot will crawl on your site within a given timeframe — determined by your server's capacity and how much Google values your content. For most B2B content sites it isn't a daily concern, but at scale it becomes critical.
SEORobots.txt
Robots.txt is a plain text file at the root of your website that tells search engine crawlers which pages they should and shouldn't access. It controls crawler access — not whether a page appears in the index.
SEOPillar Page
A pillar page is a comprehensive, long-form piece of content that covers a broad topic in depth and serves as the anchor for a topic cluster.
SEOKeyword Intent
Keyword intent (also called search intent) is the underlying goal a searcher has when they type a query — informational, navigational, commercial, or transactional.
SEOGEO (Generative Engine Optimization)
Generative Engine Optimization (GEO) is the practice of structuring content so it gets retrieved and cited by AI tools like ChatGPT, Perplexity, and Google AI Overviews.
SEOInternal Linking
Internal linking is the practice of linking from one page on your website to another, used to pass authority between pages and guide readers through related content.
