Your XML sitemap is one of the most important SEO files on your site — and one of the most ignored. A well-optimized sitemap helps Google discover and prioritize your content. A bloated, broken one wastes crawl budget and hurts indexing.

What Is an XML Sitemap?

A sitemap is a structured XML file that lists URLs on your site you want search engines to index. It's essentially a roadmap that helps crawlers find content efficiently — especially important for large sites or those with poor internal linking.

Why Sitemaps Matter

  • Faster discovery — Crawlers find new pages quickly
  • Better indexing — Helps Google prioritize crawling
  • Status visibility — Search Console shows indexing issues
  • Crawl budget optimization — Direct crawlers to important pages
  • Required for Search Console — Must submit for monitoring

Sitemap Basics

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://example.com/</loc>
    <lastmod>2026-01-15</lastmod>
    <changefreq>weekly</changefreq>
    <priority>1.0</priority>
  </url>
</urlset>

Sitemap Limits

  • Max 50,000 URLs per sitemap
  • Max 50MB uncompressed (or 10MB gzipped)
  • Use sitemap index for sites over 50K URLs

What to Include

Only include URLs you want indexed:

  • Canonical URLs (not parameter variants)
  • Self-referencing canonicals only
  • 200 status pages only
  • Non-noindex pages
  • High-value content pages

What to EXCLUDE

Common mistakes — these should NOT be in your sitemap:

  • Redirected URLs (301, 302)
  • 404 pages
  • Noindex pages
  • Pages blocked by robots.txt
  • Non-canonical duplicates
  • Parameter URLs (unless intentional)
  • Pages requiring authentication
  • Admin pages
  • Internal search results

Including these wastes crawl budget and signals confusion to Google.

About lastmod, changefreq, priority

lastmod (Most Important)

Date of last meaningful update. Use ISO 8601 format. Google relies on this for re-crawl decisions. Be honest — fake lastmods cause Google to ignore yours.

changefreq

Google says it largely ignores this. Some other engines use it. Set realistically:

  • Homepage: weekly
  • Blog posts: monthly
  • Product pages: monthly
  • About/static pages: yearly

priority

Also largely ignored by Google. Doesn't hurt to set 1.0 for homepage, 0.8 for important pages, 0.5 for everything else. Don't make all pages 1.0 — that's meaningless.

Sitemap Types

1. Standard XML Sitemap

Lists pages — most common use case.

2. Sitemap Index

Lists multiple sitemaps. Used for large sites or organizational separation.

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap><loc>https://example.com/sitemap-posts.xml</loc></sitemap>
  <sitemap><loc>https://example.com/sitemap-products.xml</loc></sitemap>
</sitemapindex>

3. Image Sitemap

Helps Google discover images. Useful for image-heavy sites.

4. Video Sitemap

Lists video content with metadata.

5. News Sitemap

For sites approved for Google News.

6. Hreflang Sitemap

For multilingual sites — declares language variants.

Submitting Your Sitemap

  1. Add to robots.txt:
    Sitemap: https://example.com/sitemap.xml
  2. Submit to Google Search Console — Sitemaps section
  3. Submit to Bing Webmaster Tools
  4. Ping Google after major updates:
    https://www.google.com/ping?sitemap=URL

Common Sitemap Mistakes

1. Including 404s and Redirects

Auto-generated sitemaps often include broken or redirected URLs. Audit regularly.

2. Not Updating lastmod

If lastmod never changes, Google deprioritizes re-crawling.

3. Submitting Multiple Versions

HTTP and HTTPS, www and non-www sitemaps confuse Google.

4. Hundreds of Tiny Sitemaps

Organize logically, but don't over-fragment.

5. Forgetting Index Updates

Sitemap index must reference current child sitemaps.

6. Including Pagination Pages with Same Content

Page 2, 3, 4 of paginated lists — generally include only first page.

Monitoring Sitemap Health

In Google Search Console, check:

  • Submitted vs Indexed — Aim for 90%+ indexing rate
  • Errors and warnings — Address quickly
  • Coverage report — See which URLs aren't indexed and why

Use our Sitemap Operations Tool to validate, fetch, and analyze sitemaps.

Pro Tips

  1. Generate dynamically — Sitemaps should reflect current site state
  2. Compress with gzip — Save bandwidth, name it sitemap.xml.gz
  3. One sitemap per content type — Better debugging
  4. Test before submitting — Validate XML, check for errors
  5. Monitor indexing ratio — Below 80% = investigate
  6. Update on every publish — Don't wait for scheduled regeneration
  7. Use lastmod truthfully — Build trust with crawlers

Sitemap for SPAs (React, Vue, etc.)

Client-side rendered apps need pre-rendered sitemaps:

  • Generate sitemap server-side
  • Include all dynamic routes
  • Ensure pages return 200 server-side
  • Consider prerendering critical pages

Conclusion

Your sitemap is a high-leverage SEO asset. A clean, accurate, regularly-updated sitemap helps Google index your best content efficiently. A bloated sitemap full of errors confuses crawlers and wastes crawl budget. Audit your sitemap today, exclude what shouldn't be there, and monitor it monthly. Your indexing rate will thank you.