Sitemaps in Headless WordPress with Next.js

Francis Agulto Avatar

·

In this tutorial, you will learn how to create sitemaps in Headless WordPress using Next.js and the WP Sitemap Rest API plugin. (Just a note, if you need a foundational understanding of Next.js, please follow this tutorial by my main man Jeff Everhart.

Previously, I wrote about what Sitemaps are and the benefits of them in this article you can reference here. In the context of this blog post, we will jump right into creating the sitemaps. By the end of this tutorial you will be able to:

  • Create a Sitemap in Next.js that dynamically pulls your URLs from WordPress
  • Add your Next.js pages and paths on your Sitemap including dynamic routes

Setup your WordPress Site

The first thing we are going to do is configure your WordPress site so that we have access to the particular endpoints for our Sitemaps. We are going to use the WP Sitemap Rest API plugin. Download the plugin from the repo via zip file. Next, In WP Admin, go to the Plugins > Add Plugins option at the top right and upload the plugin.

Once you install that plugin, you expose these 4 endpoints to your REST API in WordPress:

/wp-json/sitemap/v1/totalpages
/wp-json/sitemap/v1/author?pageNo=1&perPage=1000
/wp-json/sitemap/v1/taxonomy?pageNo=1&perPage=1000&taxonomyType=category or tag
/wp-json/sitemap/v1/posts?pageNo=1&perPage=1000&postType=post or page

Easily, with this plugin, if you visit “your.wordpresssite.com” plus appending any of the 4 endpoints, you will get an object back of what those paths reflect. An example of my demo site with the first endpoint added:

Easy Peasy, now on to the Next.js frontend.

Install Dependencies

Before we go into the code, we need to install a dependency we will use in this tutorial within our Next.js project called axios. Axios will give us some helper functionality to make XMLHttpRequests from our Next.js front-end. To do this, run npm i axios in your terminal.

Environment Variables

There are a couple of endpoints we will have to set up as environment variables in our project before we create the functions needed.

In the root of your project, create a .env.local file. In this file, add your endpoints and how many items per sitemap you would want. Note that on the frontend URL, because I am using a development environment on my local machine, I am using the Next.js default of localhost with port 3000.

NEXT_PUBLIC_WORDPRESS_API_URL=https://yourwordpresssite.com
NEXT_PUBLIC_FRONTEND_URL=http://localhost:3000
NEXT_PUBLIC_ITEM_PER_SITEMAP=50
Code language: JavaScript (javascript)

Now that we have added our environment variables in the properly named file, let’s allow our project access to them via the process.env object.

In the root of your project, create a utils folder. In the utils folder, create a file called variables.js. In this file, add this code block:

export const wordpressUrl = process.env.NEXT_PUBLIC_WORDPRESS_URL;
export let frontendUrl = process.env.NEXT_PUBLIC_FRONTEND_URL;
export let sitemapPerPage = process.env.NEXT_PUBLIC_ITEM_PER_SITEMAP || 50; //1000
Code language: JavaScript (javascript)

The variables we are naming above are needed so that Next.js can access those endpoints from our .env.local with the process.env object.

Next.js sitemap functions

We are going to need a few functions in order to dynamically grab the total amount of WordPress URLs to make them available to your sitemap index page and show them on the browser as well as showing our Next.js front-end URL’s.

The first step is to create a folder called lib in the root of the project. In the new lib folder, create a file called getTotalCounts.js and add this code block:

import axios from "axios";
import { wordpressUrl } from "../utils/variables";

export default async function getTotalCounts() {
  const res = await axios.get(`${wordpressUrl}/wp-json/sitemap/v1/totalpages`);
  let data = await res.data;
  if (!data) return [];
  const propertyNames = Object.keys(data);
  let excludeItems = ["user"];
  //if you want to remove any item from sitemap, add it to excludeItems array
  let totalArray = propertyNames
    .filter((name) => !excludeItems.includes(name))
    .map((name) => {
      return { name, total: data[name] };
    });

  return totalArray;
}
Code language: JavaScript (javascript)

This file is a fetch function that is grabbing the total pages from your WordPress site and returning that data in an array. If you want to exclude any data from your sitemap you can add it to the excludeItems array as commented.

The next thing we need to create is a way to get the sitemap pages and return them for the specific type.

Back in our utils folder, create a file called getSitemapPages.js and add this code block:

import { frontendUrl, sitemapPerPage } from "./variables";

export default function getSitemapPages(item) {
  const items = [];
  for (let i = 1; i <= Math.ceil(item.total / sitemapPerPage); i++) {
    let url = `${frontendUrl}/sitemap/${item.name}_sitemap${i}.xml`;
    items.push(
      ` 
        <sitemap>
           <loc>
              ${url}
          </loc>
      </sitemap>
      `
    );
  }
  return items.join("");
}
Code language: JavaScript (javascript)

This file is importing the variables we declared in our variables.js file. Then, it exports a function by default that is passing in the item from the total counts function we created. Once these are passed in, the front-end URL is appended with the item name and the XML path. Then we join these with the XML tags with the slug from WordPress.

We now have the necessary functions in place to generate our sitemap pages and the total number of them. The next step is to create a sitemap index page.

Sitemap Index Page

We are going to take advantage of Next.js and its page-based file routing.  This means that any file I add to the pages directory in my project will correspond automatically to a route that is available. 

Create a file in the pages directory of your project called sitemap.xml.js and add this code block:

import getSitemapPages from "../utils/getSitemapPages";
import getTotalCounts from "../lib/getTotalCounts";
export default function SitemapIndexPage() {
  return null;
}
export async function getServerSideProps({ res }) {
  const details = await getTotalCounts();

  let sitemapIndex = `<?xml version='1.0' encoding='UTF-8'?>
  <sitemapindex xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/siteindex.xsd"
           xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
     ${details.map((item) => getSitemapPages(item)).join("")}
  </sitemapindex>`;
  res.setHeader("Content-Type", "text/xml; charset=utf-8");
  res.setHeader(
    "Cache-Control",
    "public, s-maxage=600, stale-while-revalidate=600"
  );
  res.write(sitemapIndex);
  res.end();
  return { props: {} };
}

Code language: JavaScript (javascript)

The first thing to notice at the top of the file is the importing of the two functions we just created to generate the WordPress page types and the total counts. Then we have this page as a default function that will utilize server-side rendering since we want this page to be dynamic on every request so the sitemap is up to date.

Next, I copied a sitemap example and formatted it so that the XML tags and links show up on the browser with the response header set and the page write response set to load and end with the props returned.

Let’s get stoked now and see what this looks like on the browser. Go to your terminal and run npm run dev. You should see this:

Awesome!! We now have a dynamic sitemap in a headless WordPress setup. The next step is to add the ability to grab the URL of the sitemap path and have that route to its detail page.

Dynamic Sitemap Page Routes

We need a couple of things to make the dynamic page routes work for our sitemap. First, we need a way to generate the sitemap paths to their detail page.

Go back to the utils folder we created earlier. In this folder, create a generateSitemapPaths.js file and add this code block:

import { frontendUrl } from "./variables";

export default function generateSitemapPaths(array) {
  const items = array.map(
    (item) =>
      `
            <url>
                <loc>${frontendUrl + item?.url}</loc>
                ${
                  item?.post_modified_date
                    ? `<lastmod>${
                        new Date(item?.post_modified_date)
                          .toISOString()
                          .split("T")[0]
                      }</lastmod>`
                    : ""
                }
            </url>
            `
  );
  return items.join("");
}
Code language: JavaScript (javascript)

This file imports our front-end URL variable and then exports a default function that passes in an array. The items variable is mapped which then returns a single item. Then we have XML tags that dynamically grab the appropriate item and concatenate that to your frontend URL, as well as the date the page was last modified. Now that we have the function needed, let’s create the dynamic route page.

Next.js makes it easy to create a dynamic file if you have a parameter like a slug that changes upon request. The naming convention for this is the bracket syntax. Create a folder in the pages directory called sitemap. In this folder, create a file called [slug].js:

In the [slug].js file, add this code block:

import getSitemapPageUrls from "../../lib/getSitemapPageUrls";
import getTotalCounts from "../../lib/getTotalCounts";
import generateSitemapPaths from "../../utils/generateSitemapPaths";
export default function SitemapTagPage() {
  return null;
}
export async function getServerSideProps({ res, params: { slug } }) {
  let isXml = slug.endsWith(".xml");
  if (!isXml) {
    return {
      notFound: true,
    };
  }
  let slugArray = slug.replace(".xml", "").split("_");
  let type = slugArray[0];
  let pageNo = slugArray[1]?.match(/(\d+)/)[0] ?? null;
  let page = pageNo ? parseInt(pageNo) : null;
  let possibleTypes = await getTotalCounts();
  if (!possibleTypes.some((e) => e.name === type)) {
    return {
      notFound: true,
    };
  }
  let pageUrls = await getSitemapPageUrls({ type, page });
  if (!pageUrls?.length) {
    return {
      notFound: true,
    };
  }
  let sitemap = `<?xml version="1.0" encoding="UTF-8"?>
  <urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"
  xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    ${generateSitemapPaths(pageUrls)}
  </urlset>`;
  res.setHeader("Content-Type", "text/xml; charset=utf-8");
  res.setHeader(
    "Cache-Control",
    "public, s-maxage=600, stale-while-revalidate=600"
  );
  res.write(sitemap);
  res.end();
  return { props: {} };
}
Code language: JavaScript (javascript)

There is quite a bit of code in this file. Essentially, what is happening in this file is that we are importing all of our functions at the top that we need access to for the sitemap dynamic paths.

Next, a default function is exported, calling it SitemapTagPage. We return null, then have an async function to getServerSideProps as we want this function to be server-side rendered as well on every request. Then in the response object, we pass in the params which is the slug in this case.

Our variable isXML is equal to the slug and the appended XML path ending. If this a not an xml slug and path we return it to not be found.

In the middle of the file, we declare numerous variables that we name as we need to set the array of slugs, the type, the page number using regex which extracts the number from the string, and the possible types and the total counts of those types.

Lastly, we validate this to see if it properly is aligned with our sitemap index page, otherwise, we throw a 404. Once that is done, we have our XML tags formatted and the appropriate responses.

Let’s try this now and see if this works in the browser. Go back to the terminal and run npm run dev. Once the sitemap index page shows on the browser, grab any URL on the sitemap and put it in the search bar:

I chose to grab my post page URL at sitemap/post_sitemap1.xml. I followed that link in the search bar and got the post sitemap details! Stoked!! 🎉

Next.js Pages

Finally, we need to account for our Next.js pages on our front-end. In order to do this, let’s go to our pages directory and add a file called sitemapnextjs.xml.js. In this file, add this code block:

import React from "react";
import * as fs from "fs";
const Sitemap = () => {
  return null;
};

export const getServerSideProps = async ({ res }) => {
  const BASE_URL = "http://localhost:3000";

  const staticPaths = fs
    .readdirSync("pages")
    .filter((staticPage) => {
      return ![
        "api",
        "about",
        "_app.js",
        "_document.js",
        "404.js",
        "sitemap.xml.js",
      ].includes(staticPage);
    })
    .map((staticPagePath) => {
      return `${BASE_URL}/${staticPagePath}`;
    });

  const dynamicPaths = [`${BASE_URL}/name/1`, `${BASE_URL}/name/2`];

  const allPaths = [...staticPaths, ...dynamicPaths];

  const sitemap = `<?xml version="1.0" encoding="UTF-8"?>
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
      ${allPaths
        .map((url) => {
          return `
            <url>
              <loc>${url}</loc>
              <lastmod>${new Date().toISOString()}</lastmod>
              <changefreq>monthly</changefreq>
              <priority>1.0</priority>
            </url>
          `;
        })
        .join("")}
    </urlset>
  `;

  res.setHeader("Content-Type", "text/xml");
  res.write(sitemap);
  res.end();

  return {
    props: {},
  };
};

export default Sitemap;
Code language: JavaScript (javascript)

At the top of the file, we are importing React, and fs to access local storage. Next, we return null as we want this to not render or return anything in the component. Instead, we will use the getServerSideProps function, export it as a const, then pass in the response object, just like we did in the sitemap index page to server-side render for fresh, up-to-date sitemap data.

Then we declare our base URL variable, in this case, I am using the dev server off localhost:3000.

Once we set the fs to our static paths, we enable a way within the object to filter through our static pages in Next.js within an array and input which ones we do not want on the sitemap. I just added the default pages Next.js comes with as an example.

The next step is to account for the dynamic paths that you have in Next.js, apart from WordPress. We set a const for our dynamic paths and make it equal to an array where we get our base URL and our hard coded path. (I just named it “name” but you can name it in relation to whatever your dynamic route file is)

The last things we need to do are to set the const for all our paths and iterate through them with the spread operator and expand all of them in an array.

Finally, we generate the sitemap in XML format with the appropriate response headers returning our props. Jamstoked!! We now have all the paths and the URLS in our Next.js front-end!!! 💥

Combining Next.js pages and WordPress pages in your sitemap index page

All that is left is adding our Next.js pages to our sitemap index with our WordPress pages. Go back to the sitemap.xml.js file and add this code block:

import getSitemapPages from "../utils/getSitemapPages";
import getTotalCounts from "../lib/getTotalCounts";
export default function SitemapIndexPage() {
  return null;
}
export async function getServerSideProps({ res }) {
  const details = await getTotalCounts();

  let sitemapIndex = `<?xml version='1.0' encoding='UTF-8'?>
  <sitemapindex xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/siteindex.xsd"
           xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
     ${details.map((item) => getSitemapPages(item)).join("")}
     <sitemap>
     <loc>
        http://localhost:3000/sitemapnextjs.xml
    </loc>
</sitemap>
  </sitemapindex>`;
  res.setHeader("Content-Type", "text/xml; charset=utf-8");
  res.setHeader(
    "Cache-Control",
    "public, s-maxage=600, stale-while-revalidate=600"
  );
  res.write(sitemapIndex);
  res.end();
  return { props: {} };
}

Code language: JavaScript (javascript)

The one difference in this file is the URL we added http://localhost:3000/sitemapnextjs.xml in our XML element <loc></loc>. This will allow us to show the Next.js pages on our sitemap index.

Let’s see this in action in the browser. Go back to terminal and run npm run dev. You should see this in the browser:

We now have our sitemap for Next.js pages on our index. Let’s see what all the details of our pages are in Next.js when we follow that URL:

Done!! Jamstoked! 🎉

We are finished with this walk-through tutorial! Jamstoke! With this tutorial, we hope you take away the ability to understand the importance of sitemaps and the value of utilizing them to inform search engines of your site’s page and path layout.

This is just one way of approaching sitemaps in headless WordPress. I would love to hear your thoughts and your own approaches and builds. Hit me up in our Discord!

Here is the demo once again so you can follow along, make it your own and add on to it if you like on my GitHub.

I want to give a major shoutout and credit to Dipankar Maikap who created the WP Sitemap Rest Api plugin and assisted me in the creation of this tutorial.