Rails: Easy Sitemaps
A simple way to improve your SEO is letting google know what is available. Sure they crawl you automatically, but nothing is perfect. The easiest way to give them the info they need is to create sitemaps and submit them to Google's Search Console tool.
The Controller
The most simple way to start with sitemaps in rails is to create a sitemaps controller. To begin, it'll have two actions, but it can easily grow over time if necessary.
class SitemapsController < ApplicationController
def index
respond_to do |format|
format.xml
end
end
def pages
respond_to do |format|
format.xml
end
end
end
Note: Make sure that if you require authentication in ApplicationController, you remove it from this controller.
index
will return a sitemap of sitemaps. You can substitute the lastmod time with whatever makes sense.
# index.xml.builder
xml.instruct!
xml.sitemapindex xmlns: "http://www.sitemaps.org/schemas/sitemap/0.9" do
xml.sitemap do
xml.loc sitemap_pages_url
xml.lastmod Time.utc(2020, 12, 21, 11).strftime("%Y-%m-%dT%H:%M:%S+00:00")
end
end
pages
will act as the individual sitemap referenced above in our sitemap of sitemaps. Here is an example of the sitemap for www.flippercloud.io.
# pages.xml.builder
xml.instruct!
xml.urlset xmlns: "http://www.sitemaps.org/schemas/sitemap/0.9" do
xml.url do
xml.loc root_url
xml.lastmod Time.utc(2020, 12, 21, 11).strftime("%Y-%m-%dT%H:%M:%S+00:00")
end
xml.url do
xml.loc documentation_url
xml.lastmod Time.utc(2020, 12, 21, 11).strftime("%Y-%m-%dT%H:%M:%S+00:00")
end
xml.url do
xml.loc sign_up_url
xml.lastmod Time.utc(2020, 12, 21, 11).strftime("%Y-%m-%dT%H:%M:%S+00:00")
end
xml.url do
xml.loc sign_in_url
xml.lastmod Time.utc(2020, 12, 21, 11).strftime("%Y-%m-%dT%H:%M:%S+00:00")
end
xml.url do
xml.loc password_reset_url
xml.lastmod Time.utc(2020, 12, 21, 11).strftime("%Y-%m-%dT%H:%M:%S+00:00")
end
end
Right now, we don't have many pages, so it is quite simple. Just remember, this builder view is plain old Ruby. If your pages are in the database, you can query and iterate them and set the lastmod
time to updated_at
or whatever makes sense.
Down the road, if you add more sitemaps (say one for blog posts and categories) all you need to do is create another action and add the sitemap to your sitemaps index.
The last piece to make your sitemaps work is a few routes:
get "/sitemap.xml", to: "sitemaps#index", as: :sitemaps
get "/sitemap-pages.xml", to: "sitemaps#pages", as: :sitemap_pages
The Tests
We can test these routes manually in a browser, but since we want to ensure we don't accidentally break these, we'll drop some tests in.
require 'test_helper'
class SitemapsControllerTest < ActionDispatch::IntegrationTest
test "GET index renders list of sitemaps" do
get sitemaps_path, env: {"HOST" => "www.flippercloud.io"}
assert_response :success
assert_select "sitemapindex sitemap loc", "http://www.flippercloud.io/sitemap-pages.xml"
end
test "GET show renders sitemap" do
get sitemap_pages_path, env: {"HOST" => "www.flippercloud.io"}
assert_response :success
assert_select "urlset url loc", "http://www.flippercloud.io/"
assert_select "urlset url loc", "http://www.flippercloud.io/docs"
assert_select "urlset url loc", "http://www.flippercloud.io/signup"
assert_select "urlset url loc", "http://www.flippercloud.io/signin"
assert_select "urlset url loc", "http://www.flippercloud.io/password-reset"
end
end
Submitting to Google Search Console
While these changes are deploying you can add your app as a property in Search Console. Then, once they are out in production, you can head to the Sitemaps page and submit them to google.
A couple minutes of work and now Google has a much better idea of what is available on your site and when it was last updated.
Don't Forget Other Robots
Now that you've submitted these to google, the last step is to declare them in your robots.txt
file. This makes it easy for other robots to pick them up (DuckDuckGo, etc.).
It's as easy as adding a few lines like this:
Sitemap: https://www.flippercloud.io/sitemap.xml
Sitemap: https://www.flippercloud.io/sitemap-pages.xml
More Complex Example
Also, sitemaps can be as complex as you need. For example, speakerdeck.com's are quite a bit more complicated.
class SitemapsController < ApplicationController
layout nil
def index
start_date = Time.utc(2010, 10)
end_date = Time.now.to_date
@months = []
date = start_date.beginning_of_month
while date <= end_date.beginning_of_month do
@months << date.to_date
date = date.advance(months: 1)
end
respond_to do |format|
format.xml
end
end
def month
headers['Content-Type'] = 'application/xml'
month_start = Time.utc(params[:year], params[:month], 1).beginning_of_month
month_end = month_start.end_of_month
@talks = Talk.published.where("created_at BETWEEN :month_start AND :month_end", month_start: month_start, month_end: month_end).sorted.limit(50_000).includes(:owner)
respond_to do |format|
format.xml
end
end
end
This generates a sitemap of sitemaps with a sitemap per month of all published and publicly viewable talks. The index view then iterates the months:
xml.instruct!
xml.sitemapindex xmlns: "http://www.sitemaps.org/schemas/sitemap/0.9" do
@months.each do |month|
xml.sitemap do
xml.loc sitemap_url(year: month.year, month: month.month)
xml.lastmod month
end
end
end
And the individual month sitemap queries for the talks published during that time frame and iterates them to generate the urlset:
xml.instruct!
xml.urlset xmlns: "http://www.sitemaps.org/schemas/sitemap/0.9" do
@talks.each do |talk|
xml.url do
xml.loc owner_talk_url(talk.owner, talk)
xml.lastmod talk.updated_at.strftime("%Y-%m-%dT%H:%M:%S+00:00")
end
end
end
Hat tip to Esteve Castells for the recommendation to do a sitemap per month for Speaker Deck.