Redirects for SEO
The importance of redirects in search engine rankings.
Experimental Site Notice
This is not a production site. Some pages may have unusual formatting or colors as part of a series of live experiments in visual perception of self illuminated displays. If you have any problems reading any pages, or if you find any are particularly easy to read, we'd love to hear about it. Please discuss at the SAPC GitHub repo discussions tab
In today's world of a bazillion webpages, your ranking in search results is critical. But one area of site administration and search engine optimization that is often overlooked are the site URLs. Consistency is key, but how to enforce consistency when others might link to your site using URL forms that are not uniform in terms of upper/lower case, trailing slashes, file extensions, etc.
The answer is using redirects and/or rewrites of your URLs so that search engines resolve to a single, canonical URL for each page. In this article we will be discussing Redirect and RedirectMatch.
DIFFERENCES & MEANINGS:
The difference between Redirect and RedirectMatch is that Redirect
only matches a simple URL-PATH, while RedirectMatch
allows you to use regex pattern matching. This makes RedirectMatch similar in functionality to Rewrites (modRewrite) but RedirectMatch is more efficient in terms of server load.
Also, a 301 is a permanent redirect, and a 302 is a temporary redirect.
SEARCHING FOR GODOT
To preserve search engine ranking you should always use 301 redirects. Ideally for SEO you want some consistency in protocol and FQDN, so it goes well beyond just enforcing SSL/TLS.
So let's say your "complete" URL for your home page is:
https://www.example.com/index.php
Though you may have a server that is not case sensitive, and that aliases all subdomains to the root domain, and that will use the file index.html
if it is not implicit in the path, such that you get to the same location by entering only
example.com
into the browser's location field. While that certainly makes it easy for a user to directly type in just example.com
it creates potential problems for SEO. If this is how your server is set up then ALL of these URLs resolve to the exact same content:
https://www.example.com/index.php
https://www.example.com/index
https://www.example.com/
https://example.com/index.php
https://example.com/index
https://example.com/
http://www.example.com/index.php
http://www.example.com/index
http://www.example.com/
http://example.com/index.php
http://example.com/index
http://example.com/
But even though your server may think these are all the same, and the content served is identical, Google considers them all to be unique URLs, and when they determine that these 12 URLs serve duplicate content, you will be penalized in search rankings.
And it's not enough to just ensure all your internal links are specified as the preferred URL — some fan of your site is undoubtedly going to post a link, and write the link as http://example.com
when you'd prefer https://www.example.com
so you need Google to know that http://example.com
should be interpreted as your preferred, and one key way to do this is with permanent redirects.
You can make a Redirect
or a RedirectMatch
permanent (i.e. code 301) just by adding 301 to the line:
Redirect 301 /here/ https:www.example.com/there/
RedirectMatch 301 /here/(.*) https:www.example.com/there/$1
Also, for a permanent redirection these variations:
Redirect 301
Redirect Permanent
RedirectPermanent
all mean exactly the same thing.
a ROSE is not a Rose is not a rose is not a RoSe
I haven't even gotten into trailing slashes and case sensitivity on directories or parameters, but these make a difference too. The only time Google does not care about case or trailing slash is in the root domain.
All of these are identical to google:
www.MyFunDomain.com/
www.MyFunDomain.com
www.myfundomain.com/
www.myfundomain.com
WWW.MyFuNdOmAiN.COM
This is because the spec for domain names is case insensitive. But these:
ex.com/MyPath/
ex.com/MyPath
ex.com/mypath/
ex.com/mypath
ex.com/mYpAtH
Are all considered DIFFERENT even if your system or server considers them to be the same. While trailing slashes are not required on the TLD, they ARE required on all paths:
ex.com/mypath/
implies ex.com/mypath/index.html
but
ex.com/mypath
implies ex.com/mypath.html
.
BEST PRACTICES
Some solutions to this issue are:
1) Make a house standard that all paths and file names follow a strict set of case rules. The easiest is to require lower case ONLY. Or UPPERCASE for paths and lowercase for files.
2) Setup rewrite rules to make permanent 301 redirects for all variations in scheme, subdomain, trailing path slash, and file extension. If you make everything lowercase, it makes it easy to use regex to force the URL to all lower, regardless of how it was requested.
Because the possibilities are practically endless, and Redirect
requires case sensitive paths, RedirectMatch
or Rewrite
are better choices, so that every possible variation in:
https://www.example.com/sitepath/
is shown to the Google crawlers exactly that way, and not
http://example.com/SitePath/index.php
In the next installment of this series on SEO, using redirect or rewrite rules in the main conf file as opposed to the htaccess file, and using Regex for Rewrite and RedirectMatch Rules.