WordPress SEO – The Why & How of the Robots.txt File

June 30, 2010 · 21 comments

in Blogging

Website SEO – They Why & How of the Robots.txt File


Setting up a robots.txt file for your WordPress website or blog is an SEO trick that sounds full of mystique but is really straight forward.

There’s a number of reasons they impact WordPress SEO but the largest one is that Google HATES (reads penalizes) duplicate content.

You may say “but I only post stuff once, ever”… however, there are certain components within websites that can give the opposite impression.

The biggest culprits are categories and tags which are important to use, but pose a risk because they get a readable page associated with them.

Let’s say I wasn’t paying real close attention and I have two tags: “social media resources” and “social media tools”. Now you can see how, because these are similar, there is a  chance that if you opened them, they could show that the exact same posts had been tagged with those two tags. Unfortunately, when the page visually (rather than logically) looks similar…. Google starts slapping hands!

So, how is a Robots.txt File Going To Help WordPress SEO?

A robots.txt file is a simple little file that goes in the root directory of your blog/website and tells Google (and other “bots” or “spiders”) where they may and may not store information from. This essentially hangs a “do not disturb” sign on dangerous areas (although it does not stop them from reading necessary things, only from storing the data).

Think of a robots.txt file for your site as a “duplicate page filter” for the Search Engine Bots.

You can find a link to my robots.txt file here and then save the page if you do not want to fuss with opening notepad to create your own. This is a very generic robots.txt file and you can use it on just about any self-hosted wordpress blog.

What’s Inside the Robots.txt File?

The inside of the file looks simply like this. Should you chose to retype be careful about stray characters and spaces.

User-agent: *
Request-rate: 1/5 # Maximum request rate of 1 page every 5 seconds (12/minute)
Crawl-delay: 5 # Used by several bots (12/minute)

User-agent: Googlebot

Disallow: /wp-content/
Disallow: /trackback/
Disallow: /wp-admin/
Disallow: /feed/
Disallow: /archives/
Disallow: /index.php
Disallow: /*?
Disallow: /*.js$
Disallow: /*.inc$
Disallow: /*.css$
Disallow: */feed/
Disallow: */trackback/
Disallow: /page/
Disallow: /tag/
Disallow: /category/
Disallow: /go/
Disallow: /recommends/

User-agent: Googlebot-Image
Disallow: /wp-includes/
Allow: /wp-content/uploads/

User-agent: Mediapartners-Google*
Disallow:

User-agent: Adsbot-Google
Allow: /

User-agent: Googlebot-Mobile
Allow: /

User-Agent: Googlebot
Disallow: /link.php
Disallow: /gallery2
Disallow: /gallery2/
Disallow: /category/
Disallow: /page/
Disallow: /pages/
Disallow: /feed/
Disallow: /feed

User-agent: ia_archiver
Disallow: /

User-agent: duggmirror
Disallow: /

You can see that there is a whole lot of disallow “do not disturb” signs! I don’t want to get into what each line does because you can find that on 100 other websites complete with the associated headache. All you really need to know is that pending special circumstances, this is all you need for now.

So… How To Get The Robots.txt File Onto My Blog Or Website

This is set-it-and-forget-it information and you will only generally do this once ever for a particular blog or website. Bookmark this post so you can find it again when you may wish to set up a second site!

Steps To Install A Robots.txt File

1. Open your self-hosted blog or website using either FTP or your hosting provider’s CPanel File manager.

2. Navigate to the root of your site. If you only have a single blog or site on the account, that is generally the inside of the public_html folder. If you have multiples, you may have a folder with the domain name inside of the public_html folder. (Accidentally putting this in the wrong place wont generally hurt anything, but it simply wont work.)

3. Either open the link to my robots.txt file and save it as a .txt file – or copy paste the contents from above into a .txt file using a simple editor like Notepad. Do NOT use Word. Save the File, naming it robots.txt, if you haven’t done so already.

4. Now upload the robots.txt file to your website host using either the FTP or File Manager. On FTP that is simply dragging it from the left side (your computers files) to the right side (your server’s files)

5. Sit back, content in knowing just how damn smart you are!

A Few Notes

1) Thesis theme has already built the most critical components into the theme. However, there is more you can accomplish by uploading this anyways.

2) This only works on a self-hosted WordPress.org blog… aka no you can not install it on a wordpress.com or blogger.com blog.  There are however similar files for other self-hosted platforms such as Joomla.

3) There are a few different theories on what should be in a robots.txt file and you can read more on that here if you’re bored and up for scratching your head.

4) Yes, there are ways to further build on the robots.txt file but this is the basics – what you need to keep you out of trouble.

5) Yes there are plugins that do similar, but we don’t want to take an SEO penalty for further slowing down our site as every single plugin installed does.

Summary of the Robots.txt

The robots.txt file is a simple little file that gets placed in the root (home) directory of your particular site and keeps Google from indexing stuff it’s not smart enough to understand.

This benefits SEO in WordPress because it avoids us taking a penalty for having what appears to be duplicate content … considered SPAM by Google… on our sites.

Simply save the file. Upload it via FTP/File Manager to the root folder of your site. Be pleased with how SEO savvy you’ve become!

If you found this post helpful, I hope you’ll share it with your friends, followers and community! I look forward to your questions below and what other SEO tips you can share with the community here!

Kimberly

Subscribe To My Newsletter
social tripletKeep up with all the latest social marketing changes and news to accelerate your business!

  • WordPress, Facebook, Twitter, & Google+
  • Social Marketing
  • List Building & Affiliate Marketing


Previous post:

Next post: