How to use blogger new Crawlers and indexing feature effectively for improved SEO

Posted : Sunday, June 03, 2012  |  Post Author : Paul Crowe | 51 comments

SEO Robots Txt image

Guest Post – Today’s host is Chris who provides indept information on using the new search preferences settings on Blogger.Want to get involved ? See How To Become a guest author on Spice Up Your Blog.

After blogger updated their interface, they added more useful features such as edit Meta tag modify crawlers and indexing, add authors, etc. This post I’m going to show you how to use blogger new crawlers and indexing feature effectively for improve your blogs SEO.
Before start this guide, I like to briefly explain what robots.txt is. A robots.txt use for provide instructions to different search engine bots to, how to crawl your site contents, etc as well as provide a sitemap. For instance you can remove certain items on search result using this function as well as improve your site SEO.

Important – If you are unsure of any of the steps do not make any edits.

How to create a custom robots.txt on blogger

1) Log-in to your blogger account.
2) Go to your blog.
3) Now go to “Settings>Search Preferences”
4) Under the “Crawlers and indexing” you can find “Custom robots.txt”
5) Click “Edit” link.
Now you can add different Crawler to instructions. This post I added different useful crawl robots.txt codes. First we identify how this code works.

User-agent: *
Disallow: /search
Allow: /

Above code,

User-agent: – Mention crawler name. For instance, Google bot.
Disallow: – Specify which pages should not crawl.
Allow: – Which pages crawl.
/ (slash):- indicate your home page.

Setup instructions for all robots

If you use above code, it will cause duplicate content issues and your site rank will be reduced. So that, you can use the following code. It will allow index and crawl entire blog but, not allow label and search pages.

User-agent: *
Allow: /

Block label and search pages crawling.

If you use above code, it will cause duplicate content issues and your site rank will be reduced. So that, you can use the following code. It will allow entire blog but, not allow crawl label and search pages.
User-agent: *
Disallow: /search
Allow: /

Block certain page (s).

Some reasons, you many need to hide your selected page or pages from the search engines. At that time you can use the following code.
User-agent: *
Disallow: /p/page-one.html
Allow: /

If you need to block more-than one page add their URL one by one on Disallow section like below.

User-agent: *
Disallow: /p/page-one.html
Disallow: /p/page-two.html
Allow: /

Allow all but block specific crawler.

If you want to block single crawler, you can add the following code.

User-agent: <bot name>
Disallow: /

User-agent: Googlebot-News
Disallow:

Setup AdSense crawler instruction.

To improve your Google AdSense performances; you can specify how AdSense bot crawl your site. Actually there is no need to block anything.

User-agent: Mediapartners-Google
Disallow:

Block Images indexing.

If you don’t like to see your blog post’s images on the Google search result, you can remove them by using the following code.

User-agent: Googlebot-Image
Disallow: /

If you need to block any-other bot crawl your site, use following code. However you need to add selected bot name in “User-agent:” section.

User-agent: <required crawler name>
Disallow:

User-agent: *
Disallow: /

You can find more crawlers and their user agent information in here and in here.

Adding site map.

Apart from above crawl instructions, you can add a sitemap. Normally blogger default sitemap provides 26 posts. So that you can add correct sitemap using “Custom robots.txt”. This is an example of how to add a sitemap.

Sitemap: http://www.spiceupyourblog.com/atom.xml?redirect=false&start-index=1&max-results=500

Sitemap: http://www.spiceupyourblog.com/atom.xml?redirect=false&start-index=501&max-results=1000

Final result demo.

If you need to improve your blog’s search engine visibility and crawl entire blog other than labels and search pages using the following code as your robots.txt.

User-agent: Mediapartners-Google
Disallow:

User-agent: *
Disallow: /search
Allow: /

Sitemap: <your blog site map past here>

guest blog postBy Guest Author -This article is written by Chris, who is a part time blogger and write different root android guides, and lots of droid stuff and apps in his “Spicy Gadgematic” blog.

 

51 comments:

  1. Thank you for this great post.

    JayRyan’sBlog

    ReplyDelete

  2. Oluvil Campus BlogJune 4, 2012 at 5:12 AM

    Chris,This is a great post.This information is useful to any blogger who doen’t know about robots.txt. thanks Chris and Paul.

    ReplyDelete

  3. Super useful info Chris. Using the Robots file to your blog’s advantage is very essential for SEO purposes.

    ReplyDelete

  4. Wonderful! tips to grab. Great job!

    ReplyDelete

  5. thanks for sharing, It is so useful to me since everything were detail about the robots.txt

    ReplyDelete

  6. can you post a video about this tutorial ??

    ReplyDelete

     

    Replies

     

  7. will you plz provide a video tutorial about this as aziz said , thanks

    Delete

  8.  

  9. Thanks for such a nice post !

    ReplyDelete

  10. Thks sir.. this is more helpful

    ReplyDelete

  11. but that’s the default robot.txt code for all blogspot blogs…that means we dont need to make any changes in it..

    ReplyDelete

     

    Replies

     

  12. Default robots.txt has 26 post site map (feeds/posts/default?orderby=UPDATED), now we can easily edit blogger robots.txt so we can add correct site map. If you need more customizations follow above methods.

    Delete

  13.  

  14. This is good ..

    ReplyDelete

  15. very good post i used this in my blog quizvook.blogspot.com

    ReplyDelete

  16. very helpful..

    ReplyDelete

  17. I don’t understand properly.

    Which is the perfect robot.txt code for my blog,
    my duplicate content is increasing day by day,
    Anyone help me please ?

    ReplyDelete

  18. very helpful for blogger

    ReplyDelete

  19. SaikiranReddySamaJuly 9, 2012 at 1:49 PM

    Great Tutorial.I’ve Used this on my blog.For First Time..I’ve Enabled Indexed Labels on My Blog..But I got Too many 404 Error Pages By Doing that. So I’ve Now Added the Disallow:/search tag to the Code..

    ReplyDelete

  20. amazing for the info. thanx for sharing. happy blogging

    ReplyDelete

  21. read your article on robot texts what does it mean in plain English?

    ReplyDelete

     

    Replies

     

  22. Custom robots.txt is a way for you to instruct the search engine that you don’t want it to crawl certain pages of your blog (“crawl” means that crawlers, like Googlebot, go through your content, and index it so that other people can find it when they search for it)

    Delete

  23.  

  24. usman mohiuddin sultanJuly 23, 2012 at 7:08 PM

    great post.after i inserted this code.my traffic increases 50 up.

    ReplyDelete

  25. Very good post and valuable information

    ReplyDelete

  26. Thanks now just waiting for the results

    ReplyDelete

  27. Hi! Paul Crowe,
    Thank you for the useful information that I have been searching for.

    ReplyDelete

  28. I want to index my posts using labels which robots tags i should use

    ReplyDelete

  29. hi great tytorial, is help me to confing my robots.txt file in to my blog . thanks

    ReplyDelete

  30. Thanks so much,

    After some other Google warnings inside the Google webmaster account I was scared about this new blogger warning! I just wanted to thank you..

    ReplyDelete

  31. Thanks bro to share important article to us

    ReplyDelete

  32. Thank You..

    ReplyDelete

  33. very useful.Thanks for sharing this.

    ReplyDelete

  34. Thank You for this awesome guide.

    ReplyDelete

  35. Thank you for the post

    ReplyDelete

  36. thanks alot for this wonderful article, i really appreciate it

    ReplyDelete

  37. Hi Chris, when I updated the following code it has blocked my lots of urls.

    User-agent: Mediapartners-Google
    Disallow:

    User-agent: *
    Disallow: /search
    Allow: /

    Now I have updated the following code.

    User-agent: *
    Allow: /

    Will it unblock the url and increase the traffic. Plz help me.

    http://www.entertainmentinfo.in

    ReplyDelete

  38. Thanks for this great tutorial. Adding sitemap via robot.txt is easiest way to index your post.

    ReplyDelete

  39. Nice tutorial. It really worked for my blog. Thanx a lot.

    ReplyDelete

  40. thanks very informative article, will try to follow your tutorial 🙂

    ReplyDelete

  41. I pleased to see this post thanks for sharing this at one place.

    ReplyDelete

  42. Very clear and nicely explained thanks

    ReplyDelete

  43. Great bro..im looking,this type of answer to get cleared my doubts…

    ReplyDelete

  44. This is really very informative. Thanks for this post.

    ReplyDelete

  45. I was looking for this problem about robot.txt and found useful information and clear my doubts here. Thanks for sharing

    ReplyDelete

  46. It is important to add
    User-agent: Mediapartners-Google
    Disallow:

    to my robot.txt if i don’t do this it will affect my adsense account like get ban by google adsense or not?

    ReplyDelete

     

    Replies

     

  47. nope, i think it just allow google adsense bot to crawl your webpage, it do help.

    Delete

  48.  

  49. useful article. love it

    ReplyDelete

  50. nice post,thanks a lot.

    ReplyDelete

  51. I love the post. Thanks for sharing about Crawlers and indexing in blogger.
    Thank you Author.

    ReplyDelete

  52. Thanks for Useful Info..

    ReplyDelete

  53. Nice Article helpfull for newbies thanks for sharing with us.

    ReplyDelete

  54. nice seo tips.thanks for sharing.keep posting

    ReplyDelete

  55. Thank U very Much

    ReplyDelete