Magento Expert Forum - Improve your Magento experience

Results 1 to 17 of 17

What is robots.txt?

  1. #1

  2. #2
    Junior Member
    Join Date
    Mar 2016
    Posts
    209
    Thanks
    0
    Thanked 2 Times in 2 Posts

    Default

    Robots.txt is a text file
    Webmasters create to instruct robots how to crawl and index pages on their website.

  3. #3
    Junior Member
    Join Date
    Feb 2016
    Posts
    190
    Thanks
    0
    Thanked 2 Times in 2 Posts

    Default

    Robots.txt is a text file that is inserted into your website and contains information for search engine robots. The file lists webpages that are allowed and disallowed from search engine crawling.

  4. #4
    Junior Member
    Join Date
    Apr 2016
    Posts
    116
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default

    robots.txt is file which is control whole website from search engine bot

  5. #5
    Junior Member
    Join Date
    Mar 2016
    Posts
    168
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default

    The robots exclusion protocol (REP), or robots.txt is a text file webmasters create to instruct robots (typically search engine robots) how to crawl and index pages on their website.

  6. #6
    Junior Member
    Join Date
    Jun 2016
    Posts
    28
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default

    robots.txt is a file which is control of search engine bot.

  7. #7

  8. #8
    Junior Member
    Join Date
    Jul 2016
    Posts
    91
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default

    The robots exclusion protocol, or robots.txt is a text file webmasters create to teach robots how to crawl and index pages on their website.

  9. #9
    Junior Member
    Join Date
    Feb 2016
    Location
    Chennai
    Posts
    11
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default

    What is a robots.txt file?
    The robots.txt file is a simple text file placed on your web server which tells webcrawlers like Googlebot if they should access a file or not.
    User-agent: *Disallow: /folder/Disallow: /file.htmlDisallow: /image.png
    Why should you learn about robots.txt?
    Improper usage of the robots.txt file can hurt your ranking
    The robots.txt file controls how search engine spiders see and interact with your webpages
    This file is mentioned in several of the Google guidelines
    This file, and the bots they interact with, are fundamental parts of how search engines work
    Tip: To see if your robots.txt is blocking any important files used by Google, use the Google guidelines tool.
    Priorities for your website
    There are three important things that any webmaster should do when it comes to the robots.txt file.

    Determine if you have a robots.txt file
    If you have one, make sure it is not harming your ranking or blocking content you don't want blocked
    Determine if you need a robots.txt file
    Determine if your robots.txt is blocking important files
    You can use the Google guidelines tool, which will warn you if you are blocking certain page resources that Google needs to understand your pages.

    If you have access and permission you can use the Google search console to test your robots.txt file. Instructions to do so are found here (tool not public - requires login).

    To fully understand if your robots.txt file is not blocking anything you do not want it to block you will need to understand what it is saying. We cover that below.

    Do you need a robots.txt file?
    You may not even need to have a robots.txt file on your site. In fact it is often the case you do not need one.

    Reasons you may want to have a robots.txt file:

    You have content you want blocked from search engines
    You are using paid links or advertisements that need special instructions for robots
    You want to fine tune access to your site from reputable robots
    You are developing a site that is live, but you do not want search engines to index it yet
    They help you follow some Google guidelines in some certain situations
    You need some or all of the above, but do not have full access to your webserver and how it is configured
    Each of the above situations can be controlled by other methods, however the robots.txt file is a good central place to take care of them and most webmasters have the ability and access required to create and use a robots.txt file. Hadoop Admin Training | Data Science Training | Big Data Training | Devops Training

    Reasons you may not want to have a robots.txt file:

    It is simple and error free
    You do not have any files you want or need to be blocked from search engines
    You do not find yourself in any of the situations listed in the above reasons to have a robots.txt file
    It is okay to not have a robots.txt file.

    When you do not have a robots.txt file the search engine robots like Googlebot will have full access to your site. This is a normal and simple method that is very common.

    How to make a robots.txt file
    If you can type or copy and paste, you can also make a robots.txt file.

    The file is just a text file, which means that you can use notepad or any other plain text editor to make one. You can also make them in a code editor. You can even "copy and paste" them.

    Instead of thinking "I am making a robots.txt file", just think, "I am writing a note" they are pretty much the same process.
    J2EE Training | Mainframe Training | SQL Server DBA Training

  10. #10
    Junior Member
    Join Date
    Jul 2016
    Posts
    91
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default

    The robots exclusion protocol or robots.txt is a text file webmasters develop to teach robots how to crawl and index pages on their website.

  11. #11
    New member
    Join Date
    Jun 2016
    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default

    Robots.txt is such type of file which consist of such URL that you don't let search engine to crawl.

  12. #12
    Junior Member
    Join Date
    Apr 2016
    Location
    Delhi
    Posts
    112
    Thanks
    1
    Thanked 1 Time in 1 Post

    Default

    Robots.txt is text file. work of robots.txt file basically robots.txt file help to block page & directory

  13. #13
    Junior Member othername0104's Avatar
    Join Date
    Sep 2016
    Posts
    46
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default

    Robots.txt is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do.

  14. #14
    Junior Member
    Join Date
    Aug 2016
    Posts
    37
    Thanks
    0
    Thanked 1 Time in 1 Post

    Default

    The robots.txt, is a standard used by websites to communicate with web crawlers and other web robots.

  15. #15
    Junior Member
    Join Date
    Jun 2016
    Posts
    28
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default

    The robots.txt is a text file webmasters create to instruct robots how to crawl and index pages on their website.

  16. #16
    Expert
    Join Date
    Mar 2016
    Location
    india
    Posts
    570
    Thanks
    0
    Thanked 1 Time in 1 Post

    Default

    Robots.txt is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do. It is important to clarify that robots.txt is not a way from preventing search engines from crawling your site (i.e. it is not a firewall, or a kind of password protection) and the fact that you put a robots.

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •