A good rule to follow is to set up a Webmaster Tools AKA Google Search Console. Here you want FIRST to set your preferred domain (www or non-www). This you will confirm with the rel=”canonical” link in the header of your pages.

Now let us define the two topics we are to examine:

What is a robots.txt file?

A robots.txt file is a file at the root of your site that indicates those parts of your site you don’t want accessed by search engine crawlers. The file uses the Robots Exclusion Standard, which is a protocol with a small set of commands that can be used to indicate access to your site by section and by specific kinds of web crawlers (such as mobile crawlers vs desktop crawlers).
What is a sitemap?

sitemap is a file where you can list the web pages of your site to tell Google and other search engines about the organization of your site content. Search engine web crawlers like Googlebot read this file to more intelligently crawl your site.

Also, your sitemap can provide valuable metadata associated with the pages you list in that sitemap: Metadata is information about a webpage, such as when the page was last updated, how often the page is changed, and the importance of the page relative to other URLs in the site.
Now you are ready to create (always make the ROBOTS.TXT before the SITEMAP.XML) your robots.txt file – following this great guide from Google: URL blocking commands to use in your robots.txt file

Next you are ready to make your sitemap.xml file. For this you may wish to use this search for “sitemap generator tool. Make sure to tweak your robots.txt file in order to insert into your robots.txt file the following – the location of your sitemap(s)

  • Sitemap:

 NOW YOU ARE READY TO GO TO WEBMASTER TOOLS (AKA: GOOGLE SEARCH CONSOLE) and submit the location of your sitemap and the content of your robots.txt

Note: You can do similar steps for to be indexed well on Bing by signing up and using:

Learn how Google discovers, crawls, and serves web pages

