John Mueller answered in a recent Google hangout how to stop Google from crawling and indexing a staging server.
What is a Staging Site or Server?
A staging site or server is a section of a server or website that is used to test a new version of a website.
How to Stop Google from Crawling a Staging Server
The best way to stop Google from crawling and indexing the staging site or server was to block the web pages with authentication, John Mueller said.
This is how John Mueller answered the question:
“There are multiple ways… people do it in multiple ways. I think that the important part is that you don’t link to it. Because if we don’t find it then we can’t crawl it. But sometimes that still happens. “
John then goes on to recommend a better way to block both crawling and indexing of a testing server.
“Ideally, what you would want to do is provide some kind of server side authentication on the server so that normal users when they go there they would get blocked from being able to see the content; that would include GoogleBot.
And you can do that on an IP address basis, you can do it with a cookie, you can do it with normal authentication on the server.
Anything where you have to prove that you’re the right person and you can actually look at that content.
I think that’s generally the best approach for staging servers…”
“…it’s something that means you don’t have to change the normal settings on the site itself, in particular robots.txt but also noindex meta tags for example.”
By using a password protected solution you will eliminate the possibility of making a mistake on your live website, because it’s easy to have mistakes using noindex meta tag or the robots.txt solution.
Traditionally, the most common way to block Google from indexing a staging site was to create a robots.txt file that keeps Google from crawling the staging site.
Google’s John Mueller offered the best solution.
Watch John Mueller’s answer here: