how can i make sure my whmcs is 100% not searched by google ?

rob2 · June 14, 2011

Hello,

i hope i can use whmcs's announcements and knowledgebase feature to show info,

and hope search engine(google,bing,yahoo,...etc) to search my articles under announcements and knowledgebase to improve the seo,

but i worry if the search engine will search out the data include my client area,

it may include the client's billing or product or admin area data,

how can i prevent it ?

thank you

m8internet · June 14, 2011

Simply list the ones you want to restrict in the robots.txt file as DisAllow

Alistair · June 14, 2011

Google will not display data relating to your restricted areas.

rob2 · June 15, 2011

but if some search engine like bots does not follow robots.txt,how can i let my whmcs under safe ?

ckh · June 15, 2011

A bot would need to have the login information in order to view the member area. Additionally, any bot would have to find a link in one of your pages it has access to in order to find another page.

So, if you created a directory, say XYZ and put some files in it but never put the link in any of your pages or any site map, then no bot will be able to find it unless they they guess the directory name and guessed the names of the files in it.

A bot is no different from a visitor to your site. It has no special access to your site. It can't view/indiex your member area without a valid username/password nor find any pages that aren't linked in pages it has access to.

m8internet · June 15, 2011

So, if you created a directory, say XYZ and put some files in it but never put the link in any of your pages or any site map, then no bot will be able to find it unless they they guess the directory name and guessed the names of the files in it.

I suggested you try this, as the bot will process them as separate websites

If any website page is live it can be found, unless login or robots.txt protected

sparky · June 15, 2011

This may give you a bit of a start... it goes into your htaccess file

# deny bots access

SetEnvIfNoCase user-agent .*Indy Library.* stayout=1

SetEnvIfNoCase user-agent .*libwww-perl.* stayout=1

SetEnvIfNoCase user-agent .*Offline Navigator.* stayout=1

SetEnvIfNoCase user-agent .*Xaldon WebSpider.* stayout=1

SetEnvIfNoCase user-agent .*Wget.* stayout=1

SetEnvIfNoCase user-agent .*WebImages.* stayout=1

SetEnvIfNoCase user-agent .*WebCapture.* stayout=1

SetEnvIfNoCase user-agent .*LiteFinder.* stayout=1

SetEnvIfNoCase user-agent .*disco.* stayout=1

SetEnvIfNoCase user-agent .*SBIder.* stayout=1

SetEnvIfNoCase user-agent .*MJ12bot.* stayout=1

SetEnvIfNoCase user-agent .*Baiduspider.* stayout=1

Order Allow,deny

Allow from all

deny from env=stayout

ckh · June 15, 2011

I suggested you try this, as the bot will process them as separate websites
If any website page is live it can be found, unless login or robots.txt protected

Any live page can be found by a bot just like it can be found by a person, but, a bot has no special access to your site to find pages, directories, or files.

If you put up a page xxyy3kdls.html and not tell any one about it nor reference it in any of your pages, it will be live but a bot isn't going to be able to find it except by guessing or brute force, so it's highly unlikely to be found by bot or person. But, if you have a link to it in another page on your site or reference it, then it can be found by both bot and a live person.

The point I'm trying to make is a bot doesn't have any more access to your site than a live person does. If there is a member area that requires a login, then neither a bot nor person can access it unless they have a login. If you have a page on your site that isn't referenced anywhere in your other pages, neither bot nor live person will be able to find it except by guessing.

ckh · June 15, 2011

I suggested you try this, as the bot will process them as separate websites
If any website page is live it can be found, unless login or robots.txt protected

The robots.txt file is just a text file. If you protect it then nothing will be able to access it, you are better off just not having one.

Also, the robots.txt file is voluntary. Bad bots will often use the robots.txt file to see what is on your site. A person can display the robots.txt file by just pointing their browser to it. If you have a directory there that you aren't referencing, like in my example above, well, you just told the world where it is at, so it's best to leave it out. If you have a page that you don't want indexed by a spider, but, you need to have it referenced for so people can find it on your website, then you would want to tell the bot to exclude it from being indexed.

Sign In

how can i make sure my whmcs is 100% not searched by google ?

Recommended Posts

rob2

Link to comment

Share on other sites

m8internet

Link to comment

Share on other sites

Alistair

Link to comment

Share on other sites

rob2

Link to comment

Share on other sites

ckh

Link to comment

Share on other sites

m8internet

Link to comment

Share on other sites

sparky

Link to comment

Share on other sites

ckh

Link to comment

Share on other sites

ckh

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members

Browse

Latest Activity

Blog

Technical Support

Important Information