It's good that we're getting important enough for a known
"google hacker" site to post about our uniqueness...
It's bad that we're getting important enough for a known
"google hacker" site to post about our uniqueness...
Just a heads up ...they know our "google parts" How do you stop this?
First off... don't click the button in the connect to internet wizard to
“expose the entire web site” Next... if you are stupid enough to do
THAT one, I'm copying a post from Alan Billharz
Some customers may wish to exclude their SBS 2003 installation
from the scope of Web search sites such as Google.com. This
may be because you would prefer to restrict knowledge of your
installation only to those who can use it, or, you may want to
keep some portions of your site (e.g. Business Website)
searchable while keeping other portions under the radar
of Web search sites. There is a way to do this using
the Robots Exclusion Protocol.
By placing simple text file at the root of your Web site,
you can tell Web search robots which parts of the
Web site are open for search.I've attached
two versions of robots.txt that I've whipped up
for my SBS2003 server:
1.. robots.txt - Allows search of your business Web site
but hides SBS-specific sites from search robots.
2.. robots2.txt - (Must be renamed to robots.txt)
Denies search of your entire Web site .
For more information,
check out these sources: http://www.robotstxt.org/wc/robots.html
http://www.searchtools.com/robots/robots-txt.html
http://www.searchengineworld.com/robots/robots_tutorial.htm
Many Web sites implement this functionality.
For example, you can check out
http://www.cnn.com/robots.txt .
Please respond to this post if you have any questions
or comments - let us know how this works out for you!
Thanks,Alan Billharz
--------------------------------------------------------------------------------
# Place this file at the root of the Default Web Site (%system drive%\inetpub\wwwroot)
# to allow search engines to catalog your Business Web site, but not catalog the other
# SBS-specific Web sites.
#
# Note that you must choose to publish the root of your Web site to allow the search
# engine robot to read this file. In the Configure E-mail and Internet Connection Wizard,
# choose to publish Business Web site (wwwroot).
User-agent: *
Disallow: /_vti_bin/
Disallow: /clienthelp/
Disallow: /exchweb/
Disallow: /remote/
Disallow: /tsweb/
Disallow: /aspnet_client/
Disallow: /images/
Disallow: /_private/
Disallow: /_vti_cnf/
Disallow: /_vti_log/
Disallow: /_vti_pvt/
Disallow: /_vti_script/
Disallow: /_vti_txt/
--------------------------------------------------------------------------------
# Place this file at the root of the Default Web Site (%system drive%\inetpub\wwwroot)
# to prevent all search engines from cataloging your Web site.
#
# Note that you must choose to publish the root of your Web site to allow the search
# engine robot to read this file. In the Configure E-mail and Internet Connection Wizard,
# choose to publish Business Web site (wwwroot).
User-agent: *
Disallow: /
P.S. This will be included in the SBS 2003 advanced
book by Harry Brelsford