Are Server Errors Keeping Your Site Out of Search Engines?
Adult Webmasters have a lot on their plate, and often there is little time to learn proper syntax for Web server configuration files. Yet if you make a simple mistake in the syntax of your .htaccess file, you could lose your ranking on Google. Mark Jervis is a technical expert for ProHosters.com, and after recently investigating a problem relating to a mysterious drop in available hard disk space he discovered several errors that Webmasters commonly make which effectively exclude them from consideration for the Google search engine.Adult Webmasters have a lot on their plate, and often there is little time to learn proper syntax for Web server configuration files. Yet if you make a simple mistake in the syntax of your .htaccess file, you could lose your ranking on Google. Mark Jervis is a technical expert for ProHosters.com, and after recently investigating a problem relating to a mysterious drop in available hard disk space he discovered several errors that Webmasters commonly make which effectively exclude them from consideration for the Google search engine.
Unlike most articles on search engine marketing, this one has nothing to do with Meta tags. Instead it discusses the rights and wrongs of “redirects.” The use of redirects in adult Web sites is a very popular tactic, but just how many Webmasters are using them properly?
Being a tech with a well-known Web hosting company, I recently got to watch the error log on a Google index spider. The spider entered our network and started listing pages, however much to my dismay I found it was also doing something else. The spider started following the redirects, meaning it wasn’t following the links everybody had hoped that it would follow.
DISCOVERING THE PROBLEM
First you may ask how I knew I was watching the Google spider, since Google doesn’t announce when it’s going to spider. That was easy; the monitoring alarms for our servers started going off. I found disk space problems all over the place, and while checking these disk space problems I found that the error log for our Web server software was growing on all our servers. Coincidence? I think not.
While watching the logs grow, I started to get a feel for the most common error – it was redirect errors being caused by .htaccess files. Upon further investigation I found that the Google spider was skipping these sites when it received the redirect error. I then started looking at the .htaccess files to find out what was causing the error, and the things I found there prompted me to write this article in an attempt to educate Webmasters on the correct usage of .htaccess files.
Listed below are the problems that I found and the correct way to fix each problem:
REDIRECTS: THE MOST COMMON ERROR
ErrorDocument 404 http://www.mydomain.com
Do you see any problems with the above statement? Most Webmasters will say no, however the above statement is completely wrong and will force a spider off your page and into an error. This statement will not only make the spider not list your site, but it will also force the spider to stop following your links, thus hurting you in search engine listings.
The proper way to make this statement is as follows:
If you’re sending people from the current domain (where the 404 error occurred) to another domain then the correct syntax is:
Redirect 404 http://www.mydomain.com
If you’re keeping surfers on the domain where the 404 error occurred but sending them to a custom error page then the syntax is as follows:
ErrorDocument 404 /errors/404.html
The above path means there is a folder called “errors” and an html document called “404.html” in that folder. The “errors” folder would be found the first branch off where the .htaccess file was located.
ANOTHER PROBLEM WITH REDIRECTS
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://yourdomain.com$ [NC]
RewriteCond %{HTTP_REFERER} !^http://www.yourdomain.com$ [NC]
RewriteCond %{HTTP_REFERER} !^http://yourdomain.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://www.yourdomain.com/.*$ [NC]
RewriteRule .*\.(gif|GIF|Gif|jpg|JPG|jpeg)$ http://www.mydomain.com [R,L]
Some people try to use the above syntax for capturing “hotlinkers” attempting to steal bandwidth from their sites by linking directly to images. This syntax does not work at all.
The correct way to do this is as follows:
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://yourdomain.com$ [NC]
RewriteCond %{HTTP_REFERER} !^http://www.yourdomain.com$ [NC]
RewriteCond %{HTTP_REFERER} !^http://yourdomain.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://www.yourdomain.com/.*$ [NC]
RewriteRule .*\.(gif|GIF|Gif|jpg|JPG|jpeg)$ – [F]
If anybody is hotlinking to you then it will create a 403 error (Forbidden), so under this statement you can use either Redirect or ErrorDocument from the above lesson to customize the 403 error page. Here’s how that might look:
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://yourdomain.com$ [NC]
RewriteCond %{HTTP_REFERER} !^http://www.yourdomain.com$ [NC]
RewriteCond %{HTTP_REFERER} !^http://yourdomain.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://www.yourdomain.com/.*$ [NC]
RewriteRule .*\.(gif|GIF|Gif|jpg|JPG|jpeg)$ – [F]
ErrorDocument 403 /error/403.html
I hope this helps you ensure that traffic gets to your most profitable pages. If you find your .htaccess file is in error then please fix it. You will reap the rewards of better-directed traffic and reduced bandwidth bills with less risk of server downtime.
Mark Jervis is a guest writer for The ADULTWEBMASTER Magazine and a tech for ProHosters.com.