Tips On Preventing Bandwidth Theft, Hotlinking And File Leeching
Bandwidth theft, hotlinking, file leeching, bandwidth leeching, external linking, remote linking, deep linking and direct linking are all words and phrases used to describe a single problem faced by many Webmasters.Bandwidth theft, hotlinking, file leeching, bandwidth leeching, external linking, remote linking, deep linking and direct linking are all words and phrases used to describe a single problem faced by many Webmasters. They describe the practice of building Web pages that contain unauthorized content links to files hosted by another site.
Notice that I said content links and not navigation links that lead to another site. Content links are file references that the browser fetches to draw the page such as images, style sheets, scripts or even complete Web pages that are rendered within a frame. In other words, these are embedded content or embedded objects within an HTML page.
The result of hotlinking is that the offending site is able to present its pages without paying for the bandwidth needed to serve up the stolen content. The victim’s site ends up paying the bandwidth expense for serving up the files without gaining any page views. Many Webmasters would not mind if an image were copied and hosted by another site, especially if permission was sought in advance. The objection in the case of hotlinking is paying bandwidth bills for someone else’s benefit.
There are two levels at which you can apply controls to prevent hotlinking. One option is to control it at the Web server level. In Apache, this is typically implemented using mod_rewrite, while in iis this would be implemented using an isapi filter. The other possibility is to use a scripting facility such as apache + php or iis + asp to control access to the resources to be protected from hotlinking. Whatever bandwidth protection tool or technique is picked to combat hotlinks, the task remains the same. First, decide if the request is a permitted legitimate link or a hotlink originating from another site and second, send the file or drop the hotlinked request. Studying the solutions will show that the mechanisms that are used are the http-referrer header, browser cookies, dynamic session identifiers and dynamic link manipulation.
Http-referrer is an http request header sent by the browser that tells the server or script what site and page contained the current request. There are certain notable exceptions that must be accommodated. The http-referrer value will be blank if the request was a URL type-in, if an intervening proxy server deleted it, if the request is an http:// reference originating from an https:// originator, if the request is being masked by Internet privacy software or if the request is being modified by browser privacy settings. The http-referrer can also be a nonsensical string if it is being masked by Internet privacy software or browser privacy settings.
The biggest security hole in depending on the http-referrer header is that a blank referrer must almost always be permitted in the server settings. This is necessary to accommodate legitimate users who are reaching the site through normal means but presenting a blank referrer string. In this scenario it is trivial to create a Web page that will always present a blank referrer. One method is to use JavaScript to write the image links at the client browser. A second method is to do a meta-refresh. Either method will cause a blank referrer to be sent to the server.
A browser cookie is just another http request header that returns information which the server has previously requested the client to store and return with every http request. When client cookies are available, they can be a very reliable tracking device.
However, as concerns for privacy grow on the Internet, increasing numbers of users are using inaccurate http-referrer headers and turning off client browser cookies. Of course, this reduces the effectiveness of depending on these features as identifiers for bandwidth protection purposes.
Dynamic session identifiers and dynamic link manipulation refer to the technique of modifying parts of URLs for each unique client. The limiting factor is the requirement that the pages containing such links cannot be static HTML. Each page request will need to have been uniquely created by a scripting engine such as PHP, ASP, ASP.net, ColdFusion or Java. The server works harder, the user cannot cache the page and search engines may have a hard time crawling these pages if query strings are involved.
There is now an isapi filter-based product for iis that overcomes the difficulties normally inherent in dynamic URL manipulation. The company is aptly called coldlink.com, and they have an in-depth bandwidth protection demonstration site where you can see their product prevent hotlinking on an iis server. It protects any kind of file content and works with both static and dynamic HTML pages without depending on cookies or http-referrer headers.
Internet searches on Google, AllTheWeb, Yahoo or MSN that will yield useful updated instructions and source code for implementing bandwidth protection include:
· hotlink .htaccess
· hotlink mod_rewrite
· hotlink php symlink
· hotlink webmasterworld
· leechblocker
As an aside, some Webmasters have also been making use of non-technical means by enforcing their legal rights. This is particularly true if the offending action is within the parameter of the Digital Millennium Copyright Act (DMCA), and is specifically mentioned. Success in following this avenue can be mixed. The first step is a strongly worded cease and desist letter. In some jurisdictions, a formal cease and desist letter is a necessary step to further prosecution. A particularly nice touch is to have your lawyer send the initial email and include your ISP.
Bob can be contacted by email through his Web site at http://coldlink.com/. The original version of this article can be viewed at http://wordworx.com/ and may have additional information added in the future.