“Google Killer” Killing Websites Instead
CYBERSPACE — Cuil, a search engine which has named itself the “Google Killer” because of its ability to index three times as many sites as Google and 10 times as many as Microsoft, now stands accused of killing websites because its bots require too much bandwidth to do their meticulous jobs.Tech Crunch reported last week that Cuil’s Twiceler indexing bot — which Cuil’s website says “stays on a page and analyzes … its content, its concepts, their inter-relationships and the page’s coherency” until it has a complete picture of everything the page offers — is so zealous about thoroughly indexing websites that site servers sometimes roll over and give up.
“I don’t know what spawned [the assault], but when Cuil attempts to index a site, it does so by completely hammering it with traffic,” a tipster wrote to Tech Crunch. “So much, that it completely brings the site down. We’re 24 hours into this ‘index’ of the site, and I’ve had to restrict traffic to the site down to two packets per second, while discarding the rest, or otherwise it makes the site unusable.”
Forums at The Admin Zone are brimming with reports of “Cuil attacks.” According to one poster, Twiceler “leeched enormous amounts of bandwidth — nearly 2GB this month until it was blocked. It visited nearly 70,000 times!”
According to victims of Twiceler’s indexing, the bot’s activities seem somewhat random and amateurish. Often, some noted, the bot attempts to guess URLs of pages that aren’t public in order to index those, too.
Cuil Operational Engineer James Akers attempted to quell the furor by admitting Twiceler is a beta product and may not be quite ready for primetime.
“Twiceler is an experimental crawler that we are developing for our new search engine,” he responded to a blogger and site developer known as Jazzy Chad. “It is important to us that it obey robots.txt, and that it not crawl sites that do not wish to be crawled. If you wish I will be glad to add your site to our list of sites to exclude.”
He also warned system administrators that occasionally Cuil has encountered imposters masquerading as Twiceler. Akers provided the URL of a page listing Cuil’s IP addresses (http://www.cuil.com/info/webmaster_info/) so sysadmins could be sure they were under attack by Twiceler and not someone else.
“It doesn’t look like the pelting of sites by [Cuil’s] Twiceler bot is an isolated incident,” Don Reisinger wrote at Tech Crunch. “And if it’s true that Twiceler is trying to find pages on sites that don’t even exist to simply increase the index size, Cuil should work quickly to modify the bot before it receives even more negative publicity.”