Creating The Ultimate Adult Toy Store – Part Nine: Programming A Database Search Engine
BUSINESS STRATEGY
Over the past eight chapters we have delved into the details of creating an online adult toy store. Those details include setting up credit card transactions, the design and navigation of the store, and the structure of the product database.BUSINESS STRATEGY
Over the past eight chapters we have delved into the details of creating an online adult toy store. Those details include setting up credit card transactions, the design and navigation of the store, and the structure of the product database. Are you ready to discuss the search engine?
[Part 1] [Part 2] [Part 3] [Part 4] [Part 5] [Part 6] [Part 7] [Part 8]
THE SEARCH ENGINE
As mentioned in the previous article, there are three main components to a database-driven site – the database file, the search program, and the dynamic results page. We have already discussed the database and now it’s time to get into the most complex part of creating a toy store from scratch. This chapter may be a little more technical than the previous chapters, but this chapter is also what separates the amateur toy stores from the professional ones.
Programming a database search engine obviously requires knowing a little bit about a programming language (or hiring someone who does). A few options include CGI, JSP, ASP, and PHP. NileXXX.com is programmed almost exclusively in CGI because, well, that’s the language I happen to know. Detailing the differences between all possible languages is beyond the scope of this article, but the main advice is this: If you are unfamiliar with programming languages, hire someone to help you. We used the consultants at http://OnTheOutskirts.com for components of the NileXXX.com site, and there are many companies out there that can help you.
There are a number of factors to keep in mind when programming a search engine: security, speed, and counting. Let’s examine each one.
SECURITY
Security is important to protect your database and server from malicious users. There are two ways to program for security in CGI, the inclusive way and the exclusive way. The inclusive method specifies all of the characters that the program will allow to be entered by the user. On the other hand, the exclusive method specifies all the characters that are NOT allowed to be entered by the user. I recommend using the inclusive method because it grants you exact control over what type of characters are entered into your program and prevents you from possibly “missing one” if you are using the exclusive method.
Fortunately, current versions of PERL (the programming language and library that holds all of CGI’s operators) has a very nifty command known as “taint.” Basically, taint mode monitors user-input and variables to test their security. If you make a practice of using the taint operator (-T), security issues will become second nature to you, simply because all of your programs will crash if they are not secure.
Therefore, to create a secure environment for our search engine, we are going to use taint mode and an inclusive method of censoring user-input. So, what do those two things look like in CGI? I’m glad you asked:
#!/usr/lib/bin/perl –wT
This is how the taint operator is introduced. It joins the first line of code along with the PERL library location and the warning flag (w). The location of your PERL library may differ so consult your web server administrator. It has to be exact.
unless ( $string =~ /^[w . ! ?-]+$/ ) {$string = “”;}
This in an inclusive character command at its most basic level, and only allows certain characters to be entered into the program (letters, numbers, underscores, spaces, periods, exclamation points, question marks, and hyphens.) In the event (unless) a character other than those listed above is entered, the program simply changes the input ($string) to nothing ( =””;).
If you’re really anal, and since we’re in the adult business, we all should be, you may also want to add the following line just for extra precautions:
delete @ENV { ‘IFS’, ‘CDPATH’, ‘ENV’, ‘BASH_ENV’}; # secures the server environment
SPEED
CGI is popular largely because of two reasons. It is relatively simple and relatively fast. Speed is important, because adult-site surfers are notoriously impatient. You don’t want them to wait very long for anything because there’s too much competition out there.
On the other hand, you want to offer them a lot of products. The more products, the more crunching time your program will require. Optimizing your CGI code is important in order to make your site as fast as possible. For example, the search engine for NileXXX.com examines its entire product database in just under two seconds. In that span of time it itemizes all 8,314 products, including 9 different variables for each one. Let’s examine how it does that.
open (FILE, “database”) || die “Can’t connect: $!
”; # opens the database file we discussed in the previous article
foreach (sort()) # opens a loop that sorts the data within the database
{
chomp ($data); # places one row of data into a string-variable
if ($data =~/$string/i) # compares the database data of that row against user input
{ # opens a loop only for those products of the database that meet the user input
&sort_row; # opens a subroutine that counts the results
&print_row($data); # opens another subroutine that displays the results
} # closes the loop for products meeting the user input
} # closes the loop that sorts the data within the database
close (FILE); # closes the database
&close_row # opens a subroutine that determines maximum results
That’s it! Fairly simple, huh? This section of code does three basic things: 1) It opens a database file and compares the contents of each row against the user input; 2) It sends matching results to two different sub-routines – one that begins counting the results and the other that prints the results; and 3) It closes the database file and opens another sub-routine that neatly organizes all the results into chunks. To reiterate, let’s examine each of the above three tasks in turn.
1) The database file is what we discussed in the previous article. The code above specifically indicates how the search engine scans the database.
2) We’ll discuss one of the sub-routines below, counting the results. The other sub-routine, printing the results, is the topic of the next article.
3) Once the search engine performs its function with the database, it closes it. Then it finalizes the counting process. Let’s discuss that right now.
COUNTING
There are two good reasons for your search engine to count and organize your results. One reason is for the convenience of the user. If they are informed that their search command results in 3,300 results, they’re likely to conduct a more specific search to lower that number. The second reason is directly related to speed. While it only takes two seconds for the NileXXX.com search engine to scan and organize its entire database, it takes considerably longer for a webpage to display 8,314 items. Therefore, we create a sub-routine to organize the results into chunks, more typically referred to as: maximum results per page, or in the case of our program: $maxShow.
Two sub-routines perform this function. The first determines with what record the page results should begin. It does this by incrementing the results and then making a decision based upon the resulting row number:
Sub-Routine Number One:
$searchRow++; # adds 1 to the number of results meeting user input
if (($searchRow>=$startRec)&&($searchRow{
$shownRows++; # adds 1 to the number of total rows displayed
} # returns to main code
Sub-Routine Number Two organizes the results into $maxShow chunks. Let’s look at it:
my $priorRec = $startRec – $maxShow; if ($priorRecmy $nextRec = $startRec + $maxShow; if ($endRec>$searchRow)
{
$endRec = $searchRow;
}
else
{$endRec–;
}
if ($startRec!=1)
{
print “PREVIOUS Link to search program with start=$priorRec;
}
print “Displaying records $startRec thru $endRec of $searchRow”;
if (($shownRows == $maxShow)&&($endRec != $searchRow))
{
print “NEXT Link to search program with start=$nextRec;
}
# end of program
That’s it! What does all that mean? Well, let’s look at it. The first line sets the previous record number to the current record number minus the maximum results per page ($maxShow). Or, if the records are zero, it sets the number of records to one.
The next few lines set the future results to the current results plus the maximum results per page. The program opens a decision loop about what to print based upon the current number of records.
If the number of current records begins with a number that is greater than one (or in other words, is greater than $maxShow), a “PREVIOUS” link appears which allows the user to travel back one page of results.
Either way, the current number of records are displayed along with total number of records.
If there is a difference between the ending record and the starting row (or in other words, if the number of current records is less than the total number of records), a “NEXT” link appears which allows the user to travel ahead one page of results.
And there you have the basic components to a search engine. Or course, many details are missing, like for example, defining $maxShow to begin with, but that’s beyond the scope of this series. I told you this article would be complicated. I’m confused, too; but I re-read it a few times and it started to make sense. Give that a try. I will discuss printing dynamic results to the page in the next article. With all three components of the program working together, your comprehension will increase.
Or you can just open a free adult toy store franchise which handles all the details for you and gives you 50% of the profit. Go to http://nilexxx.com/store.htm if you’re interested.
Article written by Richard at NileXXX.com, home to the world’s sexiest selection of adult toys, DVDs, and clothing.