Chapter 10. String and Custom Functions: Better living through functions

image with no caption

Functions take your applications to a whole new level. You’ve already been using PHP’s built-in functions to accomplish things. Now it’s time to take a look at a few more really useful built-in functions. And then you’ll learn to build your very own custom functions to take you farther than you ever imagined it was possible to go. Well, maybe not to the point of raising laser sharks, but custom functions will streamline your code and make it reusable.

The Internet startup, RiskyJobs.biz, is designed to help companies find the right people to fill their riskiest jobs. The business model is simple: for each risky job we can fill with the right candidate, we get a commission. The more successful matches, the bigger our profit.

Risky Jobs needs help improving its site’s job-search functionality. Right now, there’s a database full of risky jobs just waiting to be discovered by the right people. Let’s take a look at the Risky Jobs search form and the underlying database of available jobs.

image with no caption
image with no caption

The SELECT query in the Risky Jobs script is very rigid, only resulting in a match if the two strings being compared are identical. This presents a problem for our job search because people need to be able to enter search terms that match job listings even if the job title isn’t an exact match.

Let’s go back to Ernesto’s search, which results in a query that searches the title column of the riskyjobs table for the text “Bull Fighter Matador”:

image with no caption

See the problem? This query is only going to match rows in the table where the title column contains the exact text “Bull Fighter Matador”. The job with the title “Matador” isn’t matched, and neither are “Firefighter” or “Electric Bull Repairer”. OK, maybe it’s good those last two were missed, but the search still isn’t working as intended. And it’s not the mixed case that presents the problem (MySQL searches are by default case-insensitive), it’s the fact that the entire search string must be an exact match due to the equality (=) operator in the WHERE clause.

What we really need is a way to search the database for a match on any portion of a search string. SQL lets us do just that with the LIKE keyword, which adds flexibility to the types of matches returned by a WHERE clause. You can think of LIKE as a more forgiving version of the = operator. Take a look at the following query, which uses LIKE to match rows where the word “fighter” appears anywhere in the title column:

image with no caption

LIKE makes it much easier to find matches, especially when you need to match the search string as part of a larger word or phrase. Check out these examples of strings that match up with the above query:

image with no caption

LIKE clauses typically work in conjunction with wildcard characters, which are stand-ins for characters in the data we’re matching. In SQL, the percent sign (%) wildcard stands for any group of zero or more characters. Placing this wildcard in a query before and after a search term, as in the SELECT statement above, tells SQL to return results whenever the term appears somewhere in the data, no matter how many characters appear before or after it.

image with no caption

Time out! Take a moment to familiarize yourself with the Risky Jobs database... and try out a few searches.

Download the riskyjobs.sql file for the Risky Jobs application from the Head First Labs web site at www.headfirstlabs.com/books/hfphp. This file contains SQL statements that build and populate the riskyjobs table with sample data.

After you’ve executed the statements in riskyjobs.sql in a MySQL tool, try out a few queries to simulate job searches. Here are some to get you started.

image with no caption
image with no caption
image with no caption

Handy indeed! We just need to figure out how to match each of the individual keywords in the search phrase.

Taking what people type in the Risky Jobs search field and matching it exactly won’t always work. The search would be much more effective if we searched on each search term entered, as opposed to searching on the entire search phrase. But how do you search on multiple terms? We could store each of the search terms in an array, and then tweak the SELECT query to search for each keyword individually.

To make Risky Jobs search functionality more effective, we need a way to break apart users’ search strings when they enter multiple words in the form field. The data our risky job seekers enter into the search form is text, which means we can use any of the built-in PHP string functions to process it. One extremely powerful function is explode(), and it breaks apart a string into an array of separate strings. Here’s an example:

image with no caption

The explode() function requires two parameters. The first is the delimiter, which is a character or characters that indicates where to break up the string. We’re using a space character as our delimiter, which means the search string is broken everywhere a space appears. The delimiter itself is not included in the resulting substrings. The second parameter is the string to be exploded.

image with no caption

Incorporating the array of search terms into Risky Jobs involves adding a line of code before we run the query on the Risky Jobs database. Now, if someone enters “Tipper Cow” into the search field, this code breaks it into two words and stores each word in an array ($search_words).

image with no caption

What we really need to do is only put OR between the LIKEs in our WHERE clause, but not after the last one. So how exactly can that be done? How about a special case inside the loop to see if we’re on the last search term, and then not include OR for that one? That would work but it’s a little messy. A much cleaner solution involves a function that does the reverse of the explode() function. The implode() function takes an array of strings and builds a single string out of them.

image with no caption

But how does that help the dangling OR problem in our query? Well, implode() lets you specify a delimiter to stick between the strings when combining them together. If we use ' OR ' as the delimiter, we can construct a WHERE clause that only has OR between each LIKE clause.

image with no caption

Are the search terms to blame or are there really no jobs for tightrope walkers?

image with no caption

The explode() function lets you explode a single string into substrings, but in this case we already have substrings.

The first call to the explode() function leaves us with multiple strings stored in an array, so there’s isn’t a single string left to explode. And attempting to explode each string in the array would likely just create more problems. Instead of trying to solve the delimiter problem with multiple calls to explode(), we need to preprocess the search string to get it down to a single delimiter before we ever even call explode(). Then it can do what it does best—break apart the string using one delimiter.

We want to hand the explode() function a string that it can break apart cleanly in one shot. How do we do that? By making sure that explode() only has to worry with a single delimiter, such as a space character. This means we need to preprocess the search string so that each search term is separated by a space, even if the user entered commas.

image with no caption

If you think about it, preprocessing the Risky Jobs search string is a lot like using find-and-replace in a word processor. In our case, we want to find commas and replace them with spaces. PHP’s str_replace() lets you do just that by supplying it three parameters: the text you want to find, the text you want to replace it with, and the string you want to perform the find-and-replace on. Here’s an example of str_replace() in action:

image with no caption

But what about those commas in the search string? The str_replace() function works just as well at replacing individual characters:

image with no caption

After this code runs, the variable $clean_string will contain the string “tightrope walker circus”.

image with no caption

Uh, no. Preprocessing gets rid of unwanted characters, but unfortunately, it doesn’t result in an array containing all good search terms.

Remember, our goal is to end up with a string where each search term is separated by exactly the same delimiter, a space. Take another look at what happened in the last three examples on the facing page. Some of the elements in the $search_words array are empty. If we try to build our WHERE clause with the empty search elements, we might end up with something like this:

image with no caption
image with no caption

Wrong! They will match everything.

If there’s a space anywhere in a job description (which is pretty much a given), this query will match it and return it as a result. So every job in the Risky Jobs database will be matched by this query. We need to get rid of those empty array elements before we construct the SQL query in order to make the search script useful again.

The good news it that it’s not too difficult to clean up our search terms before using them in a query. We’ll need to create a new array that only contains the real search terms. So we’ll copy all the non-empty elements from our first array into the second array, and then use that array to construct the SELECT query.

To construct the new array, we can use a foreach loop to cycle through each element in the original array, and then use an if statement to find non-empty elements. When we find a non-empty element, we just add it to the new array. Here’s what this process looks like:

image with no caption

We need to add code to our script that builds a new array containing only non-empty search terms.

Now let’s look at the code that will copy the non-empty elements from our $search_words array to the new $final_search_words array.

image with no caption

After checking to make sure there is at least one search term in the $search_words array, the foreach loop cycles through the array looking for non-empty elements. When it finds a non-empty element, it uses the [] operator to add the element onto the end of the $final_search_words array. This is how the new array is assembled.

Then what? Well, then we generate the SELECT query just as before, except now we use the $final_search_words array instead of $search_words:

image with no caption

This code gives us a search query that no longer has empty elements. Here’s the new query for the search “bull, matador, cape”:

SELECT * FROM riskyjobs
  WHERE description LIKE '%bull%' OR
  description LIKE '%matador%' OR
  description LIKE '%cape%'
image with no caption

Although Risky Jobs is doing much better at finding jobs, the huge job descriptions are a bit much.

What’s really irking Selma is her inability to see more job listings in her browser without doing a bunch of scrolling. It isn’t necessary to show the entire description of each job in the search results. Ideally, we really need to show part of the description of each job, maybe just the first few sentences.

Write down how you think we could trim the job descriptions so that they aren’t quite so huge in the search results:

__________________________________________

__________________________________________

__________________________________________

Since the lengths of the job descriptions in the Risky Jobs database vary quite a bit and some are quite long, we could clean up the search results by chopping all the descriptions down to a smaller size. And to keep from confusing users, we can just stick an ellipsis (...) on the end of each one to make it clear that they’re seeing only part of each description.

The PHP substr() function is perfect for extracting part of a string. You pass the “substring” function the original string and two integers. The first integer is the index of where you want the substring to start, and the second integer is its length, in characters. Here’s the syntax:

image with no caption

When it comes to the substr() function, you can think of a string as being like an array where each character is a different element. Consider the following string:

$job_desc = 'Are you a practioner of the lost art of cat juggling? ';

Similar to elements in an array, each character in this string has an index, starting at 0 and counting up until the end of the string.

image with no caption

We can use these character indexes with the substr() function to grab portions of the string:

image with no caption

The substr() function is not limited to grabbing substrings from the start of a string. You can also extract characters starting from the end of a string. The extraction still works left to right; you just use a negative index to identify the start of the substring.

image with no caption

Here are a couple of examples:

image with no caption

In order to allow visitors to sort their search results, we need a way for them to identify how they want their results ordered. Maybe a form... or a button? It’s actually way simpler than that. We can use HTML to turn each of the column headings in the search result table into links. Users can click a link to indicate which column they want to sort the results by.

image with no caption

We can use these links to reload the same Search script but with a query that sorts the results according to the link clicked. We already know how to use ORDER BY to structure a query with sorted results. If we create different SQL queries to ORDER BY each individual column, we can allow the user to sort the search results alphabetically by job title, description, or state, or chronologically by date posted.

Here is the SQL query to sort results alphabetically by job description:

image with no caption
image with no caption

Yes. While it’s true that we’ll need to run a different query when a user clicks a different link, it’s possible to construct a single query based on the link clicked.

The first time the results are displayed, no links have been clicked so we don’t have to worry about sorting. We can just take the keywords submitted into our form and build a query without an ORDER BY. The results are displayed with clickable headings, each of which links back to the script, but with a different sort order. So each link consists of a URL with the original keywords and a parameter named sort that indicates which order the results should be in.

What would really help in pulling this off is if we create our very own custom function that takes information about how to sort the job data, and returns a string with the WHERE clause and ORDER BY in place. Our new custom function takes a look at the sort parameter to figure out how to sort the search results. Here are the steps the function has to follow:

This might look like a lot of work, but we already have most of the code written. We just need to turn it into a function. But before we do that, let’s take a look at how to put together custom functions...

A function is a block of code separate from the rest of your code that you can execute wherever you want in your script. Until now, you’ve used built-in functions that PHP has already created. explode(), substr(), and mysqli_query() are all functions that are predefined by PHP and can be used in any script.

But you can also write your own custom functions to provide features not supplied by the language. By creating a custom function, you can use your own code again and again without repeating it in your script. Instead, you just call the function by name when you want to run its code.

Following is an example of a custom function called replace_commas() that replaces commas with spaces in a string:

image with no caption

When you’re ready to use a custom function, just call it by name and enter any values that it expects in parentheses. If the function is designed to return a value, you can assign it to a new variable, like this:

image with no caption

We’ve already written much of the code we need to create the custom function that generates a Risky Jobs search query. All that’s left is dropping the code into a PHP function skeleton. Here’s the custom build_query() function:

image with no caption

The build_query() function returns a complete SQL query based on the search string passed into it via the $user_search argument. To use the function, we just pass along the search data entered by the user, and then store the result in a new string that we’ll call $search_query:

image with no caption
image with no caption

Absolutely. We can pass the build_query() function two parameters instead of just one.

We’re already passing the function the $user_search argument, which contains the user’s search terms. Now we need another argument, $sort, that indicates how to sort the data. The new $sort argument needs to control the order of data returned by the query in the six ways we came up with back in Multiple queries can sort our results: sorting by the job_title, state, and date_posted columns of the riskyjobs table in both ascending and descending order.

We could store the actual ORDER BY strings in $sort to indicate the sort order. Or we could use the numbers 1 through 6 to represent each of the sorts, like this:

image with no caption

But aren’t integers more cryptic when reading through your code? Without helpful comments, yes, but there’s a more important reason to go with integers here. If we used ORDER BY strings, our data would show up in the URL of the script as part of each heading link. This would inadvertently reveal our table’s column names, which you’d rather not make public for security reasons.

image with no caption

Yes, users must specify how to sort the search results, just as they specify the search terms themselves.

The good news is we already know how we want to implement this functionality: we’re going to turn the column headings on our results page into hyperlinks. When a user clicks on a given heading, like “State,” we’ll pass the number for sorting by state into our build_query() function.

But we still have to get the sort order from the link to the script. We can do this when generating custom links for the headings by tacking on a sort parameter to the URL:

image with no caption

When the results page is generated, each heading link (except “Job Description”) has its own customized URL, including a sort value for how the results should be sorted.

image with no caption
image with no caption

Joe: Normally, the same heading would allow the user to sort in either ascending or descending order.

Jill: That’s right. Each time they click a heading it just reverses the order.

Frank: Doesn’t that mean we now have to somehow keep up with the state of the headings each time the user clicks them because they now have to link differently depending on what link they currently hold.

Joe: I don’t see what you mean.

Frank: Well, the headings don’t always do the same sort. For example, if you click the “Job Title” heading and it sorts the results by ascending job titles, then the link must change to now sort on descending job titles the next time it is clicked.

Jill: That’s right. But keep in mind that each type of sort has a number in the link URL to let the script know what kind of sort to carry out. And since we’re generating those links, we can control exactly what sort numbers get put in them.

Joe: I see. So the challenge for us is to somehow structure our code to be able to generate the correct link based on the current state of the sort, right?

Frank: Ah, I’ve got it! Isn’t that something we can solve with a few if statements? I mean, that’s the kind of decision making they’re good for, right?

Joe: Yes, that would work but we’re talking about several decisions involving the exact same piece of data, the sort type. It would really be nice if we could come up with a better way to make those decisions other than a bunch of nested if-else statements.

Jill: That’s a great point, and it’s a perfect opportunity to try out a new statement I heard about. The switch statement lets you make multiple decisions, way more than two, based solely on a single value.

Frank: That sounds great. Let’s give it a try.

Joe: I agree. Anything to avoid complicated if-else statements. Those things give me a headache!

Jill: Yeah, me too. I think the switch statement might just be the ticket...

The switch statement offers an efficient way to check a value and execute one of several different blocks of code depending on that value. This is something that would require a small army of if-else statements, especially in situations involving quite a few options.

Instead of writing nested if-else statements to check for each possible value, you instead write a switch statement that has a case label corresponding to each possible value. At the end of each case label, you put the statement break;, which instructs PHP to drop out of the entire switch statement and not consider any other cases. This ensures that PHP will execute the code in no more than one case.

Let’s take a look at an example that uses switch:

image with no caption

We now have two functions to handle Risky Jobs searches. build_query() constructs an SQL query based on search terms entered by the user, and generate_sort_links() generates hyperlinks for the search result headings that allow the user to sort the results. But build_query() isn’t quite finished since the query it generates doesn’t yet sort. The function needs to append an ORDER BY clause to the query. But it has to be the correct ORDER BY clause, as determined by a new $sort argument:

image with no caption
image with no caption

We’re displaying all of our results on a single page right now, which is a problem when a search matches lots of jobs. Instead of forcing users to scroll up and down a huge page to see all the job matches, we can use a technique called pagination to display the search results. When you paginate results, you break the collection of job matches into groups, and then display each group on a separate web page, like this:

image with no caption
image with no caption

We need a query that will return just a subset of the results, not all of them.

Luckily, SQL already gives us a way to do that: the LIMIT clause. Let’s revisit LIMIT and see how we can use it to split our results up into groups of five...

The key to controlling which rows we display on any given page is to add another clause to our search query, a LIMIT clause. To get a maximum of five rows, we add LIMIT 5 to the end of our query, like this:

image with no caption

If you recall, we use the custom build_query() function to create our Risky Jobs query. To force it to only display the first five matches, we just concatenate LIMIT 5 to the end of the query string after it’s built:

image with no caption

This works well for getting the first five rows of results, but what about the next five rows, and the five rows after that? To pull out rows deeper in the result set, we have to change our LIMIT up a bit. But how? LIMIT 10 would get the first 10 rows, so that wouldn’t work. We need to get rows 6 through 10, and to do that we use LIMIT with different syntax. When you add two arguments to LIMIT, the first arguments controls how many rows you skip, and the second argument controls how many rows you get back. For example, here’s how you get rows 11 through 25, which would be the third page of results:

An important part of pagination is providing links that allow the user to move back and forth among the different pages of results. We can use the LIMIT clause to set up the naviagation links for the bottom of each page of results. For example, the “next” and “previous” links each have their own LIMIT. The same thing applies to the numeric links that allow the user to jump straight to a specific page of results.

Here are the LIMIT clauses for the first three pages of search results, along with LIMITs for some of the page links:

image with no caption
image with no caption

Sort of. We need a different LIMIT depending on the page and link, but we can generate it instead of writing multiple queries.

All we need to do is modify our build_query() function a little further to add the correct LIMIT at the end of the query it constructs.

In order to add the new pagination functionality to build_query(), we need to set up and keep track of some variables that determine which search results to query and show on a given page. These variables are also important in determining how the navigation links at the bottom of the page are generated.

image with no caption

Most of the pagination variables can be set up purely through information provided via the URL, which is accessible through the $_GET superglobal. For example, the $sort, $user_search, and $cur_page variables all flow directly from GET data. We can then use these variables to calculate how many rows to skip to get to the first row of data, $skip. The $results_per_page variable is a little different in that we just set it to however many search results we want to appear on each page, which is more of a personal preference given the layout of the results page.

image with no caption

We’re still missing a couple of important variables: $total and $num_pages. These variables can only be set after performing an initial query to find out how many matches are found in the database. Once we know how many matches we have, it’s possible to set these variables and then LIMIT the results...

Now that we’ve got our variables set up, we need to revise the Search script so that instead of querying for all results, it queries for just the subset of results we need for the page the user is currently viewing. This involves first doing a query so that the $total variable can be set and the $num_pages variable can be calculated. Then we follow up with a second query that uses $skip and $results_per_page to generate a LIMIT clause that we add to the end of the query. Here’s the revised section of the search.php script with these new additions highlighted:

image with no caption

So we’ve set up some variables and built a new SQL query that returns a subset of results for the page. All that’s left to do is to generate the page navigation links for the bottom of the search results page: the “previous” link, numerical links for each page of results, and the “next” link. We already have all the information we need to put together the links. Let’s go over it again to make sure it’s clear how it will be used.

image with no caption

OK, we know what information we need in order to generate the page navigation links, so we’re ready to crank out the PHP code to make it happen. This code could just be dropped into the search.php script, but what if we put it in its own custom function? That way the main script code that generates the search results can be much simpler, requiring only a single line of code to generate the page links—a call to to this new function, which we’ll call generate_page_links().

The only catch is that we don’t want this function to get called if there is only one page of results. So we need to do a check on the number of pages before calling the new generate_page_links() function. Here’s how we can perform the check and call the function, making sure to pass along the required information as function arguments:

And finally we arrive at a complete Risky Jobs Search script that displays the appropriate search results based on the user’s search terms, generates clickable result heading links for sorting, paginates those results, and generates page navigation links along the bottom of the page.

image with no caption
image with no caption

The Risky Jobs Search script required quite a few new PHP and MySQL techniques. Let’s recap some of the most important ones.

image with no caption