If you create forms, sooner or later you'll need to create the server-side application that processes them. Don't panic. There is nothing magic about server-side programming, nor is it overly difficult. With a little practice and some perseverance, you'll be cranking out forms applications.
The most important advice we can give about forms programming is easy to remember: copy others' work. Writing a forms application from scratch is fairly hard; copying a functioning forms application and modifying it to support your form is far easier.
Fortunately, server vendors know this, and they usually supply sample forms applications with their server. Rummage about for a directory named cgi-src, and you should discover a number of useful examples you can easily copy and reuse.
We can't hope to replicate all the useful stuff that came with your server or provide a complete treatise on forms programming. What we can do is offer a simple example of GET and POST applications, giving you a feel for the work involved and hopefully getting you moving in the right direction.
Before we begin, keep in mind that not all servers invoke these applications in the same manner. Our examples cover the broad class of servers derived from the original National Center for Supercomputing Applications (NCSA) HTTP server. They also should work with the very popular and public-domain Apache server. In all cases, consult your server documentation for complete details. You will find even more detailed information in CGI Programming with Perl, by Scott Guelich, Gunther Birznieks, and Shishir Gundavaram, and Webmaster in a Nutshell, by Stephen Spainhour and Robert Eckstein, both published by O'Reilly.
One alternative to CGI programming is the Java servlet model, covered in Java Servlet Programming, by Jason Hunter with William Crawford (O'Reilly). Servlets can be used to process GET and POST form submissions, although they are actually more general objects. There are no examples of servlets in this book.
Before we begin, we need to discuss how server-side applications
end. All server-side applications pass their results
back to the server (and on to the user) by writing those results to
the application's standard output as a MIME-encoded file. Hence, the
first line of the application's output must be a MIME Content-Type
descriptor. If your application
returns an HTML document, the first line is:
Content-type: text/html
The second line must be completely empty. Your application can return other content types, too—just include the correct MIME type. A GIF image, for example, is preceded with:
Content-type: image/gif
Generic text that is not to be interpreted as HTML can be returned with:
Content-type: text/plain
This is often useful for returning the output of other commands that generate plain text rather than HTML.
With the GET method, the browser passes form parameters as part of the URL that invokes the server-side forms application. A typical invocation of a GET-style application might use a URL like this:
http://www.kumquat.com/cgi-bin/dump_get?name=bob&phone=555-1212
When the www.kumquat.com server processes this URL, it invokes the application named dump_get that is stored in the directory named cgi-bin. Everything after the question mark is passed to the application as parameters.
Things diverge a bit at this point, due to the nature of the GET-style URL. While forms place name/value pairs in the URL, it is possible to invoke a GET-style application with only values in the URL. Thus, the following is a valid invocation as well, with parameters separated by plus signs (+):
http://www.kumquat.com/cgi-bin/dump_get?bob+555-1212
This is a common invocation when the browser references the
application via a searchable document with the <isindex>
tag. The parameters typed by
the user into the document's text-entry field get passed to the
server-side application as unnamed parameters separated by plus
signs.
If you invoke your GET application with named parameters, your server passes those parameters to the application in one way; unnamed parameters are passed differently.
Named parameters are passed to GET applications by
creating an environment variable named QUERY_STRING
and setting its value to the
entire portion of the URL following the question mark. Using our
previous example, the value of QUERY_STRING
would be set to:
name=bob&phone=555-1212
Your application must retrieve this variable and extract from it the parameter name/value pairs. Fortunately, most servers come with a set of utility routines that perform this task for you, so a simple C program that just dumps the parameters might look like this:
#include <stdio.h> #include <stdlib.h> #define MAX_ENTRIES 10000 typedef struct {char *name; char *val; } entry; char *makeword(char *line, char stop); char x2c(char *what); void unescape_url(char *url); void plustospace(char *str); main(int argc, char *argv[]) { entry entries[MAX_ENTRIES]; int num_entries, i; char *query_string; /* Get the value of the QUERY_STRING environment variable */ query_string = getenv("QUERY_STRING"); /* Extract the parameters, building a table of entries */ for (num_entries = 0; query_string[0]; num_entries++) { entries[num_entries].val = makeword(query_string, '&'); plustospace(entries[num_entries].val); unescape_url(entries[num_entries].val); entries[num_entries].name = makeword(entries[num_entries].val, '='); } /* Spit out the HTML boilerplate */ printf("Content-type: text/html\n"); printf("\n"); printf(<html>); printf(<head>); printf("<title>Named Parameter Echo</title>\n"); printf("</head>"); printf(<body>); printf("You entered the following parameters:\n"); printf("<ul>\n"); /* Echo the parameters back to the user */ for(i = 0; i < num_entries; i++) printf("<li> %s = %s\n", entries[i].name, entries[i].val); /* And close out with more boilerplate */ printf("</ul>\n"); printf("</body>\n"); printf("</html>\n"); }
The example program begins with a few declarations that define
the utility routines that scan through a character string and
extract the parameter names and values.[*] The body of the program obtains the value of the
QUERY_STRING
environment variable
using the getenv( )
system call,
uses the utility routines to extract the parameters from that value,
and then generates a simple HTML document that echoes those values
back to the user.
For real applications, you should insert your actual processing code after the parameter extraction and before the HTML generation. Of course, you'll also need to change the HTML generation to match your application's functionality.
Unnamed parameters get passed to the application as command-line parameters. This makes writing the server-side application almost trivial. Here is a simple shell script that dumps the parameter values back to the user:
#!/bin/csh -f # # Dump unnamed GET parameters back to the user echo "Content-type: text/html" echo echo '<html>' echo '<head>' echo '<title>Unnamed Parameter Echo</title>' echo '</head>' echo '<body>' echo 'You entered the following parameters:' echo '<ul>' foreach i ($*) echo '<li>' $i end echo '</ul>' echo '</body>' exit 0
Again, we follow the same general style: output a generic
document header, including the MIME Content-Type
, followed by the parameters
and some closing boilerplate. To convert this to a real application,
replace the foreach
loop with
commands that actually do something.
Forms-processing applications that accept HTML/XHTML POST-style parameters expect to read encoded parameters from their standard input. Like GET-style applications with named parameters, they can take advantage of the server's utility routines to parse these parameters.
Here is a program that echoes the POST-style parameters back to the user:
#include <stdio.h> #include <stdlib.h> #define MAX_ENTRIES 10000 typedef struct {char *name; char *val; } entry; char *makeword(char *line, char stop); char *fmakeword(FILE *f, char stop, int *len); char x2c(char *what); void unescape_url(char *url); void plustospace(char *str); main(int argc, char *argv[]) { entry entries[MAX_ENTRIES]; int num_entries, i; /* Parse parameters from stdin, building a table of entries */ for (num_entries = 0; !feof(stdin); num_entries++) { entries[num_entries].val = fmakeword(stdin, '&', &cl); plustospace(entries[num_entries].val); unescape_url(entries[num_entries].val); entries[num_entries].name = makeword(entries[num_entries].val, '='); } /* Spit out the HTML boilerplate */ printf("Content-type: text/html\n"); printf("\n"); printf(<html>); printf(<head>); printf("<title>Named Parameter Echo</title>\n"); printf("</head>"); printf(<body>); printf("You entered the following parameters:\n"); printf("<ul>\n"); /* Echo the parameters back to the user */ for(i = 0; i < num_entries; i++) printf("<li> %s = %s\n", entries[i].name, entries[i].val); /* And close out with more boilerplate */ printf("</ul>\n"); printf("</body>\n"); printf("</html>\n"); }
Again, we follow the same general form. The program starts by
declaring the various utility routines needed to parse the parameters,
along with a data structure to hold the parameter list. The actual
code begins by reading the parameter list from the standard input and
building a list of parameter names and values in the array named
entries
. Once this is complete, a
boilerplate document header is written to the standard output,
followed by the parameters and some closing boilerplate.
Like the other examples, this program is handy for checking the parameters being passed to the server application early in the forms- and application-debugging process. You can also use it as a skeleton for other applications by inserting appropriate processing code after the parameter list is built up and altering the output section to send back the appropriate results.
[*] These routines are usually supplied by the server vendor. They are not part of the standard C or Unix library.