You want to extract the input parameters that were submitted as part of a form or specified at the end of a URL.
Use the capabilities of your API that provide a means of accessing the names and values of the input parameters in the execution environment of a web script.
Earlier recipes in this chapter discussed how to retrieve information from MySQL and use it to generate various forms of output, such as static text, lists, hyperlinks, or form elements. In this recipe, we discuss the opposite problem—how to collect input from the Web. Applications for such input are many. For example, you can use the techniques shown here to extract the contents of a form submitted by a user. You might interpret the information as a set of search keywords, and then run a query against a product catalog to show the matching items to a customer. In this case, you use the Web to collect information from which you can determine the client’s interests. From that you construct an appropriate search statement and display the results. If a form represents a survey, a mailing list sign-up sheet, or a poll, you might just store the values, using the data to create a new database record (or perhaps to update an existing record).
A script that receives input over the Web and uses it to interact with MySQL generally processes the information in a series of stages:
Extract the input from the execution environment. When a
request arrives that contains input parameters, the web server
places the input into the environment of the script that handles
the request, and the script queries its environment to obtain the
parameters. It may be necessary to decode special characters in
the parameters to recover the actual values submitted by the
client, if the extraction mechanism provided by your API doesn’t
do it for you. (For example, you might need to convert %20
to space.)
Validate the input to make sure that it’s legal. You cannot
trust users to send legal values, so it’s a good idea to check
input parameters to make sure they look reasonable. For example,
if you expect a user to enter a number into a field, you should
check the value to be sure that it’s really numeric. If a form
contains a pop-up menu that was constructed using the allowable
values of an ENUM
column, you
might expect the value that you actually get back to be one of
these values. But there’s no way to be sure except to check.
Remember, you don’t even know that there is a real user on the
other end of the network connection. It might be a malicious
script roving your web site, trying to hack into your site by
exploiting weaknesses in your form-processing code.
If you don’t check your input, you run the risk of entering garbage into your database or corrupting existing content. It is true that you can prevent entry of values that are invalid for the data types in your table columns by enabling strict SQL mode. However, there might be additional semantic constraints on what your application considers legal, in which case it is still useful to check values in your script before attempting to enter them. Also, by checking in your script, you may be able to present more meaningful error messages to users about problems in the input than the messages returned by the MySQL server when it detects bad data. For these reasons, it might be best to consider strict SQL mode a valuable additional level of protection, but one that is not necessarily sufficient in itself. That is, you can combine strict mode on the server side with client-side validation. See Using the SQL Mode to Control Bad Input Data Handling for additional information about setting the SQL mode for strict input value checking.
Construct a statement based on the input. Typically, input parameters are used to add a record to a database, or to retrieve information from the database for display to the client. Either way, you use the input to construct a statement and send it to the MySQL server. Statement construction based on user input should be done with care, using proper escaping to avoid creating malformed or dangerous SQL statements. Use of placeholders is a good idea here.
The rest of this recipe explores the first of these three stages of input processing. Recipes and cover the second and third stages. The first stage (pulling input from the execution environment) has little to do with MySQL, but is covered here because it’s how you obtain the information that is used in the later stages.
Input obtained over the Web can be received in several ways, two of which are most common:
As part of a get
request, in which case
input parameters are appended to the end of the URL. For example,
the following URL invokes a PHP script named price_quote.php and specifies item
and quantity
parameters with values D-0214
and 60
:
http://localhost/mcb/price_quote.php?item=D-0214&quantity=60
Such requests commonly are received when a user selects a
hyperlink or submits a form that specifies method="get"
in the <form>
tag. A parameter list in a
URL begins with ?
and consists
of name
=
value
pairs
separated by ;
or &
characters. (It’s also possible to
place information in the middle of a URL, but this book doesn’t
cover that.)
As part of a post
request, such as a form
submission that specifies method="post"
in the <form>
tag. The contents of a form
for a post
request are sent as input parameters
in the body of the request, rather than at the end of the
URL.
You may also have occasion to process other types of input, such
as uploaded files. Those are sent using post
requests, but as part of a special kind of form element. Processing File Uploads discusses file
uploading.
When you gather input for a web script, you should consider how
the input was sent. (Some APIs distinguish between input sent via
get
and post
; others do not.)
However, after you have pulled out the information that was sent, the
request method doesn’t matter. The validation and statement
construction stages do not need to know whether parameters were sent
using get
or post
.
The recipes
distribution
includes some scripts in the apache/params directory (tomcat/mcb for JSP) that process input
parameters. Each script enables you to submit get
or post
requests, and shows how to extract and
display the parameter values thus submitted. Examine these scripts to
see how the parameter extraction methods for the various APIs are
used. Utility routines invoked by the scripts can be found in the
library modules in the lib
directory of the distribution.
To obtain input parameters passed to a script, you should familiarize yourself with your API’s conventions so that you know what it does for you, and what you must do yourself. For example, you should know the answers to these questions:
How do you determine which parameters are available?
How do you pull a parameter value from the environment?
Are values thus obtained the actual values submitted by the client, or do you need to decode them further?
How are multiple-valued parameters handled (for example, when several items in a checkbox group are selected)?
For parameters submitted in a URL, which separator
character does the API expect between parameters? This may be
&
for some APIs and
;
for others. ;
is preferable as a parameter
separator because it’s not special in HTML like &
is, but many browsers or other
user agents separate parameters using &
. If you construct a URL within a
script that includes parameters at the end, be sure to use a
parameter-separator character that the receiving script will
understand.
Perl. The
Perl CGI.pm module makes input parameters available to scripts through
the param()
function. param()
provides access to input
submitted via either get
or post
, which simplifies your
task as the script writer. You don’t need to know which method a
form used for submitting parameters. You don’t need to perform any
decoding, either; param()
handles that as
well.
To obtain a list of names of all available parameters,
call
param()
with
no arguments:
@param_names = param ();
To obtain the value of a specific parameter, pass its name to
param()
. In scalar context,
param()
returns the
parameter value if it is single-valued, the first value if it is
multiple-valued, or undef
if the
parameter is not available. In array context, param()
returns a list containing
all the parameter’s values, or an empty list if the parameter is not
available:
$id = param ("id"); @options = param ("options");
A parameter with a given name might not be available if the
form field with that name was left blank, or if there isn’t any
field with that name. Note too that a parameter value may be defined
but empty. For good measure, you may want to check both
possibilities. For example, to check for an age
parameter and assign a default value
of unknown
if the parameter is
missing or empty, you can do this:
$age = param ("age"); $age = "unknown" if !defined ($age) || $age eq "";
CGI.pm understands both ;
and &
as URL parameter
separator characters.
Ruby. For Ruby
scripts that use the cgi
module, web script parameters are available through the same
cgi
object that you create for
generating HTML elements. Its param
method returns a hash of parameter
names and values, so you can access this hash or get the parameter
names as follows:
params = cgi.params param_names = cgi.params.keys
The value of a particular parameter is accessible either
through the hash of parameter names and values or directly through
the cgi
object:
id = cgi.params["id"] id = cgi["id"]
The two access methods differ slightly. The params
method returns each parameter value
as an array. The array contains multiple entries if the parameter
has multiple values, and is empty if the parameter is not present.
The cgi
object returns a single
string. If the parameter has multiple values, only the first is
returned. If the parameter is not present, the value is the empty
string. For either access method, use the has_key?
method to test whether a
parameter is present.
The following listing shows how to get the parameter names and loop through each parameter to print its name and value, printing multiple-valued parameters as a comma-separated list:
params = cgi.params param_names = cgi.params.keys param_names.sort! page << cgi.p { "Parameter names:" + param_names.join(", ") } list = "" param_names.each do |name| val = params[name] list << cgi.li { "type=#{val.class}, name=#{name}, value=" + CGI.escapeHTML(val.join(", ")) } end page << cgi.ul { list }
The cgi
module understands
both ;
and &
as URL parameter separator
characters.
PHP. Input parameters can be available to PHP in several ways, depending on your configuration settings:
If the track_vars
variable is on, parameters
are available in the $HTTP_GET_VARS
and $HTTP_POST_VARS
arrays. For example,
if a form contains a field named id
, the value will be available as
$HTTP_GET_VARS["id"]
or
$HTTP_POST_VARS["id"]
,
depending on whether the form was submitted via GET or POST. If
you access $HTTP_GET_VARS
and
$HTTP_POST_VARS
in a
nonglobal scope, such as within a function, you must declare
them using the global
keyword
to make them accessible.
As of PHP 4.1, parameters also are available in
the $_GET
and $_POST
arrays if track_vars
is on. These are analogous
to $HTTP_GET_VARS
and
$HTTP_POST_VARS
except that
they are “superglobal” arrays that are
automatically available in any scope. (For example, it is
unnecessary to declare $_GET
and $_POST
with global
inside of functions.) The
$_GET
and $_POST
arrays are the preferred means
of accessing input parameters.
If the register_globals
variable is on,
parameters are assigned to global variables of the same name. In
this case, the value of a field named id
will be available as the variable
$id
, regardless of whether
the request was sent via GET or POST. This is dangerous, for
reasons described shortly.
The track_vars
and register_globals
settings can be compiled
into PHP or configured in the PHP php.ini file. track_vars
is always enabled as of PHP
4.0.3, so I’ll assume that this is true for your PHP
installation.
register_globals
makes it
convenient to access input parameters through global variables, but
the PHP developers recommend that it be disabled for security
reasons. Suppose that you write a script that requires the user to
supply a password, which is represented by the $password
variable. In the script, you
might check the password like this:
if (check_password ($password)) $password_is_ok = 1;
The intent here is that if the password matches, the script
sets $password_is_ok
to 1
. Otherwise $password_is_ok
is left unset (which
compares false in Boolean expressions). But suppose that someone
invokes your script as follows:
http://your.host.com/chkpass.php?password_is_ok=1
If register_globals
is
enabled, PHP sees that the password_is_ok
parameter is set to
1
and sets the corresponding
$password_is_ok
variable to
1
. The result is that when your
script executes, $password_is_ok
is 1
no matter what password was
given, or even if no password was given! The
problem with register_globals
is
that it enables outside users to supply default values for global
variables in your scripts. The best solution is to disable register_globals
, in which case you need
to check the global arrays for input parameter values. If you cannot
disable register_globals
, take
care not to assume that PHP variables have no value initially.
Unless you’re expecting a global variable to be set from an input
parameter, it’s best to initialize it explicitly to a known value.
The password-checking code should be written as follows to make sure
that only $password
(and not $password_is_ok
) can be set from an input
parameter. That way, $password_is_ok
is assigned a value by the
script itself whatever the result of the test:
$password_is_ok = 0; if (check_password ($password)) $password_is_ok = 1;
The PHP scripts in this book do not rely on the register_globals
setting. Instead, they
obtain input through the global parameter arrays.
Another complicating factor when retrieving input
parameters in PHP is that they may need some decoding, depending on
the value of the magic_quotes_gpc
configuration variable. If magic quotes are enabled, any quote,
backslash, and NUL characters in input parameter values accessed by
your scripts will be escaped with backslashes. I suppose that this
is intended to save you a step by allowing you to extract values and
use them directly in SQL statement strings. However, that’s only
useful if you plan to use web input in a statement with no
preprocessing or validity checking, which is dangerous. You should
check your input first, in which case it’s necessary to strip out
the slashes, anyway. This means that having magic quotes turned on
isn’t really very useful.
Given the various sources through which input parameters may
be available, and the fact that they may or may not contain extra
backslashes, extracting input in PHP scripts can be an interesting
problem. If you have control of your server and can set the values
of the various configuration settings, you can of course write your
scripts based on those settings. But if you do not control your
server or are writing scripts that need to run on several machines,
you may not know in advance what the settings are. Fortunately, with
a bit of effort it’s possible to write reasonably general-purpose
parameter extraction code that works correctly with very few
assumptions about your PHP operating environment. The following
utility function, get_param_val()
, takes a parameter
name as its argument and returns the corresponding parameter value.
If the parameter is not available, the function returns an unset
value. (get_param_val()
uses a helper function, strip_slash_helper()
, which is
discussed shortly.)
function get_param_val ($name) { global $HTTP_GET_VARS, $HTTP_POST_VARS; $val = NULL; if (isset ($_GET[$name])) $val = $_GET[$name]; else if (isset ($_POST[$name])) $val = $_POST[$name]; else if (isset ($HTTP_GET_VARS[$name])) $val = $HTTP_GET_VARS[$name]; else if (isset ($HTTP_POST_VARS[$name])) $val = $HTTP_POST_VARS[$name]; if (isset ($val) && get_magic_quotes_gpc ()) $val = strip_slash_helper ($val); return ($val); }
To use this function to obtain the value of a single-valued
parameter named id
, call it like
this:
$id = get_param_val ("id");
You can test $id
to
determine whether the id
parameter was present in the input:
if (isset ($id)) ... id parameter is present ... else ... id parameter is not present ...
For a form field that might have multiple values (such as a
checkbox group or a multiple-pick scrolling list), you should
represent it in the form using a name that ends in []
. For example, a list element
constructed from the SET
column
accessories
in the cow_order
table has one item for each
allowable set value. To make sure PHP treats the element value as an
array, name the field accessories[]
, not accessories
. (See Creating Multiple-Pick Form Elements from Database
Content for an example.)
When the form is submitted, PHP places the array of values in a
parameter named without the []
.
To access it, do this:
$accessories = get_param_val ("accessories");
The value of the $accessories
variable will be an array,
whether the parameter has multiple values, a single value, or even
no values. The determining factor is not whether the parameter
actually has multiple values, but whether you
named the corresponding field in the form using []
notation.
The get_param_val()
function checks the $_GET
,
$_POST
, $HTTP_GET_VARS
, and $HTTP_POST_VARS
arrays for parameter
values. Thus, it works correctly regardless of whether the request
was made by GET or POST, or whether register_globals
is enabled. The only
thing that the function assumes is that track_vars
is enabled.
get_param_val()
also
works correctly regardless of whether magic quoting is enabled. It
uses a helper function strip_slash_helper()
that performs
backslash stripping from parameter values if necessary:
function strip_slash_helper ($val) { if (!is_array ($val)) $val = stripslashes ($val); else { foreach ($val as $k => $v) $val[$k] = strip_slash_helper ($v); } return ($val); }
strip_slash_helper()
checks whether a value is a scalar or an array and processes it
accordingly. The reason it uses a recursive algorithm for array
values is that in PHP it’s possible to create nested arrays from
input parameters.
To make it easy to obtain a list of all parameter names, use another utility function:
function get_param_names () { global $HTTP_GET_VARS, $HTTP_POST_VARS; # construct an array in which each element has a parameter name as # both key and value. (Using names as keys eliminates duplicates.) $keys = array (); if (isset ($_GET)) { foreach ($_GET as $k => $v) $keys[$k] = $k; } else if (isset ($HTTP_GET_VARS)) { foreach ($HTTP_GET_VARS as $k => $v) $keys[$k] = $k; } if (isset ($_POST)) { foreach ($_POST as $k => $v) $keys[$k] = $k; } else if (isset ($HTTP_POST_VARS)) { foreach ($HTTP_POST_VARS as $k => $v) $keys[$k] = $k; } return ($keys); }
get_param_names()
returns a list of parameter names present in the HTTP variable
arrays, with duplicate names removed if there is overlap between the
arrays. The return value is an array with its keys and values both
set to the parameter names. This way you can use either the keys or
the values as the list of names. The following example prints the
names, using the values:
$param_names = get_param_names (); foreach ($param_names as $k => $v) print (htmlspecialchars ($v) . "<br /> ");
To construct URLs that point to PHP scripts and that have
parameters at the end, you should separate the parameters by
&
characters. To use a
different character (such as ;
),
change the separator by means of the arg_separator
configuration variable in
the PHP initialization file.
Python. The Python
cgi
module provides access to the
input parameters that are present in the script environment. Import
that module, and then create a FieldStorage
object:
import cgi params = cgi.FieldStorage ()
The FieldStorage
object
contains information for parameters submitted via either GET or POST
requests, so you need not know which method was used to send the
request. The object also contains an element for each parameter
present in the environment. Its key()
method returns a list of
available parameter names:
param_names = params.keys ()
If a given parameter, name
,
is single-valued, the value associated with it is a scalar that you
can access as follows:
val = params[name].value
If the parameter is multiple-valued, params[name]
is a list of MiniFieldStorage
objects that have
name
and value
attributes. Each of these has the
same name (it will be equal to name
) and one of the parameter’s values.
To create a list containing all the values for such a parameter, do
this:
val = [] for item in params[name]: val.append (item.value)
You can distinguish single-valued from multiple-valued
parameters by checking the type. The following listing shows how to
get the parameter names and loop through each parameter to print its
name and value, printing multiple-valued parameters as a
comma-separated list. This code requires that you import the
types
module in addition to the
cgi
module.
params = cgi.FieldStorage () param_names = params.keys () param_names.sort () print "<p>Parameter names:", param_names, "</p>" items = [] for name in param_names: if type (params[name]) is not types.ListType: # it's a scalar ptype = "scalar" val = params[name].value else: # it's a list ptype = "list" val = [] for item in params[name]: # iterate through MiniFieldStorage val.append (item.value) # items to get item values val = ",".join (val) # convert to string for printing items.append ("type=" + ptype + ", name=" + name + ", value=" + val) print make_unordered_list (items)
Python raises an exception if you try to access a parameter
that is not present in the FieldStorage
object. To avoid this, use
has_key()
to find out if
the parameter exists:
if params.has_key (name): print "parameter " + name + " exists" else: print "parameter " + name + " does not exist"
Single-valued parameters have attributes other than value
. For example, a parameter
representing an uploaded file has additional attributes you can use
to get the file’s contents. Processing File Uploads discusses this
further.
The cgi
module expects URL
parameters to be separated by &
characters. If you generate a
hyperlink to a Python script based on the cgi
module and the URL includes
parameters, don’t separate them by ;
characters.
Java. Within JSP
pages, the implicit request
object provides access to the request parameters through the
following methods:
getParameterNames()
Returns an enumeration of
String
objects, one for each parameter name present in the
request.
getParameterValues(String
name)
Returns an array of String
objects, one for each value associated with the
parameter, or null
if the
parameter does not exist.
getParameterValue(String
name)
Returns the first value associated with the parameter, or null
if the parameter does not
exist.
The following example shows one way to use these methods to display request parameters:
<%@ page import="java.util.*" %> <ul> <% Enumeration e = request.getParameterNames (); while (e.hasMoreElements ()) { String name = (String) e.nextElement (); // use array in case parameter is multiple-valued String[] val = request.getParameterValues (name); out.println ("<li> name: " + name + "; values:"); for (int i = 0; i < val.length; i++) out.println (val[i]); out.println ("</li>"); } %> </ul>
Request parameters are also available within JSTL tags, using the special
variables param
and paramValues
. param[name]
returns the first value for a
given parameter and thus is most suited for single-valued
parameters:
color value: <c:out value="${param['color']}"/>
paramValues[name]
returns
an array of values for the parameter, so it’s useful for parameters
that can have multiple values:
accessory values: <c:forEach items="${paramValues['accessories']}" var="val"> <c:out value="${val}"/> </c:forEach>
If a parameter name is legal as an object property name, you can also access the parameter using dot notation:
color value: <c:out value="${param.color}"/> accessory values: <c:forEach items="${paramValues.accessories}" var="val"> <c:out value="${val}"/> </c:forEach>
To produce a list of parameter objects with key
and value
attributes, iterate over the
paramValues
variable:
<ul> <c:forEach items="${paramValues}" var="p"> <li> name: <c:out value="${p.key}"/>; values: <c:forEach items="${p.value}" var="val"> <c:out value="${val}"/> </c:forEach> </li> </c:forEach> </ul>
To construct URLs that point to JSP pages and that have
parameters at the end, you should separate the parameters by
&
characters.
3.15.214.155