More Books
PHP 5 Unleashed
PHP 5 Unleashed
Table of Contents
Copyright
Lead Author
Contributing Authors
Acknowledgments
We Want to Hear from You!
Reader Services
Introduction
Organization of the Book
Part I. Working with PHP for General Web Development
Chapter 1. Basic PHP Development
How PHP Scripts Work
Basic PHP Syntax
Basic PHP Data Types
Variable Manipulation
Control Structures
User-Defined Functions
Dynamic Variables and Functions
Multiple File PHP Scripts
References
Strings in PHP
Comparing Strings
Advanced String Comparison
Search and Replacement
Formatting Strings
Strings and Locales
Formatting Date and Time Values
Summary
Chapter 2. Arrays
Basic Arrays
Implementing Arrays
More Array Materials
Chapter 3. Regular Expressions
The Basics of Regular Expressions
Limitations of the Basic Syntax
POSIX Regular Expressions
Perl-Compatible Regular Expressions (PCRE)
PCRE Modifiers
A Few Final Words
Chapter 4. Working with Forms in PHP
HTML Forms 101
Working with Form Submissions in PHP
Summary
Chapter 5. Advanced Form Techniques
Data Manipulation and Conversion
Form Data Integrity
Form Processing
Summary
Chapter 6. Persistent Data Using Sessions and Cookies
HTTP Cookies
PHP Sessions
Advanced Sessions
Summary
Chapter 7. Using Templates
The What and Why of Templates
The Smarty Template Engine
Summary
Part II. Advanced Web Development
Chapter 8. PEAR
What Is PEAR?
Getting and Installing PEAR
Using the PEAR Package Manager
Using the PEAR Website
Using PEAR Packages in Applications
Summary
Reference
Chapter 9. XSLT and Other XML Concerns
Relating XML to HTML
Using XSLT to Describe HTML Output Using XML Input
PHP4 and XSLT Using the DOM XML Module
PHP4 and XSLT Using the XSLT Module
PHP5 and XSLT
Accessing XML Data Using SimpleXML
Generating XML Documents Using PHP
Summary
References
Chapter 10. Debugging and Optimizations
Debugging Your PHP Scripts
Optimizing Your PHP Scripts
Summary
Chapter 11. User Authentication
Authenticating Users in PHP
Securing PHP Code
Summary
Chapter 12. Data Encryption
Shared Secret Versus Public Key
Shared Secret Algorithms
Public Key Cryptography
Using Public Keys in PHP
Summary
Chapter 13. Object-Oriented Programming in PHP
Why Objects?
Creating Basic Classes
Advanced Classes
Special Methods
Class Autoloading
Object Serialization
Exceptions
Iterators
Summary
Chapter 14. Error Handling
The PHP Error-Handling Model
What to Do About Errors
The Default Error Handler
Error Suppression
Custom Error Handlers
Causing Errors
Putting It All Together
Summary
Chapter 15. Working with HTML/XHTML Using Tidy
Introduction
Basic Tidy Usage
Tidy Configuration Options
Using the Tidy Parser
Applications of Tidy
Summary
Chapter 16. Writing Email in PHP
The MIME Protocol
Implementing MIME Email in PHP
Summary
Part III. Building Applications in PHP
Chapter 17. Using PHP for Console Scripting
Core CLI Differences
Working with PHP CLI
CLI Tools and Extensions
Summary
Chapter 18. SOAP and PHP
What Are Web Services?
Installation
Creating Web Services
Consuming Web Services
Looking for Web Services
Summary
Chapter 19. Building WAP-Enabled Websites
What Is WAP?
System Requirements
Introduction to WML
Serving WAP Content
Sample Applications
Summary
Part IV. I/O, System Calls, and PHP
Chapter 20. Working with the File System
Working with Files in PHP
File Permissions
File Access Support Functions
Summary
Chapter 21. Network I/O
DNS/Reverse DNS Lookups
Socket Programming
Network Helper Functions
Summary
Chapter 22. Accessing the Underlying OS from PHP
Introduction
Unix-Specific OS Functionality
Platform-Independent System Functions
A Brief Note About Security
Summary
Part V. Working with Data in PHP
Chapter 23. Introduction to Databases
Using the MySQL Client
Basic MySQL Usage
Summary
Chapter 24. Using MySQL with PHP
Performing Queries from PHP
A MySQLi Session Handler
What Is a Custom Session Handler?
Summary
Chapter 25. Using SQLite with PHP
What Makes SQLite Unique?
Basic SQLite Functionality
Working with PHP UDFs in SQLite
Odds and Ends
Summary
Chapter 26. PHP's dba Functions
Preparations and Settings
Creating a File-Based Database
Writing Data
Reading Data
Sample Application
Conclusion
Part VI. Graphical Output with PHP
Chapter 27. Working with Images
Basic Image Creation Using GD
Using the PHP/GD Drawing Functions
Working with Colors and Brushes
Using Fonts and Printing Strings
General Image Manipulation
Other Graphics Functions
Summary
Chapter 28. Printable Document Generation
A Note Regarding the Examples in This Chapter
Generating Dynamic RTF Documents
Generating Dynamic PDF Documents
Related Resources
Part VII. Appendixes
Appendix A. Installing PHP5 and MySQL
Installing PHP5
Installing MySQL and PHP Modules
Installing PEAR
Appendix B. HTTP Reference
What Is HTTP?
PHP Programming Libraries for HTTP Work
Understanding an HTTP Transaction
HTTP Client Methods
What Comes Back: Server Response Codes
HTTP Headers
Encoding
Identifying Clients and Servers
The "Referer"
Fetching Content from an HTTP Source
Media Types
Cookies: Preserving State and a Tasty Treat
Security and Authorization
Client-Side Caching of HTTP Content
Appendix C. Migrating Applications from PHP4 to PHP5
Configuration
Object-Oriented Programming (OOP)
New Behavior of Functions
Further Reading
Appendix D. Good Programming Techniques and Performance Issues
Common Style Mistakes
Common Security Concerns
Style and SecurityLogging
Summary
Appendix E. Resources and Mailing Lists
Relevant Websites
Mailing Lists and Newsgroups
Index
SYMBOL
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z

Form Data Integrity

In this section I'll discuss methods you can use to protect data passed in HTML forms. Often when you're working with forms, it is necessary to pass data in the form of hidden input tags. For instance, let's assume that a form that you are working on requires that the user submits it back to the server within five minutes. Unless you are using sessions (discussed later in the book in Chapter 6, "Persistent Data Using Sessions and Cookies") the only method available to you is to create a hidden form element containing the time at which the form was created (see Listing 5.2):

Listing 5.2. Time-Sensitive Form Example
<FORM ACTION="process.php" METHOD=GET>
<INPUT TYPE="hidden" NAME="time" VALUE="<?php echo time(); ?>">
Enter your message (5 minute time limit):<INPUT TYPE="text" NAME="mytext" VALUE="">
<INPUT TYPE="submit" Value="Send Data">
</FORM>

When this form is submitted, the time can be checked by ensuring that the time hidden element is no more than 300 seconds (5 minutes) smaller than the current value returned by time():

if($_GET['time']+300 >= time()) {
     echo "You took too long!<BR>";
     exit;
}

The major flaw with this system is that there is no way to verify that the time element sent to the server was actually the same value that was originally sent when the form was created. When this form is submitted, in fact, the following is a sample URL that would be displayed in the user's browser:

http://somewhere.com/process.php?time=1037613504

This URL could be easily modified by the user to "turn back time" and make it look like the form was created two minutes earlier than it really was by adding 120 (60 * 2) seconds to the time URL parameter:

http://somewhere.com/process.php?time=1037613684

In situations like this, data validation can prove must useful. In the text to come, I will demonstrate how PHP can be used to ensure that any hidden data will be submitted as it was created.

Securing Hidden Elements

The secret to data validation in this case is the MD5 algorithm. This algorithm is used to create a message digest (a sort of "digital fingerprint") of the data provided to it. As with the fingerprints found on a person, the digital fingerprint generated by the MD5 algorithm is unique to the string that it represents. Although there is a slight chance (1 in 3.40282e+38) that two strings will produce an identical fingerprint, for all practical purposes it can be assumed that the fingerprint is unique. Not only will the MD5 algorithm create a digital fingerprint that is unique, but it also is predictable. For any given string, the MD5 will always generate the same fingerprint every time.

In PHP, using the MD5 algorithm is as simple as calling the md5() function. The syntax for this function is

md5($string)

$string represents the string to generate the fingerprint for. The md5() function will return a 32-character fingerprint based on the data provided in $string.

So how will the md5() function help us ensure that our data remains unchanged between the creation of a form and when it is submitted? By creating MD5 fingerprint values for each hidden element in your document and then checking those fingerprint values when the form is submitted, you now can be confident the data submitted was actually valid.

When creating a MD5 fingerprint for these purposes, it is critical to remember that one of the major benefits of the algorithm can also be its downfall. Because the MD5 algorithm is completely predictable, simply using some combination of the provided $name and $value parameters could be hazardous. For instance, consider the following code snippet:

$fingerprint = md5($name.$value);

Although $fingerprint is indeed a MD5 fingerprint based on the passed values, a malicious (and fairly observant) user could figure out the string used to generate the fingerprint with relative ease. For our MD5 fingerprint to be unique, a value completely unknown to the outside user must be included:

$fingerprint = md5($name.$value.'mysecretword');

Using this method, the malicious user would have to not only decipher the way the string was created for the MD5 algorithm, but would have to know the additional value. For simplicity's sake, let's define a constant in PHP called PROTECTED_KEY using the PHP define statement to store our secret word:

define("PROTECTED_KEY", "mysecretword");

NOTE

When a constant is defined using the define statement, it behaves as any other PHP constant. This means that it is referenced by PROTECTED_KEY (no leading $ symbol) and can be accessed from anywhere in the script automatically, regardless of scope.


The protect() Function

To facilitate the generation of MD5 fingerprints and form elements, what I will be doing in this section is constructing a helper function that will be used to generate the digital fingerprints of a HTML form. This function is called protect(), which has the following syntax:

protect($name, $value, $secret)

$name represents the NAME attribute of a hidden HTML form element, $value represents the actual corresponding value of that element, and $secret represents a secret string used in fingerprint generation. This function, when executed, will return a string representing individual hidden form elementsthe one containing the actual value and the other representing the MD5 fingerprint. The NAME attribute of the MD5 fingerprint will be defined by this function as <name>_checksum, where <name> represents the name of the actual value being passed to the form. This function is shown in Listing 5.3:

Listing 5.3. The protect() MD5 Form Fingerprint Generator
<?php

    define('PROTECTED_KEY', 'mysecretword');

    function my_addslashes($string) {
        return (get_magic_quotes_gpc() == 1) ? $string : addslashes($string);
    }

    function protect($name, $value, $secret) {

        $tag = "";
        $seed = md5($name.$value.$secret);
        $html_name = $name."_checksum";
        $tag = "<INPUT TYPE='hidden' NAME='$name' VALUE='" .
               urlencode(my_addslashes($value))."'>\n";
        $tag .= "<INPUT TYPE='hidden' NAME='$html_name' VALUE='$seed'>\n";
        return $tag;

    }
?>

NOTE

Don't know what my_addslashes() or urlencode() are? The purpose behind these functions is discussed in previous sections of this chapter ("Dealing with Magic Quotes" and "Data Conversion and Encoding," respectively).


In practice, the protect() function would be used anytime a hidden form element is required:

<FORM ACTION="process.php" METHOD=GET>
<?php echo protect('time', time(), PROTECTED_KEY); ?>
Enter your message (5 minute time limit):
<INPUT TYPE="text" NAME="mytext" VALUE="">
<INPUT TYPE="submit" Value="Send Data">
</FORM>

When processed by PHP, the following is the actual HTML that is displayed to the client browser:

<FORM ACTION="process.php" METHOD=GET>
<INPUT TYPE="hidden" NAME="time" VALUE="1037613504">
<INPUT TYPE="hidden" NAME="time_checksum"
       VALUE="3b6f5fa33bb4fb99e68cf1e3f5bf5478">
Enter your message (5 minute time limit):
<INPUT TYPE="text" NAME="mytext" VALUE="">
<INPUT TYPE="submit" Value="Send Data">
</FORM>

Now, by checking to ensure that the time hidden form element matches the MD5 fingerprint stored in time_checksum (with our secret string) the validity of the data can be ensured.

The validate() Function

After the form has been submitted, the fingerprint for each function much be confirmed for the data to be valid. To do this, we must construct the validate() function. This function has the following syntax:

validate($input, $secret)

$input represents a reference to the appropriate superglobal array ($_GET, $_POST, and so on) and $secret represents the secret string used to create the fingerprint (in this case, the string defined as PROTECTED_KEY). Unlike protect(), which represents a fairly simple function, the validate() function is considerably more complex for a number of reasons. First, there must be a number of different checks to account for all the ways a malicious user could attempt to manipulate the data, including (but not limited to) the following:

  • Modifying one or more of the protected values

  • Modifying one or more of the protected value fingerprints

  • Removing one or more of the protected values or fingerprints

To determine whether a user has removed or manipulated a protected value, the validate() function must know what values are supposed to be protected. To facilitate this, the validate() function looks for a hidden value (and its corresponding checksum) whose NAME attribute is protected_list. The value of this hidden form element is a serialized array listing the names of protected keys. If this parameter is not found, the validate() function should check all parameters with the following exceptions:

  • The name of the form element is submit.

  • The name of the form element ends in _checksum.

NOTE

If you are wondering why the validate() function ignores form elements named submit during validation, it is for circumstances where the form is being processed by the same script that displayed it. In these circumstances, often a hidden form element named "submit" will be included in the form to indicate to the script that it should process the form rather than display it.


For most cases, you'll need to provide a list of fields that are considered protected. To do this, create an array containing a list of element names that are protected and serialize it; then protect that list itself using the previously discussed protect() function:

$protected = serialize(array('myvar1', 'myvar2', 'myvar3'));
echo protect('protected_list', $protected, PROTECTED_KEY);

For the sake of avoiding repetition and confusion during my explanation of the validate() function, Listing 5.4 displays this function in its entirety and will be heavily referenced as I explain how the function actually works:

Listing 5.4. The validate() Function
<?php

    function validate($input, $secret) {

        if(!is_array($input)) {
            return false;
        }

        if(!isset($input['protected_list']) &&
           !isset($input['protected_list_checksum'])) {

            foreach($input as $key=>$val) {

                if(!preg_match("/(submit|_checksum$)/i", $key)) {

                   $protected[] = $key;

                }

            }

        } else {

            if(!isset($input['protected_list']) ||
               !isset($input['protected_list_checksum'])) {

                return false;

            }

            $checkval = 'protected_list' .
                        stripslashes(urldecode($input['protected_list'])) .
                        PROTECTED_KEY;

            $checksum = md5($checkval);
            if($checksum !== $input['protected_list_checksum']) {
                return false;
            }

            $protected = unserialize(stripslashes(urldecode(
              $input['protected_list'])));

        }

        foreach($protected as $val) {


            if(isset($input[$val."_checksum"]) && isset($input[$val])) {

                $temp = urldecode($input[$val]);

                $checksum = md5($val.stripslashes($temp).PROTECTED_KEY);

                if($checksum != $input[$val."_checksum"]) {

                    return false;

                }

            } else {

                return false;

            }

        }

        return true;
    }
?>

When the validate() function is called, its first task is to rule out a very basic validation taskensuring that the $input variable it was provided was actually an array. The next step the function takes is to determine what fields it will be validating. To determine this, first the validate() looks for a valid (with checksum) protected_list element in the $input array. If this element is found and validated based on its MD5 fingerprint, the array is reconstructed using the unserialize() function. In the event that the protected_list element is not provided in the form data, we use a simple regular expression to construct an array dynamically following the previously discussed rules. In either case, the $protected variable is populated with an array list of all the form elements in the $input array to validate.

With the $protected array now containing a list of the form elements that should be validated, the array is then iterated through using a foreach statement. For each element, the validate() function checks first to ensure that both the element itself and its fingerprint value exist. Assuming both elements exist, a MD5 fingerprint is then generated against the passed values and compared to the original fingerprint provided in the form submission. If the fingerprints are identical, the element's validity is confirmed and the script moves on to the next element. If at any time a particular element fails to validate or does not exist, the validate() function will return a Boolean false, indicating this failure. Upon a successful validation of all the required elements, the validate() function will return TRue.

Putting protect() and validate() into Action

Now that you understand both the theory and implementation of hidden element validation, let's put the complete script into action. Listing 5.5 creates a time-sensitive form that the user must submit within 5 minutes, using the protect() and validate() functions described in this section:

Listing 5.5. A Time-Sensitive Form Using protect() and validate()
<?php

    define('PROTECTED_KEY', 'mysecretword');
    function my_addslashes($string) {
        return (get_magic_quotes_gpc() == 1) ? $string : addslashes($string);
    }

    function protect($name, $value, $secret) {

        $tag = "";
        $seed = md5($name.$value.$secret);
        $html_name = $name."_checksum";
        $tag = "<INPUT TYPE='hidden' NAME='$name' VALUE='" .
               urlencode(my_addslashes($value)) .
               "'>\n";
        $tag .= "<INPUT TYPE='hidden' NAME='$html_name' VALUE='$seed'>\n";
        return $tag;

    }


    function validate($input, $secret) {

        if(!is_array($input)) {
            return false;
        }

        if(!isset($input['protected_list']) &&
           !isset($input['protected_list_checksum'])) {

            foreach($input as $key=>$val) {

                if(!preg_match("/(submit|_checksum$)/i", $key)) {

                   $protected[] = $key;

                }

            }

        } else {

            if(!isset($input['protected_list']) ||
               !isset($input['protected_list_checksum'])) {

                return false;

            }

            $checkval = 'protected_list' .
                        stripslashes(urldecode($input['protected_list'])) .
                        PROTECTED_KEY;

            $checksum = md5($checkval);
            if($checksum !== $input['protected_list_checksum']) {
                return false;
            }

            $protected = unserialize(stripslashes(urldecode(
              $input['protected_list'])));

        }

        foreach($protected as $val) {


            if(isset($input[$val."_checksum"]) && isset($input[$val])) {

                $temp = urldecode($input[$val]);

                $checksum =md5($val.stripslashes($temp).PROTECTED_KEY);

                if($checksum != $input[$val."_checksum"]) {

                    return false;

                }

            } else {

                return false;

            }

        }

        return true;
    }

    if(isset($_GET['submit'])) {
        if(validate(&$_GET, PROTECTED_KEY)) {
            if($_GET['time']+300 > time()) {
                echo "Thank you " . $_GET['username'] .
                     " for submitting this form on-time!";
            } else {
                echo "Sorry, you took too long!";
            }
        } else {
            echo "Data was invalid!";
        }
    }

    $protect_str = serialize(array('time'));
?>
<HTML><HEAD><TITLE>Validating Hidden elements example</TITLE></HEAD>
<BODY>
Please fill out the below form within 5 minutes:<BR>
<FORM ACTION="<?=$_SERVER['PHP_SELF']?>" METHOD=GET>
<INPUT TYPE="hidden" NAME="submit" VALUE="1">
<? echo protect('time', time(), PROTECTED_KEY); ?>
<? echo protect('protected_list', $protect_str, PROTECTED_KEY); ?>
What is your name: <INPUT TYPE="text" NAME="username" SIZE=30>
<INPUT TYPE="submit" VALUE="Send">
</FORM>
</BODY>
</HTML>