More Books
PHP 5 Unleashed
PHP 5 Unleashed
Table of Contents
Copyright
Lead Author
Contributing Authors
Acknowledgments
We Want to Hear from You!
Reader Services
Introduction
Organization of the Book
Part I. Working with PHP for General Web Development
Chapter 1. Basic PHP Development
How PHP Scripts Work
Basic PHP Syntax
Basic PHP Data Types
Variable Manipulation
Control Structures
User-Defined Functions
Dynamic Variables and Functions
Multiple File PHP Scripts
References
Strings in PHP
Comparing Strings
Advanced String Comparison
Search and Replacement
Formatting Strings
Strings and Locales
Formatting Date and Time Values
Summary
Chapter 2. Arrays
Basic Arrays
Implementing Arrays
More Array Materials
Chapter 3. Regular Expressions
The Basics of Regular Expressions
Limitations of the Basic Syntax
POSIX Regular Expressions
Perl-Compatible Regular Expressions (PCRE)
PCRE Modifiers
A Few Final Words
Chapter 4. Working with Forms in PHP
HTML Forms 101
Working with Form Submissions in PHP
Summary
Chapter 5. Advanced Form Techniques
Data Manipulation and Conversion
Form Data Integrity
Form Processing
Summary
Chapter 6. Persistent Data Using Sessions and Cookies
HTTP Cookies
PHP Sessions
Advanced Sessions
Summary
Chapter 7. Using Templates
The What and Why of Templates
The Smarty Template Engine
Summary
Part II. Advanced Web Development
Chapter 8. PEAR
What Is PEAR?
Getting and Installing PEAR
Using the PEAR Package Manager
Using the PEAR Website
Using PEAR Packages in Applications
Summary
Reference
Chapter 9. XSLT and Other XML Concerns
Relating XML to HTML
Using XSLT to Describe HTML Output Using XML Input
PHP4 and XSLT Using the DOM XML Module
PHP4 and XSLT Using the XSLT Module
PHP5 and XSLT
Accessing XML Data Using SimpleXML
Generating XML Documents Using PHP
Summary
References
Chapter 10. Debugging and Optimizations
Debugging Your PHP Scripts
Optimizing Your PHP Scripts
Summary
Chapter 11. User Authentication
Authenticating Users in PHP
Securing PHP Code
Summary
Chapter 12. Data Encryption
Shared Secret Versus Public Key
Shared Secret Algorithms
Public Key Cryptography
Using Public Keys in PHP
Summary
Chapter 13. Object-Oriented Programming in PHP
Why Objects?
Creating Basic Classes
Advanced Classes
Special Methods
Class Autoloading
Object Serialization
Exceptions
Iterators
Summary
Chapter 14. Error Handling
The PHP Error-Handling Model
What to Do About Errors
The Default Error Handler
Error Suppression
Custom Error Handlers
Causing Errors
Putting It All Together
Summary
Chapter 15. Working with HTML/XHTML Using Tidy
Introduction
Basic Tidy Usage
Tidy Configuration Options
Using the Tidy Parser
Applications of Tidy
Summary
Chapter 16. Writing Email in PHP
The MIME Protocol
Implementing MIME Email in PHP
Summary
Part III. Building Applications in PHP
Chapter 17. Using PHP for Console Scripting
Core CLI Differences
Working with PHP CLI
CLI Tools and Extensions
Summary
Chapter 18. SOAP and PHP
What Are Web Services?
Installation
Creating Web Services
Consuming Web Services
Looking for Web Services
Summary
Chapter 19. Building WAP-Enabled Websites
What Is WAP?
System Requirements
Introduction to WML
Serving WAP Content
Sample Applications
Summary
Part IV. I/O, System Calls, and PHP
Chapter 20. Working with the File System
Working with Files in PHP
File Permissions
File Access Support Functions
Summary
Chapter 21. Network I/O
DNS/Reverse DNS Lookups
Socket Programming
Network Helper Functions
Summary
Chapter 22. Accessing the Underlying OS from PHP
Introduction
Unix-Specific OS Functionality
Platform-Independent System Functions
A Brief Note About Security
Summary
Part V. Working with Data in PHP
Chapter 23. Introduction to Databases
Using the MySQL Client
Basic MySQL Usage
Summary
Chapter 24. Using MySQL with PHP
Performing Queries from PHP
A MySQLi Session Handler
What Is a Custom Session Handler?
Summary
Chapter 25. Using SQLite with PHP
What Makes SQLite Unique?
Basic SQLite Functionality
Working with PHP UDFs in SQLite
Odds and Ends
Summary
Chapter 26. PHP's dba Functions
Preparations and Settings
Creating a File-Based Database
Writing Data
Reading Data
Sample Application
Conclusion
Part VI. Graphical Output with PHP
Chapter 27. Working with Images
Basic Image Creation Using GD
Using the PHP/GD Drawing Functions
Working with Colors and Brushes
Using Fonts and Printing Strings
General Image Manipulation
Other Graphics Functions
Summary
Chapter 28. Printable Document Generation
A Note Regarding the Examples in This Chapter
Generating Dynamic RTF Documents
Generating Dynamic PDF Documents
Related Resources
Part VII. Appendixes
Appendix A. Installing PHP5 and MySQL
Installing PHP5
Installing MySQL and PHP Modules
Installing PEAR
Appendix B. HTTP Reference
What Is HTTP?
PHP Programming Libraries for HTTP Work
Understanding an HTTP Transaction
HTTP Client Methods
What Comes Back: Server Response Codes
HTTP Headers
Encoding
Identifying Clients and Servers
The "Referer"
Fetching Content from an HTTP Source
Media Types
Cookies: Preserving State and a Tasty Treat
Security and Authorization
Client-Side Caching of HTTP Content
Appendix C. Migrating Applications from PHP4 to PHP5
Configuration
Object-Oriented Programming (OOP)
New Behavior of Functions
Further Reading
Appendix D. Good Programming Techniques and Performance Issues
Common Style Mistakes
Common Security Concerns
Style and SecurityLogging
Summary
Appendix E. Resources and Mailing Lists
Relevant Websites
Mailing Lists and Newsgroups
Index
SYMBOL
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z

POSIX Regular Expressions

The regular expression standard that made its way through the POSIX standard is perhaps the simplest form of regex available to PHP programmers. As such, it makes a great learning tool because the functions that implement it do not provide any particular "advanced" features.

In addition to the standard rules that we have already discussed, the POSIX regex standard defines the concept of character classes as a way to make it even easier to specify character ranges. Character classes are always enclosed in a set of colon characters (:) and must be enclosed in square brackets. There are 12 character classes:

  • alpha represents a letter of the alphabet (either upper- or lowercase). This is equivalent to [A-Za-z].

  • digit represents a digit between 09 (equivalent to [0-9]).

  • alnum represents an alphanumeric character, just like [0-9A-Za-z].

  • blank represents "blank" characters, normally space and Tab.

  • cntrl represents "control" characters, such as DEL, INS, and so forth.

  • graph represents all the printable characters except the space.

  • lower represents lowercase letters of the alphabet only.

  • upper represents uppercase letters of the alphabet only.

  • print represents all printable characters.

  • punct represents punctuation characters such as "." or ",".

  • space is the whitespace.

  • xdigit represents hexadecimal digits.

This makes it possible, for example, to rewrite our email validation regex as follows:

[[:alnum:]_]+@[[:alnum:]_]+\.[[:alnum:]_]{2,4}

This notation is much simpler, and it makes mistakes a little less obvious.

Another important concept introduced by the POSIX extension is the reference. Earlier in the chapter, we have already had a chance to see how parentheses can be used to group regular expressions. When you do so in a POSIX regex, when the expression is executed the interpreter assigns a numeric identifier to each grouped expression that is matched. This identifier can later be used in various operationssuch as finding and replacing.

For example, consider the following string and regular expression:

marcot@tabini.ca

([[:alpha:]]+)@([[:alpha:]]+)\.([[:alpha:]]{2,4})

The regex should match the preceding email address. However, because we have grouped the username, the domain name and the domain extensions will each become a reference, as shown in Table 3.1.

Table 3.1. Regex References

Reference Number

Value

0

marcot@tabini.ca (the string matches by the entire regex)

1

marcot

2

tabini

3

ca


PHP provides support for POSIX through functions of the ereg* class. The simplest form of regex matching is performed through the ereg() function:

ereg (pattern, string[, matches)

The ereg function works by compiling the regular expression stored in pattern and then comparing it against string. If the regex is matched against string, the result value of the function is TRUEotherwise, it is FALSE. If the matches parameter is specified, it is filled with an array containing all the references specified by pattern that were found in string (see Listing 3.1).

Listing 3.1. Filling Patterns with ereg
<?php

    $s = 'marcot@tabini.ca';

    if (ereg ('([[:alpha:]]+)@([[:alpha:]]+)\.([[:alpha:]]{2,4})', $s, $matches))
    {
      echo "Regular expression successful. Dumping matches\n";
      var_dump ($matches);
    }
    else
    {
      echo "Regular expression unsuccessful.\n";
    }

?>

If you execute the preceding script, you should see this result:

Regular expression successful. Dumping matches
array(4) {
  [0]=>
  string(16) "marcot@tabini.ca"
  [1]=>
  string(6) "marcot"
  [2]=>
  string(6) "tabini"
  [3]=>
  string(2) "ca"
}

This indicates that the regular expression was successfully matched against the string stored in $s and returned the various references in the $matches array.

If you're not interested in case-sensitive matching (and you don't want to have to specify all characters twice when creating a regular expression), you can use the eregi function instead. It accepts the same parameters and behaves the same way as ereg(), with the exception that it ignores the case when matching a regular expression against a string (see Listing 3.2):

Listing 3.2. Case-insensitive Pattern Matching
<?php

    $a = "UPPERCASE";

    echo (int) ereg ('uppercase', $a);
    echo "\n";
    echo (int) eregi ('uppercase', $a);
    echo "\n";

?>

The first regex will fail because ereg() performs a case-sensitive match against the contents of $a. The second regex, however, will be successful, because the eregi function performs its matches using an algorithm that is not case sensitive.

References make regular expressions an even more effective tool for handling search-and-replace operations. For this purpose, PHP provides the ereg_replace function, and its cousin eregi_replace(), which is not case sensitive:

ereg_replace (pattern, replacement, string);

The ereg_replace() function first matches the regular expression pattern against string. Then, it applies the references created by the regular expression in replacement and returns the resulting string. Here's an example (see Listing 3.3):

Listing 3.3. Using ereg_replace
<?php

    $s = 'marcot@tabini.ca';

    echo ereg_replace ('([[:alpha:]]+)@([[:alpha:]]+)\.([[:alpha:]]{2,4})',
      '\1 at \2 dot \3', $s)

?>

If you execute this script, it will return the following string:

marcot at tabini dot ca

As you can see, the three references are extracted from the contents of $s by the regex compiler and used to substitute the placeholders in the replacement string.