Chapter 4. Strings

Strings are one of the fundamental building blocks of data in PHP. Each string represents an ordered sequence of bytes. Strings can range from human-readable sections of text - like “To be or not to be” - to sequences of raw bytes encoded as integers - such as “1101451541541574012715716215414441”.1 Every element of data read or written by a PHP application is represented as strings.

In PHP, strings are typically encoded as ASCII values, although you can convert between ASCII and other formats (like UTF-8) as necessary. Strings can contain null bytes when needed, and are essentially limitless in terms of storage so long as the PHP process has adequate memory available.

The most basic way to create a string in PHP is with single quotes. Single-quoted strings are treated as literal statements - there are no special characters or any kind of interpolation of variables. To include a literal single quote within a single-quoted string, you must escape that quote by prefixing it with a backslash - for example '. In fact, the only two characters that need to be - or even can be - escaped are the single quote itself or the backslash. Example 4-1 provides examples of single-quoted strings along with their corresponding, printed output.


Variable interpolation is the practice of referencing a variable directly by name within a string and letting the interpreter replace the variable with its value at runtime. Interpolation allows for more flexible strings as you can write a single string but dynamically replace some of its contents to fit the context of where it’s being used in code.

Example 4-1. Single-quoted strings
print 'Hello, world!';
// Hello, world!

print 'You've only got to escape single quotes and the \ character.';
// You've only got to escape single quotes and the  character.

print 'Variables like $blue are printed as literals.';
// Variables like $blue are printed as literals.

print '1101451541541574012715716215414441';
// 1101451541541574012715716215414441

More complicated strings might need to interpolate variables or reference special characters, like a newline or tab. For these more complicated use cases, PHP requires the use of double quotes instead and allows for various escape sequences as shown in Table 4-1.

Table 4-1. Double-quoted string escape sequences
Escape sequence Character Example


"This string ends in a new line. "

Carriage return

"This string ends with a carriage return. "


"Lots of space"



"You must escape the \ character."


Dollar sign

A movie ticket is $10.


Double quote

"Some quotes are "scare quotes.""

through 777

Octal character value


x0 through xFF

Hex character value


Outside of special characters that are explicitly escaped with leading backspaces, PHP will automatically substitute the value of any variable passed within a double-quoted string. Further, PHP will interpolate entire expressions within a double-quoted string if they’re wrapped in curly braces ({}) and treat them as a variable. Example 4-2 presents examples of how variables, complex or otherwise, are treated within double-quoted strings.

Example 4-2. Variable interpolation within double-quoted strings
print "The value of $var is $var"; 1
print "Properties of objects can be interpolated, too. {$obj->value}"; 2
print "Prints the value of the variable returned by getVar(): {${getVar()}}" 3

The first reference to $var is escaped, but the second will be replaced by its actual value. If $var = apple, the string will print The value of $var is apple.


Using curly braces enables the direct reference of object properties within a double quoted string as if these properties were locally-defined variables.


Assuming getVar() returns the name of a defined variable, this line will both execute the function and print the value behind that variable.

Both single and double-quoted strings are represented as single lines. Often, though, a program will need to represent multiple lines of text (or multiple lines of encoded binary) as a string. In such a situation, the best tool at a developer’s disposal is a Heredoc.

A Heredoc is a literal block of text that’s started with three angle brackets (the <<< operator), followed by a named identifier, followed by a newline. Every subsequent line of text (including newlines) is part of the string, up to a completely independent line containing nothing but the Heredoc’s named identifier and a semicolon. Example 4-3 presents an illustration of how a Heredoc might look in code.


The identifier used for a Heredoc does not need to be capitalized. However, it is common convention in PHP to always capitalize these identifiers to help distinguish them from the text definition of the string.

Example 4-3. String definition using the Heredoc syntax
$poem = <<<POEM
To be or not to be,
That is the question

Heredocs function just like double-quoted strings and permit for variable interpolation (or special characters like escaped hexadecimal) within them. This can be particularly powerful when encoding blocks of HTML within an application as variables can be used to make the strings dynamic.

In some situations, though, you might want a string literal rather than something open to variable interpolation. In that case, PHP’s Nowdoc syntax provides a single-quoted style alternative to Heredoc’s double-quoted string analog. A Nowdoc looks almost exactly like a Heredoc, except the identifier itself is enclosed in single quotes as in Example 4-4.

Example 4-4. String definition using the Nowdoc syntax
$poem = <<<'POEM'
To be or not to be,
That is the question

Both single and double quotes can be used within Heredoc and Nowdoc blocks without additional escaping. Nowdocs, however, will not interpolate or dynamically replace any values, whether they are escaped or otherwise.

The recipes that follow help further illustrate how strings can be used in PHP and the various problems they can solve.

4.1 Accessing substrings within a larger string


You want to identify whether or not a string contains a specific substring. For example, you want to know if a URL contains the text “/secret/”.


Use strpos():

if (strpos($url, '/secret/') !== false) {
    // A secret fragment was detected, run additional logic
    // ...


PHP’s strpos() function will scan through a given string and identify the starting position of the first occurrence of a given substring. This function literally looks for a needle in a haystack, as the function’s arguments are named $haystack and needle respectively. If the substring ($needle) is not found, then the function returns a Boolean false.

It’s important in this case to use strict equality comparison as strpos() will return 0 if the substring appears as the very beginning of the string being searched. Remember from Recipe 2.3 that comparing values with only two equals signs will attempt to re-cast the types, converting an integer 0 into a Boolean false - we must always use strict comparison operators (either === for equality or !== for inequality) to avoid confusion.

If the $needle appears multiple times within a string, strpos() only returns the position of the first occurrence. You can search for additional occurrences by adding an optional position offset as a third parameter to the function call as in Example 4-5. Defining an offset also allows you to search the latter part of a string for a substring you know already appears earlier in the string.

Example 4-5. Count all occurrences of a substring
function count_occurrences($haystack, $needle)
    $occurrences = 0;
    $offset = 0;
    $pos = 0; 1

    do {
        $pos = strpos($haystack, $needle, $offset);

        if ($pos !== false) { 2
            $occurrences += 1;
            $offset = $pos + 1; 3
    } while ($pos !== false); 4

    return $occurrences;

$str = 'How much wood would a woodchuck chuck if a woodchuck could chuck wood?';

print count_occurrences($str, 'wood'); // 4
print count_occurrences($str, 'nutria'); // 0

All of our variables are initially set to 0 so we can track new string occurrences.


If and only if the string was found do we want to count an occurrence.


If the string was found we update our offset, but also increment by 1 so we don’t repeatedly re-count the occurrence we’ve already found.


Once we’ve reached the last occurrence of our substring, we exit the loop and return our count.

See Also

PHP documentation on +strpos().

4.2 Extracting one string from within another


You want to extract a small string from a much larger string. For example, extracting the domain name from an email address.


Use substr() to select the part of the string you want to extract as follows:

$string = '[email protected]';
$start = strpos($string, '@');

$domain = substr($string, $start + 1);


PHP’s substr() function returns the portion of a given string, based on an initial offset (the second parameter) up to an optional length. The full function signature is as follows:

function substr(string $string, int $offset, ?int $length = null): string

If the $length parameter is omitted, then substr() will return the entire remainder of the string. If the $offset parameter is greater than the length of the input string, an empty string is returned.

You can also specify a negative offset to return a subset starting from the end of the string instead of the beginning as in Example 4-6.

Example 4-6. Substring with a negative offset
$substring = substr('phpcookbook', -3); 1
$substring = substr('phpcookbook', -2); 2
$substring = substr('phpcookbook', -8, 4); 3

Returns ook


Returns ok


Returns cook

There are also some other edge cases regarding offsets and string lengths to be aware of with substr(). It is possible for the offset to legitimately start within the string, but for $length to run past the end of it. PHP catches this discrepancy and merely returns the remainder of the original string, even if the final return is less than the specified length. Example 4-7 details some potential outputs of substr() based on various specified lengths:

Example 4-7. Various substring lengths
$substring = substr('Four score and twenty', 11, 3); 1
$substring = substr('Four score and twenty', 99, 3); 2
$substring = substr('Four score and twenty', 20, 3); 3

Returns and


Returns an empty string


Returns y

Another edge case is a negative $length supplied to the function. When requesting a substring with a negative length, PHP will subtract that many characters from the substring it returns, as illustrated in Example 4-8:

Example 4-8. Substring with a negative length
$substring = substr('Four score and twenty', 5); 1
$substring = substr('Four score and twenty', 5, -11); 2

Returns score and twenty


Returns score

See Also

PHP documentation for substr() and for +strpos().

4.3 Replacing part of a string


You want to replace just one part of a string with another string. For example, you want to obfuscate the last four digits of a phone number before printing it to the screen.


Use substr_replace() to replace a component of an existing string based on its position.

$string = '555-123-4567.';
$replace = 'xxx-xxx'

$obfuscated = substr_replace($string, $replace, 0, strlen($replace));
// xxx-xxx-4567


PHP’s substr_replace() function operates on a part of a string, similar to substr(), defined by some integer offset up to a specific length. The full function signature is as follows:

function substr_replace(
    array|string $string,
    array|string $replace,
    array|int $offset,
    array|int|null $length = null
): string

Unlike its substr() analog, substr_replace() can operate either on individual strings or on collections of strings. If an array of strings is passed in with scalar values for $replace and $offset, then the function will run the replacement on each string as in Example 4-9.

Example 4-9. Replacing multiple substrings at once.
$phones = [

$obfuscated = substr_replace($phones, 'xxx-xxx', 0, 7);

// xxx-xxx-5555
// xxx-xxx-1234
// xxx-xxx-9955

In general, developers have a lot of flexibility with the parameters in this function. Similar to substr():

  • $offset can be negative, in which case replacements begin that many characters from the end of the string.

  • $length can be negative, representing the number of characters from the end of the string at which to stop replacing.

  • If $length is null, it will internally become the same as the length of the input string itself.

  • If length is 0, $replace will be inserted into the string at the given $offset and no replacement will take place at all.

Finally, if $string is provided as an array, all other parameters can be provided as arrays as well. Each element will represent a setting for the string in the same position in $string as illustrated by Example 4-10.

Example 4-10. Replacing multiple substrings with array parameters
$phones = [

$offsets = [0, 0, 4];

$replace = [

$lengths = [7, 7, 8];

$obfuscated = substr_replace($phones, $replace, $offsets, $lengths);

// xxx-xxx-5555
// xxx-xxx-1234
// 555-991-9955

It is not a hard requirement that arrays passed in for $string, $replace, $offset, and $length to be all of the same size. PHP will not throw an error or warning if you pass arrays with different dimensions. Doing so will, however, result in unexpected output during the replacement operation - for example, truncating a string rather than replacing its contents. It’s a good idea to validate the dimensions of each of these four arrays all match.

The substr_replace() function is convenient if you know exactly where you need to replace characters within a string. In some situations, you might not know the position of a substring that needs to be replaced, but want to instead replace occurrences of a specific substring. In those circumstances you would want to instead use either str_replace() or str_ireplace().

These two functions will search a specified string to find an occurrence (or many occurrences) of a specified substring and replace it with something else. The functions are identical in their call pattern, but the extra i in str_ireplace() indicates that it searches for a patter in a case insensitive fashion. Example 4-11 provides an illustrative example of both functions in use.

Example 4-11. Searching and replacing within a string
$string = 'How much wood could a Woodchuck chuck if a woodchuck could chuck wood?';

$beaver = str_replace('woodchuck', 'beaver', $string); 1
$ibeaver = str_ireplace('woodchuck', 'beaver', $string); 2

How much wood could a Woodchuck chuck if a beaver could chuck wood?


How much wood could a beaver chuck if a beaver could chuck wood?

Both str_replace() and str_ireplace() accept an optional $count parameter that is passed by reference. If specified, this variable will be updates with the number of replacements the function performed. In Example 4-11 this return value would have been 1 and 2, respectively, due to the capitalization of “Woodchuck.”

See Also

PHP documentation on substr_replace(), str_replace(), and str_ireplace().

4.4 Processing a string one byte at a time


You need to process a string from beginning to end, one character at a time.


Loop through each character of the string as if it were an array. Example 4-12 will count the number of capital letters in a string.

Example 4-12. Count capital characters in a string
$capitals = 0;

$string = 'The Carriage held but just Ourselves - And Immortality';
for ($i = 0; $i < strlen($string); $i++) {
    if (ctype_upper($string[$i])) {
        $capitals += 1;

// $capitals = 5


Strings are not arrays in PHP so you cannot loop over them directly. However, they do provide array-like access to individual characters within the string based on their position. You can reference individual characters by their integer offset (starting with 0), or even by a negative offset to start at the end of the string.

Array-like access isn’t read-only, though. You can just as easily replace a single character in a string based on its position as follows:

$string = 'A new recipe made my coffee stronger this morning';
$string[31] = 'a';

// A new recipe made my coffee stranger this morning

It is also possible to convert a string directly to an array using str_split(), then iterate over all items in the resulting array. This will work as an update to our Solution example as illustrated in Example 4-13.

Example 4-13. Converting a string into an array directly.
$capitals = 0;

$string = 'The Carriage held but just Ourselves - And Immortality';
$stringArray = str_split($string);
foreach ($stringArray as $char) {
    if (ctype_upper($char)) {
        $capitals += 1;

// $capitals = 5

The downside of Example 4-13 is that PHP now has to maintain two copies of our data: both the original string itself and the resultant array. This isn’t a problem when handling small strings like in our example; if your strings instead represent entire files on disk, you will rapidly exhaust the memory availed to PHP.

PHP makes accessing individual bytes (characters) within a string relatively easy without any changes in data type. Splitting a string into an array works, but might be unnecessary unless you actually need an array of characters. Example 4-14 reimagines Example 4-13 using an array reduction technique rather than by counting the capital letters in a string directly.

Example 4-14. Counting capital letters in a string with array reduction
$str = 'The Carriage held but just Ourselves - And Immortality';

$caps = array_reduce(str_split($str), fn($c, $i) => ctype_upper($i) ? $c+1: $c, 0);

While Example 4-14 is functionally equivalent to Example 4-13, it is must more concise and, consequently, more difficult to understand. While it is tempting to reimagine complex logic as one-line functions, unnecessary refactoring of your code for the sake of conciseness can be dangerous. The code might appear elegant, but over time becomes more difficult to maintain.

The simplified reduction introduced in Example 4-14 is functionally accurate, but still requires splitting our string into an array. It saves on lines of code in our program, but still results in creating a second copy of our data. As mentioned before, if the strings over which we’re iterating are large (i.e. massive binary files), this will rapidly consume the memory availed to PHP.

See Also

PHP documentation on string access and modification, as well as documentation on ctype_upper().

4.5 Generating random strings


You want to generate a string of random characters


Use PHP’s native random_int() function:

function random_string($length = 16)
    $characters = '0123456789abcdefghijklmnopqrstuvwxyz';

    $string = '';
    while (strlen($string) < $length) {
        $string .= $characters[random_int(0, strlen($characters) - 1)];
    return $string;


PHP has strong, cryptographically secure pseudorandom generator functions for both integers and bytes. It does not have a native function that generates random human-readable text, but the underlying functions can be used to create one as shown in the Solution example.


A cryptographically secure pseudorandom number generator is a function that returns numbers with no distinguishable or predictable pattern. Even forensic analysis cannot distinguish between random noise and the output of a cryptographically secure generator.

A valid, and potentially simpler method for producing random strings is to leverage PHP’s random_bytes() function and encode the binary output as ASCII text. Example 4-15 illustrates two possible methods of using random bytes as a string.

Example 4-15. Creating a string of random bytes.
$string = random_bytes(16); 1

$hex = bin2hex($string); 2
$base64 = base64_encode($string); 3

As the string of binary bytes will be encoded in a different format, keep in mind that the number of bytes produced will not match the length of the final string.


Encode our random string in hexadecimal output. Note that this format will double the length of the string — 16 bytes is equivalent to 32 hexadecimal characters.


Leverage base-64 encoding to convert the raw bytes on readable characters. Note that this format increases the length of the string by 33-36%.

See Also

PHP documentation on random_int() and on random_bytes(). Also Recipe 5.4 on generating random numbers.

4.6 Interpolating variables within a string


You want to include dynamic content in an otherwise static string.


Use double quotes to wrap the string and insert a variable, object property, or even function call directly in the string itself.

echo "There are {$_POST['cats']} cats and {$_POST['dogs']} outside.";
echo "Your username is {strlen($username)} characters long.";
echo "The car is painted {$car->color}.";


Unlike single-quotes strings, double-quoted strings allow for complex, dynamic values as literals. Any word starting with a $ character is interpreted as a variable name, unless that leading characters is properly escaped.2

While the Solution example wraps dynamic content in curly braces, this is not a requirement in PHP. Simple variables can easily be written as-is within a double-quoted string and will be interpolated properly. However, more complex sequences become difficult to read without the braces. It’s a highly-recommended best practice to always enclose any value you want interpolated to make the string more readable.

Unfortunately, string interpolation has its limits. The Solution example illustrates pulling data out of the superglobal $_POST array and inserting it directly into a string. This is potentially dangerous, as that content is generated directly by the user, and the string could be leveraged in a sensitive way. In fact, string interpolation like this is one of the largest vectors for injection attacks against applications.


An injection attack is where a third party can pass (or inject) executable or otherwise malicious input into your application and cause it to misbehave. We will look at more sophisticated ways to protect against this family of attacks in Chapter 9.

To protect your string use against potentially malicious user-generated input, it’s a good idea to instead use a format string via PHP’s sprintf() function to filter the content. Example 4-16 rewrites part of our Solution example to protect against malicious $_POST data:

Example 4-16. Using format strings to produce an interpolated string
echo sprintf('There are %d cats and %d dogs.', $_POST['cats'], $_POST['dogs']);

Format strings are a very basic form of input sanitization in PHP. In Example 4-16 we are explicitly assuming that the supplied $_POST data is numeric. The %d tokens within the format string will be replaced by the user-supplied data, but PHP will explicitly cast this data as integers during the replacement.

If, for example, this string were being inserted into a database the formatting would protect against the potential of injection attacks against SQL interfaces. We will discuss more complete methods of filtering and sanitizing user input in Chapter 9.

See Also

PHP documentation on variable parsing in double quotes and heredoc as well as documentation on the sprintf() function.

4.7 Concatenating multiple strings together


You need to create a new string from two smaller strings.


Use PHP’s string concatenation operator:

$first = 'To be or not to be';
$second = 'That is the question';

$line = $first . ' ' . $second;


PHP uses a single . character to join two strings together. This operator will also leverage type coercion to ensure both values in the operation are strings before they’re concatenated, as shown in Example 4-17.

Example 4-17. Examples of string concatenation
print 'String ' . 2; 1
print 2 . ' number'; 2
print 'Boolean ' . true; 3
print 2 . 3; 4

Prints “String 2”.


Prints “2 number”.


Prints “Boolean 1” because Boolean values are cast to integers and then to strings.


Prints “2 3”.

The string concatenation operator is a quick way to combine simple strings, but can become somewhat verbose if you use it to combine multiple strings with white space. Consider Example 4-18, where we try to combine a list of words together into a string, each separated by a space:

Example 4-18. Verbosity in concatenation large groups of strings
$words = [

$option1 = $words[0] . ' ' . $words[1] . ' ' . $words[2] . ' ' . $words[3] .
         ' ' . $words[4] . ' ' . $words[5] . ' ' . $words[6] .
         ' ' . $words[7]; 1

$option2 = '';
foreach($words as $word) {
    $option2 .= ' ' . $word; 2
$option2 = ltrim($option2); 3

One option is to concatenate each word in our collection separately. As our word list grows, this quickly becomes unwieldy.


We can, instead, loop over our collection and concatenate each word in turn.


When using a loop, we might end up with unnecessary whitespace. We need to remember to trim extraneous spaces from the start of the string.

Large, repetitive concatenation routines can be replaced by native PHP functions like implode(). This function in particular accepts an array of data to be joined and a definition of the character (or characters) to be used between data elements. It returns the final, concatenated string.


Some developers prefer to use join() instead of implode() as it’s seen to be a more descriptive name for the operation. The fact is, join() is an alias of implode() and the PHP compiler doesn’t care which you use.

If we rewrite Example 4-18 to use implode(), the entire operation becomes much simpler as follows:

$words = [

$string = implode(' ', $words);

Take care to remember the parameter order for implode(). The string separator comes first, followed by the array over which you want to iterate. Earlier versions of PHP (prior to version 8.0) allowed the parameters to be specified in the opposite order. This behavior (specifying the array first and the separator second) was deprecated in PHP 7.4. As of PHP 8.0 this will throw a TypeError.

If you’re using a library that was written prior to PHP 8.0, be sure you test that it’s not misusing either implode() or join() before you ship your project to production.

See Also

PHP documentation on implode().

4.8 Managing binary data stored in strings


You want to encode data directly as binary rather than an ASCII-formatted representation or read data into your application that was explicitly encoded as binary data.


Use unpack() to extract binary data from a string:

$unpacked = unpack('S1', 'Hi'); // [26952]

Use pack() to write binary data to a string:

$packed = pack('S13', 72, 101, 108, 108, 111, 44, 32, 119, 111,
               114, 108, 100, 33); // 'Hello, world!'


Both pack() and unpack() empower us to operate on raw, binary strings assuming we know the format of the binary string we’re working with. The first parameter of each function is a format specification. This specification is determined by specific format codes as defined in Table 4-2.

Table 4-2. Binary format string codes
Code Description


NUL-padded string


SPACE-padded string


Hex string, low nibble first


Hex string, high nibble first


signed char


unsigned char


signed short (always 16 bit, machine byte order)


unsigned short (always 16 bit, machine byte order)


unsigned short (always 16 bit, big endian byte order)


unsigned short (always 16 bit, little endian byte order)


signed integer (machine dependent size and byte order)


unsigned integer (machine dependent size and byte order)


signed long (always 32 bit, machine byte order)


unsigned long (always 32 bit, machine byte order)


unsigned long (always 32 bit, big endian byte order)


unsigned long (always 32 bit, little endian byte order)


signed long long (always 64 bit, machine byte order)


unsigned long long (always 64 bit, machine byte order)


unsigned long long (always 64 bit, big endian byte order)


unsigned long long (always 64 bit, little endian byte order)


float (machine dependent size and representation)


float (machine dependent size, little endian byte order)


float (machine dependent size, big endian byte order)


double (machine dependent size and representation)


double (machine dependent size, little endian byte order)


double (machine dependent size, big endian byte order)


NUL byte


Back up one byte


NUL-padded string


NUL-fill to absolute position

When defining a format string, you can specify each byte type individually or leverage an optional repeating character. In our Solution examples, we specify the number of bytes explicitly with an integer. We could just as easily use an asterisk (*) to specify a type of byte repeats through the end of our string as follows:

$unpacked = unpack('S*', 'Hi'); // [26952]
$packed = pack('S*', 72, 101, 108, 108, 111, 44, 32, 119, 111,
               114, 108, 100, 33); // 'Hello, world!'

PHP’s ability to convert between byte encoding types via unpack() also avails a simple method of converting ASCII characters to and from their binary equivalent. The ord() function will return the value of a specific character, but requires looping over each character in a string if you want to unpack each in turn, as demonstrated in Example 4-19.

Example 4-19. Retrieving character values with ord()
$ascii = 'PHP Cookbook';

$chars = [];
for($i = 0; $i < strlen($ascii); $i++) {
    $chars[] = ord($ascii[$i]);


Thanks to unpack(), we don’t need to explicitly iterate over the characters in a string. The c format character references a signed character, and C a signed one. Rather than built a loop, we can leverage unpack() directly as follows to get an equivalent result:

$ascii = 'PHP Cookbook';
$chars = unpack('C*', $ascii);


Both the preceding unpack() example and the original loop implementation in Example 4-19 produce the following array:

array(12) {

See Also

PHP documentation on pack() and unpack().

1 This string is a byte representation, formatted in octal notation, of “Hello World!”

2 Review Table 4-1 for more on double-character escape sequences

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.