Optimizing the interpreter by adding a compiler

Our parser now works as it should, and you could use it in any kind of application to offer very flexible customization options to the end user. However, the parser does not work very efficiently. In general, parsing expressions are computationally expensive, and in most use cases, it is reasonable to assume that the actual expressions that you're working with do not change with every request (or at least, are evaluated more often than they are changed).

Because of this, we can optimize the parser's performance by adding a caching layer to our interpreter. Of course, we cannot cache the actual evaluation results of an expression; after all, these could change when they are interpreted with different variables.

What we're going to do in this section is add a compiler feature to our parser. For each parsed expression, our parser generates an AST that represents the structure of this expression. You can now use this syntax tree to translate the expression into any other programming language, for example, to PHP.

Consider the 2 + 3 * a expression. This expression generates the following syntax tree:

Optimizing the interpreter by adding a compiler

In our AST model, this corresponds to an instance of the PacktChp8DSLASTAddition class, holding a reference to an instance of the PacktChp8DSLASTNumber class and the PacktChp8DSLASTProduct class (and so forth).

We cannot implement a compiler feature to translate this expressions back into PHP code (after all, PHP does support simple arithmetic operations, too), which might look like this:

use PacktChp8DSLASTExpression; 
 
$cachedExpr = new class implements Expression 
{ 
    public function evaluate(array $variables=[]) 
    { 
        return 2 + (3 * $variables['a']); 
    } 
} 

The PHP code that is generated in this way could then be saved in files for later lookup. If the parser gets passed an expression that is already cached, it could simply load the saved PHP files in order to not actually parse the expression again.

To implement this feature, we'll need to have the possibility to convert each node in a syntax tree into a corresponding PHP expression. For this, let's start by extending our PacktChp8DSLASTExpression interface by a new method:

namespace PacktChp8DSLAST; 
 
interface Expression 
{ 
    public function evaluate(array $variables = []); 
 
    public function compile(): string; 
} 

The downside of this approach is that you'll now need to implement this method for each and every single one of the classes that implement this interface. Let's start with something simple: the PacktChp8DSLASTNumber class. As each Number implementation will always evaluate to the same number (3 will always evaluate to 3 and never to 4), we can simply return the numeric value:

namespace PacktChp8DSLAST; 
 
abstract class Number implements Expression 
{ 
    public function compile(): string

    {

        return var_export($this->evaluate(), true);

    } 
} 

As for the remaining node types, we'll need methods that return an implementation of each expression type in PHP. For example, for the PacktChp8DSLASTAddition class, we could add the following compile() method:

namespace PacktChp8DSLAST; 
 
class Addition extends BinaryOperation 
{ 
    // ... 
 
    public function compile(): string

    {

        return '(' . $this->left->compile() . ') + (' . $this->right->compile() . ')';

    } 
} 

Proceed similarly for the remaining arithmetic operations: Subtraction, Multiplication, and Division, and also the logical operations such as Equals, NotEquals, And, and Or.

For the Condition class, you can use PHP's ternary operator:

namespace PacktChp8DSLAST; 
 
class Condition implements Expression 
{ 
    // ... 
 
    public function compile(): string

    {

        return sprintf('%s ? (%s) : (%s)',
             $this->when->compile(),

            $this->then->compile(),

            $this->else->compile()

        );

    } 
} 

The NamedVariable class is difficult to adjust; the class' evaluate() method currently throws UnknownVariableException when a non-existing variable is referenced. However, our compile() method needs to return a single PHP expression. And looking up a value and also throwing an exception cannot be done in a single expression. Luckily, you can instantiate classes and call methods on them:

namespace PacktChp8DSLAST; 
 
use PacktChp8DSLExceptionUnknownVariableException; 
 
class NamedVariable implements Variable 
{ 
    // ... 
 
    public function evaluate(array $variables = []) 
    { 
        if (isset($variables[$this->name])) { 
            return $variables[$this->name]; 
        } 
        throw new UnknownVariableException(); 
    } 
 
    public function compile(): string 
    { 
        return sprintf('(new %s(%s))->evaluate($variables)', 
            __CLASS__, 
            var_export($this->name, true) 
        ); 
    } 
} 

Using this workaround, the a * 3 expression would be compiled to the following PHP code:

(new PacktChp8DSLASTNamedVariable('a'))->evaluate($variables) * 3 

This just leaves the PropertyFetch class. You might remember that this class was a bit more complex than the other node types, as it implemented quite a few different contingencies on how to look up properties from objects. In theory, this logic could be implemented in a single expression using ternary operators. This would result in the foo.bar expression being compiled to the following monstrosity:

is_object((new PacktChp8DSLASTNamedVariable('foo'))->evaluate($variables)) ? ((is_callable([(new PacktChp8DSLASTNamedVariable('foo'))->evaluate($variables), 'getBar']) ? (new PacktChp8DSLASTNamedVariable('a'))->evaluate($variables)->getBar() : ((is_callable([(new PacktChp8DSLASTNamedVariable('foo'))->evaluate($variables), 'isBar']) ? (new PacktChp8DSLASTNamedVariable('a'))->evaluate($variables)->isBar() : (new PacktChp8DSLASTNamedVariable('a'))->evaluate($variables)['bar'] ?? null)) : (new PacktChp8DSLASTNamedVariable('foo'))->evaluate($variables)['bar'] 

In order to prevent the compiled code from getting overly complicated, it's easier to refactor the PropertyFetch class a little bit. You can extract the actual property lookup method in a static method that can be called from both the evaluate() method and the compiled code:

<?php 
namespace PacktChp8DSLAST; 
 
class PropertyFetch implements Variable 
{ 
    private $left; 
    private $property; 
 
    public function __construct(Variable $left, string $property) 
    { 
        $this->left = $left; 
        $this->property = $property; 
    } 
 
    public function evaluate(array $variables = []) 
    { 
        $var = $this->left->evaluate($variables); 
        return static::evaluateStatic($var, $this->property); 
    } 
 
    public static function evaluateStatic($var, string $property)

    {

        if (is_object($var)) {

            $getterMethodName = 'get' . ucfirst($property);

            if (is_callable([$var, $getterMethodName])) {

                return $var->{$getterMethodName}();

            }

            $isMethodName = 'is' . ucfirst($property);

            if (is_callable([$var, $isMethodName])) {

                return $var->{$isMethodName}();

            }

            return $var->{$property} ?? null;

        }

        return $var[$property] ?? null;

    }

    public function compile(): string

    {

        return __CLASS__ . '::evaluateStatic(' . $this->left->compile() . ', ' . var_export($this->property, true) . ')';

    } 
} 

This way, the foo.bar expression will simply evaluate to this:

PacktChp8DSLASTPropertyFetch::evaluateStatic( 
    (new PacktChp8DSLASTNamedVariable('foo'))->evaluate($variables), 
    'bar' 
) 

In the next step, we can add an alternative to the previously introduced ExpressionBuilder class that transparently compiles expressions, saves them in a cache, and reuses the compiled versions when necessary.

We'll call this class PacktChp8DSLCompilingExpressionBuilder:

<?php 
namespace PacktChp8DSL; 
 
class CompilingExpressionBuilder 
{ 
    /** @var string */ 
    private $cacheDir; 
    /** 
     * @var ExpressionBuilder 
     */ 
    private $inner; 
 
    public function __construct(ExpressionBuilder $inner, string $cacheDir) 
    { 
        $this->cacheDir = $cacheDir; 
        $this->inner = $inner; 
    } 
} 

As we don't want to re-implement the ExpressionBuilder's parsing logic, this class takes an instance of ExpressionBuilder as a dependency. When a new expression is parsed that is not yet present in the cache, this inner expression builder will be used to actually parse this expression.

Let's continue by adding a parseExpression method to this class:

public function parseExpression(string $expr): Expression 
{ 
    $cacheKey = sha1($expr); 
    $cacheFile = $this->cacheDir . '/' . $cacheKey . '.php'; 
    if (file_exists($cacheFile)) { 
        return include($cacheFile); 
    } 
 
    $expr = $this->inner->parseExpression($expr); 
 
    if (!is_dir($this->cacheDir)) { 
        mkdir($this->cacheDir, 0755, true); 
    } 
 
    file_put_contents($cacheFile, '<?php return new class implements '.Expression::class.' { 
        public function evaluate(array $variables=[]) { 
            return ' . $expr->compile() . '; 
        } 
             
        public function compile(): string { 
            return ' . var_export($expr->compile(), true) . '; 
        } 
    };'); 
    return $expr; 
} 

Let's have a look at what happens in this method: first, the actual input string is used to calculate a hash value, uniquely identifying this expression. If a file with this name exists in the cache directory, it will be included as a PHP file, and the file's return value will return as the method's return value:

$cacheKey = sha1($expr); 
$cacheFile = $this->cacheDir . '/' . $cacheKey; 
if (file_exists($cacheFile)) { 
    return include($cacheFile); 
} 

As the method's type hint specified that the method needs to return an instance of the PacktChp8DSLASTExpression interface, the generated cache files also need to return an instance of this interface.

If no compiled version of the expression could be found, the expression is parsed as usual by the inner expression builder. This expression is then compiled to a PHP expression using the compile() method. This PHP code snippet is then used to write the actual cache file. In this file, we're creating a new anonymous class that implements the expression interface, and in its evaluate() method contains the compiled expression.

Tip

Anonymous classes are a feature added in PHP 7. This feature allows you to create objects that implement an interface or extend an existing class without needing to explicitly define a named class for this. Syntactically, this feature can be used as follows:

$a = new class implements SomeInterface {     public function test() {         echo 'Hello';     } }; $a->test();

This means that the foo.bar * 3 expression would create a cache file with the following PHP code as its contents:

<?php 
return new class implements PacktChp8DSLASTExpression 
{ 
    public function evaluate(array $variables = []) 
    { 
        return (PacktChp8DSLASTPropertyFetch::evaluateStatic( 
            (new PacktChp8DSLASTNamedVariable('foo'))->evaluate($variables), 
            'bar' 
        )) * (3); 
    } 
 
    public function compile(): string 
    { 
        return '(Packt\Chp8\DSL\AST\PropertyFetch::evaluateStatic((new Packt\Chp8\DSL\AST\NamedVariable('foo'))->evaluate($variables), 'bar'))*(3)'; 
    } 
}; 

Interestingly, the PHP interpreter itself works much the same way. Before actually executing PHP code, the PHP interpreter compiles the code into an intermediate representation or Bytecode, which is then interpreted by the actual interpreter. In order to not parse the PHP source code over and over again, the compiled bytecode is cached; this is how PHP's opcode cache works.

As we're saving our compiled expressions as PHP code, these will also be compiled into PHP bytecode and cached in the opcode cache for even more performance again. For example, the previous cached expression's evaluate method evaluates to the following PHP bytecode:

Optimizing the interpreter by adding a compiler

The PHP bytecode generated by the PHP interpreter

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.58.51.36