Handling user-defined operators – binary operators

User-defined operators are similar to the C++ concept of operator overloading, where a default definition of an operator is altered to operate on a wide variety of objects. Typically, operators are unary or binary operators. Implementing binary operator overloading is easier with the existing infrastructure. Unary operators need some additional code to handle. First, binary operator overloading will be defined, and then unary operator overloading will be looked into.

Getting ready

The first part is to define a binary operator for overloading. The logical OR operator (|) is a good example to start with. The | operator in our TOY language can be used as follows:

def binary | (LHS RHS)
if LHS then
1
else if RHS then
1
else
0;

As seen in the preceding code, if any of the values of the LHS or RHS are not equal to 0, then we return 1. If both the LHS and RHS are null, then we return 0.

How to do it...

Do the following steps:

  1. The first step, as usual, is to append the enum states for the binary operator and return the enum states on encountering the binary keyword:
     enum Token_Type {
    …
    …
    BINARY_TOKEN
    }
    static int get_token() {
    …
    …
    if (Identifier_string == "in") return IN_TOKEN;
    if (Identifier_string == "binary") return BINARY_TOKEN;
    …
    …
    }
  2. The next step is to add an AST for the same. Note that it doesn't need a new AST to be defined. It can be handled with the function declaration AST. We just need to modify it by adding a flag to represent whether it's a binary operator. If it is, then determine its precedence:
    class FunctionDeclAST {
      std::string Func_Name;
      std::vector<std::string> Arguments;
      bool isOperator;
      unsigned Precedence;
    public:
      FunctionDeclAST(const std::string &name, const std::vector<std::string> &args,
                   bool isoperator = false, unsigned prec = 0)
          : Func_Name(name), Arguments(args), isOperator(isoperator), Precedence(prec) {}
    
      bool isUnaryOp() const { return isOperator && Arguments.size() == 1; }
      bool isBinaryOp() const { return isOperator && Arguments.size() == 2; }
    
      char getOperatorName() const {
        assert(isUnaryOp() || isBinaryOp());
        return Func_Name[Func_Name.size() - 1];
      }
    
      unsigned getBinaryPrecedence() const { return Precedence; }
    
      Function *Codegen();
    };
  3. Once the modified AST is ready, the next step is to modify the parser of the function declaration:
    static FunctionDeclAST *func_decl_parser() {
      std::string FnName;
    
      unsigned Kind = 0;
      unsigned BinaryPrecedence = 30;
    
      switch (Current_token) {
      default:
        return 0;
      case IDENTIFIER_TOKEN:
        FnName = Identifier_string;
        Kind = 0;
        next_token();
        break;
      case UNARY_TOKEN:
        next_token();
        if (!isascii(Current_token))
          return 0;
        FnName = "unary";
        FnName += (char)Current_token;
        Kind = 1;
        next_token();
        break;
      case BINARY_TOKEN:
        next_token();
        if (!isascii(Current_token))
          return 0;
        FnName = "binary";
        FnName += (char)Current_token;
        Kind = 2;
        next_token();
    
        if (Current_token == NUMERIC_TOKEN) {
          if (Numeric_Val < 1 || Numeric_Val > 100)
            return 0;
          BinaryPrecedence = (unsigned)Numeric_Val;
          next_token();
        }
        break;
      }
    
      if (Current_token != '(')
        return 0;
    
      std::vector<std::string> Function_Argument_Names;
      while (next_token() == IDENTIFIER_TOKEN)
        Function_Argument_Names.push_back(Identifier_string);
      if (Current_token != ')')
        return 0;
    
      next_token();
    
      if (Kind && Function_Argument_Names.size() != Kind)
        return 0;
    
      return new FunctionDeclAST(FnName, Function_Argument_Names, Kind != 0, BinaryPrecedence);
    }
  4. Then we modify the Codegen() function for the binary AST:
    Value* BinaryAST::Codegen() {
     Value* L = LHS->Codegen();
    Value* R = RHS->Codegen();
    switch(Bin_Operator) {
    case '+' : return Builder.CreateAdd(L, R, "addtmp");
    case '-' : return Builder.CreateSub(L, R, "subtmp");
    case '*': return Builder.CreateMul(L, R, "multmp");
    case '/': return Builder.CreateUDiv(L, R, "divtmp");
    case '<' :
    L = Builder.CreateICmpULT(L, R, "cmptmp");
    return Builder.CreateUIToFP(L, Type::getIntTy(getGlobalContext()), "booltmp");
    default :
    break;
    }
    Function *F = TheModule->getFunction(std::string("binary")+Op);
      Value *Ops[2] = { L, R };
      return Builder.CreateCall(F, Ops, "binop");
    }
  5. Next we modify the function definition; it can be defined as:
    Function* FunctionDefnAST::Codegen() {
    Named_Values.clear();
    Function *TheFunction = Func_Decl->Codegen();
    if (!TheFunction) return 0;
    if (Func_Decl->isBinaryOp())
        Operator_Precedence [Func_Decl->getOperatorName()] = Func_Decl->getBinaryPrecedence();
    BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
    Builder.SetInsertPoint(BB);
    if (Value* Return_Value = Body->Codegen()) {
        Builder.CreateRet(Return_Value);
    …
    …

How it works...

Do the following steps:

  1. Compile the toy.cpp file:
    $ g++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core ` -O3 -o toy
    
  2. Open an example file:
    $ vi example
    
  3. Write the following binary operator overloading code in the example file:
    def binary| 5 (LHS RHS)
      if LHS then
        1
      else if RHS then
        1
      else
        0;
  4. Compile the example file with the TOY compiler:
    $ ./toy example
    
    output :
    
    ; ModuleID = 'my compiler'
    target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"
    
    define i32 @"binary|"(i32 %LHS, i32 %RHS) {
    entry:
      %ifcond = icmp eq i32 %LHS, 0
      %ifcond1 = icmp eq i32 %RHS, 0
      %. = select i1 %ifcond1, i32 0, i32 1
      %iftmp5 = select i1 %ifcond, i32 %., i32 1
      ret i32 %iftmp5
    }
    

The binary operator we just defined will be parsed. Its definition is also parsed. Whenever the | binary operator is encountered, the LHS and RHS are initialized and the definition body is executed, giving the appropriate result as per the definition. In the preceding example, if either the LHS or RHS is nonzero, then the result is 1. If both the LHS and RHS are zero, then the result is 0.

See also

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.227.72.15