User-defined operators are similar to the C++ concept of operator overloading, where a default definition of an operator is altered to operate on a wide variety of objects. Typically, operators are unary or binary operators. Implementing binary operator overloading is easier with the existing infrastructure. Unary operators need some additional code to handle. First, binary operator overloading will be defined, and then unary operator overloading will be looked into.
The first part is to define a binary operator for overloading. The logical OR operator (|
) is a good example to start with. The |
operator in our TOY language can be used as follows:
def binary | (LHS RHS) if LHS then 1 else if RHS then 1 else 0;
As seen in the preceding code, if any of the values of the LHS or RHS are not equal to 0, then we return 1
. If both the LHS and RHS are null, then we return 0
.
Do the following steps:
enum
states for the binary operator and return the enum states on encountering the binary
keyword:enum Token_Type { … … BINARY_TOKEN } static int get_token() { … … if (Identifier_string == "in") return IN_TOKEN; if (Identifier_string == "binary") return BINARY_TOKEN; … … }
class FunctionDeclAST { std::string Func_Name; std::vector<std::string> Arguments; bool isOperator; unsigned Precedence; public: FunctionDeclAST(const std::string &name, const std::vector<std::string> &args, bool isoperator = false, unsigned prec = 0) : Func_Name(name), Arguments(args), isOperator(isoperator), Precedence(prec) {} bool isUnaryOp() const { return isOperator && Arguments.size() == 1; } bool isBinaryOp() const { return isOperator && Arguments.size() == 2; } char getOperatorName() const { assert(isUnaryOp() || isBinaryOp()); return Func_Name[Func_Name.size() - 1]; } unsigned getBinaryPrecedence() const { return Precedence; } Function *Codegen(); };
static FunctionDeclAST *func_decl_parser() { std::string FnName; unsigned Kind = 0; unsigned BinaryPrecedence = 30; switch (Current_token) { default: return 0; case IDENTIFIER_TOKEN: FnName = Identifier_string; Kind = 0; next_token(); break; case UNARY_TOKEN: next_token(); if (!isascii(Current_token)) return 0; FnName = "unary"; FnName += (char)Current_token; Kind = 1; next_token(); break; case BINARY_TOKEN: next_token(); if (!isascii(Current_token)) return 0; FnName = "binary"; FnName += (char)Current_token; Kind = 2; next_token(); if (Current_token == NUMERIC_TOKEN) { if (Numeric_Val < 1 || Numeric_Val > 100) return 0; BinaryPrecedence = (unsigned)Numeric_Val; next_token(); } break; } if (Current_token != '(') return 0; std::vector<std::string> Function_Argument_Names; while (next_token() == IDENTIFIER_TOKEN) Function_Argument_Names.push_back(Identifier_string); if (Current_token != ')') return 0; next_token(); if (Kind && Function_Argument_Names.size() != Kind) return 0; return new FunctionDeclAST(FnName, Function_Argument_Names, Kind != 0, BinaryPrecedence); }
Codegen()
function for the binary AST:Value* BinaryAST::Codegen() { Value* L = LHS->Codegen(); Value* R = RHS->Codegen(); switch(Bin_Operator) { case '+' : return Builder.CreateAdd(L, R, "addtmp"); case '-' : return Builder.CreateSub(L, R, "subtmp"); case '*': return Builder.CreateMul(L, R, "multmp"); case '/': return Builder.CreateUDiv(L, R, "divtmp"); case '<' : L = Builder.CreateICmpULT(L, R, "cmptmp"); return Builder.CreateUIToFP(L, Type::getIntTy(getGlobalContext()), "booltmp"); default : break; } Function *F = TheModule->getFunction(std::string("binary")+Op); Value *Ops[2] = { L, R }; return Builder.CreateCall(F, Ops, "binop"); }
Function* FunctionDefnAST::Codegen() { Named_Values.clear(); Function *TheFunction = Func_Decl->Codegen(); if (!TheFunction) return 0; if (Func_Decl->isBinaryOp()) Operator_Precedence [Func_Decl->getOperatorName()] = Func_Decl->getBinaryPrecedence(); BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction); Builder.SetInsertPoint(BB); if (Value* Return_Value = Body->Codegen()) { Builder.CreateRet(Return_Value); … …
Do the following steps:
toy.cpp
file:$ g++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core ` -O3 -o toy
$ vi example
def binary| 5 (LHS RHS) if LHS then 1 else if RHS then 1 else 0;
$ ./toy example output : ; ModuleID = 'my compiler' target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128" define i32 @"binary|"(i32 %LHS, i32 %RHS) { entry: %ifcond = icmp eq i32 %LHS, 0 %ifcond1 = icmp eq i32 %RHS, 0 %. = select i1 %ifcond1, i32 0, i32 1 %iftmp5 = select i1 %ifcond, i32 %., i32 1 ret i32 %iftmp5 }
The binary operator we just defined will be parsed. Its definition is also parsed. Whenever the |
binary operator is encountered, the LHS and RHS are initialized and the definition body is executed, giving the appropriate result as per the definition. In the preceding example, if either the LHS or RHS is nonzero, then the result is 1
. If both the LHS and RHS are zero, then the result is 0
.
18.227.72.15