(Ad revenue is used to cover hosting costs and nice little presents for my little daugther.)

Exploration of LLVM: CPP to LLVM to CPP LLVM API code
Jan 5, 2016

So today I met my advisor for my Bachelor thesis at TU Berlin . In the coming semester breaks I will start to work on the topic of Compiling Database Operators to efficient LLVM assembly code . The thesis is mainly based on Thomas Neumanns paper Efficiently Compiling Efficient Query Plans for Modern Hardware . He claims the resulting code by compiling queries into machine code using the optimising LLVM compiler is faster than handwritten C++ code.

I'll be working on the CoGaDB . It already generates custom C++ query execution code for SQL queries. My goal is to implement matching LLVM IR Code blocks and its compiler to compare it against the aforementioned C++ query compiler.

I would like to share my exploration of LLVM with you. So lets start with something simple:

I learned that you can create LLVM API C++ code from your existing C++ code. Imagine we have a small file called hello.cpp. It is defined in the following code:

#include <string>

using namespace std;

string hello() {
    return "Hello World";

We can compile this to LLVM Bytecode using clang :

clang -c -emit-llvm hello.cpp -o hello.bc

Now we can transform the LLVM byte code back to LLVM API C++ code using llc:

llc -march=cpp hello.bc -o hello_llvm.cpp

Et voila, there is our code that generates LLVM IR code for our hello.cpp. This means to write LLVM IR code I don't even have to understand it (from the beginning).

One more thing: I discovered that the LLVM code as well as the LLVM API C++ code for my little function is about half the size if I attach -O2 to the clang command. We will discover in the one of the following notes, why this is the case and what is happening there.