So today I met my advisor for my Bachelor thesis at TU Berlin . In the coming semester breaks I will start to work on the topic of Compiling Database Operators to efficient LLVM assembly code . The thesis is mainly based on Thomas Neumanns paper Efficiently Compiling Efficient Query Plans for Modern Hardware . He claims the resulting code by compiling queries into machine code using the optimising LLVM compiler is faster than handwritten C++ code.
I'll be working on the CoGaDB . It already generates custom C++ query execution code for SQL queries. My goal is to implement matching LLVM IR Code blocks and its compiler to compare it against the aforementioned C++ query compiler.
I would like to share my exploration of LLVM with you. So lets start with something simple:
I learned that you can create LLVM API C++ code from your existing C++ code. Imagine we have a small file called
hello.cpp. It is defined in the following code:
We can compile this to LLVM Bytecode using clang :
Now we can transform the LLVM byte code back to LLVM API C++ code
Et voila, there is our code that generates LLVM IR code for our
This means to write LLVM IR code I don't even have to understand it (from the beginning).
One more thing: I discovered that the LLVM code as well as the LLVM API C++ code for my little function is about half the size if I attach
-O2 to the
clang command. We will discover in the one of the following notes, why this is the case and what is happening there.