Performance Benefits of Dynamic Compilation
Dynamic compilation is the process of compiling code at runtime. This code could be an intermediate language code or could even be a binary code that is re-optimized at runtime. Dynamic compilation of intermediate code is the technique used in compiling java and .net code.
In this article I will discuss the performance benefits of dynamic compilation in general with some focusing on dynamic compilation on the .net platform.
User input or environment variables based optimization
First let’s start with a short introduction to function inlining.
When a function is inlined, the compiler will insert the complete body of the function in every place in the code where that function is used. Take the following code as an example:
public void Caller()
{
double firstNumber = 5.6;
double secondNumber=9.4;
double result= Multiply(firstNumber, secondNumber);
}
public double Multiply(double x, double y)
{
return x*y;
}
If the function Multiply is inlined, the compiler output will be a compilation for this function
public void Caller()
{
double firstNumber = 5.6;
double secondNumber = 9.4;
double result = firstNumber * secondNumber;
}
Inlining is used to eliminate the time overhead when a function is called, but it could increase the output size of the executable, so either the developer or the compiler have to choose what to inline, you cannot inline all function calls in your application.
Now let’s look at the following piece of code:
public void A()
{
int x = GetInputFromUser();
if (x<10)
B();
else
C();
}
public void B()
{
}
public void C()
{
}
A is a function that takes an integer from the user and based on the value of this integer it calls either function B or C. The question here is which function would we inline, B or C? We should inline the function that has a higher calling probability, but here the calling is based on the user input.
Inlining is just an example for an optimization decision that could be based on the user input or the execution environment. A lot of other optimization decisions are similar to this one (loop unwinding for example).
Profile Guided Optimization (PGO)
A solution to such problems is the use of profile guided optimization. This is a technique in which a special version of the executable or the dll is built that contains probes. The user is asked to take this version of the executable and use it for sometime in the real daily usage scenario expected for this executable or dll. The probes collect data about the application usage to build a profile (from here comes the name Profile Guided Optimization). This profile contains data about code paths used frequently, frequent values of variables, etc.
The profile data are given back to the compiler to use them to optimize the code and build a final optimized version of the executable. This allows the compiler to make optimization decisions based on the user input or environment variables.
This allows the compiler to make frequently used parts of the application run faster where less running parts will be smaller in size, solving the old tradeoff between size and speed.
PGO is supported in Visual C++ starting from Visual C++ 2005 and is supported in GCC and other compilers. PGO was used for optimizing Sql Server 2005.
But now another problem rises. Using PGO means that you have to build a profile for every user and execution environment targeted by your executable or dll. When using PGO for optimizing Sql server, this means that Sql server was optimized for most common usage scenarios. A user who has a different usage scenario will not benefit from this optimization. In some cases the performance of the non optimized code paths could have a worse performance than plain non optimized code (this depends on the compiler in the first place).
Dynamic Profile Guided Optimization
The solution to this problem is dynamic profile guided optimization, a possible optimization technique to use when using dynamic compilation. Here, profiling is done on the fly while the user is running the application. Hot code paths and frequent variables’ values are determined and code is recompiled to a more optimized version dynamically while the user is running the application.
One problem in dynamic profile guided optimization is that collecting the profile data while running the application could decrease performance, so this part in a dynamic compiler needs a good design. Multi core processors are common today and could allow running the profiling process on a separate core so that the profiling process doesn’t affect the performance of the application.
Machine Specific Compilation
Instruction sets supported by processors today can be divided into two main categories:
1-Standard or common instruction sets (x86 and x64) which are instruction sets supported by most processors.
2-Processor specific instruction sets which are supported by some processors or processors’ lines but not all processors (AMD’s 3D Now, Intel’s SSE, etc).
Processor specific instruction sets could offer a better performance but a compiler that targets a wide range of users and execution environments cannot build executables that uses processor specific instruction set because they are not supported by all processors.
Dynamic compilation allows for using processor specific instructions without sacrificing the wide users range. When compiling the application, the compiler checks the execution environment and compiles the executable to use processor specific instructions supported by the processor running the code.
For example The .NET Just In Time (JIT) compiler uses SSE2 instructions (on the machines that support SSE2 instruction set) when casting from double to integer, which gave a 40x performance gain over using ordinary x86 instruction set.
One problem with supporting processor specific instructions is choosing which instruction set to use. For a processor that supports multiple instruction sets, One instruction set could be good for performing one operation and could be bad for another. This is why the .NET JIT compiler uses machine specific instructions in a small scope, because a technique for optimization called vectorizing optimization is needed to decide which instruction sets to use for a given operation. Vector optimization is not supported in the .NET JIT compiler.
Processor specific instructions compilation is an example for an optimization technique that takes the hardware variations into consideration at compilation. There are other similar machine based optimization techniques that can be used when using dynamic compilation.
Other Performance Benefits
There are other benefits for dynamic compilation. For example, dynamic compilation allows for optimization across dynamic binding. When binding at runtime to a dll, the compiler can have a bird’s eye view to the whole application and dlls it uses and optimize the whole application based on this view.
Also dynamic compilation means that you don’t have to build a new version of the executable to take benefit of new optimization techniques in the new version of the compiler. When updating the runtime on the user’s machine, all executables takes benefit of the features of the new compiler automatically.
Dynamic Compilation Performance Pitfall
The compilation process itself could represent a great performance overhead that could waste all performance gains mentioned above. Some techniques could be used to decrease the compilation overhead, such as caching compiled code to avoid recompilation when the same code is executed multiple times. Even dynamic optimization is done on the already compiled code or even on code optimized in previous optimization iterations to decrease the compilation and optimization performance cost. Running the compiler and optimization on a separate core (on multi core processesor) is another technique to decrease the dynamic compilation performance cost.
A Word about .NET JIT
The .NET JIT started as a promising JIT compiler. In the first version it had some great features like compiled code caching and using processor specific instructions in a limited scope. In the next versions I expected them to continue adding other optimization techniques to the compiler but this didn’t take place.
In Microsoft Middle East Developer’s Conference (MDC) in 2006, Ayman Shokry the program manager of Visual C++ gave a session where he introduced the profile guided optimization in Visual C++ 2005 and he mentioned that implementing dynamic profile guided optimization in the .NET JIT compiler was planned. Till now I heard nothing about implementing dynamic profile guided optimization in the JIT compiler.
Also one of the developers at Microsoft mentioned on a development forum that time constraint prevented them from implementing verctorizing optimization to support processor specific instructions on a bigger scope in version 1.1 of the runtime. I expected Microsoft to implement this in next versions but this didn’t happen.
Maybe this is because of Microsoft’s focus on shipping multiple .NET based technologies in a narrow time scope (Entity Framework, LINQ, WWF, etc), technologies which I think lake a unified vision and contain major design flaws, but this would be discussed in a separate article.
Comments
3 Responses to “Performance Benefits of Dynamic Compilation”
Leave a Reply
If you are interested in this topic, you might want to investigate the LoseThos operating system. http://www.losethos.com It compiles as much as possible JIT.
مقالة رائعة
جزاك الله خيرا
@Terry
Thanks for the link, but I can’t find details on the website, for example what is the language that the compiler supports, what are its features,etc.
@EraMax
جزانا و إياكم