Compilers and Virtual Machines (VM)

Development

Our company has been working on compilers and virtual machines since its inception, which is more than a quarter of a century! The most renowned IT corporations trust our expertise in this field.

We develop optimizing compilers, runtime environments, and backends for various general-purpose and specialized architectures. For our clients' needs, we have completed several projects on static compilers for programming languages like C, C++, Fortran, and specialized dialects. We are also creating our open-source Java VM based on LLVM – the Alhambra project.

We have gained extensive experience in developing and porting highly efficient virtual machines for modern dynamic languages, including Java, C#, and JavaScript, to new architectures. We conduct additional research on clients' critical scenarios and offer solutions to optimize bottlenecks.

Examples of our developments successfully deployed in production:

Static Compiler and SDK for NPU ASIC (since 2017), proprietary VLIW architecture

We have developed a complete SDK toolset (optimizing C dialect compiler, assembler, disassembler, debugger) based on the LLVM+clang framework. The development started with LLVM4, and the project has since been updated to LLVM12.

The compiler's code generator supports five generations of architecture with significantly different encodings. The necessary set of machine-dependent optimizations (machine loop transformation, loop unrolling & software pipelining, if-conversion, instruction scheduler, etc.) has been implemented, allowing for high code density and respectable performance. A vast (3000+ methods) vector intrinsics library is supported (both at the high level and in the LLVM IR), with several high-level optimizations over such intrinsics developed.

The modular architecture of LLVM allowed our code generator to be used for creating compilers from other languages, including the MLIR project. Work has begun on a universal recompilation mechanism for future architecture generations during execution.

Dynamic Languages for "Elbrus"

As a result of our long-term collaboration with MCST, Russian "Elbrus" processors now support major modern programming languages that are widely used around the world. Our team builds reliable system products based on popular software solutions and original components of our own design. For example, the code generation process was moved to the Tango library, which is used for the implementation of various language platforms.

Below are descriptions of projects related to development on the "Elbrus" platform.

Java Virtual Machine for the "Elbrus" Platform (since 2011)

The Russian Virtual Machine (RVM) Java VM was implemented based on the OpenJDK open-source virtual machine. Over 10 years of work on the project, we have fully supported OpenJDK versions 6, 8, and 11 for the "Elbrus" platform. RVM is fully compatible with Oracle's J2SE specifications.

The just-in-time compiler developed for the "Elbrus" platform is optimized, achieving performance comparable to that of similar Intel processors.

The code generation process was moved to the Tango library, which can be further used for implementing other language platforms.

Work is currently underway to support versions 8 and 11 and further improve performance.

JavaScript Language Support for the "Elbrus" Platform (since 2016)

Two JavaScript runtime environments have been ported to the "Elbrus" platform.

The SpiderMonkey script engine has been integrated into the Mozilla Firefox browser, which is part of the "Elbrus" operating system.

The v8 script engine has been successfully integrated into the node.js platform, primarily used as a web server.

Work is ongoing to enhance the JIT compilers of both the SpiderMonkey and v8 environments and improve performance.

C# Language Support for the "Elbrus" Platform (since 2016)

The first implementation of the C# language on the "Elbrus" platform was based on the Mono project, which has a toolkit compatible with the .NET SDK. The result was a release of Mono with support for version 3.2, followed by releases for versions 4.5 and now 6.12.

Subsequently, the .NET SDK platform was ported. In early 2022, the first release of the .NET Core 3.1 platform on "Elbrus" was launched. Work has begun on .NET 6.

Both implementations have been provided to corporate users for testing their applications. The "UNIPRO" team provides technical support to users. Ongoing work is being done to improve stability and performance.

Java Platform Implementation for Cloud Virtualization by Waratek Inc (2011-2016) on x86_64 architecture.

The virtual machine DRLVM from the Apache Harmony project formed the basis of the implementation. A universal interface was developed to bridge the VM with class libraries, allowing for multiple library implementations (Apache Harmony, OpenJDK), core class packages were modified, and full JCK certification for JavaSE versions 6 and 7 was achieved.

The virtual machine was significantly improved: numerous bugs (especially in multithreading) were fixed, and the stability and performance of all runtime components, including class loading, verification, garbage collection, profiling, and JIT-compiled code versioning, were improved. The G1 garbage collector was transplanted from OpenJDK HotspotVM. Support for lightweight containerization of independent Java applications within a single VM process was implemented. The optimizing JIT compiler underwent extensive improvements, including the addition of new high-level optimizations (escape analysis, class hierarchy analysis, string optimizations, intrinsics for cryptography, math, XML processing, etc.) and enhancements to the x86_64 code generator. These efforts allowed us to catch up with and even surpass competitors (OpenJDK Hotspot, IBM J9) on certain benchmarks.

Sun Studio Fortran 95 Compiler

During our collaboration with Sun Microsystems, "UNIPRO" implemented interval data type support for the Fortran compiler. This implementation was integrated into the standard compiler provided by Sun Microsystems, fully supporting the new INTERVAL type with corresponding operations, built-in functions, and intrinsics. Several library functions using this type were also developed, including guaranteed solutions for nonlinear equations and global optimization of nonlinear functions.