Files
2026-03-25 14:14:07 +01:00

5.8 KiB

Overview

Here is a quick overview of the execution time of Koffi calls on three benchmarks, where it is compared to a theoretical ideal FFI implementation (approximated with pre-compiled static N-API glue code):

  • The first benchmark is based on rand() calls
  • The second benchmark is based on atoi() calls
  • The third benchmark is based on Raylib

Linux x86_64 performance Windows x86_64 performance

These results are detailed and explained below, and compared to node-ffi/node-ffi-napi.

Linux x86_64

The results presented below were measured on my x86_64 Linux machine (AMD Ryzen™ 5 2600).

rand results

This test is based around repeated calls to a simple standard C function rand, and has three implementations:

  • the first one is the reference, it calls rand through an N-API module, and is close to the theoretical limit of a perfect (no overhead) Node.js > C FFI implementation (pre-compiled static glue code)
  • the second one calls rand through Koffi
  • the third one uses the official Node.js FFI implementation, node-ffi-napi
Benchmark Iteration time Relative performance Overhead
rand_napi 569 ns x1.00 (ref)
rand_koffi 855 ns x0.67 +50%
rand_node_ffi 58730 ns x0.010 +10228%

Because rand is a pretty small function, the FFI overhead is clearly visible.

atoi results

This test is similar to the rand one, but it is based on atoi, which takes a string parameter. Javascript (V8) to C string conversion is relatively slow and heavy.

Benchmark Iteration time Relative performance Overhead
atoi_napi 1039 ns x1.00 (ref)
atoi_koffi 1642 ns x0.63 +58%
atoi_node_ffi 164790 ns x0.006 +15767%

Because atoi is a pretty small function, the FFI overhead is clearly visible.

Raylib results

This benchmark uses the CPU-based image drawing functions in Raylib. The calls are much heavier than in previous benchmarks, thus the FFI overhead is reduced. In this implementation, Koffi is compared to:

  • Baseline: Full C++ version of the code (no JS)
  • node-raylib: This is a native wrapper implemented with N-API
Benchmark Iteration time Relative performance Overhead
raylib_cc 17.5 µs x1.34 -25%
raylib_node_raylib 23.4 µs x1.00 (ref)
raylib_koffi 28.8 µs x0.81 +23%
raylib_node_ffi 103.9 µs x0.23 +344%

Windows x86_64

The results presented below were measured on my x86_64 Windows machine (Intel® Core™ i5-4460).

rand results

This test is based around repeated calls to a simple standard C function rand, and has three implementations:

  • the first one is the reference, it calls rand through an N-API module, and is close to the theoretical limit of a perfect (no overhead) Node.js > C FFI implementation (pre-compiled static glue code)
  • the second one calls rand through Koffi
  • the third one uses the official Node.js FFI implementation, node-ffi-napi
Benchmark Iteration time Relative performance Overhead
rand_napi 859 ns x1.00 (ref)
rand_koffi 1352 ns x0.64 +57%
rand_node_ffi 35640 ns x0.02 +4048%

Because rand is a pretty small function, the FFI overhead is clearly visible.

atoi results

This test is similar to the rand one, but it is based on atoi, which takes a string parameter. Javascript (V8) to C string conversion is relatively slow and heavy.

The results below were measured on my x86_64 Windows machine (Intel® Core™ i5-4460):

Benchmark Iteration time Relative performance Overhead
atoi_napi 1336 ns x1.00 (ref)
atoi_koffi 2440 ns x0.55 +83%
atoi_node_ffi 136890 ns x0.010 +10144%

Because atoi is a pretty small function, the FFI overhead is clearly visible.

Raylib results

This benchmark uses the CPU-based image drawing functions in Raylib. The calls are much heavier than in the atoi benchmark, thus the FFI overhead is reduced. In this implementation, Koffi is compared to:

  • node-raylib (baseline): This is a native wrapper implemented with N-API
  • raylib_cc: C++ implementation of the benchmark, without any Javascript
Benchmark Iteration time Relative performance Overhead
raylib_cc 18.2 µs x1.50 -33%
raylib_node_raylib 27.3 µs x1.00 (ref)
raylib_koffi 29.8 µs x0.92 +9%
raylib_node_ffi 96.3 µs x0.28 +253%

Running benchmarks

Please note that all benchmark results on this page are made with Clang-built binaries.

cd koffi
node ../../cnoke/cnoke.js --prefer-clang

cd koffi/benchmark
node ../../cnoke/cnoke.js --prefer-clang

Once everything is built and ready, run:

node benchmark.js