suro/aza

Files

suro a0073b4fb1 update

2026-03-25 14:14:07 +01:00

5.8 KiB

Raw Permalink Blame History

Overview

Here is a quick overview of the execution time of Koffi calls on three benchmarks, where it is compared to a theoretical ideal FFI implementation (approximated with pre-compiled static N-API glue code):

The first benchmark is based on rand() calls
The second benchmark is based on atoi() calls
The third benchmark is based on Raylib

These results are detailed and explained below, and compared to node-ffi/node-ffi-napi.

Linux x86_64

The results presented below were measured on my x86_64 Linux machine (AMD Ryzen™ 5 2600).

rand results

This test is based around repeated calls to a simple standard C function rand, and has three implementations:

the first one is the reference, it calls rand through an N-API module, and is close to the theoretical limit of a perfect (no overhead) Node.js > C FFI implementation (pre-compiled static glue code)
the second one calls rand through Koffi
the third one uses the official Node.js FFI implementation, node-ffi-napi

Benchmark	Iteration time	Relative performance	Overhead
rand_napi	569 ns	x1.00	(ref)
rand_koffi	855 ns	x0.67	+50%
rand_node_ffi	58730 ns	x0.010	+10228%

Because rand is a pretty small function, the FFI overhead is clearly visible.

atoi results

This test is similar to the rand one, but it is based on atoi, which takes a string parameter. Javascript (V8) to C string conversion is relatively slow and heavy.

Benchmark	Iteration time	Relative performance	Overhead
atoi_napi	1039 ns	x1.00	(ref)
atoi_koffi	1642 ns	x0.63	+58%
atoi_node_ffi	164790 ns	x0.006	+15767%

Because atoi is a pretty small function, the FFI overhead is clearly visible.

Raylib results

This benchmark uses the CPU-based image drawing functions in Raylib. The calls are much heavier than in previous benchmarks, thus the FFI overhead is reduced. In this implementation, Koffi is compared to:

Baseline: Full C++ version of the code (no JS)
node-raylib: This is a native wrapper implemented with N-API

Benchmark	Iteration time	Relative performance	Overhead
raylib_cc	17.5 µs	x1.34	-25%
raylib_node_raylib	23.4 µs	x1.00	(ref)
raylib_koffi	28.8 µs	x0.81	+23%
raylib_node_ffi	103.9 µs	x0.23	+344%

Windows x86_64

The results presented below were measured on my x86_64 Windows machine (Intel® Core™ i5-4460).

rand results

This test is based around repeated calls to a simple standard C function rand, and has three implementations:

the first one is the reference, it calls rand through an N-API module, and is close to the theoretical limit of a perfect (no overhead) Node.js > C FFI implementation (pre-compiled static glue code)
the second one calls rand through Koffi
the third one uses the official Node.js FFI implementation, node-ffi-napi

Benchmark	Iteration time	Relative performance	Overhead
rand_napi	859 ns	x1.00	(ref)
rand_koffi	1352 ns	x0.64	+57%
rand_node_ffi	35640 ns	x0.02	+4048%

Because rand is a pretty small function, the FFI overhead is clearly visible.

atoi results

This test is similar to the rand one, but it is based on atoi, which takes a string parameter. Javascript (V8) to C string conversion is relatively slow and heavy.

The results below were measured on my x86_64 Windows machine (Intel® Core™ i5-4460):

Benchmark	Iteration time	Relative performance	Overhead
atoi_napi	1336 ns	x1.00	(ref)
atoi_koffi	2440 ns	x0.55	+83%
atoi_node_ffi	136890 ns	x0.010	+10144%

Because atoi is a pretty small function, the FFI overhead is clearly visible.

Raylib results

This benchmark uses the CPU-based image drawing functions in Raylib. The calls are much heavier than in the atoi benchmark, thus the FFI overhead is reduced. In this implementation, Koffi is compared to:

node-raylib (baseline): This is a native wrapper implemented with N-API
raylib_cc: C++ implementation of the benchmark, without any Javascript

Benchmark	Iteration time	Relative performance	Overhead
raylib_cc	18.2 µs	x1.50	-33%
raylib_node_raylib	27.3 µs	x1.00	(ref)
raylib_koffi	29.8 µs	x0.92	+9%
raylib_node_ffi	96.3 µs	x0.28	+253%

Running benchmarks

Please note that all benchmark results on this page are made with Clang-built binaries.

cd koffi
node ../../cnoke/cnoke.js --prefer-clang

cd koffi/benchmark
node ../../cnoke/cnoke.js --prefer-clang

Once everything is built and ready, run:

node benchmark.js

5.8 KiB Raw Permalink Blame History

Overview

Linux x86_64

rand results

atoi results

Raylib results

Windows x86_64

rand results

atoi results

Raylib results

Running benchmarks

5.8 KiB

Raw Permalink Blame History