APP/nexus-remote/node_modules/koffi/doc/pages/benchmarks.md

# Overview

Here is a quick overview of the execution time of Koffi calls on three benchmarks, where it is compared to a theoretical ideal FFI implementation (approximated with pre-compiled static N-API glue code):

- The first benchmark is based on `rand()` calls
- The second benchmark is based on `atoi()` calls
- The third benchmark is based on [Raylib](https://www.raylib.com/)

<p style="text-align: center;">
    <a href="{{ ASSET static/perf_linux.png }}" target="_blank"><img src="{{ ASSET static/perf_linux.png }}" alt="Linux x86_64 performance" style="width: 350px;"/></a>
    <a href="{{ ASSET static/perf_windows.png }}" target="_blank"><img src="{{ ASSET static/perf_windows.png }}" alt="Windows x86_64 performance" style="width: 350px;"/></a>
</p>

These results are detailed and explained below, and compared to node-ffi/node-ffi-napi.

# Linux x86_64

The results presented below were measured on my x86_64 Linux machine (AMD Ryzen™ 5 2600).

## rand results

This test is based around repeated calls to a simple standard C function `rand`, and has three implementations:

- the first one is the reference, it calls rand through an N-API module, and is close to the theoretical limit of a perfect (no overhead) Node.js > C FFI implementation (pre-compiled static glue code)
- the second one calls rand through Koffi
- the third one uses the official Node.js FFI implementation, node-ffi-napi

Benchmark     | Iteration time | Relative performance | Overhead
------------- | -------------- | -------------------- | --------
rand_napi     | 569 ns         | x1.00                | (ref)
rand_koffi    | 855 ns         | x0.67                | +50%
rand_node_ffi | 58730 ns       | x0.010               | +10228%

Because rand is a pretty small function, the FFI overhead is clearly visible.

## atoi results

This test is similar to the rand one, but it is based on `atoi`, which takes a string parameter. Javascript (V8) to C string conversion is relatively slow and heavy.

Benchmark     | Iteration time | Relative performance | Overhead
------------- | -------------- | -------------------- | --------
atoi_napi     | 1039 ns        | x1.00                | (ref)
atoi_koffi    | 1642 ns        | x0.63                | +58%
atoi_node_ffi | 164790 ns      | x0.006               | +15767%

Because atoi is a pretty small function, the FFI overhead is clearly visible.

## Raylib results

This benchmark uses the CPU-based image drawing functions in Raylib. The calls are much heavier than in previous benchmarks, thus the FFI overhead is reduced. In this implementation, Koffi is compared to:

- Baseline: Full C++ version of the code (no JS)
- [node-raylib](https://github.com/RobLoach/node-raylib): This is a native wrapper implemented with N-API

Benchmark          | Iteration time | Relative performance | Overhead
------------------ | -------------- | -------------------- | --------
raylib_cc          | 17.5 µs        | x1.34                | -25%
raylib_node_raylib | 23.4 µs        | x1.00                | (ref)
raylib_koffi       | 28.8 µs        | x0.81                | +23%
raylib_node_ffi    | 103.9 µs       | x0.23                | +344%

# Windows x86_64

The results presented below were measured on my x86_64 Windows machine (Intel® Core™ i5-4460).

## rand results

This test is based around repeated calls to a simple standard C function `rand`, and has three implementations:

- the first one is the reference, it calls rand through an N-API module, and is close to the theoretical limit of a perfect (no overhead) Node.js > C FFI implementation (pre-compiled static glue code)
- the second one calls rand through Koffi
- the third one uses the official Node.js FFI implementation, node-ffi-napi

Benchmark     | Iteration time | Relative performance | Overhead
------------- | -------------- | -------------------- | --------
rand_napi     | 859 ns         | x1.00                | (ref)
rand_koffi    | 1352 ns        | x0.64                | +57%
rand_node_ffi | 35640 ns       | x0.02                | +4048%

Because rand is a pretty small function, the FFI overhead is clearly visible.

## atoi results

This test is similar to the rand one, but it is based on `atoi`, which takes a string parameter. Javascript (V8) to C string conversion is relatively slow and heavy.

The results below were measured on my x86_64 Windows machine (Intel® Core™ i5-4460):

Benchmark     | Iteration time | Relative performance | Overhead
------------- | -------------- | -------------------- | --------
atoi_napi     | 1336 ns        | x1.00                | (ref)
atoi_koffi    | 2440 ns        | x0.55                | +83%
atoi_node_ffi | 136890 ns      | x0.010               | +10144%

Because atoi is a pretty small function, the FFI overhead is clearly visible.

## Raylib results

This benchmark uses the CPU-based image drawing functions in Raylib. The calls are much heavier than in the atoi benchmark, thus the FFI overhead is reduced. In this implementation, Koffi is compared to:

- [node-raylib](https://github.com/RobLoach/node-raylib) (baseline): This is a native wrapper implemented with N-API
- raylib_cc: C++ implementation of the benchmark, without any Javascript

Benchmark          | Iteration time | Relative performance | Overhead
------------------ | -------------- | -------------------- | --------
raylib_cc          | 18.2 µs        | x1.50                | -33%
raylib_node_raylib | 27.3 µs        | x1.00                | (ref)
raylib_koffi       | 29.8 µs        | x0.92                | +9%
raylib_node_ffi    | 96.3 µs        | x0.28                | +253%

# Running benchmarks

Please note that all benchmark results on this page are made with Clang-built binaries.

```sh
cd koffi
node ../../cnoke/cnoke.js --prefer-clang

cd koffi/benchmark
node ../../cnoke/cnoke.js --prefer-clang
```

Once everything is built and ready, run:

```sh
node benchmark.js
```
update 2026-03-25 14:14:07 +01:00			`# Overview`

			`Here is a quick overview of the execution time of Koffi calls on three benchmarks, where it is compared to a theoretical ideal FFI implementation (approximated with pre-compiled static N-API glue code):`

			- The first benchmark is based on `rand()` calls
			- The second benchmark is based on `atoi()` calls
			`- The third benchmark is based on [Raylib](https://www.raylib.com/)`

			`<p style="text-align: center;">`
			`<a href="{{ ASSET static/perf_linux.png }}" target="_blank"><img src="{{ ASSET static/perf_linux.png }}" alt="Linux x86_64 performance" style="width: 350px;"/></a>`
			`<a href="{{ ASSET static/perf_windows.png }}" target="_blank"><img src="{{ ASSET static/perf_windows.png }}" alt="Windows x86_64 performance" style="width: 350px;"/></a>`
			`</p>`

			`These results are detailed and explained below, and compared to node-ffi/node-ffi-napi.`

			`# Linux x86_64`

			`The results presented below were measured on my x86_64 Linux machine (AMD Ryzen™ 5 2600).`

			`## rand results`

			This test is based around repeated calls to a simple standard C function `rand`, and has three implementations:

			`- the first one is the reference, it calls rand through an N-API module, and is close to the theoretical limit of a perfect (no overhead) Node.js > C FFI implementation (pre-compiled static glue code)`
			`- the second one calls rand through Koffi`
			`- the third one uses the official Node.js FFI implementation, node-ffi-napi`

			`Benchmark \| Iteration time \| Relative performance \| Overhead`
			`------------- \| -------------- \| -------------------- \| --------`
			`rand_napi \| 569 ns \| x1.00 \| (ref)`
			`rand_koffi \| 855 ns \| x0.67 \| +50%`
			`rand_node_ffi \| 58730 ns \| x0.010 \| +10228%`

			`Because rand is a pretty small function, the FFI overhead is clearly visible.`

			`## atoi results`

			This test is similar to the rand one, but it is based on `atoi`, which takes a string parameter. Javascript (V8) to C string conversion is relatively slow and heavy.

			`Benchmark \| Iteration time \| Relative performance \| Overhead`
			`------------- \| -------------- \| -------------------- \| --------`
			`atoi_napi \| 1039 ns \| x1.00 \| (ref)`
			`atoi_koffi \| 1642 ns \| x0.63 \| +58%`
			`atoi_node_ffi \| 164790 ns \| x0.006 \| +15767%`

			`Because atoi is a pretty small function, the FFI overhead is clearly visible.`

			`## Raylib results`

			`This benchmark uses the CPU-based image drawing functions in Raylib. The calls are much heavier than in previous benchmarks, thus the FFI overhead is reduced. In this implementation, Koffi is compared to:`

			`- Baseline: Full C++ version of the code (no JS)`
			`- [node-raylib](https://github.com/RobLoach/node-raylib): This is a native wrapper implemented with N-API`

			`Benchmark \| Iteration time \| Relative performance \| Overhead`
			`------------------ \| -------------- \| -------------------- \| --------`
			`raylib_cc \| 17.5 µs \| x1.34 \| -25%`
			`raylib_node_raylib \| 23.4 µs \| x1.00 \| (ref)`
			`raylib_koffi \| 28.8 µs \| x0.81 \| +23%`
			`raylib_node_ffi \| 103.9 µs \| x0.23 \| +344%`

			`# Windows x86_64`

			`The results presented below were measured on my x86_64 Windows machine (Intel® Core™ i5-4460).`

			`## rand results`

			This test is based around repeated calls to a simple standard C function `rand`, and has three implementations:

			`- the first one is the reference, it calls rand through an N-API module, and is close to the theoretical limit of a perfect (no overhead) Node.js > C FFI implementation (pre-compiled static glue code)`
			`- the second one calls rand through Koffi`
			`- the third one uses the official Node.js FFI implementation, node-ffi-napi`

			`Benchmark \| Iteration time \| Relative performance \| Overhead`
			`------------- \| -------------- \| -------------------- \| --------`
			`rand_napi \| 859 ns \| x1.00 \| (ref)`
			`rand_koffi \| 1352 ns \| x0.64 \| +57%`
			`rand_node_ffi \| 35640 ns \| x0.02 \| +4048%`

			`Because rand is a pretty small function, the FFI overhead is clearly visible.`

			`## atoi results`

			This test is similar to the rand one, but it is based on `atoi`, which takes a string parameter. Javascript (V8) to C string conversion is relatively slow and heavy.

			`The results below were measured on my x86_64 Windows machine (Intel® Core™ i5-4460):`

			`Benchmark \| Iteration time \| Relative performance \| Overhead`
			`------------- \| -------------- \| -------------------- \| --------`
			`atoi_napi \| 1336 ns \| x1.00 \| (ref)`
			`atoi_koffi \| 2440 ns \| x0.55 \| +83%`
			`atoi_node_ffi \| 136890 ns \| x0.010 \| +10144%`

			`Because atoi is a pretty small function, the FFI overhead is clearly visible.`

			`## Raylib results`

			`This benchmark uses the CPU-based image drawing functions in Raylib. The calls are much heavier than in the atoi benchmark, thus the FFI overhead is reduced. In this implementation, Koffi is compared to:`

			`- [node-raylib](https://github.com/RobLoach/node-raylib) (baseline): This is a native wrapper implemented with N-API`
			`- raylib_cc: C++ implementation of the benchmark, without any Javascript`

			`Benchmark \| Iteration time \| Relative performance \| Overhead`
			`------------------ \| -------------- \| -------------------- \| --------`
			`raylib_cc \| 18.2 µs \| x1.50 \| -33%`
			`raylib_node_raylib \| 27.3 µs \| x1.00 \| (ref)`
			`raylib_koffi \| 29.8 µs \| x0.92 \| +9%`
			`raylib_node_ffi \| 96.3 µs \| x0.28 \| +253%`

			`# Running benchmarks`

			`Please note that all benchmark results on this page are made with Clang-built binaries.`

			```sh
			`cd koffi`
			`node ../../cnoke/cnoke.js --prefer-clang`

			`cd koffi/benchmark`
			`node ../../cnoke/cnoke.js --prefer-clang`
			```

			`Once everything is built and ready, run:`

			```sh
			`node benchmark.js`
			```