Benchmarking on ./Code

How REX Builds Native LLVM And Generated Variants For Side-By-Side GPU Benchmarking

Mon, 20 Apr 2026 00:00:00 +0000

The previous post explained why the GPU benchmark layer in REX should be treated as an investigation surface rather than a scoreboard.

This post steps one level earlier in that same top-layer contract.

Before you can compare correctness or performance meaningfully, you need to answer a much more basic question:

what exactly are the two binaries being compared?

That sounds like bookkeeping.

It is not.

If the native LLVM side and the REX side are not built under a disciplined contract, then the benchmark result stops meaning what it claims to mean.

Why REX Treats GPU Benchmark Results As An Investigation Surface, Not A Scoreboard

Sun, 19 Apr 2026 00:00:00 +0000

The previous post argued that the GPU benchmark layer in REX must stay narrow instead of becoming a catch-all test suite.

This post narrows the same idea one step further.

It is about the style of interpretation at the top layer.

The benchmark layer is not most useful when it says:

1
2
3
4


REX faster
LLVM faster
pass
fail

It is most useful when it says:

Why REX's GPU Benchmark Layer Must Not Become A Catch-All Test Suite

Sat, 18 Apr 2026 00:00:00 +0000

The previous post argued that real GPU benchmarks still matter in REX because they are the only place where the full offloading stack meets a real application.

That does not mean benchmarks should become the default place to detect every kind of bug.

This post is about that boundary.

The benchmark layer is indispensable, but only if it stays disciplined.

If it tries to answer every testing question at once, it becomes:

What Only Real GPU Benchmarks Still Catch In REX

Fri, 17 Apr 2026 00:00:00 +0000

The previous two posts in this series narrowed the benchmark layer into two specific contracts:

fairness in performance comparison,
and correctness comparison that does not trust naive raw diffs.

This post steps back one level.

It asks a simpler question:

after REX already has parser tests, Frontend AST tests, semantic checks, lowering invariant tests, and CPU equivalence tests, why are real GPU benchmarks still necessary at all?

The short answer is that those earlier layers prove narrower things.

How REX Validates Benchmark Correctness Without Trusting Naive Diffs

Thu, 16 Apr 2026 00:00:00 +0000

The previous post in this series focused on fairness in performance comparison: same runtime stack, same user intent, and the right meaning of time.

This post covers the correctness half of the same problem.

The question is:

when a benchmark is supposed to prove that native LLVM and REX still compute the same thing, what exactly counts as “the same thing”?

The answer turned out to be more careful than a raw diff.

How REX Makes Fair GPU Offloading Comparisons Against Native LLVM

Wed, 15 Apr 2026 00:00:00 +0000

The previous post in this series explained how REX emits omp_offloading_entries and keeps host and device kernel identity aligned.

This one moves back up to the benchmark layer, but it stays much narrower than the general benchmark-validation post.

The question here is not:

how do we run a benchmark suite at all?

The question is:

when REX and native LLVM are close, noisy, or trading wins across different measurement modes, what makes a comparison fair enough to trust?