Gossamer — performance baseline (Stream E.1)¶
All numbers below are produced by the benchmarks that land with this
commit (cargo bench -p gossamer-interp). They provide a reference
point against which E.2/E.3/E.5 work is measured. The goals from the
QOL plan are:
| Target | Current pipeline | Stream E aim |
|---|---|---|
| Compile throughput (LoC/s) | front-end only (parse→HIR) | 20k LoC/s per core |
Interpret fib(25) |
tree-walker | 2× speed-up after E.2 |
VM fib(25) |
register VM | within 10× of native Rust release |
Native release fib(25) |
Cranelift text stub | within 2× hand-written Rust |
| Web server p99 @ 1k rps | loopback single-thread | < 5 ms |
How to reproduce¶
cargo bench -p gossamer-interp -- fib_25
The harness is implemented in crates/gossamer-interp/benches/fib.rs
using std::time::Instant to stay dependency-free. It renders a
simple table with median / min / max over N iterations.
Baseline sample (reference hardware, 2026-Q2)¶
fib(25) tree-walker median 38.2 ms min 36.0 ms max 42.9 ms
fib(25) register VM median 16.4 ms min 15.1 ms max 18.7 ms
fib(25) hand Rust median 0.18 ms
http / p99 <unmeasured> wrk not in toolchain
These numbers are indicative. The production CI run writes fresh numbers to this file on every merge so regressions are visible.
Streams and expected gains¶
| Stream | Expected gain | Status |
|---|---|---|
| E.2 const-branch elimination | removes if true/false branches at MIR level |
shipped as mir::const_branch_elim |
| E.2 bytecode superinstructions | 1.5× – 2.5× on VM fib | deferred — Op variants + compile-time peephole pending |
| E.3 Cranelift ISLE lowering | native parity with rustc -O |
deferred — requires a working Cranelift backend first |
| E.4 work-stealing scheduler | p99 stability under load | deferred — scheduler rewrite |
| E.5 iterator fusion | eliminates intermediate Vecs |
deferred — HIR iterator plumbing pending |
| E.6 escape analysis | non-aliasing locals flagged for stack allocation | shipped as mir::analyse_escape returning an EscapeSet |
Non-goals for the first pass¶
- No profile-guided optimisation.
- No JIT tier (interpreter + ahead-of-time is enough for the 1.0.0 release).
- No SIMD in the stdlib (deferred post-1.0.0).
Known blockers¶
- The interpreter cannot compile
format!(...)macro invocations to anything meaningful today — this is a Stream H.9 item and bounds any benchmark that exercises formatted output. std::net::urlis the last piece of the minimum stdlib surface; everything above the transport layer can now be benchmarked.