Benchmarks¶
Representative single-machine numbers for v0.12.0, committed by hand — not from CI (shared runners aren't hardware-representative).
Two builds are shown so you can see the from-source upside:
- portable —
-O3 -march=x86-64-v2 -ffast-math— the shipped PyPI wheel. - native —
-O3 -march=native -mprefer-vector-width=256 -ffast-math—-DDOPPLER_NATIVE=ON, built from source for this CPU.
Environment — measured 2026-06-12 01:43:17 UTC, doppler 37ed11c, cc (GCC) 16.1.1 20260430.
- CPU: AMD Ryzen AI 9 465 w/ Radeon 880M — 20 threads, governor
performance, boost on - OS: 7.0.11-1-cachyos, glibc 2.43
- Libs: Python 3.13.13, NumPy 2.4.5, CMake 4.3.3
Throughput is MSa/s (higher is better); latency ops are mean time/call (lower is better). from src = native vs portable.
The two builds are measured interleaved (alternating runs, per-benchmark best), so from src reflects the real build difference. Big gains are vectorizable kernels under AVX-512; overhead-bound benches (tiny per-call work) sit near 0% because the build can't help where Python-call overhead dominates.
Python (pytest-benchmark)¶
Throughput¶
| benchmark | portable throughput | native throughput | from src |
|---|---|---|---|
corr2d::execute_64k |
86.30 GSa/s | 64.84 GSa/s | -25% |
corr::execute_64k |
47.73 GSa/s | 70.31 GSa/s | +47% |
i32_to_f32::steps_64k |
18.91 GSa/s | 34.04 GSa/s | +80% |
i16_to_f32::steps_64k |
17.97 GSa/s | 33.60 GSa/s | +87% |
wfm_synth::steps_64k |
15.92 GSa/s | 22.20 GSa/s | +39% |
i16u32_to_f32::steps_64k |
13.15 GSa/s | 26.23 GSa/s | +99% |
i8_to_f32::steps_64k |
13.06 GSa/s | 27.12 GSa/s | +108% |
uq15_to_f32::steps_64k |
12.87 GSa/s | 31.68 GSa/s | +146% |
nco::steps_u32_64k |
9.99 GSa/s | 23.87 GSa/s | +139% |
i16u64_to_f32::steps_64k |
9.20 GSa/s | 17.97 GSa/s | +95% |
acc_f32::steps[20480] |
6.52 GSa/s | 23.03 GSa/s | +253% |
acc_f32::steps[409600] |
6.30 GSa/s | 23.97 GSa/s | +281% |
acc_f32::steps[819200] |
6.05 GSa/s | 22.58 GSa/s | +273% |
delay::push_ptr_64k |
5.79 GSa/s | 6.04 GSa/s | +4% |
nco::steps_u32_1k |
5.65 GSa/s | 8.46 GSa/s | +50% |
delay::push_ptr_1k |
5.43 GSa/s | 5.34 GSa/s | -2% |
i16_to_f32::steps_1k |
5.42 GSa/s | 5.74 GSa/s | +6% |
i16u32_to_f32::steps_1k |
5.16 GSa/s | 5.89 GSa/s | +14% |
acc_f32::steps[1024] |
5.00 GSa/s | 12.87 GSa/s | +157% |
i8_to_f32::steps_1k |
4.78 GSa/s | 5.83 GSa/s | +22% |
uq15_to_f32::steps_1k |
4.76 GSa/s | 5.86 GSa/s | +23% |
acc_q15::steps_64k |
4.59 GSa/s | 8.46 GSa/s | +84% |
i32_to_f32::steps_1k |
4.58 GSa/s | 6.14 GSa/s | +34% |
nco::steps_u32_ovf_64k |
4.25 GSa/s | 4.30 GSa/s | +1% |
buffer::f32_write |
4.24 GSa/s | 4.02 GSa/s | -5% |
acc_q8::steps_64k |
4.24 GSa/s | 4.23 GSa/s | -0% |
acc_q15::steps_1k |
3.95 GSa/s | 6.59 GSa/s | +67% |
i16u64_to_f32::steps_1k |
3.69 GSa/s | 5.48 GSa/s | +49% |
lo::steps_64k |
3.61 GSa/s | 2.10 GSa/s | -42% |
buffer::f64_write |
3.55 GSa/s | 3.54 GSa/s | -0% |
acc_q8::steps_1k |
3.30 GSa/s | 2.93 GSa/s | -11% |
wfm_synth::steps_1k |
2.84 GSa/s | 4.08 GSa/s | +44% |
lo::steps_1k |
2.62 GSa/s | 1.99 GSa/s | -24% |
nco::steps_u32_ovf_1k |
2.28 GSa/s | 2.33 GSa/s | +2% |
compose::reader[raw-cf32] |
1.70 GSa/s | 1.44 GSa/s | -15% |
acc_cf64::steps[409600] |
1.61 GSa/s | 3.11 GSa/s | +93% |
acc_cf64::steps[20480] |
1.50 GSa/s | 3.30 GSa/s | +120% |
acc_cf64::steps[819200] |
1.50 GSa/s | 3.12 GSa/s | +109% |
compose::reader[blue-cf32] |
1.48 GSa/s | 1.43 GSa/s | -4% |
acc_cf64::steps[1024] |
1.46 GSa/s | 2.58 GSa/s | +77% |
corr2d::execute_1k |
1.36 GSa/s | 1.01 GSa/s | -26% |
lo::steps_ctrl_64k |
1.20 GSa/s | 901.3 MSa/s | -25% |
adc::steps_64k |
1.11 GSa/s | 2.55 GSa/s | +130% |
lo::steps_ctrl_1k |
1.01 GSa/s | 785.7 MSa/s | -22% |
compose::reader[raw-ci16] |
922.5 MSa/s | 974.5 MSa/s | +6% |
adc::steps_1k |
907.0 MSa/s | 1.65 GSa/s | +82% |
corr::execute_1k |
810.2 MSa/s | 1.07 GSa/s | +32% |
cic::decimate_R256 |
728.9 MSa/s | 701.5 MSa/s | -4% |
awgn::generate_64k |
718.6 MSa/s | 698.8 MSa/s | -3% |
cic::decimate_R32 |
709.3 MSa/s | 649.3 MSa/s | -8% |
compose::writer[raw-cf32] |
695.7 MSa/s | 623.8 MSa/s | -10% |
cic::decimate_R64 |
686.0 MSa/s | 683.9 MSa/s | -0% |
awgn::generate_1k |
683.2 MSa/s | 654.0 MSa/s | -4% |
RateConverter::cic_64k |
665.0 MSa/s | 612.1 MSa/s | -8% |
cic::decimate_R8 |
663.8 MSa/s | 597.6 MSa/s | -10% |
cic::decimate_1k |
651.3 MSa/s | 661.0 MSa/s | +1% |
cic::decimate_64k |
624.5 MSa/s | 691.5 MSa/s | +11% |
cic::decimate_R4 |
597.7 MSa/s | 556.1 MSa/s | -7% |
compose::writer[blue-cf32] |
541.5 MSa/s | 517.9 MSa/s | -4% |
compose::writer[raw-ci16] |
531.8 MSa/s | 494.9 MSa/s | -7% |
fft2d::execute_cf32 |
524.0 MSa/s | 635.5 MSa/s | +21% |
f32_to_i16u32::steps_64k |
457.3 MSa/s | 500.1 MSa/s | +9% |
f32_to_i16u64::steps_64k |
456.5 MSa/s | 433.2 MSa/s | -5% |
f32_to_uq15::steps_64k |
442.2 MSa/s | 429.3 MSa/s | -3% |
f32_to_i16::steps_64k |
433.1 MSa/s | 425.1 MSa/s | -2% |
f32_to_i16u32::steps_1k |
417.9 MSa/s | 454.5 MSa/s | +9% |
f32_to_uq15::steps_1k |
400.1 MSa/s | 417.0 MSa/s | +4% |
f32_to_i16u64::steps_1k |
393.9 MSa/s | 417.0 MSa/s | +6% |
f32_to_i16::steps_1k |
388.3 MSa/s | 419.1 MSa/s | +8% |
HalfbandDecimator::execute_64k |
381.6 MSa/s | 333.8 MSa/s | -13% |
HalfbandDecimator::execute_1k |
381.6 MSa/s | 336.4 MSa/s | -12% |
RateConverter::hb_64k |
365.6 MSa/s | 346.9 MSa/s | -5% |
resample::hbdecim |
359.4 MSa/s | 315.0 MSa/s | -12% |
RateConverter::cic_resamp_64k |
356.6 MSa/s | 342.9 MSa/s | -4% |
hbdecim_q15::execute_1k |
318.5 MSa/s | 325.7 MSa/s | +2% |
hbdecim_q15::execute_64k |
293.7 MSa/s | 347.9 MSa/s | +18% |
RateConverter::hb2_64k |
276.9 MSa/s | 242.1 MSa/s | -13% |
fft2d::execute_cf64 |
267.7 MSa/s | 498.1 MSa/s | +86% |
ddcr_fn::oo_64k |
252.0 MSa/s | 250.1 MSa/s | -1% |
ddcr::execute_64k |
247.4 MSa/s | 247.3 MSa/s | -0% |
ddcr::execute_1k |
245.4 MSa/s | 228.3 MSa/s | -7% |
ddcr_fn::oo_1k |
234.4 MSa/s | 225.2 MSa/s | -4% |
ddcr_fn::fn_1k |
229.3 MSa/s | 231.7 MSa/s | +1% |
ddcr_fn::fn_64k |
229.2 MSa/s | 227.8 MSa/s | -1% |
ddc::execute_64k |
228.5 MSa/s | 209.7 MSa/s | -8% |
ddc::execute_1k |
223.7 MSa/s | 177.9 MSa/s | -20% |
fft::execute_cf64_1k |
193.8 MSa/s | 274.0 MSa/s | +41% |
fft::execute_cf64_8k |
179.0 MSa/s | 243.8 MSa/s | +36% |
RateConverter::resamp_64k |
160.7 MSa/s | 159.6 MSa/s | -1% |
agc::steps_1k |
137.2 MSa/s | 131.4 MSa/s | -4% |
agc::steps_64k |
136.1 MSa/s | 141.7 MSa/s | +4% |
Resampler::execute_decim_64k |
135.8 MSa/s | 123.2 MSa/s | -9% |
resample::resample_down |
135.1 MSa/s | 140.4 MSa/s | +4% |
Resampler::execute_decim_1k |
124.2 MSa/s | 116.3 MSa/s | -6% |
fft::execute_cf32_1k |
89.6 MSa/s | 222.9 MSa/s | +149% |
fir::execute[20480] |
85.0 MSa/s | 245.3 MSa/s | +188% |
fir::execute[1024] |
82.6 MSa/s | 252.9 MSa/s | +206% |
fir::execute[409600] |
80.8 MSa/s | 236.3 MSa/s | +193% |
resample::resample_up |
79.6 MSa/s | 88.8 MSa/s | +11% |
Resampler::execute_interp_1k |
78.7 MSa/s | 92.4 MSa/s | +17% |
fft::execute_cf32_8k |
77.9 MSa/s | 175.2 MSa/s | +125% |
RateConverter::interp_1k |
77.2 MSa/s | 92.1 MSa/s | +19% |
fir::execute[819200] |
73.3 MSa/s | 218.4 MSa/s | +198% |
delay::push |
39.0 MSa/s | 41.4 MSa/s | +6% |
compose::writer[csv-cf32] |
6.1 MSa/s | 5.7 MSa/s | -7% |
Latency¶
| benchmark | portable time/call | native time/call | from src |
|---|---|---|---|
i16u32_to_f32::step |
26.29 ns | 26.98 ns | -3% |
i16_to_f32::step |
26.31 ns | 28.52 ns | -8% |
acc_q8::step |
26.32 ns | 25.52 ns | +3% |
i32_to_f32::step |
26.62 ns | 28.50 ns | -7% |
i8_to_f32::step |
27.27 ns | 27.07 ns | +1% |
acc_q15::step |
27.96 ns | 28.93 ns | -3% |
adc::step |
28.18 ns | 30.69 ns | -8% |
uq15_to_f32::step |
28.43 ns | 27.26 ns | +4% |
detection::det_threshold_power |
28.50 ns | 26.83 ns | +6% |
i16u64_to_f32::step |
28.59 ns | 25.58 ns | +12% |
detection::det_threshold |
30.37 ns | 33.47 ns | -9% |
f32_to_i16u32::step |
31.33 ns | 31.26 ns | +0% |
f32_to_uq15::step |
31.59 ns | 34.05 ns | -7% |
f32_to_i16::step |
32.06 ns | 31.73 ns | +1% |
f32_to_i16u64::step |
32.83 ns | 32.09 ns | +2% |
detection::marcum_q_large_a |
36.67 ns | 36.71 ns | -0% |
agc::step |
55.12 ns | 46.39 ns | +19% |
timing::stamp |
62.07 ns | 64.95 ns | -4% |
detection::det_pd_m1 |
70.30 ns | 74.03 ns | -5% |
detection::marcum_q_m4 |
79.47 ns | 81.78 ns | -3% |
detection::marcum_q_m1 |
87.36 ns | 90.82 ns | -4% |
detection::det_pd_m4 |
93.45 ns | 104.34 ns | -10% |
detection::det_pd_power_m4 |
96.28 ns | 103.56 ns | -7% |
wfm_synth::step |
103.52 ns | 102.44 ns | +1% |
timing::pace_nowait |
150.85 ns | 129.37 ns | +17% |
detection::det_pd_m16 |
156.71 ns | 152.49 ns | +3% |
detection::det_dwell |
8.17 µs | 7.65 µs | +7% |
detection::det_dwell_power |
8.38 µs | 8.30 µs | +1% |
detection::det_snr_m4 |
9.29 µs | 9.59 µs | -3% |
detection::det_snr_power_m4 |
9.50 µs | 9.42 µs | +1% |
detection::det_snr_m16 |
9.95 µs | 9.77 µs | +2% |
detector2d::push_1k |
10.60 µs | 14.04 µs | -24% |
detector::push_1k |
17.76 µs | 12.22 µs | +45% |
detector2d::push_64k |
159.77 µs | 220.69 µs | -28% |
detector::push_64k |
297.33 µs | 214.16 µs | +39% |
C (jm_bench)¶
Throughput¶
| benchmark | portable throughput | native throughput | from src |
|---|---|---|---|
i16_to_f32::steps |
17.68 GSa/s | 32.58 GSa/s | +84% |
i8_to_f32::steps |
14.21 GSa/s | 26.13 GSa/s | +84% |
i32_to_f32::steps |
12.42 GSa/s | 19.42 GSa/s | +56% |
i16u32_to_f32::steps |
11.39 GSa/s | 25.75 GSa/s | +126% |
uq15_to_f32::steps |
8.47 GSa/s | 21.53 GSa/s | +154% |
i16u64_to_f32::steps |
6.73 GSa/s | 10.86 GSa/s | +61% |
acc_f32::steps |
6.67 GSa/s | 26.62 GSa/s | +299% |
i8_to_f32::step |
3.78 GSa/s | 3.92 GSa/s | +4% |
acc_q15::step |
3.31 GSa/s | 6.39 GSa/s | +93% |
acc_q15::steps |
3.30 GSa/s | 6.39 GSa/s | +94% |
i32_to_f32::step |
3.28 GSa/s | 3.22 GSa/s | -2% |
i16u64_to_f32::step |
3.27 GSa/s | 3.25 GSa/s | -0% |
i16_to_f32::step |
3.22 GSa/s | 3.28 GSa/s | +2% |
i16u32_to_f32::step |
3.21 GSa/s | 2.97 GSa/s | -7% |
acc_q8::steps |
2.82 GSa/s | 3.78 GSa/s | +34% |
acc_q8::step |
2.80 GSa/s | 3.01 GSa/s | +8% |
uq15_to_f32::step |
2.56 GSa/s | 3.25 GSa/s | +27% |
acc_cf64::steps |
1.54 GSa/s | 3.34 GSa/s | +117% |
acc_cf64::madd2d |
1.26 GSa/s | 1.26 GSa/s | -0% |
acc_f32::madd |
1.26 GSa/s | 1.26 GSa/s | -0% |
acc_cf64::madd |
1.26 GSa/s | 1.26 GSa/s | -0% |
acc_f32::get |
1.26 GSa/s | 1.26 GSa/s | -0% |
acc_f32::dump |
1.26 GSa/s | 1.26 GSa/s | +0% |
acc_cf64::dump |
1.26 GSa/s | 1.26 GSa/s | -0% |
acc_cf64::get |
1.26 GSa/s | 1.26 GSa/s | +0% |
acc_f32::madd2d |
1.26 GSa/s | 1.26 GSa/s | +0% |
acc_cf64::add2d |
1.25 GSa/s | 1.26 GSa/s | +1% |
acc_f32::add2d |
1.25 GSa/s | 1.26 GSa/s | +1% |
acc_f32::step |
1.14 GSa/s | 1.10 GSa/s | -3% |
acc_cf64::step |
1.13 GSa/s | 1.18 GSa/s | +5% |
adc::steps |
1.11 GSa/s | 2.34 GSa/s | +111% |
adc::step |
875.4 MSa/s | 1.43 GSa/s | +64% |
cic::decimate |
591.0 MSa/s | 663.8 MSa/s | +12% |
delay::write |
559.3 MSa/s | 859.4 MSa/s | +54% |
RateConverter::CIC(0.125) |
505.4 MSa/s | 504.6 MSa/s | -0% |
f32_to_i16::steps |
457.4 MSa/s | 457.2 MSa/s | -0% |
f32_to_uq15::steps |
456.4 MSa/s | 449.8 MSa/s | -1% |
f32_to_i16u32::steps |
456.3 MSa/s | 443.1 MSa/s | -3% |
delay::push |
444.7 MSa/s | 931.4 MSa/s | +109% |
f32_to_i16::step |
410.0 MSa/s | 369.3 MSa/s | -10% |
f32_to_uq15::step |
403.6 MSa/s | 390.0 MSa/s | -3% |
f32_to_i16u32::step |
380.3 MSa/s | 357.1 MSa/s | -6% |
f32_to_i16u64::steps |
375.8 MSa/s | 448.7 MSa/s | +19% |
f32_to_i16u64::step |
374.3 MSa/s | 369.5 MSa/s | -1% |
RateConverter::HB(0.5) |
321.7 MSa/s | 366.3 MSa/s | +14% |
RateConverter::CIC+Rs(0.1) |
321.0 MSa/s | 295.1 MSa/s | -8% |
RateConverter::HB2(0.25) |
272.3 MSa/s | 244.9 MSa/s | -10% |
Resampler::reset |
229.3 MSa/s | 240.8 MSa/s | +5% |
RateConverter::Resamp(1/3) |
165.2 MSa/s | 160.9 MSa/s | -3% |
agc::steps |
139.7 MSa/s | 140.6 MSa/s | +1% |
RateConverter::Interp(2.0) |
82.7 MSa/s | 102.2 MSa/s | +24% |
agc::step |
34.8 MSa/s | 44.4 MSa/s | +28% |
Release history (portable build)¶
Portable-build throughput across releases — the wheel numbers. Comparable only across releases measured on the same machine.
Python (pytest-benchmark)¶
| benchmark | v0.10.1 | v0.12.0 |
|---|---|---|
corr2d::execute_64k |
79.11 GSa/s | 86.30 GSa/s |
corr::execute_64k |
52.00 GSa/s | 47.73 GSa/s |
i32_to_f32::steps_64k |
18.89 GSa/s | 18.91 GSa/s |
i16_to_f32::steps_64k |
17.96 GSa/s | 17.97 GSa/s |
wfm_synth::steps_64k |
— | 15.92 GSa/s |
i16u32_to_f32::steps_64k |
13.08 GSa/s | 13.15 GSa/s |
i8_to_f32::steps_64k |
12.13 GSa/s | 13.06 GSa/s |
uq15_to_f32::steps_64k |
11.87 GSa/s | 12.87 GSa/s |
nco::steps_u32_64k |
9.10 GSa/s | 9.99 GSa/s |
i16u64_to_f32::steps_64k |
9.47 GSa/s | 9.20 GSa/s |
acc_f32::steps[20480] |
6.29 GSa/s | 6.52 GSa/s |
acc_f32::steps[409600] |
6.25 GSa/s | 6.30 GSa/s |
acc_f32::steps[819200] |
6.11 GSa/s | 6.05 GSa/s |
delay::push_ptr_64k |
5.72 GSa/s | 5.79 GSa/s |
nco::steps_u32_1k |
6.45 GSa/s | 5.65 GSa/s |
delay::push_ptr_1k |
5.69 GSa/s | 5.43 GSa/s |
i16_to_f32::steps_1k |
6.26 GSa/s | 5.42 GSa/s |
i16u32_to_f32::steps_1k |
5.07 GSa/s | 5.16 GSa/s |
acc_f32::steps[1024] |
4.68 GSa/s | 5.00 GSa/s |
i8_to_f32::steps_1k |
5.86 GSa/s | 4.78 GSa/s |
uq15_to_f32::steps_1k |
5.48 GSa/s | 4.76 GSa/s |
acc_q15::steps_64k |
4.20 GSa/s | 4.59 GSa/s |
i32_to_f32::steps_1k |
5.86 GSa/s | 4.58 GSa/s |
nco::steps_u32_ovf_64k |
4.27 GSa/s | 4.25 GSa/s |
buffer::f32_write |
3.92 GSa/s | 4.24 GSa/s |
acc_q8::steps_64k |
4.05 GSa/s | 4.24 GSa/s |
acc_q15::steps_1k |
4.00 GSa/s | 3.95 GSa/s |
i16u64_to_f32::steps_1k |
4.29 GSa/s | 3.69 GSa/s |
lo::steps_64k |
3.31 GSa/s | 3.61 GSa/s |
buffer::f64_write |
2.99 GSa/s | 3.55 GSa/s |
acc_q8::steps_1k |
3.08 GSa/s | 3.30 GSa/s |
wfm_synth::steps_1k |
— | 2.84 GSa/s |
lo::steps_1k |
2.80 GSa/s | 2.62 GSa/s |
nco::steps_u32_ovf_1k |
2.81 GSa/s | 2.28 GSa/s |
compose::reader[raw-cf32] |
— | 1.70 GSa/s |
acc_cf64::steps[409600] |
1.36 GSa/s | 1.61 GSa/s |
acc_cf64::steps[20480] |
1.53 GSa/s | 1.50 GSa/s |
acc_cf64::steps[819200] |
1.40 GSa/s | 1.50 GSa/s |
compose::reader[blue-cf32] |
— | 1.48 GSa/s |
acc_cf64::steps[1024] |
1.47 GSa/s | 1.46 GSa/s |
corr2d::execute_1k |
1.32 GSa/s | 1.36 GSa/s |
lo::steps_ctrl_64k |
1.20 GSa/s | 1.20 GSa/s |
adc::steps_64k |
963.9 MSa/s | 1.11 GSa/s |
lo::steps_ctrl_1k |
1.07 GSa/s | 1.01 GSa/s |
compose::reader[raw-ci16] |
— | 922.5 MSa/s |
adc::steps_1k |
891.7 MSa/s | 907.0 MSa/s |
corr::execute_1k |
803.4 MSa/s | 810.2 MSa/s |
cic::decimate_R256 |
668.0 MSa/s | 728.9 MSa/s |
awgn::generate_64k |
694.1 MSa/s | 718.6 MSa/s |
cic::decimate_R32 |
691.9 MSa/s | 709.3 MSa/s |
compose::writer[raw-cf32] |
— | 695.7 MSa/s |
cic::decimate_R64 |
665.4 MSa/s | 686.0 MSa/s |
awgn::generate_1k |
708.5 MSa/s | 683.2 MSa/s |
RateConverter::cic_64k |
601.3 MSa/s | 665.0 MSa/s |
cic::decimate_R8 |
609.6 MSa/s | 663.8 MSa/s |
cic::decimate_1k |
638.7 MSa/s | 651.3 MSa/s |
cic::decimate_64k |
643.3 MSa/s | 624.5 MSa/s |
cic::decimate_R4 |
570.0 MSa/s | 597.7 MSa/s |
compose::writer[blue-cf32] |
— | 541.5 MSa/s |
compose::writer[raw-ci16] |
— | 531.8 MSa/s |
fft2d::execute_cf32 |
514.7 MSa/s | 524.0 MSa/s |
f32_to_i16u32::steps_64k |
440.9 MSa/s | 457.3 MSa/s |
f32_to_i16u64::steps_64k |
414.3 MSa/s | 456.5 MSa/s |
f32_to_uq15::steps_64k |
445.3 MSa/s | 442.2 MSa/s |
f32_to_i16::steps_64k |
419.9 MSa/s | 433.1 MSa/s |
f32_to_i16u32::steps_1k |
417.1 MSa/s | 417.9 MSa/s |
f32_to_uq15::steps_1k |
386.0 MSa/s | 400.1 MSa/s |
f32_to_i16u64::steps_1k |
384.1 MSa/s | 393.9 MSa/s |
f32_to_i16::steps_1k |
385.5 MSa/s | 388.3 MSa/s |
HalfbandDecimator::execute_64k |
389.9 MSa/s | 381.6 MSa/s |
HalfbandDecimator::execute_1k |
368.5 MSa/s | 381.6 MSa/s |
RateConverter::hb_64k |
410.4 MSa/s | 365.6 MSa/s |
resample::hbdecim |
344.5 MSa/s | 359.4 MSa/s |
RateConverter::cic_resamp_64k |
352.5 MSa/s | 356.6 MSa/s |
hbdecim_q15::execute_1k |
306.8 MSa/s | 318.5 MSa/s |
hbdecim_q15::execute_64k |
331.2 MSa/s | 293.7 MSa/s |
RateConverter::hb2_64k |
257.8 MSa/s | 276.9 MSa/s |
fft2d::execute_cf64 |
286.4 MSa/s | 267.7 MSa/s |
ddcr_fn::oo_64k |
252.7 MSa/s | 252.0 MSa/s |
ddcr::execute_64k |
251.6 MSa/s | 247.4 MSa/s |
ddcr::execute_1k |
233.6 MSa/s | 245.4 MSa/s |
ddcr_fn::oo_1k |
233.7 MSa/s | 234.4 MSa/s |
ddcr_fn::fn_1k |
234.2 MSa/s | 229.3 MSa/s |
ddcr_fn::fn_64k |
231.9 MSa/s | 229.2 MSa/s |
ddc::execute_64k |
241.3 MSa/s | 228.5 MSa/s |
ddc::execute_1k |
228.7 MSa/s | 223.7 MSa/s |
fft::execute_cf64_1k |
199.1 MSa/s | 193.8 MSa/s |
fft::execute_cf64_8k |
166.7 MSa/s | 179.0 MSa/s |
RateConverter::resamp_64k |
145.6 MSa/s | 160.7 MSa/s |
agc::steps_1k |
128.3 MSa/s | 137.2 MSa/s |
agc::steps_64k |
137.6 MSa/s | 136.1 MSa/s |
Resampler::execute_decim_64k |
129.9 MSa/s | 135.8 MSa/s |
resample::resample_down |
133.1 MSa/s | 135.1 MSa/s |
Resampler::execute_decim_1k |
125.6 MSa/s | 124.2 MSa/s |
fft::execute_cf32_1k |
177.1 MSa/s | 89.6 MSa/s |
fir::execute[20480] |
80.6 MSa/s | 85.0 MSa/s |
fir::execute[1024] |
75.6 MSa/s | 82.6 MSa/s |
fir::execute[409600] |
82.0 MSa/s | 80.8 MSa/s |
resample::resample_up |
74.5 MSa/s | 79.6 MSa/s |
Resampler::execute_interp_1k |
77.0 MSa/s | 78.7 MSa/s |
fft::execute_cf32_8k |
159.4 MSa/s | 77.9 MSa/s |
RateConverter::interp_1k |
75.6 MSa/s | 77.2 MSa/s |
fir::execute[819200] |
71.7 MSa/s | 73.3 MSa/s |
delay::push |
39.7 MSa/s | 39.0 MSa/s |
compose::writer[csv-cf32] |
— | 6.1 MSa/s |
C (jm_bench)¶
| benchmark | v0.10.1 | v0.12.0 |
|---|---|---|
i16_to_f32::steps |
12.71 GSa/s | 17.68 GSa/s |
i8_to_f32::steps |
7.25 GSa/s | 14.21 GSa/s |
i32_to_f32::steps |
8.95 GSa/s | 12.42 GSa/s |
i16u32_to_f32::steps |
9.29 GSa/s | 11.39 GSa/s |
uq15_to_f32::steps |
12.56 GSa/s | 8.47 GSa/s |
i16u64_to_f32::steps |
6.91 GSa/s | 6.73 GSa/s |
acc_f32::steps |
6.70 GSa/s | 6.67 GSa/s |
i8_to_f32::step |
2.45 GSa/s | 3.78 GSa/s |
acc_q15::step |
3.30 GSa/s | 3.31 GSa/s |
acc_q15::steps |
3.31 GSa/s | 3.30 GSa/s |
i32_to_f32::step |
2.36 GSa/s | 3.28 GSa/s |
i16u64_to_f32::step |
2.38 GSa/s | 3.27 GSa/s |
i16_to_f32::step |
3.15 GSa/s | 3.22 GSa/s |
i16u32_to_f32::step |
3.06 GSa/s | 3.21 GSa/s |
acc_q8::steps |
3.12 GSa/s | 2.82 GSa/s |
acc_q8::step |
2.81 GSa/s | 2.80 GSa/s |
uq15_to_f32::step |
2.71 GSa/s | 2.56 GSa/s |
acc_cf64::steps |
1.57 GSa/s | 1.54 GSa/s |
acc_cf64::madd2d |
1.26 GSa/s | 1.26 GSa/s |
acc_f32::madd |
1.26 GSa/s | 1.26 GSa/s |
acc_cf64::madd |
1.26 GSa/s | 1.26 GSa/s |
acc_f32::get |
1.23 GSa/s | 1.26 GSa/s |
acc_f32::dump |
1.26 GSa/s | 1.26 GSa/s |
acc_cf64::dump |
1.26 GSa/s | 1.26 GSa/s |
acc_cf64::get |
1.26 GSa/s | 1.26 GSa/s |
acc_f32::madd2d |
1.26 GSa/s | 1.26 GSa/s |
acc_cf64::add2d |
1.25 GSa/s | 1.25 GSa/s |
acc_f32::add2d |
1.25 GSa/s | 1.25 GSa/s |
acc_f32::step |
1.14 GSa/s | 1.14 GSa/s |
acc_cf64::step |
1.38 GSa/s | 1.13 GSa/s |
adc::steps |
1.06 GSa/s | 1.11 GSa/s |
adc::step |
801.2 MSa/s | 875.4 MSa/s |
cic::decimate |
686.4 MSa/s | 591.0 MSa/s |
delay::write |
392.2 MSa/s | 559.3 MSa/s |
RateConverter::CIC(0.125) |
505.3 MSa/s | 505.4 MSa/s |
f32_to_i16::steps |
320.2 MSa/s | 457.4 MSa/s |
f32_to_uq15::steps |
365.4 MSa/s | 456.4 MSa/s |
f32_to_i16u32::steps |
320.1 MSa/s | 456.3 MSa/s |
delay::push |
354.3 MSa/s | 444.7 MSa/s |
f32_to_i16::step |
300.0 MSa/s | 410.0 MSa/s |
f32_to_uq15::step |
301.3 MSa/s | 403.6 MSa/s |
f32_to_i16u32::step |
288.9 MSa/s | 380.3 MSa/s |
f32_to_i16u64::steps |
319.7 MSa/s | 375.8 MSa/s |
f32_to_i16u64::step |
285.9 MSa/s | 374.3 MSa/s |
RateConverter::HB(0.5) |
381.8 MSa/s | 321.7 MSa/s |
RateConverter::CIC+Rs(0.1) |
320.6 MSa/s | 321.0 MSa/s |
RateConverter::HB2(0.25) |
269.2 MSa/s | 272.3 MSa/s |
Resampler::reset |
202.2 MSa/s | 229.3 MSa/s |
RateConverter::Resamp(1/3) |
167.9 MSa/s | 165.2 MSa/s |
agc::steps |
123.2 MSa/s | 139.7 MSa/s |
RateConverter::Interp(2.0) |
82.6 MSa/s | 82.7 MSa/s |
agc::step |
36.0 MSa/s | 34.8 MSa/s |