File jm_perf.h¶
Go to the source code of this file
just-makeit performance annotation macros. More...
#include "jm_simd.h"
Macros¶
| Type | Name |
|---|---|
| define | JM_ALIGNED (n) [**\_JM\_ALIGNED\_**](jm__perf_8h.md#define-_jm_aligned_) (n) |
| define | JM_ASSUME_ALIGNED (ptr, n) [**\_JM\_ASSUME\_ALIGNED\_**](jm__perf_8h.md#define-_jm_assume_aligned_) (ptr, n)Inform the compiler that ptr is aligned to n bytes. |
| define | JM_FORCEINLINE [**\_JM\_FORCEINLINE\_**](jm__perf_8h.md#define-_jm_forceinline_) |
| define | JM_HOT [**\_JM\_HOT\_**](jm__perf_8h.md#define-_jm_hot_) |
| define | JM_LIKELY (x) [**\_JM\_LIKELY\_**](jm__perf_8h.md#define-_jm_likely_) (x) |
| define | JM_PREFETCH (ptr, rw, loc) [**\_JM\_PREFETCH\_**](jm__perf_8h.md#define-_jm_prefetch_) (ptr, rw, loc)Issue a non-blocking prefetch hint to the CPU. |
| define | JM_RESTRICT [**\_JM\_RESTRICT\_**](jm__perf_8h.md#define-_jm_restrict_) |
| define | JM_UNLIKELY (x) [**\_JM\_UNLIKELY\_**](jm__perf_8h.md#define-_jm_unlikely_) (x) |
| define | JM_UNROLL (n) [**\_JM\_UNROLL\_**](jm__perf_8h.md#define-_jm_unroll_) (n)Unroll the immediately following for-loop exactly n times. |
| define | _JM_ALIGNED_ (n) |
| define | _JM_ASSUME_ALIGNED_ (p, n) (p) |
| define | _JM_FORCEINLINE_ inline |
| define | _JM_HOT_ |
| define | _JM_LIKELY_ (x) (x) |
| define | _JM_PREFETCH_ (p, rw, loc) |
| define | _JM_RESTRICT_ restrict |
| define | _JM_UNLIKELY_ (x) (x) |
| define | _JM_UNROLL_ (n) |
Detailed Description¶
Portable hints for the compiler and CPU: hot functions, forced inlining, restrict aliasing, branch prediction, alignment, prefetch, and loop unrolling. All macros degrade gracefully to safe no-ops on unknown compilers.
Include order: jm_perf.h includes jm_simd.h automatically. Either header may be included standalone — jm_simd.h guards against redefining JM_RESTRICT if jm_perf.h was already included.
Usage:
JM_HOT static void process(const float * JM_RESTRICT in,
float * JM_RESTRICT out, size_t n)
{
JM_PREFETCH(in + 16, 0, 1);
JM_UNROLL(4)
for (size_t i = 0; i < n; i++)
out[i] = in[i] * 2.0f;
}
Macro Definition Documentation¶
define JM_ALIGNED¶
Align a variable or struct member to n bytes.
define JM_ASSUME_ALIGNED¶
Inform the compiler that ptr is aligned to n bytes.
Enables aligned SIMD loads/stores on ISAs that penalise unaligned access. Returns ptr so it can be used in expressions. Falls back to ptr unchanged on unknown compilers.
define JM_FORCEINLINE¶
Force inlining regardless of the compiler's cost model.
define JM_HOT¶
Mark a function as performance-critical (hot section).
define JM_LIKELY¶
Hint that x is almost always true.
define JM_PREFETCH¶
Issue a non-blocking prefetch hint to the CPU.
Parameters:
ptrAddress to prefetch.rw0 = read (PLD), 1 = write (PSTL).locCache level: 3 = L1 (hottest), 0 = NTA (streaming).
Prefetching 1–2 cache lines ahead of the load keeps the data pipeline full on high-latency memory. No-op on unknown compilers.
define JM_RESTRICT¶
Assert a pointer does not alias any other; enables vectorisation.
define JM_UNLIKELY¶
Hint that x is almost never true.
define JM_UNROLL¶
Unroll the immediately following for-loop exactly n times.
Applied before a for loop, instructs GCC/Clang to unroll it unconditionally by the given factor, regardless of the compiler's own cost model. Use only on tight, well-measured inner loops with a known iteration count — a large n on a non-trivial body will bloat code size and hurt instruction-cache pressure.
No-op on compilers that do not support the pragma.
define _JM_ALIGNED_¶
define _JM_ASSUME_ALIGNED_¶
define _JM_FORCEINLINE_¶
define _JM_HOT_¶
define _JM_LIKELY_¶
define _JM_PREFETCH_¶
define _JM_RESTRICT_¶
define _JM_UNLIKELY_¶
define _JM_UNROLL_¶
The documentation for this class was generated from the following file native/inc/jm_perf.h