Performance benchmarks for mkl_fft using Airspeed Velocity (ASV).
Set MKL_NUM_THREADS in the environment before running ASV to control the
thread count used by MKL:
MKL_NUM_THREADS=8 asv run --python=same --quick HEAD^!If MKL_NUM_THREADS is not set, __init__.py applies a default: 4 threads
when the machine has 4 or more physical cores, or 1 (single-threaded)
otherwise. This keeps results comparable across CI machines in the shared pool
regardless of their total core count. Physical cores are detected via
psutil.cpu_count(logical=False) — hyperthreads are excluded per MKL
recommendation.
MKL creates a DFTI descriptor on the first FFT call for a given (size, dtype,
strides) combination and reuses it on subsequent calls. To avoid charging
that one-time cost to the first measured iteration, each benchmark's setup
performs an explicit warmup call after preparing the input array. ASV's
default warmup_time (0.1s) already amortizes this for sub-millisecond
transforms, but the explicit warmup makes the intent visible.
Prerequisites:
pip install ".[benchmark]"Run benchmarks against the current environment:
asv run --python=same --quick HEAD^!Compare two commits:
asv continuous --python=same HEAD~1 HEADView results in a browser:
asv publish
asv preview