tilelang.profiler.bench¶

The profiler and convert to torch utils

Functions¶

do_bench(fn[, warmup, rep, _n_warmup, _n_repeat, ...])

Benchmarks the runtime of a PyTorch function.

Module Contents¶

tilelang.profiler.bench.do_bench(fn, warmup=25, rep=100, _n_warmup=0, _n_repeat=0, grad_to_none=None, quantiles=None, fast_flush=True, return_mode='mean')¶

Benchmarks the runtime of a PyTorch function.

This function handles: - L2 cache flushing between runs for consistent timing - Automatic warmup and repeat count calculation - Optional gradient clearing for backward passes - Multiple measurement modes (mean, median, min, max)

Parameters:

fn (Callable) – Function to benchmark
warmup (float) – Target warmup time in milliseconds
rep (float) – Target number of repetitions
_n_warmup (int) – Override for number of warmup iterations
_n_repeat (int) – Override for number of timing iterations
grad_to_none (Optional[List[torch.Tensor]]) – Tensors whose gradients should be cleared between runs
quantiles (Optional[List[float]]) – Optional performance percentiles to compute
fast_flush (bool) – Whether to use faster L2 cache flushing
return_mode (Literal['min', 'max', 'mean', 'median']) – How to aggregate timing results (“mean”, “median”, “min”, “max”)

Returns:

Aggregated runtime in milliseconds

Return type:

float