tilelang.profiler package#

Submodules#

tilelang.profiler.bench module
- do_bench()

Module contents#

The profiler and convert to torch utils

class tilelang.profiler.Profiler(params: List[KernelParam], result_idx: List[int], supply_type: TensorSupplyType, adapter: Optional[BaseKernelAdapter] = None)#

Bases: object

A profiler class for benchmarking and validating kernel implementations.

params#

List of kernel parameters defining the input/output specifications

Type:: List[tilelang.engine.param.KernelParam]

result_idx#

Indices indicating which parameters are output tensors

Type:: List[int]

supply_type#

Type of tensor supply to use (e.g., random, zeros, etc.)

Type:: tilelang.utils.tensor.TensorSupplyType

adapter#

Optional kernel adapter for interfacing with different backends

Type:: Optional[tilelang.jit.adapter.base.BaseKernelAdapter]

adapter: Optional[BaseKernelAdapter] = None#

assert_allclose(reference_program: Callable, input_tensors: Optional[List[torch.Tensor]] = None, atol: float = 0.01, rtol: float = 0.01, max_mismatched_ratio=0.01)#

Validates kernel output against a reference implementation.

Parameters:

reference_program – Reference implementation to compare against
input_tensors – Optional pre-generated input tensors
atol – Absolute tolerance for comparison
rtol – Relative tolerance for comparison
max_mismatched_ratio – Maximum allowed ratio of mismatched elements

assert_consistent(repeat=10)#

Checks for kernel consistency across multiple runs.

Parameters:: repeat – Number of times to repeat the consistency check

determine_profiler(func: Optional[Callable] = None)#

Determines which profiler backend to use based on function type.

Parameters:

func – Function to be profiled
profiler – Explicitly specified profiler type or “auto” for automatic detection

Returns:

The determined profiler type (“torch” or “tvm”)

Return type:

str

do_bench(func: Optional[Callable] = None, warmup: int = 25, rep: int = 100, n_warmup: int = 1, n_repeat: int = 1, input_tensors: Optional[List[torch.Tensor]] = None) → float#

Benchmarks the execution time of a given function.

Parameters:

func – Function to benchmark (uses adapter if None)
warmup – Warmup time in milliseconds
rep – Number of repetitions for timing
n_warmup – Number of warmup iterations
n_repeat – Number of timing iterations
profiler – Which profiling backend to use
input_tensors – Optional pre-generated input tensors

Returns:

Average execution time in milliseconds

Return type:

float

property func#

manual_assert_close(reference_program: Callable, input_tensors: Optional[List[torch.Tensor]] = None, manual_check_prog: Optional[Callable] = None)#

Validates kernel output against a reference implementation.

Parameters:

reference_program – Reference implementation to compare against
input_tensors – Optional pre-generated input tensors
atol – Absolute tolerance for comparison
rtol – Relative tolerance for comparison
max_mismatched_ratio – Maximum allowed ratio of mismatched elements

params: List[KernelParam]#

result_idx: List[int]#

run_once(func: Optional[Callable] = None)#

supply_type: TensorSupplyType#

with_default_adapter(adapter: BaseKernelAdapter) → Profiler#