tilelang.profiler package#

Submodules#

Module contents#

The profiler and convert to torch utils

class tilelang.profiler.Profiler(params: List[KernelParam], result_idx: List[int], supply_type: TensorSupplyType, adapter: Optional[BaseKernelAdapter] = None)#

Bases: object

A profiler class for benchmarking and validating kernel implementations.

params#

List of kernel parameters defining the input/output specifications

Type:

List[tilelang.engine.param.KernelParam]

result_idx#

Indices indicating which parameters are output tensors

Type:

List[int]

supply_type#

Type of tensor supply to use (e.g., random, zeros, etc.)

Type:

tilelang.utils.tensor.TensorSupplyType

adapter#

Optional kernel adapter for interfacing with different backends

Type:

Optional[tilelang.jit.adapter.base.BaseKernelAdapter]

adapter: Optional[BaseKernelAdapter] = None#
assert_allclose(reference_program: Callable, input_tensors: Optional[List[torch.Tensor]] = None, atol: float = 0.01, rtol: float = 0.01, max_mismatched_ratio=0.01)#

Validates kernel output against a reference implementation.

Parameters:
  • reference_program – Reference implementation to compare against

  • input_tensors – Optional pre-generated input tensors

  • atol – Absolute tolerance for comparison

  • rtol – Relative tolerance for comparison

  • max_mismatched_ratio – Maximum allowed ratio of mismatched elements

assert_consistent(repeat=10)#

Checks for kernel consistency across multiple runs.

Parameters:

repeat – Number of times to repeat the consistency check

determine_profiler(func: Optional[Callable] = None)#

Determines which profiler backend to use based on function type.

Parameters:
  • func – Function to be profiled

  • profiler – Explicitly specified profiler type or “auto” for automatic detection

Returns:

The determined profiler type (“torch” or “tvm”)

Return type:

str

do_bench(func: Optional[Callable] = None, warmup: int = 25, rep: int = 100, n_warmup: int = 1, n_repeat: int = 1, input_tensors: Optional[List[torch.Tensor]] = None) float#

Benchmarks the execution time of a given function.

Parameters:
  • func – Function to benchmark (uses adapter if None)

  • warmup – Warmup time in milliseconds

  • rep – Number of repetitions for timing

  • n_warmup – Number of warmup iterations

  • n_repeat – Number of timing iterations

  • profiler – Which profiling backend to use

  • input_tensors – Optional pre-generated input tensors

Returns:

Average execution time in milliseconds

Return type:

float

property func#
params: List[KernelParam]#
result_idx: List[int]#
run_once(func: Optional[Callable] = None)#
supply_type: TensorSupplyType#
with_default_adapter(adapter: BaseKernelAdapter) Profiler#