tilelang.profiler package#
Submodules#
Module contents#
The profiler and convert to torch utils
- class tilelang.profiler.Profiler(params: List[KernelParam], result_idx: List[int], supply_type: TensorSupplyType, adapter: Optional[BaseKernelAdapter] = None)#
Bases:
object
A profiler class for benchmarking and validating kernel implementations.
- params#
List of kernel parameters defining the input/output specifications
- Type:
- result_idx#
Indices indicating which parameters are output tensors
- Type:
List[int]
- supply_type#
Type of tensor supply to use (e.g., random, zeros, etc.)
- adapter#
Optional kernel adapter for interfacing with different backends
- Type:
- adapter: Optional[BaseKernelAdapter] = None#
- assert_allclose(reference_program: Callable, input_tensors: Optional[List[torch.Tensor]] = None, atol: float = 0.01, rtol: float = 0.01, max_mismatched_ratio=0.01)#
Validates kernel output against a reference implementation.
- Parameters:
reference_program – Reference implementation to compare against
input_tensors – Optional pre-generated input tensors
atol – Absolute tolerance for comparison
rtol – Relative tolerance for comparison
max_mismatched_ratio – Maximum allowed ratio of mismatched elements
- assert_consistent(repeat=10)#
Checks for kernel consistency across multiple runs.
- Parameters:
repeat – Number of times to repeat the consistency check
- determine_profiler(func: Optional[Callable] = None)#
Determines which profiler backend to use based on function type.
- Parameters:
func – Function to be profiled
profiler – Explicitly specified profiler type or “auto” for automatic detection
- Returns:
The determined profiler type (“torch” or “tvm”)
- Return type:
str
- do_bench(func: Optional[Callable] = None, warmup: int = 25, rep: int = 100, n_warmup: int = 1, n_repeat: int = 1, input_tensors: Optional[List[torch.Tensor]] = None) float #
Benchmarks the execution time of a given function.
- Parameters:
func – Function to benchmark (uses adapter if None)
warmup – Warmup time in milliseconds
rep – Number of repetitions for timing
n_warmup – Number of warmup iterations
n_repeat – Number of timing iterations
profiler – Which profiling backend to use
input_tensors – Optional pre-generated input tensors
- Returns:
Average execution time in milliseconds
- Return type:
float
- property func#
- params: List[KernelParam]#
- result_idx: List[int]#
- run_once(func: Optional[Callable] = None)#
- supply_type: TensorSupplyType#
- with_default_adapter(adapter: BaseKernelAdapter) Profiler #