tilelang.profiler¶
The profiler and convert to torch utils
Submodules¶
Classes¶
A profiler class for benchmarking and validating kernel implementations. |
Package Contents¶
- class tilelang.profiler.Profiler¶
A profiler class for benchmarking and validating kernel implementations.
- params¶
List of kernel parameters defining the input/output specifications
- result_idx¶
Indices indicating which parameters are output tensors
- supply_type¶
Type of tensor supply to use (e.g., random, zeros, etc.)
- adapter¶
Optional kernel adapter for interfacing with different backends
- params: List[tilelang.engine.param.KernelParam]¶
- result_idx: List[int]¶
- supply_type: tilelang.utils.tensor.TensorSupplyType¶
- adapter: tilelang.jit.adapter.BaseKernelAdapter | None = None¶
- __post_init__()¶
Initialize tensor supply after dataclass initialization
- with_default_adapter(adapter)¶
- Parameters:
adapter (tilelang.jit.adapter.BaseKernelAdapter)
- Return type:
- assert_allclose(reference_program, input_tensors=None, atol=0.01, rtol=0.01, max_mismatched_ratio=0.01)¶
Validates kernel output against a reference implementation.
- Parameters:
reference_program (Callable) – Reference implementation to compare against
input_tensors (Optional[List[torch.Tensor]]) – Optional pre-generated input tensors
atol (float) – Absolute tolerance for comparison
rtol (float) – Relative tolerance for comparison
max_mismatched_ratio – Maximum allowed ratio of mismatched elements
- manual_assert_close(reference_program, input_tensors=None, manual_check_prog=None)¶
Validates kernel output against a reference implementation.
- Parameters:
reference_program (Callable) – Reference implementation to compare against
input_tensors (Optional[List[torch.Tensor]]) – Optional pre-generated input tensors
atol – Absolute tolerance for comparison
rtol – Relative tolerance for comparison
max_mismatched_ratio – Maximum allowed ratio of mismatched elements
manual_check_prog (Callable)
- assert_consistent(repeat=10)¶
Checks for kernel consistency across multiple runs.
- Parameters:
repeat – Number of times to repeat the consistency check
- run_once(func=None)¶
- Parameters:
func (Optional[Callable])
- determine_profiler(func=None)¶
Determines which profiler backend to use based on function type.
- Parameters:
func (Optional[Callable]) – Function to be profiled
profiler – Explicitly specified profiler type or “auto” for automatic detection
- Returns:
The determined profiler type (“torch” or “tvm”)
- Return type:
str
- do_bench(func=None, warmup=25, rep=100, n_warmup=1, n_repeat=1, input_tensors=None)¶
Benchmarks the execution time of a given function.
- Parameters:
func (Optional[Callable]) – Function to benchmark (uses adapter if None)
warmup (int) – Warmup time in milliseconds
rep (int) – Number of repetitions for timing
n_warmup (int) – Number of warmup iterations
n_repeat (int) – Number of timing iterations
profiler – Which profiling backend to use
input_tensors (List[torch.Tensor]) – Optional pre-generated input tensors
- Returns:
Average execution time in milliseconds
- Return type:
float
- property func¶
- __call__(*args, **kwds)¶
- Parameters:
args (Any)
kwds (Any)
- Return type:
Any