tilelang.autotuner.tuner¶
The auto-tune module for tilelang programs.
This module provides functionality for auto-tuning tilelang programs, including JIT compilation and performance optimization through configuration search.
Attributes¶
Exceptions¶
Common base class for all non-exit exceptions. |
Classes¶
Auto-tuner for tilelang programs. |
Functions¶
|
|
|
|
Gets the number of CPU cores available to the current process. |
|
|
Just-In-Time (JIT) compiler decorator for TileLang functions. |
Module Contents¶
- exception tilelang.autotuner.tuner.TimeoutException¶
Bases:
Exception
Common base class for all non-exit exceptions.
- tilelang.autotuner.tuner.timeout_handler(signum, frame)¶
- tilelang.autotuner.tuner.run_with_timeout(func, timeout, *args, **kwargs)¶
- tilelang.autotuner.tuner.logger¶
- tilelang.autotuner.tuner.get_available_cpu_count()¶
Gets the number of CPU cores available to the current process.
- Return type:
int
- class tilelang.autotuner.tuner.AutoTuner(fn, configs)¶
Auto-tuner for tilelang programs.
This class handles the auto-tuning process by testing different configurations and finding the optimal parameters for program execution.
- Parameters:
fn (Callable) – The function to be auto-tuned.
configs – List of configurations to try during auto-tuning.
- compile_args¶
- profile_args¶
- cache_dir: pathlib.Path¶
- fn¶
- configs¶
- ref_latency_cache = None¶
- jit_input_tensors = None¶
- ref_input_tensors = None¶
- jit_compile = None¶
- classmethod from_kernel(kernel, configs)¶
Create an AutoTuner instance from a kernel function.
- Parameters:
kernel (Callable) – The kernel function to auto-tune.
configs – List of configurations to try.
- Returns:
A new AutoTuner instance.
- Return type:
- set_compile_args(out_idx=None, target='auto', execution_backend='cython', target_host=None, verbose=False, pass_configs=None)¶
Set compilation arguments for the auto-tuner.
- Parameters:
out_idx (Union[List[int], int, None]) – List of output tensor indices.
target (Literal['auto', 'cuda', 'hip']) – Target platform.
execution_backend (Literal['dlpack', 'ctypes', 'cython']) – Execution backend to use for kernel execution.
target_host (Union[str, tvm.target.Target]) – Target host for cross-compilation.
verbose (bool) – Whether to enable verbose output.
pass_configs (Optional[Dict[str, Any]]) – Additional keyword arguments to pass to the Compiler PassContext.
- Returns:
Self for method chaining.
- Return type:
- set_profile_args(warmup=25, rep=100, timeout=30, supply_type=tilelang.TensorSupplyType.Auto, ref_prog=None, supply_prog=None, rtol=0.01, atol=0.01, max_mismatched_ratio=0.01, skip_check=False, manual_check_prog=None, cache_input_tensors=False)¶
Set profiling arguments for the auto-tuner.
- Parameters:
supply_type (tilelang.TensorSupplyType) – Type of tensor supply mechanism. Ignored if supply_prog is provided.
ref_prog (Callable) – Reference program for validation.
supply_prog (Callable) – Supply program for input tensors.
rtol (float) – Relative tolerance for validation.
atol (float) – Absolute tolerance for validation.
max_mismatched_ratio (float) – Maximum allowed mismatch ratio.
skip_check (bool) – Whether to skip validation.
manual_check_prog (Callable) – Manual check program for validation.
cache_input_tensors (bool) – Whether to cache input tensors.
warmup (int) – Number of warmup iterations.
rep (int) – Number of repetitions for timing.
timeout (int) – Maximum time per configuration.
- Returns:
Self for method chaining.
- Return type:
- set_kernel_parameters(parameters)¶
- Parameters:
parameters (Tuple[str, Ellipsis])
- generate_cache_key(parameters)¶
Generate a cache key for the auto-tuning process.
- Parameters:
parameters (Dict[str, Any])
- Return type:
Optional[tilelang.autotuner.param.AutotuneResult]
- run(warmup=25, rep=100, timeout=30)¶
Run the auto-tuning process.
- Parameters:
warmup (int) – Number of warmup iterations.
rep (int) – Number of repetitions for timing.
timeout (int) – Maximum time per configuration.
- Returns:
Results of the auto-tuning process.
- Return type:
- __call__()¶
Make the AutoTuner callable, running the auto-tuning process.
- Returns:
Results of the auto-tuning process.
- Return type:
- tilelang.autotuner.tuner.autotune(func=None, *, configs, warmup=25, rep=100, timeout=100, supply_type=tilelang.TensorSupplyType.Auto, ref_prog=None, supply_prog=None, rtol=0.01, atol=0.01, max_mismatched_ratio=0.01, skip_check=False, manual_check_prog=None, cache_input_tensors=False)¶
Just-In-Time (JIT) compiler decorator for TileLang functions.
- This decorator can be used without arguments (e.g., @tilelang.jit):
Applies JIT compilation with default settings.
- Tips:
- If you want to skip the auto-tuning process, you can set override the tunable parameters in the function signature.
- Parameters:
func_or_out_idx (Any, optional) – If using @tilelang.jit(…) to configure, this is the out_idx parameter. If using @tilelang.jit directly on a function, this argument is implicitly the function to be decorated (and out_idx will be None).
configs (Dict or Callable) – Configuration space to explore during auto-tuning.
warmup (int, optional) – Number of warmup iterations before timing.
rep (int, optional) – Number of repetitions for timing measurements.
timeout (int, optional)
target (Union[str, Target], optional) – Compilation target for TVM (e.g., “cuda”, “llvm”). Defaults to “auto”.
target_host (Union[str, Target], optional) – Target host for cross-compilation. Defaults to None.
execution_backend (Literal["dlpack", "ctypes", "cython"], optional) – Backend for kernel execution and argument passing. Defaults to “cython”.
verbose (bool, optional) – Enables verbose logging during compilation. Defaults to False.
pass_configs (Optional[Dict[str, Any]], optional) – Configurations for TVM’s pass context. Defaults to None.
debug_root_path (Optional[str], optional) – Directory to save compiled kernel source for debugging. Defaults to None.
func (Union[Callable[tilelang.jit.param._P, tilelang.jit.param._RProg], tvm.tir.PrimFunc, None])
supply_type (tilelang.TensorSupplyType)
ref_prog (Callable)
supply_prog (Callable)
rtol (float)
atol (float)
max_mismatched_ratio (float)
skip_check (bool)
manual_check_prog (Callable)
cache_input_tensors (bool)
- Returns:
Either a JIT-compiled wrapper around the input function, or a configured decorator instance that can then be applied to a function.
- Return type:
Callable