tilelang.autotuner package#
Submodules#
- tilelang.autotuner.param module
AutotuneResult
AutotuneResult.latency
AutotuneResult.config
AutotuneResult.ref_latency
AutotuneResult.libcode
AutotuneResult.func
AutotuneResult.kernel
AutotuneResult.config
AutotuneResult.func
AutotuneResult.kernel
AutotuneResult.latency
AutotuneResult.libcode
AutotuneResult.load_from_disk()
AutotuneResult.ref_latency
AutotuneResult.save_to_disk()
CompileArgs
ProfileArgs
ProfileArgs.warmup
ProfileArgs.rep
ProfileArgs.timeout
ProfileArgs.supply_type
ProfileArgs.ref_prog
ProfileArgs.supply_prog
ProfileArgs.out_idx
ProfileArgs.supply_type
ProfileArgs.ref_prog
ProfileArgs.supply_prog
ProfileArgs.rtol
ProfileArgs.atol
ProfileArgs.max_mismatched_ratio
ProfileArgs.skip_check
ProfileArgs.manual_check_prog
ProfileArgs.cache_input_tensors
ProfileArgs.atol
ProfileArgs.cache_input_tensors
ProfileArgs.manual_check_prog
ProfileArgs.max_mismatched_ratio
ProfileArgs.ref_prog
ProfileArgs.rep
ProfileArgs.rtol
ProfileArgs.skip_check
ProfileArgs.supply_prog
ProfileArgs.supply_type
ProfileArgs.timeout
ProfileArgs.warmup
Module contents#
The auto-tune module for tilelang programs.
This module provides functionality for auto-tuning tilelang programs, including JIT compilation and performance optimization through configuration search.
- class tilelang.autotuner.AutoTuner(fn: Callable, configs)#
Bases:
object
Auto-tuner for tilelang programs.
This class handles the auto-tuning process by testing different configurations and finding the optimal parameters for program execution.
- Parameters:
fn β The function to be auto-tuned.
configs β List of configurations to try during auto-tuning.
- cache_dir: Path = PosixPath('/home/t-leiwang/.tilelang/cache')#
- compile_args = CompileArgs(out_idx=None, execution_backend='cython', target='auto', target_host=None, verbose=False, pass_configs=None)#
- classmethod from_kernel(kernel: Callable, configs)#
Create an AutoTuner instance from a kernel function.
- Parameters:
kernel β The kernel function to auto-tune.
configs β List of configurations to try.
- Returns:
A new AutoTuner instance.
- Return type:
- generate_cache_key(parameters: Dict[str, Any]) Optional[AutotuneResult] #
Generate a cache key for the auto-tuning process.
- profile_args = ProfileArgs(warmup=25, rep=100, timeout=30, supply_type=<TensorSupplyType.Auto: 7>, ref_prog=None, supply_prog=None, rtol=0.01, atol=0.01, max_mismatched_ratio=0.01, skip_check=False, manual_check_prog=None, cache_input_tensors=True)#
- run(warmup: int = 25, rep: int = 100, timeout: int = 30)#
Run the auto-tuning process.
- Parameters:
warmup β Number of warmup iterations.
rep β Number of repetitions for timing.
timeout β Maximum time per configuration.
- Returns:
Results of the auto-tuning process.
- Return type:
- set_compile_args(out_idx: Optional[Union[List[int], int]] = None, target: Literal['auto', 'cuda', 'hip'] = 'auto', execution_backend: Literal['dlpack', 'ctypes', 'cython'] = 'cython', target_host: Optional[Union[str, Target]] = None, verbose: bool = False, pass_configs: Optional[Dict[str, Any]] = None)#
Set compilation arguments for the auto-tuner.
- Parameters:
out_idx β List of output tensor indices.
target β Target platform.
execution_backend β Execution backend to use for kernel execution.
target_host β Target host for cross-compilation.
verbose β Whether to enable verbose output.
pass_configs β Additional keyword arguments to pass to the Compiler PassContext.
- Returns:
Self for method chaining.
- Return type:
- set_kernel_parameters(parameters: Tuple[str, ...])#
- set_profile_args(warmup: int = 25, rep: int = 100, timeout: int = 30, supply_type: TensorSupplyType = TensorSupplyType.Auto, ref_prog: Optional[Callable] = None, supply_prog: Optional[Callable] = None, rtol: float = 0.01, atol: float = 0.01, max_mismatched_ratio: float = 0.01, skip_check: bool = False, manual_check_prog: Optional[Callable] = None, cache_input_tensors: bool = False)#
Set profiling arguments for the auto-tuner.
- Parameters:
supply_type β Type of tensor supply mechanism. Ignored if supply_prog is provided.
ref_prog β Reference program for validation.
supply_prog β Supply program for input tensors.
rtol β Relative tolerance for validation.
atol β Absolute tolerance for validation.
max_mismatched_ratio β Maximum allowed mismatch ratio.
skip_check β Whether to skip validation.
manual_check_prog β Manual check program for validation.
cache_input_tensors β Whether to cache input tensors.
warmup β Number of warmup iterations.
rep β Number of repetitions for timing.
timeout β Maximum time per configuration.
- Returns:
Self for method chaining.
- Return type:
- exception tilelang.autotuner.TimeoutException#
Bases:
Exception
- tilelang.autotuner.autotune(func: Optional[Union[Callable[[_P], _RProg], PrimFunc]] = None, *, configs: Any, warmup: int = 25, rep: int = 100, timeout: int = 100, supply_type: TensorSupplyType = TensorSupplyType.Auto, ref_prog: Optional[Callable] = None, supply_prog: Optional[Callable] = None, rtol: float = 0.01, atol: float = 0.01, max_mismatched_ratio: float = 0.01, skip_check: bool = False, manual_check_prog: Optional[Callable] = None, cache_input_tensors: bool = False)#
Just-In-Time (JIT) compiler decorator for TileLang functions.
- This decorator can be used without arguments (e.g., @tilelang.jit):
Applies JIT compilation with default settings.
- Parameters:
func_or_out_idx (Any, optional) β If using @tilelang.jit(β¦) to configure, this is the out_idx parameter. If using @tilelang.jit directly on a function, this argument is implicitly the function to be decorated (and out_idx will be None).
target (Union[str, Target], optional) β Compilation target for TVM (e.g., βcudaβ, βllvmβ). Defaults to βautoβ.
target_host (Union[str, Target], optional) β Target host for cross-compilation. Defaults to None.
execution_backend (Literal["dlpack", "ctypes", "cython"], optional) β Backend for kernel execution and argument passing. Defaults to βcythonβ.
verbose (bool, optional) β Enables verbose logging during compilation. Defaults to False.
pass_configs (Optional[Dict[str, Any]], optional) β Configurations for TVMβs pass context. Defaults to None.
debug_root_path (Optional[str], optional) β Directory to save compiled kernel source for debugging. Defaults to None.
- Returns:
Either a JIT-compiled wrapper around the input function, or a configured decorator instance that can then be applied to a function.
- Return type:
Callable
- tilelang.autotuner.get_available_cpu_count() int #
Gets the number of CPU cores available to the current process.
- tilelang.autotuner.run_with_timeout(func, timeout, *args, **kwargs)#
- tilelang.autotuner.timeout_handler(signum, frame)#