tilelang.autotuner package#

Submodules#

Module contents#

The auto-tune module for tilelang programs.

This module provides functionality for auto-tuning tilelang programs, including JIT compilation and performance optimization through configuration search.

class tilelang.autotuner.AutoTuner(fn: Callable, configs)#

Bases: object

Auto-tuner for tilelang programs.

This class handles the auto-tuning process by testing different configurations and finding the optimal parameters for program execution.

Parameters:
  • fn – The function to be auto-tuned.

  • configs – List of configurations to try during auto-tuning.

cache_dir: Path = PosixPath('/home/t-leiwang/.tilelang/cache')#
compile_args = CompileArgs(out_idx=None, execution_backend='cython', target='auto', target_host=None, verbose=False, pass_configs=None)#
classmethod from_kernel(kernel: Callable, configs)#

Create an AutoTuner instance from a kernel function.

Parameters:
  • kernel – The kernel function to auto-tune.

  • configs – List of configurations to try.

Returns:

A new AutoTuner instance.

Return type:

AutoTuner

generate_cache_key(parameters: Dict[str, Any]) Optional[AutotuneResult]#

Generate a cache key for the auto-tuning process.

profile_args = ProfileArgs(warmup=25, rep=100, timeout=30, supply_type=<TensorSupplyType.Auto: 7>, ref_prog=None, supply_prog=None, rtol=0.01, atol=0.01, max_mismatched_ratio=0.01, skip_check=False, manual_check_prog=None, cache_input_tensors=True)#
run(warmup: int = 25, rep: int = 100, timeout: int = 30)#

Run the auto-tuning process.

Parameters:
  • warmup – Number of warmup iterations.

  • rep – Number of repetitions for timing.

  • timeout – Maximum time per configuration.

Returns:

Results of the auto-tuning process.

Return type:

AutotuneResult

set_compile_args(out_idx: Optional[Union[List[int], int]] = None, target: Literal['auto', 'cuda', 'hip'] = 'auto', execution_backend: Literal['dlpack', 'ctypes', 'cython'] = 'cython', target_host: Optional[Union[str, Target]] = None, verbose: bool = False, pass_configs: Optional[Dict[str, Any]] = None)#

Set compilation arguments for the auto-tuner.

Parameters:
  • out_idx – List of output tensor indices.

  • target – Target platform.

  • execution_backend – Execution backend to use for kernel execution.

  • target_host – Target host for cross-compilation.

  • verbose – Whether to enable verbose output.

  • pass_configs – Additional keyword arguments to pass to the Compiler PassContext.

Returns:

Self for method chaining.

Return type:

AutoTuner

set_kernel_parameters(parameters: Tuple[str, ...])#
set_profile_args(warmup: int = 25, rep: int = 100, timeout: int = 30, supply_type: TensorSupplyType = TensorSupplyType.Auto, ref_prog: Optional[Callable] = None, supply_prog: Optional[Callable] = None, rtol: float = 0.01, atol: float = 0.01, max_mismatched_ratio: float = 0.01, skip_check: bool = False, manual_check_prog: Optional[Callable] = None, cache_input_tensors: bool = False)#

Set profiling arguments for the auto-tuner.

Parameters:
  • supply_type – Type of tensor supply mechanism. Ignored if supply_prog is provided.

  • ref_prog – Reference program for validation.

  • supply_prog – Supply program for input tensors.

  • rtol – Relative tolerance for validation.

  • atol – Absolute tolerance for validation.

  • max_mismatched_ratio – Maximum allowed mismatch ratio.

  • skip_check – Whether to skip validation.

  • manual_check_prog – Manual check program for validation.

  • cache_input_tensors – Whether to cache input tensors.

  • warmup – Number of warmup iterations.

  • rep – Number of repetitions for timing.

  • timeout – Maximum time per configuration.

Returns:

Self for method chaining.

Return type:

AutoTuner

exception tilelang.autotuner.TimeoutException#

Bases: Exception

tilelang.autotuner.autotune(func: Optional[Union[Callable[[_P], _RProg], PrimFunc]] = None, *, configs: Any, warmup: int = 25, rep: int = 100, timeout: int = 100, supply_type: TensorSupplyType = TensorSupplyType.Auto, ref_prog: Optional[Callable] = None, supply_prog: Optional[Callable] = None, rtol: float = 0.01, atol: float = 0.01, max_mismatched_ratio: float = 0.01, skip_check: bool = False, manual_check_prog: Optional[Callable] = None, cache_input_tensors: bool = False)#

Just-In-Time (JIT) compiler decorator for TileLang functions.

This decorator can be used without arguments (e.g., @tilelang.jit):

Applies JIT compilation with default settings.

Parameters:
  • func_or_out_idx (Any, optional) – If using @tilelang.jit(…) to configure, this is the out_idx parameter. If using @tilelang.jit directly on a function, this argument is implicitly the function to be decorated (and out_idx will be None).

  • target (Union[str, Target], optional) – Compilation target for TVM (e.g., β€œcuda”, β€œllvm”). Defaults to β€œauto”.

  • target_host (Union[str, Target], optional) – Target host for cross-compilation. Defaults to None.

  • execution_backend (Literal["dlpack", "ctypes", "cython"], optional) – Backend for kernel execution and argument passing. Defaults to β€œcython”.

  • verbose (bool, optional) – Enables verbose logging during compilation. Defaults to False.

  • pass_configs (Optional[Dict[str, Any]], optional) – Configurations for TVM’s pass context. Defaults to None.

  • debug_root_path (Optional[str], optional) – Directory to save compiled kernel source for debugging. Defaults to None.

Returns:

Either a JIT-compiled wrapper around the input function, or a configured decorator instance that can then be applied to a function.

Return type:

Callable

tilelang.autotuner.get_available_cpu_count() int#

Gets the number of CPU cores available to the current process.

tilelang.autotuner.run_with_timeout(func, timeout, *args, **kwargs)#
tilelang.autotuner.timeout_handler(signum, frame)#