tilelang.autotuner.param¶
The auto-tune parameters.
Attributes¶
Classes¶
Compile arguments for the auto-tuner. Detailed description can be found in tilelang.jit.compile. |
|
Profile arguments for the auto-tuner. |
|
Results from auto-tuning process. |
Module Contents¶
- tilelang.autotuner.param.BEST_CONFIG_PATH = 'best_config.json'¶
- tilelang.autotuner.param.FUNCTION_PATH = 'function.pkl'¶
- tilelang.autotuner.param.LATENCY_PATH = 'latency.json'¶
- tilelang.autotuner.param.KERNEL_PATH = 'kernel.cu'¶
- tilelang.autotuner.param.WRAPPED_KERNEL_PATH = 'wrapped_kernel.cu'¶
- tilelang.autotuner.param.KERNEL_LIB_PATH = 'kernel_lib.so'¶
- tilelang.autotuner.param.PARAMS_PATH = 'params.pkl'¶
- class tilelang.autotuner.param.CompileArgs¶
Compile arguments for the auto-tuner. Detailed description can be found in tilelang.jit.compile. .. attribute:: out_idx
List of output tensor indices.
- execution_backend¶
Execution backend to use for kernel execution (default: “cython”).
- target¶
Compilation target, either as a string or a TVM Target object (default: “auto”).
- target_host¶
Target host for cross-compilation (default: None).
- verbose¶
Whether to enable verbose output (default: False).
- pass_configs¶
Additional keyword arguments to pass to the Compiler PassContext.
- Available options
“tir.disable_vectorize”: bool, default: False “tl.disable_tma_lower”: bool, default: False “tl.disable_warp_specialized”: bool, default: False “tl.config_index_bitwidth”: int, default: None “tl.disable_dynamic_tail_split”: bool, default: False “tl.dynamic_vectorize_size_bits”: int, default: 128 “tl.disable_safe_memory_legalize”: bool, default: False
- out_idx: List[int] | int | None = None¶
- execution_backend: Literal['dlpack', 'ctypes', 'cython'] = 'cython'¶
- target: Literal['auto', 'cuda', 'hip'] = 'auto'¶
- target_host: str | tvm.target.Target = None¶
- verbose: bool = False¶
- pass_configs: Dict[str, Any] | None = None¶
- compile_program(program)¶
- Parameters:
program (tvm.tir.PrimFunc)
- __hash__()¶
- class tilelang.autotuner.param.ProfileArgs¶
Profile arguments for the auto-tuner.
- warmup¶
Number of warmup iterations.
- rep¶
Number of repetitions for timing.
- timeout¶
Maximum time per configuration.
- supply_type¶
Type of tensor supply mechanism.
- ref_prog¶
Reference program for correctness validation.
- supply_prog¶
Supply program for input tensors.
- out_idx¶
Union[List[int], int] = -1
- supply_type¶
tilelang.TensorSupplyType = tilelang.TensorSupplyType.Auto
- ref_prog¶
Callable = None
- supply_prog¶
Callable = None
- rtol¶
float = 1e-2
- atol¶
float = 1e-2
- max_mismatched_ratio¶
float = 0.01
- skip_check¶
bool = False
- manual_check_prog¶
Callable = None
- cache_input_tensors¶
bool = True
- warmup: int = 25¶
- rep: int = 100¶
- timeout: int = 30¶
- supply_type: tilelang.TensorSupplyType¶
- ref_prog: Callable = None¶
- supply_prog: Callable = None¶
- rtol: float = 0.01¶
- atol: float = 0.01¶
- max_mismatched_ratio: float = 0.01¶
- skip_check: bool = False¶
- manual_check_prog: Callable = None¶
- cache_input_tensors: bool = True¶
- __hash__()¶
- class tilelang.autotuner.param.AutotuneResult¶
Results from auto-tuning process.
- latency¶
Best achieved execution latency.
- config¶
Configuration that produced the best result.
- ref_latency¶
Reference implementation latency.
- libcode¶
Generated library code.
- func¶
Optimized function.
- kernel¶
Compiled kernel function.
- latency: float | None = None¶
- config: dict | None = None¶
- ref_latency: float | None = None¶
- libcode: str | None = None¶
- func: Callable | None = None¶
- kernel: Callable | None = None¶
- save_to_disk(path, verbose=False)¶
- Parameters:
path (pathlib.Path)
verbose (bool)
- classmethod load_from_disk(path, compile_args)¶
- Parameters:
path (pathlib.Path)
compile_args (CompileArgs)
- Return type: