tilelang.tileop.gemm.gemm_wgmma¶
Classes¶
Base class for GEMM tile operators. |
Module Contents¶
- class tilelang.tileop.gemm.gemm_wgmma.GemmWGMMA¶
Bases:
tilelang.tileop.gemm.gemm_base.GemmBaseBase class for GEMM tile operators.
Classifies the GEMM variant by the memory scopes of operands A and B (SS, SR, RS, TS, RR) and provides common property accessors for the underlying
gemm_nodeIR node.Infer the swizzle layout for shared memory based on continuity.
WGMMA can directly use shared memory as input, so the swizzle layout must match the tensor core’s access pattern. The swizzle granularity is determined by the continuous dimension size:
128B swizzle (Full): continuity % (vectorized_size * 8) == 0
64B swizzle (Half): continuity % (vectorized_size * 4) == 0
32B swizzle (Quarter): continuity % (vectorized_size * 2) == 0
Linear (no swizzle): otherwise
See: https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__TENSOR__MEMORY.html
- Parameters:
continuity (int)
- Return type:
Callable[[tvm.tir.Buffer], tilelang.layout.Layout]
- infer_layout(target, thread_nums)¶
- Parameters:
target (tvm.target.Target)
thread_nums (int)
- lower(layout_map, target, thread_bounds, thread_var)¶
- Parameters:
layout_map (dict)
target (tvm.target.Target)
thread_bounds (tvm.ir.Range)
thread_var (tvm.tir.Var)
- is_gemm_sr()¶
Return True if A is in shared memory and B is in registers (SR variant).
- Return type:
- is_gemm_rs()¶
Return True if A is in registers and B is in shared memory (RS variant).
- Return type: