tilelang.language.experimental.gemm_sp_opΒΆ
Sparse GEMM operators exposed on the TileLang language surface.
FunctionsΒΆ
|
TileLang sparse GEMM operator. |
|
Explicit Hopper WGMMA sparse GEMM without an implicit wait. |
|
Explicit Blackwell TCGEN05 sparse GEMM without an implicit wait. |
Module ContentsΒΆ
- tilelang.language.experimental.gemm_sp_op.gemm_sp(A_sparse, E, B, C, transpose_A=False, transpose_E=False, transpose_B=False, policy=GemmWarpPolicy.Square, clear_accum=False, k_pack=1, wg_wait=0)ΒΆ
TileLang sparse GEMM operator.
This is the default synchronous sparse GEMM interface. On Hopper, if the compiler selects WGMMA SP lowering, TileLang inserts the corresponding wait implicitly.
For manual asynchronous scheduling, use
T.wgmma_gemm_sp(...)withT.wait_wgmma(...)on Hopper, orT.tcgen05_gemm_sp(...)on Blackwell.- Parameters:
A_sparse (tilelang._typing.BufferLikeType | tvm.tir.Var) β Compressed sparse matrix containing only non-zero elements.
E (tilelang._typing.BufferLikeType | tvm.tir.Var) β Metadata tensor encoding the sparsity pattern of A.
B (tilelang._typing.BufferLikeType | tvm.tir.Var) β Dense input matrix.
C (tilelang._typing.BufferLikeType | tvm.tir.Var) β Output accumulator matrix.
transpose_A (bool) β Whether to transpose A. Defaults to False.
transpose_E (bool) β Whether to transpose E. Defaults to False.
transpose_B (bool) β Whether to transpose B. Defaults to False.
policy (tilelang.tileop.base.GemmWarpPolicy) β Warp partition policy. Defaults to GemmSPWarpPolicy.Square.
clear_accum (bool) β Whether to zero the accumulator before computation. Defaults to False.
k_pack (int) β Number of K dimensions packed per warp. Defaults to 1.
wg_wait (int) β Warp group wait count. Defaults to 0.
- Returns:
A handle to the sparse GEMM operation.
- Return type:
tir.Call
- tilelang.language.experimental.gemm_sp_op.wgmma_gemm_sp(A_sparse, E, B, C, transpose_A=False, transpose_E=False, transpose_B=False, policy=GemmWarpPolicy.Square, clear_accum=False)ΒΆ
Explicit Hopper WGMMA sparse GEMM without an implicit wait.
This is the explicit asynchronous Hopper WGMMA counterpart to the default synchronous
T.gemm_sp(...)interface, with two stricter guarantees: - it always requests the WGMMA SP lowering path - it never auto-emits an inlinedwarpgroup_waitIf the current target or operand pattern cannot use Hopper WGMMA SP, compilation fails instead of silently falling back to MMA SP.
- Parameters:
A_sparse (tilelang._typing.BufferLikeType | tvm.tir.Var) β Compressed sparse matrix containing only non-zero elements.
E (tilelang._typing.BufferLikeType | tvm.tir.Var) β Metadata tensor encoding the sparsity pattern of A.
B (tilelang._typing.BufferLikeType | tvm.tir.Var) β Dense input matrix.
C (tilelang._typing.BufferLikeType | tvm.tir.Var) β Output accumulator matrix.
transpose_A (bool) β Whether to transpose A. Defaults to False.
transpose_E (bool) β Whether to transpose E. Defaults to False.
transpose_B (bool) β Whether to transpose B. Defaults to False.
policy (tilelang.tileop.base.GemmWarpPolicy) β Warp partition policy. Defaults to GemmSPWarpPolicy.Square.
clear_accum (bool) β Whether to zero the accumulator before computation. Defaults to False.
- Returns:
A handle to the sparse GEMM operation.
- Return type:
tir.Call
- tilelang.language.experimental.gemm_sp_op.tcgen05_gemm_sp(A_sparse, E, B, C, transpose_A=False, transpose_E=False, transpose_B=False, policy=GemmWarpPolicy.Square, clear_accum=False)ΒΆ
Explicit Blackwell TCGEN05 sparse GEMM without an implicit wait.
This is the explicit asynchronous Blackwell TCGEN05 counterpart to the default synchronous
T.gemm_sp(...)interface, with two stricter guarantees: - it always requests the TCGEN05 SP lowering path - it never auto-emits an inlinedmbarrier_wait_parityIf the current target or operand pattern cannot use Blackwell TCGEN05 SP, compilation fails instead of silently falling back to another sparse GEMM path.
- Parameters:
A_sparse (tilelang._typing.BufferLikeType | tvm.tir.Var) β Compressed sparse matrix containing only non-zero elements.
E (tilelang._typing.BufferLikeType | tvm.tir.Var) β Metadata tensor encoding the sparsity pattern of A.
B (tilelang._typing.BufferLikeType | tvm.tir.Var) β Dense input matrix.
C (tilelang._typing.BufferLikeType | tvm.tir.Var) β Output accumulator matrix.
transpose_A (bool) β Whether to transpose A. Defaults to False.
transpose_E (bool) β Whether to transpose E. Defaults to False.
transpose_B (bool) β Whether to transpose B. Defaults to False.
policy (tilelang.tileop.base.GemmWarpPolicy) β Warp partition policy. Defaults to GemmSPWarpPolicy.Square.
clear_accum (bool) β Whether to zero the accumulator before computation. Defaults to False.
- Returns:
A handle to the sparse GEMM operation.
- Return type:
tir.Call