tilelang.intrinsics.utils¶

Functions¶

get_ldmatrix_offset(matrix, row_idx, col_idx, stride)

shared_16x16_to_mma_32x8_layout(i, j)

shared_16x32_to_mma_32x16_layout(i, j)

shared_32x16_to_mma_32x16_layout(i, j)

mma_store_index_map(thread_id, local_id)

mfma_store_index_map(thread_id, local_id)

get_mma_micro_size(dtype)

Return the MMA (Tensor Core) micro-tile dimensions for a given data type.

Module Contents¶

tilelang.intrinsics.utils.get_ldmatrix_offset(matrix, row_idx, col_idx, stride, dtype='float16', transposed=False)¶
Parameters:
  • matrix (Literal['A', 'B'])

  • dtype (Literal['float16', 'int8'])

  • transposed (bool)

tilelang.intrinsics.utils.shared_16x16_to_mma_32x8_layout(i, j)¶
tilelang.intrinsics.utils.shared_16x32_to_mma_32x16_layout(i, j)¶
tilelang.intrinsics.utils.shared_32x16_to_mma_32x16_layout(i, j)¶
tilelang.intrinsics.utils.mma_store_index_map(thread_id, local_id)¶
tilelang.intrinsics.utils.mfma_store_index_map(thread_id, local_id)¶
tilelang.intrinsics.utils.get_mma_micro_size(dtype)¶

Return the MMA (Tensor Core) micro-tile dimensions for a given data type.

This function returns the micro tile sizes (x, y, k) used by MMA/Tensor Core operations. - x: tile width in the output/result dimension - y: tile height in the output/result dimension - k: tile depth in the reduction/K dimension

Accepted dtype strings include “float16”, “int8” and some FP8 identifiers (“float8_e4m3”, “float8_e5m2”). For FP8 and int8 types the reduction depth (k) is 32; for float16 it is 16.

Returns:

(micro_size_x, micro_size_y, micro_size_k)

Return type:

tuple[int, int, int]

Parameters:

dtype (Literal['float16', 'int8'])