tilelang.language¶
The language interface for tl programs.
Submodules¶
- tilelang.language.allocate
- tilelang.language.builtin
- tilelang.language.copy
- tilelang.language.customize
- tilelang.language.experimental
- tilelang.language.fill
- tilelang.language.frame
- tilelang.language.gemm
- tilelang.language.kernel
- tilelang.language.logical
- tilelang.language.memscope
- tilelang.language.parallel
- tilelang.language.persistent
- tilelang.language.pipeline
- tilelang.language.print
- tilelang.language.proxy
- tilelang.language.reduce
- tilelang.language.tir
- tilelang.language.warpgroup
Functions¶
|
|
|
|
|
Annotate the layout of the buffer |
|
Annotate the padding of the buffer |
|
Annotate the L2 hit ratio of the buffer, detailed explanation please refer to: |
|
Package Contents¶
- tilelang.language.symbolic(name, dtype='int32')¶
- Parameters:
name (str)
dtype (str)
- tilelang.language.use_swizzle(panel_size, order='row', enable=True)¶
- Parameters:
panel_size (int)
order (str)
enable (bool)
- tilelang.language.annotate_layout(layout_map)¶
Annotate the layout of the buffer
- Parameters:
layout_map (Dict) – a dictionary of buffer to layout
- Returns:
a block attribute
- Return type:
block_attr
Example
@T.prim_func def main(
A: T.Tensor((M, N), dtype), B: T.Tensor((M, N), dtype),
- ):
# Initialize Kernel Context with T.Kernel(T.ceildiv(N, block_N), T.ceildiv(M, block_M), threads=128) as (bx, by):
A_shared = T.alloc_shared((block_M, block_N), dtype)
T.annotate_layout({A_shared: layout}) for i, j in T.Parallel(block_M, block_N):
A_shared[i, j] = A[by * block_M + i, bx * block_N + j]
- for i, j in T.Parallel(block_M, block_N):
B[by * block_M + i, bx * block_N + j] = A_shared[i, j]
return main
- tilelang.language.annotate_padding(padding_map)¶
Annotate the padding of the buffer
- Parameters:
padding_map (dict) – a dictionary of buffer to padding value
- Returns:
a block attribute
- Return type:
block_attr
Example
@T.prim_func def main(
A: T.Tensor((M, N), dtype), B: T.Tensor((M, N), dtype),
- ):
# Initialize Kernel Context with T.Kernel(T.ceildiv(N, block_N), T.ceildiv(M, block_M), threads=128) as (bx, by):
A_shared = T.alloc_shared((block_M, block_N), dtype)
T.annotate_padding({A_shared: pad_value}) for i, j in T.Parallel(block_M, block_N):
A_shared[i, j] = A[by * block_M + i - 10, bx * block_N + j]
- for i, j in T.Parallel(block_M, block_N):
B[by * block_M + i, bx * block_N + j] = A_shared[i, j]
return main
- tilelang.language.annotate_l2_hit_ratio(l2_hit_ratio_map)¶
Annotate the L2 hit ratio of the buffer, detailed explanation please refer to: https://docs.nvidia.com/cuda/cuda-c-programming-guide/#l2-policy-for-persisting-accesses
- Parameters:
l2_hit_ratio_map (dict) – a dictionary of buffer to L2 hit ratio value
Example
# 0.5 is the hit ratio T.annotate_l2_hit_ratio({A: 0.5})
- tilelang.language.import_source(source=None)¶
- Parameters:
source (Optional[str])