tilelang.transform¶
Wrapping transformations.
Submodules¶
Functions¶
Get the current pass context |
|
ClusterPlanning |
|
infer the fragment/shared memory layout |
|
LayoutInference |
|
LowerTileOp |
|
InjectSoftwarePipeline |
|
Legalize negative indices in buffer loads. |
|
Inject Assumes for natural shape boundary conditions. And convert Assumes in Evaluate(Call(...)) form |
|
VerifyParallelLoop |
|
|
Insert sync between parallel read/write of shared buffers. |
IfStmtBinding |
|
MergeIfStmt |
|
LoopUnswitching: Hoist loop-invariant if statements out of loops. |
|
LegalizeLoopVectorize |
|
LegalizeLoopVectorize |
|
Lower TileLang frontend tl.access_ptr to tir.builtin.tvm_access_ptr. |
|
MakePackedAPI |
|
|
Materialize the target-neutral kernel launch nest (thread_binding |
AnnotateDeviceRegions |
|
Split host/device functions even for empty kernels. |
|
Annotate read-only handle parameters for PrimFuncs. |
|
|
VectorizeLoop |
Config index bitwidth. |
|
FlattenBuffer |
|
|
MergeSharedMemoryAllocations |
Plan and update buffer allocation locations within PrimFuncs. |
|
Hoist global buffer allocations to the top of the block (host side). |
|
StorageRewrite |
|
LowerOpaqueBlock |
|
LowerThreadAllreduce |
|
LowerIntrin |
|
Create and return a transform pass that lowers device kernel launch constructs to target-specific IR. |
|
Return a TVM transform pass that performs layout reduction/normalization. |
|
Unroll loops as in Halide pipeline. |
Package Contents¶
- tilelang.transform.get_pass_context()¶
Get the current pass context
- tilelang.transform.ClusterPlanning()¶
ClusterPlanning
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.PipelinePlanning()¶
infer the fragment/shared memory layout
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.LayoutInference()¶
LayoutInference
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.LowerTileOp()¶
LowerTileOp
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.InjectSoftwarePipeline()¶
InjectSoftwarePipeline
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.LegalizeNegativeIndex()¶
Legalize negative indices in buffer loads.
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.InjectAssumes()¶
Inject Assumes for natural shape boundary conditions. And convert Assumes in Evaluate(Call(…)) form (tvm builtin assume call) to AttrNode form.
Returns:¶
- fpasstvm.transform.Pass
The result pass
- tilelang.transform.VerifyParallelLoop()¶
VerifyParallelLoop
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.ThreadSync(storage_scope)¶
Insert sync between parallel read/write of shared buffers.
- Parameters:
storage_scope (str) – The target storage scope.
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.IfStmtBinding()¶
IfStmtBinding
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.MergeIfStmt()¶
MergeIfStmt
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.LoopUnswitching()¶
LoopUnswitching: Hoist loop-invariant if statements out of loops.
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.LegalizeVectorizedLoop()¶
LegalizeLoopVectorize
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.LegalizeSafeMemoryAccess()¶
LegalizeLoopVectorize
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.LowerAccessPtr()¶
Lower TileLang frontend tl.access_ptr to tir.builtin.tvm_access_ptr.
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.MakePackedAPI()¶
MakePackedAPI
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.MaterializeKernelLaunch(lower_thread_binding=True)¶
Materialize the target-neutral kernel launch nest (thread_binding For loops emitted by T.Kernel) into a backend-specific form. Each backend pipeline decides the mode for itself:
- Parameters:
lower_thread_binding (bool) – If True (SIMT backends, e.g. CUDA/ROCm/Metal), lower the blockIdx.*/threadIdx.* loops into thread_extent AttrStmts. If False (backends without SIMT, e.g. CPU), lower blockIdx.* loops into plain serial For loops and ignore threadIdx.* loops (their extents are dropped; the loop vars are pinned to 0).
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.AnnotateDeviceRegions()¶
AnnotateDeviceRegions
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.SplitHostDevice()¶
Split host/device functions even for empty kernels.
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.AnnotateReadOnlyParams()¶
Annotate read-only handle parameters for PrimFuncs.
Adds attribute tl.readonly_param_indices listing param indices that are never written, enabling CUDA codegen to emit const qualifiers to unlock read-only cache loads.
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.VectorizeLoop(enable_vectorize=True)¶
VectorizeLoop
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- Parameters:
enable_vectorize (bool)
- tilelang.transform.ConfigIndexBitwidth()¶
Config index bitwidth.
- Returns:
fpass (tvm.transform.Pass) – The result pass
—-
- tilelang.transform.FlattenBuffer()¶
FlattenBuffer
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
MergeSharedMemoryAllocations
- tilelang.transform.PlanAndUpdateBufferAllocationLocation()¶
Plan and update buffer allocation locations within PrimFuncs.
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.HoistGlobalBufferAllocations()¶
Hoist global buffer allocations to the top of the block (host side).
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.HoistNonRestrictParams()¶
- tilelang.transform.StorageRewrite()¶
StorageRewrite
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.LowerOpaqueBlock()¶
LowerOpaqueBlock
- tilelang.transform.LowerThreadAllreduce()¶
LowerThreadAllreduce
- tilelang.transform.LowerIntrin()¶
LowerIntrin
- tilelang.transform.LowerDeviceKernelLaunch()¶
Create and return a transform pass that lowers device kernel launch constructs to target-specific IR.
This pass transforms high-level device kernel launch and related intrinsics into lower-level IR suitable for backend code generation and device-side lowering.
- Returns:
The transform pass that performs device kernel launch lowering.
- Return type:
tvm.transform.Pass
- tilelang.transform.LayoutReducer()¶
Return a TVM transform pass that performs layout reduction/normalization.
This wrapper delegates to the underlying FFI implementation and returns a pass object suitable for use in a PassContext or pass pipeline. The pass is intended to simplify or reduce tensor/layout-related representations during relay/tile transformations.
- Returns:
The transform pass object produced by the FFI backend.
- tilelang.transform.UnrollLoop()¶
Unroll loops as in Halide pipeline.
This pass unrolls loops based on configuration options including: - auto_max_step: Threshold of number of steps to be automatically unrolled - auto_max_depth: Maximum nested level of loops that can be automatically unrolled - auto_max_extent: Maximum extent of loop that will be unrolled - explicit_unroll: Whether to explicitly unroll instead of setting a pragma - unroll_local_access: Whether to always unroll local access
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass