tilelang.transform¶
Wrapping transformations.
Submodules¶
Functions¶
Get the current pass context |
|
ClusterPlanning |
|
infer the fragment/shared memory layout |
|
LayoutInference |
|
LowerTileOp |
|
InjectSoftwarePipeline |
|
FrontendLegalize |
|
LowerHopperIntrin |
|
WarpSpecializedPipeline |
|
RewriteWgmmaSync |
|
|
Insert sync between parallel read/write of shared buffers. |
|
Insert partial sync. |
IfStmtBinding |
|
MergeIfStmt |
|
WarpSpecializedPipeline |
|
WarpSpecializedPipeline |
|
InjectTmaBarrier |
|
InjectFenceProxy |
|
LegalizeLoopVectorize |
|
LegalizeLoopVectorize |
|
MakePackedAPI |
|
AnnotateDeviceRegions |
|
|
VectorizeLoop |
Rewrite global to shared memory copy on CUDA with asynchronous copy. |
|
Lower attached storage access information on device. |
|
Try to vectorize loop with dynamic shape. |
|
Config index bitwidth. |
|
FlattenBuffer |
|
EliminateStorageSyncForMBarrier |
|
|
MergeSharedMemoryAllocations |
LowerL2Persistent |
|
PersistThreadblock |
|
|
AlignDynamicSharedMemoryAllocations |
LowerSharedBarrier |
|
StorageRewrite |
|
LowerOpaqueBlock |
|
LowerThreadAllreduce |
|
LowerDeviceKernelLaunch |
Package Contents¶
- tilelang.transform.get_pass_context()¶
Get the current pass context
- tilelang.transform.ClusterPlanning()¶
ClusterPlanning
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.PipelinePlanning()¶
infer the fragment/shared memory layout
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.LayoutInference()¶
LayoutInference
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.LowerTileOp()¶
LowerTileOp
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.InjectSoftwarePipeline()¶
InjectSoftwarePipeline
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.FrontendLegalize()¶
FrontendLegalize
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.LowerHopperIntrin()¶
LowerHopperIntrin
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.WarpSpecializedPipeline()¶
WarpSpecializedPipeline
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.RewriteWgmmaSync()¶
RewriteWgmmaSync
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.ThreadSync(storage_scope)¶
Insert sync between parallel read/write of shared buffers.
- Parameters:
storage_scope (str) – The target storage scope.
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.ThreadPartialSync(storage_scope)¶
Insert partial sync.
- Parameters:
storage_scope (str) – The target storage scope.
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.IfStmtBinding()¶
IfStmtBinding
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.MergeIfStmt()¶
MergeIfStmt
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.MultiVersionBuffer()¶
WarpSpecializedPipeline
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.WarpSpecialized()¶
WarpSpecializedPipeline
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.InjectTmaBarrier()¶
InjectTmaBarrier
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.InjectFenceProxy()¶
InjectFenceProxy
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.LegalizeVectorizedLoop()¶
LegalizeLoopVectorize
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.LegalizeSafeMemoryAccess()¶
LegalizeLoopVectorize
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.MakePackedAPI()¶
MakePackedAPI
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.AnnotateDeviceRegions()¶
AnnotateDeviceRegions
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.VectorizeLoop(enable_vectorize=True)¶
VectorizeLoop
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- Parameters:
enable_vectorize (bool)
- tilelang.transform.InjectPTXAsyncCopy()¶
Rewrite global to shared memory copy on CUDA with asynchronous copy.
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.LowerDeviceStorageAccessInfo()¶
Lower attached storage access information on device.
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
Note
Run this pass after all storage access analysis finish.
- tilelang.transform.LoopVectorizeDynamic()¶
Try to vectorize loop with dynamic shape.
- Returns:
fpass (tvm.transform.Pass) – The result pass
—-
- tilelang.transform.ConfigIndexBitwidth()¶
Config index bitwidth.
- Returns:
fpass (tvm.transform.Pass) – The result pass
—-
- tilelang.transform.FlattenBuffer()¶
FlattenBuffer
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.EliminateStorageSyncForMBarrier()¶
EliminateStorageSyncForMBarrier
MergeSharedMemoryAllocations
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- Parameters:
enable_aggressive_merge (bool)
align_bytes (int)
- tilelang.transform.LowerL2Persistent()¶
LowerL2Persistent
- tilelang.transform.PersistThreadblock()¶
PersistThreadblock
AlignDynamicSharedMemoryAllocations
- Parameters:
align_bytes (int) – The alignment bytes.
LowerSharedBarrier
- tilelang.transform.StorageRewrite()¶
StorageRewrite
- Returns:
fpass – The result pass
- Return type:
tvm.transform.Pass
- tilelang.transform.LowerOpaqueBlock()¶
LowerOpaqueBlock
- tilelang.transform.LowerThreadAllreduce()¶
LowerThreadAllreduce
- tilelang.transform.LowerDeviceKernelLaunch()¶
LowerDeviceKernelLaunch