tilelang.transform ================== .. py:module:: tilelang.transform .. autoapi-nested-parse:: Wrapping transformations. Submodules ---------- .. toctree:: :maxdepth: 1 /autoapi/tilelang/transform/pass_config/index /autoapi/tilelang/transform/simplify/index Functions --------- .. autoapisummary:: tilelang.transform.get_pass_context tilelang.transform.ClusterPlanning tilelang.transform.PipelinePlanning tilelang.transform.LayoutInference tilelang.transform.LowerTileOp tilelang.transform.InjectSoftwarePipeline tilelang.transform.FrontendLegalize tilelang.transform.LowerHopperIntrin tilelang.transform.WarpSpecializedPipeline tilelang.transform.RewriteWgmmaSync tilelang.transform.ThreadSync tilelang.transform.ThreadPartialSync tilelang.transform.IfStmtBinding tilelang.transform.MergeIfStmt tilelang.transform.MultiVersionBuffer tilelang.transform.WarpSpecialized tilelang.transform.InjectTmaBarrier tilelang.transform.InjectFenceProxy tilelang.transform.LegalizeVectorizedLoop tilelang.transform.LegalizeSafeMemoryAccess tilelang.transform.MakePackedAPI tilelang.transform.AnnotateDeviceRegions tilelang.transform.VectorizeLoop tilelang.transform.InjectPTXAsyncCopy tilelang.transform.LowerDeviceStorageAccessInfo tilelang.transform.LoopVectorizeDynamic tilelang.transform.ConfigIndexBitwidth tilelang.transform.FlattenBuffer tilelang.transform.EliminateStorageSyncForMBarrier tilelang.transform.MergeSharedMemoryAllocations tilelang.transform.LowerL2Persistent tilelang.transform.PersistThreadblock tilelang.transform.AlignDynamicSharedMemoryAllocations tilelang.transform.LowerSharedBarrier tilelang.transform.StorageRewrite tilelang.transform.LowerOpaqueBlock tilelang.transform.LowerThreadAllreduce tilelang.transform.LowerDeviceKernelLaunch Package Contents ---------------- .. py:function:: get_pass_context() Get the current pass context .. py:function:: ClusterPlanning() ClusterPlanning :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: PipelinePlanning() infer the fragment/shared memory layout :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: LayoutInference() LayoutInference :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: LowerTileOp() LowerTileOp :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: InjectSoftwarePipeline() InjectSoftwarePipeline :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: FrontendLegalize() FrontendLegalize :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: LowerHopperIntrin() LowerHopperIntrin :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: WarpSpecializedPipeline() WarpSpecializedPipeline :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: RewriteWgmmaSync() RewriteWgmmaSync :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: ThreadSync(storage_scope) Insert sync between parallel read/write of shared buffers. :param storage_scope: The target storage scope. :type storage_scope: str :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: ThreadPartialSync(storage_scope) Insert partial sync. :param storage_scope: The target storage scope. :type storage_scope: str :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: IfStmtBinding() IfStmtBinding :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: MergeIfStmt() MergeIfStmt :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: MultiVersionBuffer() WarpSpecializedPipeline :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: WarpSpecialized() WarpSpecializedPipeline :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: InjectTmaBarrier() InjectTmaBarrier :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: InjectFenceProxy() InjectFenceProxy :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: LegalizeVectorizedLoop() LegalizeLoopVectorize :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: LegalizeSafeMemoryAccess() LegalizeLoopVectorize :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: MakePackedAPI() MakePackedAPI :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: AnnotateDeviceRegions() AnnotateDeviceRegions :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: VectorizeLoop(enable_vectorize = True) VectorizeLoop :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: InjectPTXAsyncCopy() Rewrite global to shared memory copy on CUDA with asynchronous copy. :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: LowerDeviceStorageAccessInfo() Lower attached storage access information on device. :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. note:: Run this pass after all storage access analysis finish. .. py:function:: LoopVectorizeDynamic() Try to vectorize loop with dynamic shape. :returns: * **fpass** (*tvm.transform.Pass*) -- The result pass * *----* .. py:function:: ConfigIndexBitwidth() Config index bitwidth. :returns: * **fpass** (*tvm.transform.Pass*) -- The result pass * *----* .. py:function:: FlattenBuffer() FlattenBuffer :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: EliminateStorageSyncForMBarrier() EliminateStorageSyncForMBarrier .. py:function:: MergeSharedMemoryAllocations(enable_aggressive_merge = False, align_bytes = 16) MergeSharedMemoryAllocations :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: LowerL2Persistent() LowerL2Persistent .. py:function:: PersistThreadblock() PersistThreadblock .. py:function:: AlignDynamicSharedMemoryAllocations(align_bytes = 16) AlignDynamicSharedMemoryAllocations :param align_bytes: The alignment bytes. :type align_bytes: int .. py:function:: LowerSharedBarrier() LowerSharedBarrier .. py:function:: StorageRewrite() StorageRewrite :returns: **fpass** -- The result pass :rtype: tvm.transform.Pass .. py:function:: LowerOpaqueBlock() LowerOpaqueBlock .. py:function:: LowerThreadAllreduce() LowerThreadAllreduce .. py:function:: LowerDeviceKernelLaunch() LowerDeviceKernelLaunch