tilelang.carver.roller.policy.default ===================================== .. py:module:: tilelang.carver.roller.policy.default .. autoapi-nested-parse:: Policy for cuda core schedule Classes ------- .. autoapisummary:: tilelang.carver.roller.policy.default.DefaultPolicy Module Contents --------------- .. py:class:: DefaultPolicy(arch, tags = None) Default Policy for fastdlight, a heuristic plan that tries to minimize memory traffic and maximize parallelism.for BitBLAS Schedule. .. py:attribute:: func :type: tvm.tir.PrimFunc .. py:attribute:: nodes :type: List[tilelang.carver.roller.node.PrimFuncNode] :value: [] .. py:attribute:: arch :type: tilelang.carver.arch.TileDevice .. py:attribute:: tags :type: Dict .. py:attribute:: rasterization .. py:method:: from_prim_func(func, arch, tags = None, name = 'PrimFuncNode') :classmethod: .. py:method:: from_output_nodes(nodes, arch, tags = None) :classmethod: .. py:method:: emit_config(topk) .. py:method:: dfs_smem_tile(init_tile, rstep_map) .. py:method:: get_base_tile() Gets the minimum tile configuration that satisfies no redundancy in computation. :returns: The base tile configuration, which is a list of 1s equal in length to the space dimensions of the primary function node. :rtype: List[int] .. py:method:: compute_workload_per_item(output_tile) .. py:method:: score_block_size(n) Scores a block size based on its efficiency and fit relative to the architecture's warp size and SM partition. :param n: The block size to score. :type n: int :returns: A tuple containing two scores representing efficiency and fit, respectively. :rtype: Tuple[float, float] .. py:method:: get_block_size(n) Determines the optimal block size for a given constraint, based on scoring various factors. :param n: The constraint size. :type n: int :returns: The optimal block size chosen from the factors of n, constrained by a maximum of 1024 and scored by the `score_block_size` method. :rtype: int .. py:method:: get_node_reduce_step_candidates(node) Calculates reduction step candidates for each reduction axis in a PrimFuncNode. General idea : use factor first, since it does not require extra boundary check. for large prime number, which is rare case, use power of 2. :param node: The node for which to calculate reduction step candidates. It contains reduction axes (raxis) with their domains (dom.extent). :type node: PrimFuncNode :returns: A dictionary mapping axis variable names to lists of step candidates. For each axis in the node, this function calculates possible step sizes. For axes with a large prime domain, it uses powers of 2 as step candidates; for others, it uses all factors of the domain. :rtype: Dict[str, List[int]] .. py:method:: infer_node_smem_usage(td, node) Infers the shared memory usage of a node given a TileDict configuration. :param td: The TileDict object containing the tile configuration. :type td: TileDict :param node: The node for which to infer the shared memory usage. :type node: PrimFuncNode :returns: The estimated amount of shared memory used by the node. :rtype: int .. py:method:: compute_node_stride_map(node, td) Computes the stride map for a given node based on the TileDict configuration. :param node: The node for which to compute the stride map. :type node: PrimFuncNode :param td: The TileDict object containing the tile configuration. :type td: TileDict :returns: A tuple of dictionaries containing the output strides and tensor strides. :rtype: Tuple[Dict, Dict] .. py:method:: compute_tile_dict(output_tile, rstep_map) Computes and returns a TileDict object for a given output tile configuration and reduction step map. :param output_tile: The output tile configuration. :type output_tile: List[int] :param rstep_map: The reduction step map. :type rstep_map: Dict :returns: A TileDict object containing the computed tile configuration, memory traffic, shared memory cost, grid size, and other related parameters. :rtype: TileDict .. py:method:: check_tile_shape_isvalid(td) Checks if the tile shapes in the TileDict are valid for the nodes in this context. Parameters: - td (TileDict): The TileDict object containing tile shapes and other configurations. Returns: - bool: True if all tile shapes are valid, False otherwise. .. py:method:: recommend_block_size(td) Recommends optimal block sizes based on the TileDict configuration. :param td: The TileDict object containing the tile configuration. :type td: TileDict :returns: A list of recommended block sizes sorted based on their score. :rtype: List[int] .. py:method:: assign_block_size(td, topk=1) Assigns block sizes to the TileDict based on the recommended block sizes. :param td: The TileDict object to assign block sizes to. :type td: TileDict :param topk: The number of top block sizes to consider. :type topk: int, optional :Yields: *Dict* -- The block size assignment for the primary function node. .. py:method:: plan_rasterization(td) Plans the rasterization for the given TileDict. This function is not implemented yet. :param td: The TileDict object to plan rasterization for. :type td: TileDict :raises RasterRationPlan: This function is not implemented yet.