tilelang.profiler ================= .. py:module:: tilelang.profiler .. autoapi-nested-parse:: The profiler and convert to torch utils Submodules ---------- .. toctree:: :maxdepth: 1 /autoapi/tilelang/profiler/bench/index Classes ------- .. autoapisummary:: tilelang.profiler.Profiler Package Contents ---------------- .. py:class:: Profiler A profiler class for benchmarking and validating kernel implementations. .. attribute:: params List of kernel parameters defining the input/output specifications .. attribute:: result_idx Indices indicating which parameters are output tensors .. attribute:: supply_type Type of tensor supply to use (e.g., random, zeros, etc.) .. attribute:: adapter Optional kernel adapter for interfacing with different backends .. py:attribute:: params :type: List[tilelang.engine.param.KernelParam] .. py:attribute:: result_idx :type: List[int] .. py:attribute:: supply_type :type: tilelang.utils.tensor.TensorSupplyType .. py:attribute:: adapter :type: Optional[tilelang.jit.adapter.BaseKernelAdapter] :value: None .. py:method:: __post_init__() Initialize tensor supply after dataclass initialization .. py:method:: with_default_adapter(adapter) .. py:method:: assert_allclose(reference_program, input_tensors = None, atol = 0.01, rtol = 0.01, max_mismatched_ratio=0.01) Validates kernel output against a reference implementation. :param reference_program: Reference implementation to compare against :param input_tensors: Optional pre-generated input tensors :param atol: Absolute tolerance for comparison :param rtol: Relative tolerance for comparison :param max_mismatched_ratio: Maximum allowed ratio of mismatched elements .. py:method:: manual_assert_close(reference_program, input_tensors = None, manual_check_prog = None) Validates kernel output against a reference implementation. :param reference_program: Reference implementation to compare against :param input_tensors: Optional pre-generated input tensors :param atol: Absolute tolerance for comparison :param rtol: Relative tolerance for comparison :param max_mismatched_ratio: Maximum allowed ratio of mismatched elements .. py:method:: assert_consistent(repeat=10) Checks for kernel consistency across multiple runs. :param repeat: Number of times to repeat the consistency check .. py:method:: run_once(func = None) .. py:method:: determine_profiler(func = None) Determines which profiler backend to use based on function type. :param func: Function to be profiled :param profiler: Explicitly specified profiler type or "auto" for automatic detection :returns: The determined profiler type ("torch" or "tvm") :rtype: str .. py:method:: do_bench(func = None, warmup = 25, rep = 100, n_warmup = 1, n_repeat = 1, input_tensors = None) Benchmarks the execution time of a given function. :param func: Function to benchmark (uses adapter if None) :param warmup: Warmup time in milliseconds :param rep: Number of repetitions for timing :param n_warmup: Number of warmup iterations :param n_repeat: Number of timing iterations :param profiler: Which profiling backend to use :param input_tensors: Optional pre-generated input tensors :returns: Average execution time in milliseconds :rtype: float .. py:property:: func .. py:method:: __call__(*args, **kwds)