tilelang.contrib.nvcc

Utility to invoke nvcc compiler in the system

Functions

compile_cuda(code[, target_format, arch, options, ...])

Compile cuda code with NVCC from env.

find_cuda_path()

Utility function to find cuda path

get_cuda_version([cuda_path])

Utility function to get cuda version

tilelang_callback_cuda_compile(code, target)

use nvcc to generate fatbin code for better optimization

find_libdevice_path(arch)

Utility function to find libdevice

callback_libdevice_path(arch)

get_target_compute_version([target])

Utility function to get compute capability of compilation target.

parse_compute_version(compute_version)

Parse compute capability string to divide major and minor version

have_fp16(compute_version)

Either fp16 support is provided in the compute capability or not

have_int8(compute_version)

Either int8 support is provided in the compute capability or not

have_tensorcore([compute_version, target])

Either TensorCore support is provided in the compute capability or not

have_cudagraph()

Either CUDA Graph support is provided

have_bf16(compute_version)

Either bf16 support is provided in the compute capability or not

have_fp8(compute_version)

Whether fp8 support is provided in the specified compute capability or not

have_tma(target)

Whether TMA support is provided in the specified compute capability or not

get_nvcc_compiler()

Get the path to the nvcc compiler

Module Contents

tilelang.contrib.nvcc.compile_cuda(code, target_format='ptx', arch=None, options=None, path_target=None, verbose=False)

Compile cuda code with NVCC from env.

Parameters:
  • code (str) – The cuda code.

  • target_format (str) – The target format of nvcc compiler.

  • arch (str) – The cuda architecture.

  • options (str or list of str) – The additional options.

  • path_target (str, optional) – Output file.

Returns:

cubin – The bytearray of the cubin

Return type:

bytearray

tilelang.contrib.nvcc.find_cuda_path()

Utility function to find cuda path

Returns:

path – Path to cuda root.

Return type:

str

tilelang.contrib.nvcc.get_cuda_version(cuda_path=None)

Utility function to get cuda version

Parameters:

cuda_path (Optional[str]) – Path to cuda root. If None is passed, will use find_cuda_path() as default.

Returns:

version – The cuda version

Return type:

float

tilelang.contrib.nvcc.tilelang_callback_cuda_compile(code, target)

use nvcc to generate fatbin code for better optimization

tilelang.contrib.nvcc.find_libdevice_path(arch)

Utility function to find libdevice

Parameters:

arch (int) – The compute architecture in int

Returns:

path – Path to libdevice.

Return type:

str

tilelang.contrib.nvcc.callback_libdevice_path(arch)
tilelang.contrib.nvcc.get_target_compute_version(target=None)

Utility function to get compute capability of compilation target.

Looks for the target arch in three different places, first in the target input, then the Target.current() scope, and finally the GPU device (if it exists).

Parameters:

target (tvm.target.Target, optional) – The compilation target

Returns:

compute_version – compute capability of a GPU (e.g. “8.6” or “9.0”)

Return type:

str

tilelang.contrib.nvcc.parse_compute_version(compute_version)

Parse compute capability string to divide major and minor version

Parameters:

compute_version (str) – compute capability of a GPU (e.g. “6.0”)

Returns:

  • major (int) – major version number

  • minor (int) – minor version number

tilelang.contrib.nvcc.have_fp16(compute_version)

Either fp16 support is provided in the compute capability or not

Parameters:

compute_version (str) – compute capability of a GPU (e.g. “6.0”)

tilelang.contrib.nvcc.have_int8(compute_version)

Either int8 support is provided in the compute capability or not

Parameters:

compute_version (str) – compute capability of a GPU (e.g. “6.1”)

tilelang.contrib.nvcc.have_tensorcore(compute_version=None, target=None)

Either TensorCore support is provided in the compute capability or not

Parameters:
  • compute_version (str, optional) – compute capability of a GPU (e.g. “7.0”).

  • target (tvm.target.Target, optional) – The compilation target, will be used to determine arch if compute_version isn’t specified.

tilelang.contrib.nvcc.have_cudagraph()

Either CUDA Graph support is provided

tilelang.contrib.nvcc.have_bf16(compute_version)

Either bf16 support is provided in the compute capability or not

Parameters:

compute_version (str) – compute capability of a GPU (e.g. “8.0”)

tilelang.contrib.nvcc.have_fp8(compute_version)

Whether fp8 support is provided in the specified compute capability or not

Parameters:

compute_version (str) – GPU capability

tilelang.contrib.nvcc.have_tma(target)

Whether TMA support is provided in the specified compute capability or not

Parameters:

target (tvm.target.Target) – The compilation target

tilelang.contrib.nvcc.get_nvcc_compiler()

Get the path to the nvcc compiler

Return type:

str