tilelang.contrib.cutedsl.ieee_math ================================== .. py:module:: tilelang.contrib.cutedsl.ieee_math .. autoapi-nested-parse:: IEEE-754 compliant floating-point operations with explicit rounding modes. These correspond to CUDA __fadd_rn, __fsub_rz, etc. Implemented via inline PTX to ensure exact rounding mode compliance. Rounding modes: rn (nearest), rz (toward zero), rm (toward -inf), rp (toward +inf) Functions --------- .. autoapisummary:: tilelang.contrib.cutedsl.ieee_math.ieee_fadd tilelang.contrib.cutedsl.ieee_math.ieee_fsub tilelang.contrib.cutedsl.ieee_math.ieee_fmul tilelang.contrib.cutedsl.ieee_math.ieee_fmaf tilelang.contrib.cutedsl.ieee_math.ieee_frcp tilelang.contrib.cutedsl.ieee_math.ieee_fsqrt tilelang.contrib.cutedsl.ieee_math.ieee_fdiv Module Contents --------------- .. py:function:: ieee_fadd(a, b, rounding='rn') IEEE-754 add with explicit rounding mode. .. py:function:: ieee_fsub(a, b, rounding='rn') IEEE-754 subtract with explicit rounding mode. .. py:function:: ieee_fmul(a, b, rounding='rn') IEEE-754 multiply with explicit rounding mode. .. py:function:: ieee_fmaf(a, b, c, rounding='rn') IEEE-754 fused multiply-add with explicit rounding mode. .. py:function:: ieee_frcp(a, rounding='rn') IEEE-754 reciprocal with explicit rounding mode. .. py:function:: ieee_fsqrt(a, rounding='rn') IEEE-754 square root with explicit rounding mode. .. py:function:: ieee_fdiv(a, b, rounding='rn') IEEE-754 divide with explicit rounding mode.