tilelang.contrib.cutedsl.ieee_math¶
IEEE-754 compliant floating-point operations with explicit rounding modes.
These correspond to CUDA __fadd_rn, __fsub_rz, etc. Implemented via inline PTX to ensure exact rounding mode compliance.
Rounding modes: rn (nearest), rz (toward zero), rm (toward -inf), rp (toward +inf)
Functions¶
|
IEEE-754 add with explicit rounding mode. |
|
IEEE-754 subtract with explicit rounding mode. |
|
IEEE-754 multiply with explicit rounding mode. |
|
IEEE-754 fused multiply-add with explicit rounding mode. |
|
IEEE-754 reciprocal with explicit rounding mode. |
|
IEEE-754 square root with explicit rounding mode. |
|
IEEE-754 divide with explicit rounding mode. |
Module Contents¶
- tilelang.contrib.cutedsl.ieee_math.ieee_fadd(a, b, rounding='rn')¶
IEEE-754 add with explicit rounding mode.
- tilelang.contrib.cutedsl.ieee_math.ieee_fsub(a, b, rounding='rn')¶
IEEE-754 subtract with explicit rounding mode.
- tilelang.contrib.cutedsl.ieee_math.ieee_fmul(a, b, rounding='rn')¶
IEEE-754 multiply with explicit rounding mode.
- tilelang.contrib.cutedsl.ieee_math.ieee_fmaf(a, b, c, rounding='rn')¶
IEEE-754 fused multiply-add with explicit rounding mode.
- tilelang.contrib.cutedsl.ieee_math.ieee_frcp(a, rounding='rn')¶
IEEE-754 reciprocal with explicit rounding mode.
- tilelang.contrib.cutedsl.ieee_math.ieee_fsqrt(a, rounding='rn')¶
IEEE-754 square root with explicit rounding mode.
- tilelang.contrib.cutedsl.ieee_math.ieee_fdiv(a, b, rounding='rn')¶
IEEE-754 divide with explicit rounding mode.