Inlay

On some GPUs (eg GCN/RDNA) 24bit integer muls/mads are 4x faster than 32bit ones. 24bit instructions are not exposed in HLSL/GLSL but you can encourage the compiler to use the 24bit intrinsics by zeroing the 8 most significant bits, if the range allows it, for faster integer muls/mads.