Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Subnormal number
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Disabling subnormal floats at the code level {{anchor|Disabling_denormal_floats_at_the_code_level}} == === Intel SSE === Intel's C and Fortran compilers enable the {{code|DAZ}} (denormals-are-zero) and {{code|FTZ}} (flush-to-zero) flags for [[Streaming_SIMD_Extensions|SSE]] by default for optimization levels higher than {{code|-O0}}.<ref>{{cite web |url=http://software.intel.com/sites/products/documentation/hpc/composerxe/en-us/2011Update/fortran/win/fpops/common/fpops_reduce_denorm.htm |title=Intel® MPI Library – Documentation |publisher=Intel}}</ref> The effect of {{code|DAZ}} is to treat subnormal input arguments to floating-point operations as zero, and the effect of {{code|FTZ}} is to return zero instead of a subnormal float for operations that would result in a subnormal float, even if the input arguments are not themselves subnormal. [[clang]] and [[GNU Compiler Collection|gcc]] have varying default states depending on platform and optimization level. A non-[[C99]]-compliant method of enabling the {{code|DAZ}} and {{code|FTZ}} flags on targets supporting SSE is given below, but is not widely supported. It is known to work on [[Mac OS X]] since at least 2006.<ref>{{cite web |url=https://lists.apple.com/archives/perfoptimization-dev/2006/May/msg00013.html |archive-url=https://web.archive.org/web/20160826010613/https://lists.apple.com/archives/perfoptimization-dev/2006/May/msg00013.html |url-status=dead |archive-date=2016-08-26 |title=Re: Macbook pro performance issue |publisher=Apple Inc.}}</ref> <syntaxhighlight lang="c"> #include <fenv.h> #pragma STDC FENV_ACCESS ON // Sets DAZ and FTZ, clobbering other CSR settings. // See https://opensource.apple.com/source/Libm/Libm-287.1/Source/Intel/, fenv.c and fenv.h. fesetenv(FE_DFL_DISABLE_SSE_DENORMS_ENV); // fesetenv(FE_DFL_ENV) // Disable both, clobbering other CSR settings. </syntaxhighlight> For other x86-SSE platforms where the C library has not yet implemented this flag, the following may work:<ref>{{cite web |url=http://lists.apple.com/archives/perfoptimization-dev/2007/Jun/msg00025.html |title=Re: Changing floating point state (Was: double vs float performance) |publisher=Apple Inc. |access-date=2013-01-24 |archive-url=https://web.archive.org/web/20140115124313/http://lists.apple.com/archives/perfoptimization-dev/2007/Jun/msg00025.html |archive-date=2014-01-15 |url-status=dead }}</ref> <syntaxhighlight lang="c"> #include <xmmintrin.h> _mm_setcsr(_mm_getcsr() | 0x0040); // DAZ _mm_setcsr(_mm_getcsr() | 0x8000); // FTZ _mm_setcsr(_mm_getcsr() | 0x8040); // Both _mm_setcsr(_mm_getcsr() & ~0x8040); // Disable both </syntaxhighlight> The {{code|_MM_SET_DENORMALS_ZERO_MODE}} and {{code|_MM_SET_FLUSH_ZERO_MODE}} macros wrap a more readable interface for the code above.<ref>{{cite web |url=https://software.intel.com/sites/default/files/ae/4f/6320 |title=C++ Compiler for Linux* Systems User's Guide |publisher=Intel}}</ref> <syntaxhighlight lang="c"> // To enable DAZ #include <pmmintrin.h> _MM_SET_DENORMALS_ZERO_MODE(_MM_DENORMALS_ZERO_ON); // To enable FTZ #include <xmmintrin.h> _MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON); </syntaxhighlight> Most compilers will already provide the previous macro by default, otherwise the following code snippet can be used (the definition for FTZ is analogous): <syntaxhighlight lang="c"> #define _MM_DENORMALS_ZERO_MASK 0x0040 #define _MM_DENORMALS_ZERO_ON 0x0040 #define _MM_DENORMALS_ZERO_OFF 0x0000 #define _MM_SET_DENORMALS_ZERO_MODE(mode) _mm_setcsr((_mm_getcsr() & ~_MM_DENORMALS_ZERO_MASK) | (mode)) #define _MM_GET_DENORMALS_ZERO_MODE() (_mm_getcsr() & _MM_DENORMALS_ZERO_MASK) </syntaxhighlight> The default denormalization behavior is mandated by the [[Application_binary_interface|ABI]], and therefore well-behaved software should save and restore the denormalization mode before returning to the caller or calling code in other libraries. === ARM === {{Unreferenced section|date=March 2023}} AArch32 NEON (SIMD) FPU always uses a flush-to-zero mode{{cn|date=October 2024}}, which is the same as {{code|FTZ + DAZ}}. For the scalar FPU and in the AArch64 SIMD, the flush-to-zero behavior is optional and controlled by the {{code|FZ}} bit of the control register – FPSCR in Arm32 and FPCR in AArch64.<ref>{{cite web |url=https://developer.arm.com/documentation/ddi0595/2021-06/AArch64-Registers/FPCR--Floating-point-Control-Register?lang=en#fieldset_0-24_24 |title=Aarch64 Registers |publisher=Arm}}</ref> One way to do this can be: <syntaxhighlight lang="c"> #if defined(__arm64__) || defined(__aarch64__) uint64_t fpcr; asm( "mrs %0, fpcr" : "=r"( fpcr )); //Load the FPCR register asm( "msr fpcr, %0" :: "r"( fpcr | (1 << 24) )); //Set the 24th bit (FTZ) to 1 #endif </syntaxhighlight> Some ARM processors have hardware handling of subnormals.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)