-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Add options to enable/disable the OpenCL/RenderScript compiler optimizations.
On RenderScript, there are three optimization levels:
#pragma rs_fp_full - No optimizations, full IEEE 754-2008 compliance
#pragma rs_fp_relaxed - flush to zero denorms and round towards zero optimizations
#pragma rs_fp_imprecise - rs_fp_relaxed, NaN and +-inf operations undefined and -0.0 can return +0.0 instead optimizations.
OpenCL has optimization flags that can simulate the three RenderScript levels, and can be specified in the ParallelME::Program() constructor (the options of lower levels are preserved in each new level):
-cl-strict-aliasing - full IEEE 754-2008 compliance, assumes strict aliasing
-cl-single-precision-constant -cl-denorms-are-zero - flush to zero denorms and treat double precision constants as single precision optimizations
-cl-fast-relaxed-math -cl-no-signed-zeros -cl-mad-enable - NaN and +-inf operations are undefined, -0.0 can return +0.0 and mad optimizations
Some OpenCL optimizations made the compiler crash. This needs to be tested.