The following examples demonstrate typical difficulties in statically analyzing control/command programs. For illustrative purposes, the code snippets have been deliberately constructed to be simple. One of the main challenges in implementing a sound analysis is, of course, to ensure that it scales up to large real-world applications. Astrée takes care of that for you.
Control/command programs, in particular synchronous ones, manipulate thousands of boolean variables. Analyzing which program runtime properties hold when each boolean variable is either true or false rapidly leads to a combinatorial explosion of the number of cases to be considered.
Sophisticated algorithms enable Astrée to handle thousands of boolean variables with just enough precision to avoid both false alarms and combinatorial explosion.
Here’s a simple example:
/* boolean.c */ typedef enum {FALSE = 0, TRUE = 1} BOOLEAN; BOOLEAN B; void main () { unsigned int X, Y; while (1) { /* ... */ B = (X == 0); /* ... */ if (!B) { Y = 1 / X; }; /* ... */ }; }
The analysis of the above code by Astrée yields no warnings, thanks to the
automatically determined relationship between B
and X
.
Integer divide-by-zero can never happen when executing this program.
Command programs controlling complex physical systems are derived from mathematical models designed with real numbers whereas computer programs perform floating-point computations. The two computation models are completely different, which can yield very surprising results.
Consider the following two programs:
/* float-error.c */ int main () { float x, y, z, r; x = 1.000000019e+38; y = x + 1.0e21; z = x - 1.0e21; r = y - z; printf("%f\n", r); } % gcc float-error.c % ./a.out 0.000000
/* double-error.c */ int main () { double x; float y, z, r; /* x = ldexp(1.,50)+ldexp(1.,26); */ x = 1125899973951488.0; y = x + 1; z = x - 1; r = y - z; printf("%f\n", r); } % gcc double-error.c % ./a.out 134217728.000000
One would expect these programs to print 2.0e21
and 2.0
,
respectively, based on the reasoning that (x+a)-(x-a) = 2a
.
However, this reasoning turns out to be wrong due to rounding errors. Astrée
recognizes that automatically.
Astrée is sound for floating-point computations and handles them precisely and safely. It takes all possible rounding errors into account. It will even consider possible cumulative effects in programs running for hours.
Here’s another example:
/* float.c */ void main () { float x,y,z; /* ... code to initialize x,y,z ... */ if ( (((*((unsigned*)&x) & 0x7f800000) >> 23) != 255 ) /* not NaN */ && (x >= -1.0e38) && (x <= 1.0e38) ) { while (1) { y = x+1.0e21; z = x-1.0e21; x = y-z; }} else return; }
Astrée proves that the above program is free of runtime errors when running on a machine with floats on 32 bits.
Likewise, Astrée handles possible positive and negative infinity
as well NaN
values for variables, and can track their effects
through arithmetic calculations and comparisons. This is used to precisely analyze
source code with special handling of these values.
Control/command programs perform lots of digital filtering, as shown by the following example:
/* filter.c */ typedef enum {FALSE = 0, TRUE = 1} BOOLEAN; BOOLEAN INIT; float P, X; void filter () { static float E[2], S[2]; if (INIT) { S[0] = X; P = X; E[0] = X; } else { P = (((((0.5*X)-(E[0]*0.7))+(E[1]*0.4))+(S[0]*1.5))-(S[1]*0.7)); } E[1] = E[0]; E[0] = X; S[1] = S[0]; S[0] = P; } void main () { X = 0.2 * X + 5; INIT = TRUE; while (1) { X = 0.9 * X + 35; filter (); INIT = FALSE; } }
The absence of overflow (and more precisely that P
is in
[-1327.05, 1327.05]
as found by Astrée)
is not obvious, in particular because of 32-bit floating-point computations.
The situation becomes even more inextricable in the presence of boolean control
or cascades of filters. Astrée knows enough about control theory to
analyze filters precisely.
The idea of signal flow analysis is to analyze the flow from input signals to output signals through complex software. Taint analysis is a good fit for this, using input signals as taint sources and output signals as taint sinks.
The following depicts a small signal-flow example:
__ASTREE_taint_add((input; "INPUT")); __ASTREE_taint_sink((output; all)); auto a = input; // (1) f(&b, a); // f: *b <- a (2) if (b) ... // (3) output = b + ...; // (4)
The input variable is tainted by the hue INPUT
, and the output variable
is declared as a sink, specifying that all possible signal influences are relevant
for the output. Alternatively, a list of a relevant subset of input signals can be specified.
The taint directives can be included as an externally-stored analysis configuration, without modifying the actual code.
The taint analysis now tracks signal propagation in simple direct cases like
(1)
, indirect flow via pointers as in (2)
, as well as
through complex computation exemplified by (4)
.
Whenever the output taint sink is reached by a taint, an alarm is reported by Astrée.
Optionally, the influence of signals on control flow decisions can be reported
such as in (3)
.
The taint analysis builds upon a full-fledged semantic analysis of the software, thus taking into account the influence of application configuration parameters on the signal flow. The analysis can be performed for a specific application configuration, avoiding false signal flows not present in that configuration. Alternatively, the influence of application parameters on the output signal can be determined for a generic configuration.
If Astrée does not report an influence of a tainted input signal on an output signal declared as taint sink, independence has been proven.