Arrays and loops, verified¶
Recall this specification for a simple function that reads from an array:
unsigned int read (unsigned int *p, unsigned int n, unsigned int i)
/*@ requires take A = each(u32 j; j < n)
{ RW<unsigned int>(array_shift<unsigned int>(p,j)) };
i < n;
ensures take A_post = each(u32 j; j < n)
{ RW<unsigned int>(array_shift<unsigned int>(p,j)) };
return == A[i];
@*/
{
return p[i];
}
This specification should ensure that the access p[i]
is safe. However, running cn verify
on the example produces an
error: CN is unable to find the required ownership for reading p[i]
.
[1/1]: read -- fail
solutions/array_load.test.c:10:10: error: `&p[(u64)i]` out of bounds
return p[i];
^~~~
(UB missing short message): UB_CERB004_unspecified__pointer_add
The reason is that, when searching for a required resource, such as
the RW
resource for p[i]
here, CN’s resource inference does not
consider iterated resources. Quantifiers, as used by iterated
resources, can make verification undecidable, so, in order to maintain
predictable type checking, CN delegates this aspect of the reasoning
to the user.
To make the RW
resource required for accessing p[i]
available to
CN’s resource inference, we have to explicitly "focus
" ownership for
index i
out of the iterated resource.
unsigned int read (unsigned int *p, unsigned int n, unsigned int i)
/*@ requires take A = each(u32 j; j < n)
{ RW<unsigned int>(array_shift<unsigned int>(p,j)) };
i < n;
ensures take A_post = each(u32 j; j < n)
{ RW<unsigned int>(array_shift<unsigned int>(p,j)) };
return == A[i];
@*/
{
/*@ focus RW<unsigned int>, i; @*/
return p[i];
}
The CN comment /*@ focus RW<unsigned int>, i; @*/
is a proof hint in the form of a "ghost statement" that instructs CN to instantiate any available iterated RW<unsigned int>
resource for index i
. In our example this operation splits this iterated resource
into two parts:
- the instantiation of the iterated resource at
i
- the remainder of the iterated resource, the ownership for all indices except
i
After this extraction step, CN can use the (former) extracted resource to justify the access p[i]
. Note that an focus
statement's second argument can be any arithmetic expression, not just a single identifier like in this example.
Following a focus
, CN remembers the extracted index and can automatically "reverse" the extraction when needed, merging the resources when ownership of the full array is required.
Exercises¶
Exercise: Add focus
statements to verify your specification of array_swap
from last chapter.
void array_swap (int *p, int n, int i, int j)
{
int tmp = p[i];
p[i] = p[j];
p[j] = tmp;
}
Exercise: Specify and verify the following function, array_read_two
, which takes the base pointer p
of an unsigned int
array, the array length n
, and two indices i
and j
. Assuming i
and j
are different, it returns the sum of the values at these two indices.
unsigned int array_read_two (unsigned int *p, int n, int i, int j)
{
unsigned int tmp1 = p[i];
unsigned int tmp2 = p[j];
return (tmp1 + tmp2);
}
Loops¶
If we have array code with loops, CN requires the user to supply loop invariants — CN specifications of the RW
resources and constraints required to verify each iteration of the loop.
Let's take a look at a simple first example. The following function, array_init
, takes the base pointer p
of a char
array and the array length n
and writes 0
to each array cell.
void array_init (char *p, unsigned int n)
/*@ requires take A = each(u32 i; i < n) {
RW<char>( array_shift<char>(p, i)) };
ensures take A_post = each(u32 i; i < n) {
RW<char>( array_shift<char>(p, i)) };
@*/
{
unsigned int j = 0;
while (j < n)
{
p[j] = 0;
j++;
}
}
If we focus for now just on proving safe execution of array_init
, a specification might look as above. This closely resembles specifications we have seen previously, but now with char
s.
To verify this, we have to supply a loop invariant that specifies all resource ownership and the necessary constraints that hold before and after each iteration of the loop. Loop invariants are specified using the keyword inv
, followed by CN specifications using the same syntax as in function pre- and postconditions. The variables in scope for loop invariants are all in-scope C variables, as well as CN variables introduced in the function precondition. In loop invariants, the name of a C variable refers to its current value (more on this shortly).
void array_init (char *p, unsigned int n)
/*@ requires take A = each(u32 i; i < n) {
RW<char>( array_shift<char>(p, i)) };
ensures take A_post = each(u32 i; i < n) {
RW<char>( array_shift<char>(p, i)) };
@*/
{
unsigned int j = 0;
while (j < n)
/*@ inv take Ai = each(u32 i; i < n) {
RW<char>( array_shift<char>(p, i)) };
{p} unchanged;
{n} unchanged;
@*/
{
/*@ focus RW<char>, j; @*/
p[j] = 0;
j++;
}
}
The main condition here is unsurprising: we specify ownership of an iterated resource for an array just like in the the pre- and postcondition.
The second thing we need to do, however, is less straightforward. Recall that, as discussed at the start of the tutorial, function arguments in C are mutable. Although, in this example, it is obvious that p
and n
do not change, CN currently requires the loop invariant to explicitly state this, using special notation {p} unchanged
(and similarly for n
).
Note. If we forget to specify unchanged
, this can lead to confusing errors. In this example, for instance, CN would verify the loop against the loop invariant, but would be unable to prove a function postcondition seemingly directly implied by the loop invariant (lacking the information that the postcondition's p
and n
are the same as the loop invariant's). Future CN versions may handle loop invariants differently and treat variables as immutable by default.
The final piece needed in the verification is an focus
statement, as used in the previous examples: to separate the individual RW<char>
resource for index j
out of the iterated RW
resource and make it available to the resource inference, we specify focus RW<char>, j;
.
With the inv
and focus
statements in place, CN accepts the function.
Exercises¶
Init array reverse. Verify the function array_init_rev
, which has the same specification as array_init2
, but initializes the array in decreasing index order (from right to left).
void array_init_rev (char *p, unsigned int n)
/*@ requires take A = each(u32 i; i < n) {
RW<char>( array_shift<char>(p, i)) };
ensures take A_post = each(u32 i; i < n) {
RW<char>( array_shift<char>(p, i)) };
@*/
{
unsigned int j = 0;
while (j < n)
{
p[n-(j+1)] = 0;
j++;
}
}