Arrays and Loops¶

Another common datatype in C is arrays. Reasoning about memory ownership for arrays is more difficult than for the datatypes we have seen so far, for two reasons: (1) C allows the programmer to access arrays using computed pointers, and (2) the size of an array does not need to be known as a constant at compile time.

To support reasoning about code manipulating arrays and computed pointers, CN has iterated resources. For instance, to specify ownership of an int array with 10 cells starting at pointer p, CN uses the following iterated resource:

each (i32 i; 0i32 <= i && i < 10i32)
{ Owned<int>(array_shift<int>(p,i)) }

In detail, this can be read as follows:

for each integer i of CN type i32, …
if i is between 0 and 10, …
assert ownership of a resource Owned<int> …
for cell i of the array with base-address p.

Here array_shift<int>(p,i) computes a pointer into the array at pointer p, appropriately offset for index i.

In general, iterated resource specifications take the form

each (BT Q; GUARD) { RESOURCE }

comprising three parts:

BT Q, for some CN type BT and name Q, introduces the quantifier Q of basetype BT, which is bound in GUARD and RESOURCE;
GUARD is a boolean-typed expression delimiting the instances of Q for which ownership is asserted; and
RESOURCE is any non-iterated CN resource.

First array example¶

Let’s see how this applies to a simple array-manipulating function. Function read takes three arguments: the base pointer p of an int array, the length n of the array, and an index i into the array; read then returns the value of the i-th array cell.

exercises/array_load.broken.c

int read (int *p, int n, int i)
/*@ requires take A = each(i32 j; 0i32 <= j && j < n) { 
                        Owned<int>(array_shift<int>(p,j)) };
             0i32 <= i && i < n; 
    ensures take A_post = each(i32 j; 0i32 <= j && j < n) { 
                        Owned<int>(array_shift<int>(p,j)) }; 
@*/
{
  return p[i];
}

The CN precondition requires

ownership of the array on entry — one Owned<int> resource for each array index between 0 and n — and
that i lies within the range of owned indices.

On exit the array ownership is returned again.

This specification, in principle, should ensure that the access p[i] is safe. However, running CN on the example produces an error: CN is unable to find the required ownership for reading p[i].

cn verify solutions/array_load.broken.c
[1/1]: read
build/solutions/array_load.broken.c:5:10: error: Missing resource for reading
return p[i];
^~~~
Resource needed: Owned<signed int>(array_shift<signed int>(p, (u64)i))

The reason is that, when searching for a required resource, such as the Owned resource for p[i] here, CN’s resource inference does not consider iterated resources. Quantifiers, as used by iterated resources, can make verification undecidable, so, in order to maintain predictable type checking, CN delegates this aspect of the reasoning to the user.

To make the Owned resource required for accessing p[i] available to CN’s resource inference we have to explicitly "extract" ownership for index i out of the iterated resource.

exercises/array_load.c

int read (int *p, int n, int i)
/*@ requires take A = each(i32 j; 0i32 <= j && j < n) { 
                        Owned<int>(array_shift<int>(p,j)) };
             0i32 <= i && i < n; 
    ensures take A_post = each(i32 j; 0i32 <= j && j < n) { 
                        Owned<int>(array_shift<int>(p,j)) }; 
@*/
{
  /*@ extract Owned<int>, i; @*/
  return p[i];
}

Here the CN comment /*@ extract Owned<int>, i; @*/ is a proof hint in the form of a "ghost statement" that instructs CN to instantiate any available iterated Owned<int> resource for index i. In our example this operation splits the iterated resource into two:

each(i32 j; 0i32 <= j && j < n) { Owned<int>(array_shift<int>(p,j)) }

is split into

the instantiation of the iterated resource at i

Owned<int>(array_shift<int>(p,i))

the remainder of the iterated resource, the ownership for all indices except i

  each(i32 j; 0i32 <= j && j < n && j != i)
  { Owned<int>(array_shift<int>(p,j)) }

After this extraction step, CN can use the (former) extracted resource to justify the access p[i]. Note that an extract statement's second argument can be any arithmetic expression, not just a single identifier like in this example.

Following an extract statement, CN remembers the extracted index and can automatically "reverse" the extraction when needed: after type checking the access p[i] CN must ensure the function’s postcondition holds, which needs the full array ownership again (including the extracted index i); remembering the index i, CN then automatically merges resources (1) and (2) again to obtain the required full array ownership, and completes the verification of the function.

So far the specification only guarantees safe execution but does not specify the behaviour of read. To address this, let’s return to the iterated resources in the function specification. When we specify take A = each ... here, what is A? In CN, the output of an iterated resource is a map from indices to resource outputs. In this example, where index j has CN type i32 and the iterated resource is Owned<int>, the output A is a map from i32 indices to i32 values — CN type map<i32,i32>. If the type of j was i64 and the resource Owned<char>, A would have type map<i64,u8>.

We can use this to refine our specification with information about the functional behaviour of read.

exercises/array_load2.c

int read (int *p, int n, int i)
/*@ requires take a1 = each(i32 j; 0i32 <= j && j < n) { Owned<int>(array_shift<int>(p,j)) };
             0i32 <= i && i < n;
    ensures take a2 = each(i32 j; 0i32 <= j && j < n) { Owned<int>(array_shift<int>(p,j)) };
            a1 == a2;
            return == a1[i];
@*/
{
  /*@ extract Owned<int>, i; @*/
  return p[i];
}

We specify that read does not change the array — the outputs of Owned, A and A_post, taken before and after running the function, are the same — and that the value returned is A[i].

Exercises¶

Array read two. Specify and verify the following function, array_read_two, which takes the base pointer p of an unsigned int array, the array length n, and two indices i and j. Assuming i and j are different, it returns the sum of the values at these two indices.

exercises/add_two_array.c

unsigned int array_read_two (unsigned int *p, int n, int i, int j)
{
  unsigned int tmp1 = p[i];
  unsigned int tmp2 = p[j];
  return (tmp1 + tmp2);
}

Swap array. Specify and verify swap_array, which swaps the values of two cells of an int array. Assume again that i and j are different, and describe the effect of swap_array on the array value using the CN map update expression a[i:v], which denotes the same map as a, except with index i updated to v.

exercises/swap_array.c

void swap_array (int *p, int n, int i, int j)
{
  int tmp = p[i];
  p[i] = p[j];
  p[j] = tmp;
}

Loops¶

The array examples covered so far manipulate one or two individual cells of an array. Another typical pattern in code working over arrays is to loop, uniformly accessing all cells of an array or a sub-range of it.

In order to verify code with loops, CN requires the user to supply loop invariants -- CN specifications of all owned resources and the constraints required to verify each iteration of the loop.

Let's take a look at a simple first example. The following function, init_array, takes the base pointer p of a char array and the array length n and writes 0 to each array cell.

exercises/init_array.c

void init_array (char *p, unsigned int n)
/*@ requires take A = each(u32 i; i < n) { 
                         Owned<char>( array_shift<char>(p, i)) };
    ensures  take A_post = each(u32 i; i < n) { 
                             Owned<char>( array_shift<char>(p, i)) };
@*/
{
  unsigned int j = 0;
  while (j < n)
  {
    p[j] = 0;
    j++;
  }
}

If, for the moment, we focus just on proving safe execution of init_array, ignoring its functional behaviour, a specification might look as above: on entry, init_array takes ownership of an iterated Owned<char> resource -- one Owned resource for each index i of type u32 (so necessarily greater or equal to 0) up to n; on exit init_array returns the ownership.

To verify this, we have to supply a loop invariant that specifies all resource ownership and the necessary constraints that hold before and after each iteration of the loop. Loop invariants are specified using the keyword inv, followed by CN specifications using the same syntax as in function pre- and postconditions. The variables in scope for loop invariants are all in-scope C variables, as well as CN variables introduced in the function precondition. In loop invariants, the name of a C variable refers to its current value (more on this shortly).

solutions/init_array.c

void init_array (char *p, unsigned int n)
/*@ requires take A = each(u32 i; i < n) { 
                         Owned<char>( array_shift<char>(p, i)) };
    ensures  take A_post = each(u32 i; i < n) { 
                             Owned<char>( array_shift<char>(p, i)) };
@*/
{
  unsigned int j = 0;
  while (j < n)
  /*@ inv take Ai = each(u32 i; i < n) { 
                      Owned<char>( array_shift<char>(p, i)) };
          {p} unchanged; 
          {n} unchanged;
  @*/
  {
    /*@ extract Owned<char>, j; @*/
    p[j] = 0;
    j++;
  }
}

The main condition here is unsurprising: we specify ownership of an iterated resource for an array just like in the the pre- and postcondition.

The second thing we need to do, however, is less straightforward. Recall that, as discussed at the start of the tutorial, function arguments in C are mutable. Although, in this example, it is obvious that p and n do not change, CN currently requires the loop invariant to explicitly state this, using special notation {p} unchanged (and similarly for n).

Note. If we forget to specify unchanged, this can lead to confusing errors. In this example, for instance, CN would verify the loop against the loop invariant, but would be unable to prove a function postcondition seemingly directly implied by the loop invariant (lacking the information that the postcondition's p and n are the same as the loop invariant's). Future CN versions may handle loop invariants differently and treat variables as immutable by default.

The final piece needed in the verification is an extract statement, as used in the previous examples: to separate the individual Owned<char> resource for index j out of the iterated Owned resource and make it available to the resource inference, we specify extract Owned<char>, j;.

With the inv and extract statements in place, CN accepts the function.

Second loop example¶

The specification of init_array is overly strong: it requires an iterated Owned resource for the array on entry. If, as the name suggests, the purpose of init_array is to initialise the array, then a precondition asserting only an iterated Block resource for the array should also be sufficient. The modified specification is then as follows.

exercises/init_array2.c

void init_array2 (char *p, unsigned int n)
/*@ requires take A = each(u32 i; i < n) { 
                        Block<char>( array_shift<char>(p, i)) };
    ensures  take A_post = each(u32 i; i < n) { 
                             Owned<char>( array_shift<char>(p, i)) };
@*/
{
  unsigned int j = 0;
  while (j < n)
  {
    p[j] = 0;
    j++;
  }
}

This specification should hold: assuming ownership of an uninitialised array on entry, each iteration of the loop initialises one cell of the array, moving it from Block to Owned "state", so that on function return the full array is initialised. (Recall that stores only require Block ownership of the written memory location, i.e., ownership of not-necessarily-initialised memory.)

To verify this modified example we again need a loop Invariant. But this time the loop invariant is more involved: since each iteration of the loop initialises one more array cell, the loop invariant has to do precise book-keeping of the initialisation status of the different sections of the array.

To do this, we partition the array ownership into two parts: for each index of the array the loop has already visited, we have an Owned resource, for all other array indices we have the (unchanged) Block ownership.

solutions/init_array2.c

void init_array2 (char *p, unsigned int n)
/*@ requires take A = each(u32 i; i < n) { 
                        Block<char>( array_shift<char>(p, i)) };
    ensures  take A_post = each(u32 i; i < n) { 
                             Owned<char>( array_shift<char>(p, i)) };
@*/
{
  unsigned int j = 0;
  while (j < n)
  /*@ inv take Al = each(u32 i; i < j) { 
                      Owned<char>( array_shift<char>(p, i)) };
          take Ar = each(u32 i; j <= i && i < n) { 
                      Block<char>( array_shift<char>(p, i)) };
          {p} unchanged; {n} unchanged;
          j <= n;
  @*/
  {
    /*@ extract Block<char>, j; @*/
    /*@ extract Owned<char>, j; @*/
    p[j] = 0;
    j++;
  }
}

Let's go through this line-by-line:

We assert ownership of an iterated Owned resource, one for each index i strictly smaller than loop variable j.
All remaining indices i, between j and n are still uninitialised, so part of the iterated Block resource.
As in the previous example, we assert p and n are unchanged.
Finally, unlike in the previous example, this loop invariant involves j. We therefore also need to know that j does not exceed the array length n. Otherwise CN would not be able to prove that, on completing the last loop iteration, j=n holds. This, in turn, is needed to show that, when the function returns, ownership of the iterated Owned resource --- as specified in the loop invariant --- is fully consumed by the function's post-condition and there is no left-over unused resource.

As before, we also have to instruct CN to extract ownership of individual array cells out of the iterated resources:

to allow CN to extract the individual Block to be written, we use extract Block<char>, j;;
the store returns a matching Owned<char> resource for index j;
finally, we add extract Owned<char>, j; to allow CN to "attach" this resource to the iterated Owned resource. CN issues a warning, because nothing is, in fact, extracted: we are using extract only for the "reverse" direction.

Exercises¶

Init array reverse. Verify the function init_array_rev, which has the same specification as init_array2, but initializes the array in decreasing index order (from right to left).

exercises/init_array_rev.c

void init_array_rev (char *p, unsigned int n)
/*@ requires take A = each(u32 i; i < n) { 
                        Block<char>( array_shift<char>(p, i)) };
    n > 0u32;
    ensures  take A_post = each(u32 i; i < n) { 
                             Owned<char>( array_shift<char>(p, i)) };
@*/
{
  unsigned int j = 0;
  while (j < n)
  {
    p[n-(j+1)] = 0;
    j++;
  }
}