Skip to content

Arrays and loops, verified

Recall this specification for a simple function that reads from an array:

exercises/array_load.test.c
unsigned int read (unsigned int *p, unsigned int n, unsigned int i)
/*@ requires take A = each(u32 j; j < n)
                      { RW<unsigned int>(array_shift<unsigned int>(p,j)) };
             i < n; 
    ensures take A_post = each(u32 j; j < n) 
                          { RW<unsigned int>(array_shift<unsigned int>(p,j)) };
            return == A[i];
@*/
{
  return p[i];
}

This specification should ensure that the access p[i] is safe. However, running cn verify on the example produces an error: CN is unable to find the required ownership for reading p[i].

[1/1]: read -- fail
solutions/array_load.test.c:10:10: error: `&p[(u64)i]` out of bounds
  return p[i];
         ^~~~
(UB missing short message): UB_CERB004_unspecified__pointer_add

The reason is that, when searching for a required resource, such as the RW resource for p[i] here, CN’s resource inference does not consider iterated resources. Quantifiers, as used by iterated resources, can make verification undecidable, so, in order to maintain predictable type checking, CN delegates this aspect of the reasoning to the user.

To make the RW resource required for accessing p[i] available to CN’s resource inference, we have to explicitly "focus" ownership for index i out of the iterated resource.

exercises/array_load.c
unsigned int read (unsigned int *p, unsigned int n, unsigned int i)
/*@ requires take A = each(u32 j; j < n)
                      { RW<unsigned int>(array_shift<unsigned int>(p,j)) };
             i < n; 
    ensures take A_post = each(u32 j; j < n) 
                          { RW<unsigned int>(array_shift<unsigned int>(p,j)) };
            return == A[i];
@*/
{
  /*@ focus RW<unsigned int>, i; @*/
  return p[i];
}

The CN comment /*@ focus RW<unsigned int>, i; @*/ is a proof hint in the form of a "ghost statement" that instructs CN to instantiate any available iterated RW<unsigned int> resource for index i. In our example this operation splits this iterated resource

each(u32 j; j < n) { RW<unsigned int>(array_shift<unsigned int>(p,j)) }

into two parts:

  1. the instantiation of the iterated resource at i
RW<unsigned int>(array_shift<unsigned int>(p,i))
  1. the remainder of the iterated resource, the ownership for all indices except i
  each(u32 j; j < n && j != i)
  { RW<unsigned int>(array_shift<unsigned int>(p,j)) }

After this extraction step, CN can use the (former) extracted resource to justify the access p[i]. Note that an focus statement's second argument can be any arithmetic expression, not just a single identifier like in this example.

Following a focus, CN remembers the extracted index and can automatically "reverse" the extraction when needed, merging the resources when ownership of the full array is required.

Exercises

Exercise: Add focus statements to verify your specification of array_swap from last chapter.

exercises/array_swap.c
void array_swap (int *p, int n, int i, int j)
{
  int tmp = p[i];
  p[i] = p[j];
  p[j] = tmp;
}

Exercise: Specify and verify the following function, array_read_two, which takes the base pointer p of an unsigned int array, the array length n, and two indices i and j. Assuming i and j are different, it returns the sum of the values at these two indices.

exercises/array_read_two.c
unsigned int array_read_two (unsigned int *p, int n, int i, int j)
{
  unsigned int tmp1 = p[i];
  unsigned int tmp2 = p[j];
  return (tmp1 + tmp2);
}

Loops

If we have array code with loops, CN requires the user to supply loop invariants — CN specifications of the RW resources and constraints required to verify each iteration of the loop.

Let's take a look at a simple first example. The following function, array_init, takes the base pointer p of a char array and the array length n and writes 0 to each array cell.

exercises/array_init.c
void array_init (char *p, unsigned int n)
/*@ requires take A = each(u32 i; i < n) { 
                         RW<char>( array_shift<char>(p, i)) };
    ensures  take A_post = each(u32 i; i < n) { 
                             RW<char>( array_shift<char>(p, i)) };
@*/
{
  unsigned int j = 0;
  while (j < n)
  {
    p[j] = 0;
    j++;
  }
}

If we focus for now just on proving safe execution of array_init, a specification might look as above. This closely resembles specifications we have seen previously, but now with chars.

To verify this, we have to supply a loop invariant that specifies all resource ownership and the necessary constraints that hold before and after each iteration of the loop. Loop invariants are specified using the keyword inv, followed by CN specifications using the same syntax as in function pre- and postconditions. The variables in scope for loop invariants are all in-scope C variables, as well as CN variables introduced in the function precondition. In loop invariants, the name of a C variable refers to its current value (more on this shortly).

solutions/array_init.c
void array_init (char *p, unsigned int n)
/*@ requires take A = each(u32 i; i < n) { 
                         RW<char>( array_shift<char>(p, i)) };
    ensures  take A_post = each(u32 i; i < n) { 
                             RW<char>( array_shift<char>(p, i)) };
@*/
{
  unsigned int j = 0;
  while (j < n)
  /*@ inv take Ai = each(u32 i; i < n) { 
                      RW<char>( array_shift<char>(p, i)) };
          {p} unchanged; 
          {n} unchanged;
  @*/
  {
    /*@ focus RW<char>, j; @*/
    p[j] = 0;
    j++;
  }
}

The main condition here is unsurprising: we specify ownership of an iterated resource for an array just like in the the pre- and postcondition.

The second thing we need to do, however, is less straightforward. Recall that, as discussed at the start of the tutorial, function arguments in C are mutable. Although, in this example, it is obvious that p and n do not change, CN currently requires the loop invariant to explicitly state this, using special notation {p} unchanged (and similarly for n).

Note. If we forget to specify unchanged, this can lead to confusing errors. In this example, for instance, CN would verify the loop against the loop invariant, but would be unable to prove a function postcondition seemingly directly implied by the loop invariant (lacking the information that the postcondition's p and n are the same as the loop invariant's). Future CN versions may handle loop invariants differently and treat variables as immutable by default.

The final piece needed in the verification is an focus statement, as used in the previous examples: to separate the individual RW<char> resource for index j out of the iterated RW resource and make it available to the resource inference, we specify focus RW<char>, j;.

With the inv and focus statements in place, CN accepts the function.

Exercises

Init array reverse. Verify the function array_init_rev, which has the same specification as array_init2, but initializes the array in decreasing index order (from right to left).

exercises/array_init_rev.c
void array_init_rev (char *p, unsigned int n)
/*@ requires take A = each(u32 i; i < n) { 
                        RW<char>( array_shift<char>(p, i)) };
    ensures  take A_post = each(u32 i; i < n) { 
                             RW<char>( array_shift<char>(p, i)) };
@*/
{
  unsigned int j = 0;
  while (j < n)
  {
    p[n-(j+1)] = 0;
    j++;
  }
}