CN is a type system for verifying C code, focusing especially on low-level systems code. Compared to the normal C type system, CN checks not only that expressions and statements follow the correct typing discipline for C-types, but also that the C code executes safely — does not raise C undefined behaviour — and correctly — according to strong user-defined specifications. This tutorial introduces CN along a series of examples, starting with basic usage of CN on simple arithmetic functions and slowly moving towards more elaborate separation logic specifications of data structures.

CN was first described in CN: Verifying Systems C Code with Separation-Logic Refinement Types by Christopher Pulte, Dhruv C. Makwana, Thomas Sewell, Kayvan Memarian, Peter Sewell, Neel Krishnaswami. To accurately handle the complex semantics of C, CN builds on the Cerberus semantics for C. Some of the examples in this tutorial are adapted from Arthur Charguéraud’s excellent Separation Logic Foundations textbook, and one of the case studies is based on an extended exercise due to Bryan Parno.

This tutorial is a work in progress — your suggestions are greatly appreciated!

Acknowledgment of Support and Disclaimer. This material is based upon work supported by the Air Force Research Laboratory (AFRL) and Defense Advanced Research Projects Agencies (DARPA) under Contract No. FA8750-24-C-B044, a European Research Council (ERC) Advanced Grant “ELVER” under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 789108), and additional funding from Google. The opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the Air Force Research Laboratory (AFRL).

Installing CN

To fetch and install CN, check the Cerberus repository at https://github.com/rems-project/cerberus and follow the instructions in backend/cn/README.md.

Once completed, type cn --help in your terminal to ensure CN is installed and found by your system. This should print the list of available options CN can be executed with.

To apply CN to a C file, run cn CFILE.

Source files for the exercises and examples

The source files can be downloaded from here.

Basic usage

First example

For a first example, let’s look at a simple arithmetic function: add, shown below, takes two int arguments, x and y, and returns their sum.

int add(int x, int y)
{
  return x+y;
}

Running CN on the example produces an error message:

cn exercises/add_0.c
[1/1]: add
exercises/add_0.c:3:10: error: Undefined behaviour
  return x+y;
         ~^~
an exceptional condition occurs during the evaluation of an expression (§6.5#5)
Consider the state in /var/folders/_v/ndl32wpj4bb3y9dg11rvc8ph0000gn/T/state_393431.html

CN rejects the program because it has C undefined behaviour, meaning it is not safe to execute. CN points to the relevant source location, the addition x+y, and paragraph §6.5#5 of the C language standard that specifies the undefined behaviour. It also puts a link to an HTML file with more details on the error to help in diagnosing the problem.

Inspecting this HTML report (as we do in a moment) gives us possible example values for x and y that cause the undefined behaviour and hint at the problem: for very large values for x and y, such as 1073741825 and 1073741824, the sum of x and y can exceed the representable range of a C int value: 1073741825 + 1073741824 = 2^31+1, so their sum is larger than the maximal int value, 2^31-1.

Here x and y are signed integers, and in C, signed integer overflow is undefined behaviour (UB). Hence, add is only safe to execute for smaller values. Similarly, large negative values of x and y can cause signed integer underflow, also UB in C. We therefore need to rule out too large values for x and y, both positive and negative, which we do by writing a CN function specification.

First function specification

Shown below is our first function specification, for add, with a precondition that constrains x and y such that the sum of x and y lies between -2147483648 and 2147483647, so within the representable range of a C int value.

int add(int x, int y)
/*@ requires let sum = (i64) x + (i64) y;
             -2147483648i64 <= sum; sum <= 2147483647i64; @*/
{
  return x+y;
}

In detail:

  • Function specifications are given using special /*@ ... @*/ comments, placed in-between the function argument list and the function body.

  • The keyword requires starts the precondition, a list of one or more CN conditions separated by semicolons.

  • In function specifications, the names of the function arguments, here x and y, refer to their initial values. (Function arguments are mutable in C.)

  • let sum = (i64) x + (i64) y is a let-binding, which defines sum as the value (i64) x + (i64) y in the remainder of the function specification.

  • Instead of C syntax, CN uses Rust-like syntax for integer types, such as u32 for 32-bit unsigned integers and i64 for signed 64-bit integers to make their sizes unambiguous. Here, x and y, of C-type int, have CN type i32.

  • To define sum we cast x and y to the larger i64 type, using syntax (i64), which is large enough to hold the sum of any two i32 values.

  • Finally, we require this sum to be in-between the minimal and maximal int value. Integer constants, such as -2147483648i64, must specifiy their CN type (i64).

Running CN on the annotated program passes without errors. This means with our specified precondition, add is safe to execute.

We may, however, wish to be more precise. So far the specification gives no information to callers of add about its output. To also specify the return values we add a postcondition, using the ensures keyword.

int add(int x, int y)
/*@ requires let sum = (i64) x + (i64) y;
             -2147483648i64 <= sum; sum <= 2147483647i64;
    ensures return == (i32) sum;
@*/
{
  return x+y;
}

Here we use the keyword return, only available in function postconditions, to refer to the return value, and equate it to sum as defined in the preconditions, cast back to i32 type: add returns the sum of x and y.

Running CN confirms that this postcondition also holds.

One final refinement of this example. CN defines constant functions MINi32, MAXi64, etc. so that specifications do not need to be littered with unreadable numeric constants.

int add(int x, int y)
/*@ requires let sum = (i64) x + (i64) y;
             (i64)MINi32() <= sum; sum <= (i64)MAXi32();
    ensures return == (i32) sum;
@*/
{
  return x+y;
}

Two things to note: * These are constant functions, so they require a following (). * The type of MINi32() is i32, so if we want to use it as a 64-bit constant we need to add the explicit coercion (i64).

Error reports

In the original example CN reported a type error due to C undefined behaviour. While that example was perhaps simple enough to guess the problem and solution, this can become quite challenging as program and specification complexity increases. Diagnosing type errors is therefore an important part of using CN. CN tries to help with that by producing detailed error information, in the form of an HTML error report.

Let’s return to the type error from earlier (add without precondition) and take a closer look at this report. The report comprises two sections.

*CN error report*
Figure 1. CN error report

Path. The first section, “Path to error”, contains information about the control-flow path leading to the error.

When type checking a C function, CN checks each possible control-flow path through the program individually. If CN detects UB or a violation of user-defined specifications, CN reports the problematic control-flow path, as a nested structure of statements: paths are split into sections, which group together statements between high-level control-flow positions (e.g. function entry, the start of a loop, the invocation of a continue, break, or return statement, etc.); within each section, statements are listed by source code location; finally, per statement, CN lists the typechecked sub-expressions, and the memory accesses and function calls within these.

In our example, there is only one possible control-flow path: entering the function body (section “function body”) and executing the block from lines 2 to 4, followed by the return statement at line 3. The entry for the latter contains the sequence of sub-expressions in the return statement, including reads of the variables x and y.

In C, local variables in a function, including its arguments, are mutable and their address can be taken and passed as a value. CN therefore represents local variables as memory allocations that are manipulated using memory reads and writes. Here, type checking the return statement includes checking memory reads for x and y, at their locations &ARG0 and &ARG1. The path report lists these reads and their return values: the read at &ARG0 returns x (that is, the value of x originally passed to add); the read at &ARG1 returns y. Alongside this symbolic information, CN displays concrete values:

  • 1073741825i32 /* 0x40000001 */ for x (the first value is the decimal representation, the second, in /*...*/ comments, the hex equivalent) and

  • 1073741824i32 /* 0x40000000 */ for y.

For now, ignore the pointer values {@0; 4} for x and {@0; 0} for y.

These concrete values are part of a counterexample: a concrete valuation of variables and pointers in the program that that leads to the error. (The exact values may vary on your machine, depending on the version of Z3 installed on your system.)

Proof context. The second section, below the error trace, lists the proof context CN has reached along this control-flow path.

“Available resources” lists the owned resources, as discussed in later sections.

“Variables” lists counterexample values for program variables and pointers. In addition to x and y, assigned the same values as above, this includes values for their memory locations &ARG0 and &ARG1, function pointers in scope, and the __cn_alloc_history, all of which we ignore for now.

Finally, “Constraints” records all logical facts CN has learned along the path. This includes user-specified assumptions from preconditions or loop invariants, value ranges inferred from the C-types of variables, and facts learned during the type checking of the statements. Here (add without precondition) the only constraints are some contraints inferred from C-types in the code.

  • For instance, good<signed int>(x) says that the initial value of x is a “good” signed int value (i.e. in range). Here signed int is the same type as int, CN just makes the sign explicit. For integer types T, good<T> requires the value to be in range of type T; for pointer types T it also requires the pointer to be aligned. For structs and arrays this extends in the obvious way to struct members or array cells.

  • repr<T> requires representability (not alignment) at type T, so repr<signed int*>(&ARGO), for instance, records that the pointer to x is representable at C-type signed int*;

  • aligned(&ARGO, 4u64), moreover, states that it is 4-byte aligned.

Another arithmetic example

Let’s apply what we know so far to another simple arithmetic example.

The function doubled, shown below, takes an int n, defines a as n incremented, b as n decremented, and returns the sum of the two.

int doubled (int n)
{
  int a = n+1;
  int b = n-1;
  return a+b;
}

We would like to verify this is safe, and that doubled returns twice the value of n. Running CN on doubled leads to a type error: the increment of a has undefined behaviour.

As in the first example, we need to ensure that n+1 does not overflow and n-1 does not underflow. Similarly also a+b has to be representable at int type.

int doubled (int n)
/*@ requires let n_ = (i64) n;
             (i64)MINi32() <= n_ - 1i64; n_ + 1i64 <= (i64)MAXi32();
             (i64)MINi32() <= n_ + n_; n_ + n_ <= (i64)MAXi32();
    ensures return == n * 2i32;
@*/
{
  int a = n+1;
  int b = n-1;
  return a+b;
}

We can specify these using a similar style of precondition as in the first example. We first define n_ as n cast to type i64 — i.e. a type large enough to hold n+1, n-1 and a+b for any possible i32 value for n. Then we specify that decrementing n_ does not go below the minimal int value, that incrementing n_ does not go above the maximal value, and that n doubled is also in range. These preconditions together guarantee safe execution.

To capture the functional behaviour, the postcondition specifies that return is twice the value of n.

Exercise

Quadruple. Specify the precondition needed to ensure safety of the C function quadruple, and a postcondition that describes its return value.

int quadruple (int n)
{
  int m = n + n;
  return m + m;
}

Abs. Give a specification to the C function abs, which computes the absolute value of a given int value. To describe the return value, use CN’s ternary “_ ? _ : _” operator. Given a boolean b, and expressions e1 and e2 of the same basetype, b ? e1 : e2 returns e1 if b holds and e2 otherwise.

int abs (int x)
{
  if (x >= 0) {
    return x;
  }
  else {
    return -x;
  }
}

Pointers and simple ownership

So far we’ve only considered example functions manipulating integer values. Verification becomes more interesting and challenging when pointers are involved, because the safety of memory accesses via pointers has to be verified.

CN uses separation logic resource types and the concept of ownership to reason about memory accesses. A resource is the permission to access a region of memory. Unlike logical constraints, resource ownership is unique, meaning resources cannot be duplicated.

Let’s look at a simple example. The function read takes an int pointer p and returns the pointee value.

int read (int *p)
{
  return *p;
}

Running CN on this example produces the following error:

cn exercises/read.c
[1/1]: read
exercises/read.c:3:10: error: Missing resource for reading
  return *p;
         ^~
Resource needed: Owned<signed int>(p)
Consider the state in /var/folders/_v/ndl32wpj4bb3y9dg11rvc8ph0000gn/T/state_403624.html

For the read *p to be safe, ownership of a resource is missing: a resource Owned<signed int>(p).

The Owned resource type

Given a C-type T and pointer p, the resource Owned<T>(p) asserts ownership of a memory cell at location p of the size of C-type T. It is CN’s equivalent of a points-to assertion in separation logic (indexed by C-types T).

In this example we can ensure the safe execution of read by adding a precondition that requires ownership of Owned<int>(p), as shown below. For now ignore the notation take ... = Owned<int>(p). Since read maintains this ownership, we also add a corresponding postcondition, whereby read returns ownership of p after it is finished executing, in the form of another Owned<int>(p) resource.

int read (int *p)
/*@ requires take v1 = Owned<int>(p);
    ensures take v2 = Owned<int>(p);
@*/
{
  return *p;
}

This specifications means that

  • any function calling read has to be able to provide a resource Owned<int>(p) to pass into read, and

  • the caller will receive back a resource Owned<int>(p) when read returns.

Resource outputs

However, a caller of read may also wish to know that read actually returns the correct value, the pointee of p, and also that it does not change memory at location p. To phrase both we need a way to refer to the pointee of p.

In CN resources have outputs. Each resource outputs the information that can be derived from ownership of the resource. What information is returned is specific to the type of resource. A resource Owned<T>(p) (for some C-type T) outputs the pointee value of p, since that can be derived from the resource ownership: assume you have a pointer p and the associated ownership, then this uniquely determines the pointee value of p.

CN uses the take-notation seen in the example above to refer to the output of a resource, introducing a new name binding for the output. The precondition take v1 = Owned<int>(p) from the precondition does two things: (1) it assert ownership of resource Owned<int>(p), and (2) it binds the name v1 to the resource output, here the pointee value of p at the start of the function. Similarly, the postcondition introduces the name v2 for the pointee value on function return.

That means we can use the resource outputs from the pre- and postcondition to strengthen the specification of read as planned. We add two new postconditions: we specify

  1. that read returns v1 (the initial pointee value of p), and

  2. that the pointee values v1 and v2 before and after execution of read (respectively) are the same.

int read (int *p)
/*@ requires take v1 = Owned<int>(p);
    ensures take v2 = Owned<int>(p);
            return == v1;
            v1 == v2;
@*/
{
  return *p;
}

Aside. In standard separation logic the equivalent specification for read could have been phrased as follows (where return binds the return value in the postcondition):

∀p.
∀v1. { p ↦ v1 }
     read(p)
     { return. ∃v2. (p ↦ v2) /\ return = v1 /\ v1 = v2 }

CN’s take notation is just alternative syntax for quantification over the values of resources, but a useful one: the take notation syntactically restricts how these quantifiers can be used to ensure CN can always infer them.

Exercises

Quadruple. Specify the function quadruple_mem, that is similar to the earlier quadruple function, except that the input is passed as an int pointer. Write a specification that takes ownership of this pointer on entry and returns this ownership on exit, leaving the pointee value unchanged.

int quadruple_mem (int *p)
{
  int m = *p + *p;
  return m + m;
}

Abs. Give a specification to the function abs_mem, which computes the absolute value of a number passed as an int pointer.

int abs_mem (int *p)
{
  int x = *p;
  if (x >= 0) {
    return x;
  }
  else {
    return -x;
  }
}

Linear resource ownership

In the specifications we have written so far, functions that receive resources as part of their precondition also return this ownership in their postcondition.

Let’s try the read example from earlier again, but with a postcondition that does not return the ownership:

int read (int *p)
/*@ requires take v1 = Owned<int>(p); @*/
{
  return *p;
}

CN rejects this program with the following message:

cn build/exercises/read.broken.c
[1/1]: read
build/exercises/read.broken.c:4:3: error: Left-over unused resource 'Owned<signed int>(p)(v1)'
  return *p;
  ^~~~~~~~~~
Consider the state in /var/folders/_v/ndl32wpj4bb3y9dg11rvc8ph0000gn/T/state_17eb4a.html

CN has typechecked the function, verified that it is safe to execute under the precondition (given ownership Owned<int>(p)), and that the function (vacuously) satisfies its postcondition. But, following the check of the postcondition it finds that not all resources have been “used up”.

Given the above specification, read leaks memory: it takes ownership, does not return it, but also does not deallocate the owned memory or otherwise dispose of it. In CN this is a type error.

CN’s resource types are linear (as opposed to affine). This means not only that resources cannot be duplicated, but also that resources cannot simply be dropped or “forgotten”. Every resource passed into a function has to either be used up by it, by returning it or passing it to another function that consumes it, or destroyed, by deallocating the owned area of memory (as we shall see later).

CN’s motivation for linear tracking of resources is its focus on low-level systems software. CN checks C programs, in which, unlike higher-level garbage-collected languages, memory is managed manually, and memory leaks are typically very undesirable.

As a consequence, function specifications have to do precise “book-keeping” of their resource footprint, and, in particular, return any unused resources back to the caller.

The Block resource type

Aside from the Owned resource seen so far, CN has another built-in resource type: Block. Given a C-type T and pointer p, Block<T>(p) asserts the same ownership as Owned<T>(p) — so ownership of a memory cell at p the size of type T — but in contrast to Owned, Block memory is not necessarily initialised.

CN uses this distinction to prevent reads from uninitialised memory:

  • A read at C-type T and pointer p requires a resource Owned<T>(p), i.e., ownership of initialised memory at the right C-type. The load returns the Owned resource unchanged.

  • A write at C-type T and pointer p needs only a Block<T>(p) (so, unlike reads, writes to uninitialised memory are fine). The write consumes ownership of the Block resource (it destroys it) and returns a new resource Owned<T>(p) with the value written as the output. This means the resource returned from a write records the fact that this memory cell is now initialised and can be read from.

Since Owned carries the same ownership as Block, just with the additional information that the Owned memory is initalised, a resource Owned<T>(p) is “at least as good” as Block<T>(p) — an Owned<T>(p) resource can be used whenever Block<T>(p) is needed. For instance CN’s type checking of a write to p requires a Block<T>(p), but if an Owned<T>(p) resource is what is available, this can be used just the same. This allows an already-initialised memory cell to be over-written again.

Unlike Owned, whose output is the pointee value, Block has no meaningful output: its output is void/unit.

Write example

Let’s explore resources and their outputs in another example. The C function incr takes an int pointer p and increments the pointee value.

void incr (int *p)
/*@ requires take v1 = Owned<int>(p);
             ((i64) v1) + 1i64 <= (i64)MAXi32();
    ensures take v2 = Owned<int>(p);
            v2 == v1+1i32;
@*/
{
  *p = *p+1;
}

In the precondition we assert ownership of resource Owned<int>(p), binding its output/pointee value to v1, and use v1 to specify that p must point to a sufficiently small value at the start of the function not to overflow when incremented. The postcondition asserts ownership of p with output v2, as before, and uses this to express that the value p points to is incremented by incr: v2 == v1+1i32.

If we incorrectly tweaked this specification and used Block<int>(p) instead of Owned<int>(p) in the precondition, as below, then CN would reject the program.

void incr (int *p)
/*@ requires take u = Block<int>(p); 
    ensures take v = Owned<int>(p); 
@*/
{
  *p = *p+1;
}

CN reports:

build/solutions/slf0_basic_incr.signed.broken.c:6:11: error: Missing resource for reading
  int n = *p;
          ^~
Resource needed: Owned<signed int>(p)
Consider the state in /var/folders/_v/ndl32wpj4bb3y9dg11rvc8ph0000gn/T/state_5da0f3.html

The Owned<int>(p) resource required for reading is missing, since, as per precondition, only Block<int>(p) is available. Checking the linked HTML file confirms this. Here the section “Available resources” lists all resource ownership at the point of the failure:

  • Block<signed int>(p)(u), so ownership of uninitialised memory at location p; the output is a void/unit value u (specified in the second pair of parentheses)

  • Owned<signed int*>(&ARG0)(p), the ownership of (initialised) memory at location &ARG0, so the memory location where the first function argument is stored; its output is the pointer p (not to be confused with the pointee of p); and finally

  • __CN_Alloc(&ARG0)(void) is a resource that records allocation information for location &ARG0; this is related to CN’s memory-object semantics, which we ignore for the moment.

Exercises

Zero. Write a specification for the function zero, which takes a pointer to uninitialised memory and initialises it to 0.

void zero (int *p)
{
  *p = 0;
}

In-place double. Give a specification for the function inplace_double, which takes an int pointer p and doubles the pointee value: specify the precondition needed to guarantee safe execution and a postcondition that captures the function’s behaviour.

void inplace_double (int *p)
{
  int n = *p;
  int m = n + n;
  *p = m;
}

Multiple owned pointers

When functions manipulate multiple pointers, we can assert their ownership just like before. However (as in standard separation logic) pointer ownership is unique, so simultaneous ownership of Owned or Block resources for two pointers requires these pointers to be disjoint.

The following example shows the use of two Owned resources for accessing two different pointers: function add reads two int values in memory, at locations p and q, and returns their sum.

unsigned int add (unsigned int *p, unsigned int *q)
/*@ requires take m = Owned<unsigned int>(p);
             take n = Owned<unsigned int>(q);
    ensures take m2 = Owned<unsigned int>(p);
            take n2 = Owned<unsigned int>(q);
            m == m2 && n == n2;
            return == m + n;
@*/
{
  unsigned int m = *p;
  unsigned int n = *q;
  return m+n;
}

This time we use C’s unsigned int type. In C, over- and underflow of unsigned integers is not undefined behaviour, so we do not need any special preconditions to rule this out. Instead, when an arithmetic operation at unsigned type goes outside the representable range, the value “wraps around”.

The CN variables m and n (resp. m2 and n2) for the pointee values of p and q before (resp. after) the execution of add have CN basetype u32, so unsigned 32-bit integers, matching the C unsigned int type. Like C’s unsigned integer arithmetic, CN unsigned int values wrap around when exceeding the value range of the type.

Hence, the postcondition return == m+n holds also when the sum of m and n is greater than the maximal unsigned int value.

In the following we will sometimes use unsigned integer types to focus on specifying memory ownership, rather than the conditions necessary to show absence of C arithmetic undefined behaviour.

Exercises

Swap. Specify the function swap, which takes two owned unsigned int pointers and swaps their values.

void swap (unsigned int *p, unsigned int *q)
{
  unsigned int m = *p;
  unsigned int n = *q;
  *p = n;
  *q = m;
}

Transfer. Write a specification for the function transfer, shown below.

void transfer (unsigned int *p, unsigned int *q)
{
  unsigned int n = *p;
  unsigned int m = *q;
  unsigned int s = n + m;
  *p = s;
  *q = 0;
}

Ownership of compound objects

So far all examples have worked with just integers and pointers, but larger programs typically also manipulate compound values, often represented using C struct types. Specifying functions manipulating structs works in much the same way as with basic types.

For instance, the following example uses a struct point to represent a point in two-dimensional space. The function transpose swaps a point’s x and y coordinates.

struct point { int x; int y; };

void transpose (struct point *p) 
/*@ requires take s = Owned<struct point>(p);
    ensures take s2 = Owned<struct point>(p);
            s2.x == s.y;
            s2.y == s.x;
@*/
{
  int temp_x = p->x;
  int temp_y = p->y;
  p->x = temp_y;
  p->y = temp_x;
}

Here the precondition asserts ownership for p, at type struct point; the output s is a value of CN type struct point, i.e. a record with members i32 x and i32 y. The postcondition similarly asserts ownership of p, with output s2, and asserts the coordinates have been swapped, by referring to the members of s and s2 individually.

Compound Owned and Block resources

While one might like to think of a struct as a single (compound) object that is manipulated as a whole, C permits more flexible struct manipulation: given a struct pointer, programmers can construct pointers to individual struct members and pass these as values, even to other functions.

CN therefore cannot treat resources for compound C types, such as structs, as primitive, indivisible units. Instead, Owned<T> and Block<T> are defined inductively in the structure of the C-type T.

For struct types T, the Owned<T> resource is defined as the collection of Owned resources for its members (as well as Block resources for any padding bytes in-between). The resource Block<T>, similarly, is made up of Block resources for all members (and padding bytes).

To handle code that manipulates pointers into parts of a struct object, CN can automatically decompose a struct resource into the member resources, and recompose it, as needed. The following example illustrates this.

Recall the function zero from our earlier exercise. It takes an int pointer to uninitialised memory, with Block<int> ownership, and initialises the value to zero, returning an Owned<int> resource with output 0.

Now consider the function init_point, shown below, which takes a pointer p to a struct point and zero-initialises its members by calling zero twice, once with a pointer to struct member x, and once with a pointer to y.

void zero (int *p) 
/*@ requires take u = Block<int>(p);
    ensures take v = Owned<int>(p);
            v == 0i32; @*/
{
  *p = 0;
}

struct point { int x; int y; };

void init_point(struct point *p) 
/*@ requires take s = Block<struct point>(p);
    ensures take s2 = Owned<struct point>(p);
            s2.x == 0i32;
            s2.y == 0i32;
@*/
{
  zero(&p->x);
  zero(&p->y);
}

As stated in its precondition, init_point receives ownership Block<struct point>(p). The zero function, however, works on int pointers and requires Block<int> ownership.

CN can prove the calls to zero with &p->x and &p->y are safe because it decomposes the Block<struct point>(p) into two Block<int>, one for member x, one for member y. Later, the reverse happens: following the two calls to zero, as per zero’s precondition, init_point has ownership of two adjacent Owned<int> resources – ownership for the two struct member pointers, with the member now initialised. Since the postcondition of init_point requires ownership Owned<struct point>(p), CN combines these back into a compound resource. The resulting Owned<point struct> resource has for an output the struct value s2 that is composed of the zeroed member values for x and y.

Resource inference

To handle the required resource inference, CN “eagerly” decomposes all struct resources into resources for the struct members, and “lazily” re-composes them as needed.

We can see this if, for instance, we experimentally change the transpose example from above to force a type error. Let’s insert an /*@ assert(false) @*/ CN assertion in the middle of the transpose function (more on CN assertions later), so we can inspect CN’s proof context shown in the error report.

struct point { int x; int y; };

void transpose (struct point *p) 
/*@ requires take s = Owned<struct point>(p); 
    ensures take s2 = Owned<struct point>(p); 
            s2.x == s.y;
            s2.y == s.x; 
@*/
{
  int temp_x = p->x;
  int temp_y = p->y;
  /*@ assert(false); @*/
  p->x = temp_y;
  p->y = temp_x;
}

The precondition of transpose asserts ownership of an Owned<struct point>(p) resource. The error report now instead lists under “Available resources” two resources:

  • Owned<signed int>(member_shift<point>(p, x)) with output s.x and

  • Owned<signed int>(member_shift<point>(p, y)) with output s.y

Here member_shift<s>(p,m) is the CN expression that constructs, from a struct s pointer p, the “shifted” pointer for its member m.

When the function returns the two member resources are recombined “on demand” to satisfy the postcondition Owned<struct point>(p).

Exercises

Init point. Insert CN assert(false) statements in different statement positions of init_point and check how the available resources evolve.

Transpose (again). Recreate the transpose function from before, now using the swap function verified earlier (for struct upoint, with unsigned member values).

void swap (unsigned int *p, unsigned int *q)
/*@ requires take v = Owned<unsigned int>(p);
             take w = Owned<unsigned int>(q);
    ensures  take v2 = Owned<unsigned int>(p);
             take w2 = Owned<unsigned int>(q);
             v2 == w && w2 == v;
@*/
{
  unsigned int m = *p;
  unsigned int n = *q;
  *p = n;
  *q = m;
}

struct upoint { unsigned int x; unsigned int y; };

void transpose2 (struct upoint *p)

Arrays and loops

Another common datatype in C is arrays. Reasoning about memory ownership for arrays is more difficult than for the datatypes we have seen so far: C allows the programmer to access arrays using computed pointers, and the size of an array does not need to be known as a constant at compile time.

To support reasoning about code manipulating arrays and computed pointers, CN has iterated resources. For instance, to specify ownership of an int array with 10 cells starting at pointer p, CN uses the iterated resource

each (i32 i; 0i32 <= i && i < 10i32)
     { Owned<int>(array_shift<int>(p,i)) }

In detail, this can be read as follows:

  • for each integer i of CN type i32, …

  • if i is between 0 and 10, …

  • assert ownership of a resource Owned<int>

  • for cell i of the array with base-address p.

Here array_shift<int>(p,i) computes a pointer into the array at pointer p, appropriately offset for index i.

In general, iterated resource specifications take the form

each (BT Q; GUARD) { RESOURCE }

comprising three parts:

  • BT Q, for some CN type BT and name Q, introduces the quantifier Q of basetype BT, which is bound in GUARD and RESOURCE;

  • GUARD is a boolean-typed expression delimiting the instances of Q for which ownership is asserted; and

  • RESOURCE is any non-iterated CN resource.

First array example

Let’s see how this applies to a first example of an array-manipulating function. Function read takes three arguments: the base pointer p of an int array, the length n of the array, and an index i into the array; read then returns the value of the i-th array cell.

int read (int *p, int n, int i)
/*@ requires take a1 = each(i32 j; 0i32 <= j && j < n) { Owned<int>(array_shift<int>(p,j)) };
             0i32 <= i && i < n; 
    ensures take a2 = each(i32 j; 0i32 <= j && j < n) { Owned<int>(array_shift<int>(p,j)) }; @*/
{
  return p[i];
}

The CN precondition requires

  • ownership of the array on entry — one Owned<int> resource for each array index between 0 and n — and

  • that i lies within the range of owned indices.

On exit the array ownership is returned again.

This specification, in principle, should ensure that the access p[i] is safe. However, running CN on the example produces an error: CN is unable to find the required ownership for reading p[i].

cn build/solutions/array_load.broken.c
[1/1]: read
build/solutions/array_load.broken.c:5:10: error: Missing resource for reading
  return p[i];
         ^~~~
Resource needed: Owned<signed int>(array_shift<signed int>(p, (u64)i))

The reason is that when searching for a required resource, such as the Owned resource for p[i] here, CN’s resource inference does not consider iterated resources. Quantifiers, as used by iterated resources, can make verification undecidable, so, in order to maintain predictable type checking, CN delegates this aspect of the reasoning to the user.

To make the Owned resource required for accessing p[i] available to CN’s resource inference we have to “extract” ownership for index i out of the iterated resource.

int read (int *p, int n, int i)
/*@ requires take a1 = each(i32 j; 0i32 <= j && j < n) { 
                          Owned<int>(array_shift<int>(p,j)) };
             0i32 <= i && i < n;
    ensures take a2 = each(i32 j; 0i32 <= j && j < n) { 
                            Owned<int>(array_shift<int>(p,j)) }; @*/
{
  /*@ extract Owned<int>, i; @*/
  return p[i];
}

Here the CN comment /*@ extract Owned<int>, i; @*/ is a CN “ghost statement”/proof hint that instructs CN to instantiate any available iterated Owned<int> resource for index i. In our example this operation splits the iterated resource into two:

each(i32 j; 0i32 <= j && j < n) { Owned<int>(array_shift<int>(p,j)) }

is split into

  1. the instantiation of the iterated resource at i

    Owned<int>(array_shift<int>(p,i))
  2. the remainder of the iterated resource, the ownership for all indices except i

    each(i32 j; 0i32 <= j && j < n && j != i)
        { Owned<int>(array_shift<int>(p,j)) }

After this extraction step, CN can use the (former) extracted resource to justify the access p[i].

Following an extract statement, CN moreover remembers the extracted index and can automatically “reverse” the extraction when needed: after type checking the access p[i] CN must ensure the function’s postcondition holds, which needs the full array ownership again (including the extracted index i); remembering the index i, CN then automatically merges resources (1) and (2) again to obtain the required full array ownership, and completes the verification of the function.

So far the specification only guarantees safe execution but does not specify the behaviour of read. To address this, let’s return to the iterated resources in the function specification. When we specify take a1 = each ... here, what is a1? In CN, the output of an iterated resource is a map from indices to resource outputs. In this example, where index j has CN type i32 and the iterated resource is Owned<int>, the output a1 is a map from i32 indices to i32 values — CN type map<i32,i32>. (If the type of j was i64 and the resource Owned<char>, a1 would have type map<i64,u8>.)

We can use this to refine our specification with information about the functional behaviour of read.

int read (int *p, int n, int i)
/*@ requires take a1 = each(i32 j; 0i32 <= j && j < n) { Owned<int>(array_shift<int>(p,j)) };
             0i32 <= i && i < n;
    ensures take a2 = each(i32 j; 0i32 <= j && j < n) { Owned<int>(array_shift<int>(p,j)) };
            a1 == a2;
            return == a1[i];
@*/
{
  /*@ extract Owned<int>, i; @*/
  return p[i];
}

We specify that read does not change the array — the outputs a1 and a2, taken before and after running the function, are the same — and that the value returned is a1[i], a1 at index i.

Exercises

Array read two. Specify and verify the following function, array_read_two, which takes the base pointer p of an unsigned int array, the array length n, and two indices i and j. Assuming i and j are different, it returns the sum of the values at these two indices.

unsigned int array_read_two (unsigned int *p, int n, int i, int j)
{
  unsigned int tmp1 = p[i];
  unsigned int tmp2 = p[j];
  return (tmp1 + tmp2);
}

Swap array. Specify and verify swap_array, which swaps the values of two cells of an int array. Assume again that i and j are different, and describe the effect of swap_array on the array value using the CN map update expression a[i:v], which denotes the same map as a, except with index i updated to v.

void swap_array (int *p, int n, int i, int j)
{
  int tmp = p[i];
  p[i] = p[j];
  p[j] = tmp;
}

Loops

The array examples covered so far manipulate one or two individual cells of an array. Another typical pattern in code working over arrays is to loop, uniformly accessing all cells of an array, or sub-ranges of it.

In order to verify code with loops, CN requires the user to supply loop invariants — CN specifications of all owned resources and the constraints required to verify each iteration of the loop.

Let’s take a look at a simple first example. The following function, init_array, takes the base pointer p of a char array and the array length n and writes 0 to each array cell. .exercises/init_array.c

void init_array (char *p, unsigned int n)
/*@ requires take a1 = each(u32 i; i < n) { Owned<char>( array_shift<char>(p, i)) };
    ensures  take a2 = each(u32 i; i < n) { Owned<char>( array_shift<char>(p, i)) };
@*/
{
  unsigned int j = 0;
  while (j < n)
  {
    p[j] = 0;
    j++;
  }
}

If, for the moment, we focus just on proving safe execution of init_array, ignoring its functional behaviour, a specification might look as above: on entry init_array takes ownership of an iterated Owned<char> resource — one Owned resource for each index i of type u32 (so necessarily greater or equal to 0) up to n; on exit init_array returns the ownership.

To verify this, we have to supply a loop invariant that specifies all resource ownership and the necessary constraints that hold before and after each iteration of the loop. Loop invariants are specified using the keyword inv, followed by CN specifications using the same syntax as in function pre- and postconditions. The variables in scope for loop invariants are all in-scope C variables, as well as CN variables introduced in the function precondition. In loop invariants, the name of a C variable refers to its current value (more on this shortly).

void init_array (char *p, unsigned int n)
/*@ requires take a1 = each(u32 i; i < n) { Owned<char>( array_shift<char>(p, i)) };
    ensures  take a2 = each(u32 i; i < n) { Owned<char>( array_shift<char>(p, i)) };
@*/
{
  unsigned int j = 0;
  while (j < n)
  /*@ inv take ai = each(u32 i; i < n) { Owned<char>( array_shift<char>(p, i)) };
          {p} unchanged; {n} unchanged;
  @*/
  {
    /*@ extract Owned<char>, j; @*/
    p[j] = 0;
    j++;
  }
}

The main condition here is unsurprising: we specify ownership of an iterated resource for an array just like in the the pre- and postcondition.

The second thing we need to do, however, is less straightforward. Recall that, as discussed at the start of the tutorial, function arguments in C are mutable, and so CN permits this as well.While in this example it is obvious that p and n do not change, CN currently requires the loop invariant to explicitly state this, using special notation {p} unchanged (and similarly for n).

Note. If we forget to specify unchanged, this can lead to confusing errors. In this example, for instance, CN would verify the loop against the loop invariant, but would be unable to prove a function postcondition seemingly directly implied by the loop invariant (lacking the information that the postcondition’s p and n are the same as the loop invariant’s). Future CN versions may handle loop invariants differently and treat variables as immutable by default.

The final piece needed in the verification is an extract statement, as used in the previous examples: to separate the individual Owned<char> resource for index j out of the iterated Owned resource and make it available to the resource inference, we specify extract Owned<char>, j;.

With the extract statements in place, CN accepts the function.

Second loop example

However, on closer look, the specification of init_array is overly strong: it requires an iterated Owned resource for the array on entry. If, as the name suggests, the purpose of init_array is to initialise the array, then a precondition asserting only an iterated Block resource for the array should also be sufficient. The modified specification is then as follows.

void init_array2 (char *p, unsigned int n)
/*@ requires take a1 = each(u32 i; i < n) { Block<char>( array_shift<char>(p, i)) };
    ensures  take a2 = each(u32 i; i < n) { Owned<char>( array_shift<char>(p, i)) };
@*/
{
  unsigned int j = 0;
  while (j < n)
  {
    p[j] = 0;
    j++;
  }
}

This specification should hold: assuming ownership of an uninitialised array on entry, each iteration of the loop initialises one cell of the array, moving it from Block to Owned “state”, so that on function return the full array is initialised. (Recall that stores only require Block ownership of the written memory location, so ownership of not-necessarily-initialised memory.)

To verify this modified example we again need a loop invariant. This time, the loop invariant is more involved, however: since each iteration of the loop initialises one more array cell, the loop invariant has to do precise book-keeping of the initialisation status of the array.

To do so, we partition the array ownership into two parts: for each index of the array the loop has already visited, we have an Owned resource, for all other array indices we have the (unchanged) Block ownership.

void init_array2 (char *p, unsigned int n)
/*@ requires take a1 = each(u32 i; i < n) { Block<char>( array_shift<char>(p, i)) };
    ensures  take a2 = each(u32 i; i < n) { Owned<char>( array_shift<char>(p, i)) };
@*/
{
  unsigned int j = 0;
  while (j < n)
  /*@ inv take al = each(u32 i; i < j) { Owned<char>( array_shift<char>(p, i)) };
          take ar = each(u32 i; j <= i && i < n) { Block<char>( array_shift<char>(p, i)) };
          {p} unchanged; {n} unchanged;
          j <= n;
  @*/
  {
    /*@ extract Block<char>, j; @*/
    /*@ extract Owned<char>, j; @*/
    p[j] = 0;
    j++;
  }
}

Let’s go through this line-by-line:

  • We assert ownership of an iterated Owned resource, one for each index i strictly smaller than loop variable j.

  • All remaining indices i, between j and n are still uninitialised, so part of the iterated Block resource.

  • As in the previous example, we assert p and n are unchanged.

  • Finally, unlike in the previous example, this loop invariant involves j. We therefore also need to know that j does not exceed the array length n. Otherwise CN would not be able to prove that, on completing the last loop iteration, j=n holds. This, in turn, is needed to show that when the function returns, ownership of the iterated Owned resource --- as specified in the loop invariant --- is fully consumed by the function’s post-condition and there is no left-over unused resource.

As before, we also have to instruct CN to extract ownership of individual array cells out of the iterated resources:

  • to allow CN to extract the individual Block to be written we use extract Block<char>, j;;

  • the store returns a matching Owned<char> resource for index j;

  • finally, we put extract Owned<char>, j; to allow CN to “attach” this resource to the iterated Owned resource. CN issues a warning, because nothing is, in fact, extracted: we are using extract only for the “reverse” direction.

Exercises

Init array reverse. Verify the function init_array_rev, which has the same specification as init_array2, but initializes the array in decreasing index order (from right to left).

void init_array2 (char *p, unsigned int n)
/*@ requires take a1 = each(u32 i; i < n) { Block<char>( array_shift<char>(p, i)) };
    n > 0u32;
    ensures  take a2 = each(u32 i; i < n) { Owned<char>( array_shift<char>(p, i)) };
@*/
{
  unsigned int j = 0;
  while (j < n)
  {
    p[n-(j+1)] = 0;
    j++;
  }
}

Defining Predicates

Suppose we want to write a function that takes two pointers to integers and increments the contents of both of them.

First, let’s deal with the "normal" case where the two arguments do not alias…​

void incr2a (unsigned int *p, unsigned int *q)
/*@ requires take pv = Owned<unsigned int>(p);
             take qv = Owned<unsigned int>(q);
    ensures take pv_ = Owned<unsigned int>(p);
            take qv_ = Owned<unsigned int>(q);
            pv_ == pv + 1u32;
            qv_ == qv + 1u32;
@*/
{
  unsigned int n = *p;
  unsigned int m = n+1;
  *p = m;
  n = *q;
  m = n+1;
  *q = m;
}

But what if they do alias? The clunky solution is to write a whole different version of incr2 with a different embedded specification…​

void incr2b (unsigned int *p, unsigned int *q)
/*@ requires take pv = Owned<unsigned int>(p);
             ptr_eq(q,p);
    ensures take pv_ = Owned<unsigned int>(p);
            ptr_eq(q,p);
            pv_ == pv + 2u32;
@*/
{
  unsigned int n = *p;
  unsigned int m = n+1;
  *p = m;
  n = *q;
  m = n+1;
  *q = m;
}

#include "slf_incr2_noalias.c"

void call_both (unsigned int *p, unsigned int *q)
/*@ requires take pv = Owned<unsigned int>(p);
             take qv = Owned<unsigned int>(q);
    ensures take pv_ = Owned<unsigned int>(p);
            take qv_ = Owned<unsigned int>(q);
            pv_ == pv + 3u32;
            qv_ == qv + 1u32;
@*/
{
  incr2a(p, q);
  incr2b(p, p);
}

This is horrible. Much better is to define a predicate to use in the pre- and postconditions that captures both cases together:

/*@
predicate { u32 pv, u32 qv } BothOwned (pointer p, pointer q)
{
  if (ptr_eq(p,q)) {
    take pv = Owned<unsigned int>(p);
    return {pv: pv, qv: pv};
  }
  else {
    take pv = Owned<unsigned int>(p);
    take qv = Owned<unsigned int>(q);
    return {pv: pv, qv: qv};
  }
}
@*/

void incr2 (unsigned int *p, unsigned int *q)
/*@ requires take vs = BothOwned(p,q);
    ensures take vs_ = BothOwned(p,q);
            vs_.pv == (!ptr_eq(p,q) ? (vs.pv + 1u32) : (vs.pv + 2u32));
            vs_.qv == (!ptr_eq(p,q) ? (vs.qv + 1u32) : vs_.pv);
@*/
{
  /*@ split_case ptr_eq(p,q); @*/
  unsigned int n = *p;
  unsigned int m = n+1;
  *p = m;
  n = *q;
  m = n+1;
  *q = m;
}

void call_both_better (unsigned int *p, unsigned int *q)
/*@ requires take pv = Owned<unsigned int>(p);
             take qv = Owned<unsigned int>(q);
             !ptr_eq(p,q);
    ensures take pv_ = Owned<unsigned int>(p);
            take qv_ = Owned<unsigned int>(q);
            pv_ == pv + 3u32;
            qv_ == qv + 1u32;
@*/
{
  incr2(p, q);
  incr2(p, p);
}

Allocating and Deallocating Memory

At the moment, CN does not understand the malloc and free functions. They are a bit tricky because they rely on a bit of polymorphism and a typecast between char* — the result type of malloc and argument type of free — and the actual type of the object being allocated or deallocated.

However, for any given type, we can define a type-specific function that allocates heap storage with exactly that type. The implementation of this function cannot be checked by CN, but we can give just the spec, together with a promise to link against an external C library providing the implementation:

extern int *mallocInt ();
/*@ spec mallocInt();
    requires true;
    ensures take v = Block<int>(return);
@*/


extern unsigned int *mallocUnsignedInt ();
/*@ spec mallocUnsignedInt();
    requires true;
    ensures take v = Block<unsigned int>(return);
@*/

(Alternatively we can include an implementation written in arbitrary C inside a CN file by marking it with the keyword trusted at the top of its CN specification.)

Similarly: .exercises/free.h

extern void freeInt (int *p);
/*@ spec freeInt(pointer p);
    requires take v = Block<int>(p);
    ensures true;
@*/


extern void freeUnsignedInt (unsigned int *p);
/*@ spec freeUnsignedInt(pointer p);
    requires take v = Block<unsigned int>(p);
    ensures true;
@*/

Now we can write code that allocates and frees memory: .exercises/slf17_get_and_free.c

#include "free.h"

unsigned int get_and_free (unsigned int *p)
/*@ requires take v1_ = Owned(p);
    ensures return == v1_; @*/
{
  unsigned int v = *p;
  freeUnsignedInt (p);
  return v;
}

We can also define a "safer", ML-style version of malloc that handles both allocation and initialization:

extern unsigned int *refUnsignedInt (unsigned int v);
/*@ spec refUnsignedInt(u32 v);
    requires true;
    ensures take vr = Owned(return);
            vr == v;
@*/


extern int *refInt (int v);
/*@ spec refInt(i32 v);
    requires true;
    ensures take vr = Owned(return);
            vr == v;
@*/
#include "free.h"
#include "ref.h"
#include "slf0_basic_incr.c"

unsigned int succ_using_incr (unsigned int n)
/*@ ensures return == n + 1u32; @*/
{
  unsigned int *p = refUnsignedInt(n);
  incr(p);
  unsigned int x = *p;
  freeUnsignedInt(p);
  return x;
}

Exercises

Prove a specification for the following program that reveals only that it returns a pointer to a number that is greater than the number pointed to by its argument.

#include "malloc.h"

unsigned int *ref_greater_abstract (unsigned int *p)
{
  unsigned int* q = mallocUnsignedInt();
  *q = *p + 1;
  return q;
}

Side Note

Here is another syntax for external / unknown functions, together with an example of a loose specification:

unsigned int val_rand (unsigned int n);
/*@ spec val_rand(u32 n);
    requires n > 0u32;
    ensures 0u32 <= return && return < n;
@*/

unsigned int two_dice ()
/*@ ensures 2u32 <= return && return <= 12u32; @*/
{
  unsigned int n1 = val_rand (6);
  unsigned int n2 = val_rand (6);
  unsigned int s = n1 + n2;
  return s + 2;
}

Lists

Now it’s time to look at some more interesting heap structures.

To begin with, here is a C definition for linked list cells, together with allocation and deallocation functions:

struct int_list {
  int head;
  struct int_list* tail;
};

extern struct int_list *mallocIntList();
/*@ spec mallocIntList();
    requires true;
    ensures take u = Block<struct int_list>(return);
@*/

extern void freeIntList (struct int_list *p);
/*@ spec freeIntList(pointer p);
    requires take u = Block<struct int_list>(p);
    ensures true;
@*/

To write specifications for C functions that manipulate lists, we need to define a CN "predicate" that describes mathematical list structures, as one would do in ML, Haskell, or Coq. (We call them "sequences" here to avoid overloading the word "list".)

Intuitively, the IntList predicate walks over a pointer structure in the C heap and extracts an Owned version of the mathematical list that it represents.

/*@
datatype seq {
  Seq_Nil {},
  Seq_Cons {i32 head, datatype seq tail}
}

predicate (datatype seq) IntList(pointer p) {
  if (is_null(p)) {
    return Seq_Nil{};
  } else {
    take H = Owned<struct int_list>(p);
    take tl = IntList(H.tail);
    return (Seq_Cons { head: H.head, tail: tl });
  }
}
@*/

We can also write specification-level "functions" by ordinary functional programming (in slightly strange, unholy-union-of-C-and-ML syntax):

/*@
function (i32) hd (datatype seq xs) {
  match xs {
    Seq_Nil {} => {
      0i32
    }
    Seq_Cons {head : h, tail : _} => {
      h
    }
  }
}

function (datatype seq) tl (datatype seq xs) {
  match xs {
    Seq_Nil {} => {
      Seq_Nil {}
    }
    Seq_Cons {head : _, tail : tail} => {
      tail
    }
  }
}
@*/

We use the IntList predicate to specify functions returning the empty list and the cons of a number and a list.

struct int_list* IntList_nil()
/*@ ensures take ret = IntList(return);
            ret == Seq_Nil{};
 @*/
{
  return 0;
}

struct int_list* IntList_cons(int h, struct int_list* t)
/*@ requires take l = IntList(t);
    ensures take ret = IntList(return);
            ret == Seq_Cons{ head: h, tail: l};
 @*/
{
  struct int_list *p = mallocIntList();
  p->head = h;
  p->tail = t;
  return p;
}

Finally, we can collect all this stuff into a single header file and add the usual C #ifndef gorp to avoid complaints from the compiler if it happens to get included twice from the same source file later.

#ifndef _LIST_H
#define _LIST_H

#include "list_c_types.h"
#include "list_cn_types.h"
#include "list_hdtl.h"
#include "list_constructors.h"

#endif //_LIST_H

Append

With this basic infrastructure in place, we can start specifying and verifying list-manipulating functions. First, append.

Here is its specification (in a separate file, because we’ll want to use it multiple times below.)

// append.h

/*@
function [rec] (datatype seq) append(datatype seq xs, datatype seq ys) {
  match xs {
    Seq_Nil {} => {
      ys
    }
    Seq_Cons {head : h, tail : zs}  => {
      Seq_Cons {head: h, tail: append(zs, ys)}
    }
  }
}
@*/

Here is a simple destructive append function. Note the two uses of the unfold annotation in the body, which are needed to help the CN typechecker.

#include "list.h"
#include "list_append.h"

struct int_list* IntList_append(struct int_list* xs, struct int_list* ys)
/*@ requires take L1 = IntList(xs); @*/
/*@ requires take L2 = IntList(ys); @*/
/*@ ensures take L3 = IntList(return); @*/
/*@ ensures L3 == append(L1, L2); @*/
{
  if (xs == 0) {
    /*@ unfold append(L1, L2); @*/
    return ys;
  } else {
    /*@ unfold append(L1, L2); @*/
    struct int_list *new_tail = IntList_append(xs->tail, ys);
    xs->tail = new_tail;
    return xs;
  }
}

List copy

Here is an allocating list copy function with a pleasantly light annotation burden.

#include "list.h"

struct int_list* IntList_copy (struct int_list *xs)
/*@ requires take L1 = IntList(xs);
    ensures take L1_ = IntList(xs);
            take L2 = IntList(return);
            L1 == L1_;
            L1 == L2;
@*/
{
  if (xs == 0) {
    return IntList_nil();
  } else {
    struct int_list *new_tail = IntList_copy(xs->tail);
    return IntList_cons(xs->head, new_tail);
  }
}

Merge sort

Finally, here is a slightly tricky in-place version of merge sort that avoids allocating any new list cells in the splitting step by taking alternate cells from the original list and linking them together into two new lists of roughly equal lengths.

#include "list.h"

/*@
function [rec] ({datatype seq fst, datatype seq snd}) split(datatype seq xs)
{
  match xs {
    Seq_Nil {} => {
      {fst: Seq_Nil{}, snd: Seq_Nil{}}
    }
    Seq_Cons {head: h1, tail: Seq_Nil{}} => {
      {fst: Seq_Nil{}, snd: xs}
    }
    Seq_Cons {head: h1, tail: Seq_Cons {head : h2, tail : tl2 }} => {
      let P = split(tl2);
      {fst: Seq_Cons { head: h1, tail: P.fst},
       snd: Seq_Cons { head: h2, tail: P.snd}}
    }
  }
}

function [rec] (datatype seq) merge(datatype seq xs, datatype seq ys) {
  match xs {
      Seq_Nil {} => { ys }
      Seq_Cons {head: x, tail: xs1} => {
        match ys {
          Seq_Nil {} => { xs }
          Seq_Cons{ head: y, tail: ys1} => {
            (x < y) ?
              (Seq_Cons{ head: x, tail: merge(xs1, ys) })
            : (Seq_Cons{ head: y, tail: merge(xs, ys1) })
          }
        }
      }
  }
}

function [rec] (datatype seq) mergesort(datatype seq xs) {
  match xs {
      Seq_Nil{} => { xs }
      Seq_Cons{head: x, tail: Seq_Nil{}} => { xs }
      Seq_Cons{head: x, tail: Seq_Cons{head: y, tail: zs}} => {
        let P = split(xs);
        let L1 = mergesort(P.fst);
        let L2 = mergesort(P.snd);
        merge(L1, L2)
      }
    }
}
@*/

struct int_list_pair {
  struct int_list* fst;
  struct int_list* snd;
};

struct int_list_pair split(struct int_list *xs)
/*@ requires take Xs = IntList(xs); @*/
/*@ ensures take Ys = IntList(return.fst); @*/
/*@ ensures take Zs = IntList(return.snd); @*/
/*@ ensures {fst: Ys, snd: Zs} == split(Xs); @*/
{
  if (xs == 0) {
    /*@ unfold split(Xs); @*/
    struct int_list_pair r = {.fst = 0, .snd = 0};
    return r;
  } else {
    struct int_list *cdr = xs -> tail;
    if (cdr == 0) {
      /*@ unfold split(Xs); @*/
      struct int_list_pair r = {.fst = 0, .snd = xs};
      return r;
    } else {
      /*@ unfold split(Xs); @*/
      struct int_list_pair p = split(cdr->tail);
      xs->tail = p.fst;
      cdr->tail = p.snd;
      struct int_list_pair r = {.fst = xs, .snd = cdr};
      return r;
    }
  }
}

struct int_list* merge(struct int_list *xs, struct int_list *ys)
/*@ requires take Xs = IntList(xs); @*/
/*@ requires take Ys = IntList(ys); @*/
/*@ ensures take Zs = IntList(return); @*/
/*@ ensures Zs == merge(Xs, Ys); @*/
{
  if (xs == 0) {
    /*@ unfold merge(Xs, Ys); @*/
    return ys;
  } else {
    /*@ unfold merge(Xs, Ys); @*/
    if (ys == 0) {
      /*@ unfold merge(Xs, Ys); @*/
      return xs;
    } else {
      /*@ unfold merge(Xs, Ys); @*/
      if (xs->head < ys->head) {
        struct int_list *zs = merge(xs->tail, ys);
        xs->tail = zs;
        return xs;
      } else {
        struct int_list *zs = merge(xs, ys->tail);
        ys->tail = zs;
        return ys;
      }
    }
  }
}

struct int_list* mergesort(struct int_list *xs)
/*@ requires take Xs = IntList(xs); @*/
/*@ ensures take Ys = IntList(return); @*/
/*@ ensures Ys == mergesort(Xs); @*/
{
  if (xs == 0) {
    /*@ unfold mergesort(Xs); @*/
    return xs;
  } else {
    struct int_list *tail = xs->tail;
    if (tail == 0) {
      /*@ unfold mergesort(Xs); @*/
      return xs;
    } else {
      /*@ unfold mergesort(Xs); @*/
      struct int_list_pair p = split(xs);
      p.fst = mergesort(p.fst);
      p.snd = mergesort(p.snd);
      return merge(p.fst, p.snd);
    }
  }
}

Exercises

Allocating append. Fill in the CN annotations on IntList_append2. (You will need some in the body as well as at the top.)

#include "list.h"
#include "list_append.h"

struct int_list* IntList_copy (struct int_list *xs)
/*@ requires take L1 = IntList(xs);
    ensures take L1_ = IntList(xs);
            take L2 = IntList(return);
            L1 == L1_;
            L1 == L2;
@*/
{
  if (xs == 0) {
    return IntList_nil();
  } else {
    struct int_list *new_tail = IntList_copy(xs->tail);
    return IntList_cons(xs->head, new_tail);
  }
}

struct int_list* IntList_append2 (struct int_list *xs, struct int_list *ys)
{
  if (xs == 0) {
    return IntList_copy(ys);
  } else {
    struct int_list *new_tail = IntList_append2(xs->tail, ys);
    return IntList_cons(xs->head, new_tail);
  }
}

Note that it would not make sense to do the usual functional-programming trick of copying xs but sharing ys. (Why?)

Length. Add annotations as appropriate:

/* list_length.c */

#include "list.h"

unsigned int IntList_length (struct int_list *xs)
{
  if (xs == 0) {
    return 0;
  } else {
    return 1 + IntList_length (xs->tail);
  }
}

List deallocation. Fill in the body of the following procedure and add annotations as appropriate:

#include "list.h"

void IntList_free_list(struct int_list* xs)
// You fill in the rest...

Length with an accumulator. Add annotations as appropriate:

#include "list.h"
#include "ref.h"
#include "free.h"

/*@
function [rec] (u32) length(datatype seq xs) {
  match xs {
    Seq_Nil {} => {
      0u32
    }
    Seq_Cons {head : h, tail : zs}  => {
      1u32 + length(zs)
    }
  }
}
@*/

void IntList_length_acc_aux (struct int_list *xs, unsigned int *p)
/*@ requires take L1 = IntList(xs);
             take n = Owned<unsigned int>(p);
    ensures take L1_ = IntList(xs);
            take n_ = Owned<unsigned int>(p);
            L1 == L1_;
            n_ == n + length(L1);
@*/
{
  /*@ unfold length(L1); @*/
  if (xs == 0) {
  } else {
    *p = *p + 1;
    IntList_length_acc_aux (xs->tail, p);
  }
}

unsigned int IntList_length_acc (struct int_list *xs)
/*@ requires take L1 = IntList(xs);
    ensures take L1_ = IntList(xs);
            L1 == L1_;
            return == length(L1);
@*/
{
  unsigned int *p = refUnsignedInt(0);
  IntList_length_acc_aux(xs, p);
  unsigned int x = *p;
  freeUnsignedInt(p);
  return x;
}

Working with External Lemmas

TODO: This section should also show what the proof of the lemmas looks like on the Coq side!

List reverse

The specification of list reversal in CN relies on the familiar recursive list reverse function, with a recursive helper.

/*@
function [rec] (datatype seq) snoc(datatype seq xs, i32 y) {
  match xs {
    Seq_Nil {} => {
      Seq_Cons {head: y, tail: Seq_Nil{}}
    }
    Seq_Cons {head: x, tail: zs}  => {
      Seq_Cons{head: x, tail: snoc (zs, y)}
    }
  }
}
@*/
#include "list_snoc.h"

/*@
function [rec] (datatype seq) rev(datatype seq xs) {
  match xs {
    Seq_Nil {} => {
      Seq_Nil {}
    }
    Seq_Cons {head : h, tail : zs}  => {
      snoc (rev(zs), h)
    }
  }
}
@*/

To reason about the C implementation of list reverse, we need to help the SMT solver by enriching its knowledge base with a couple of facts about lists. The proofs of these facts require induction, so in CN we simply state them as lemmas and defer the proofs to Coq.

/*@
lemma append_nil_r (datatype seq l1)
  requires true;
  ensures append(l1, Seq_Nil {}) == l1;

lemma append_cons_r (datatype seq l1, i32 x, datatype seq l2)
  requires true;
  ensures append(l1, Seq_Cons {head: x, tail: l2})
          == append(snoc(l1, x), l2);
@*/

Having stated these lemmas, we can now complete the specification and proof of IntList_rev. Note the two places where apply is used to tell the SMT solver where to pay attention to the lemmas.

#include "list.h"
#include "list_append.h"
#include "list_rev.h"
#include "list_rev_lemmas.h"

struct int_list* IntList_rev_aux(struct int_list* xs, struct int_list* ys)
/*@ requires take L1 = IntList(xs); @*/
/*@ requires take L2 = IntList(ys); @*/
/*@ ensures take R = IntList(return); @*/
/*@ ensures R == append(rev(L2), L1); @*/
{
  if (ys == 0) {
    /*@ unfold rev(L2); @*/
    /*@ unfold append(Seq_Nil {},L1); @*/
    return xs;
  } else {
    /*@ unfold rev(L2); @*/
    /*@ apply append_cons_r(rev(tl(L2)), hd(L2), L1); @*/
    struct int_list *tmp = ys->tail;
    ys->tail = xs;
    return IntList_rev_aux(ys, tmp);
  }
}

struct int_list* IntList_rev(struct int_list* xs)
/*@ requires take L1 = IntList(xs); @*/
/*@ ensures take L1_rev = IntList(return); @*/
/*@ ensures L1_rev == rev(L1); @*/
{
  /*@ apply append_nil_r(rev(L1)); @*/
  return IntList_rev_aux (0, xs);
}

For comparison, here is another way to write the program, using a while loop instead of recursion, with its specification and proof.

#include "list.h"
#include "list_append.h"
#include "list_rev.h"
#include "list_rev_lemmas.h"

struct int_list* IntList_rev_loop(struct int_list *xs)
/*@ requires take L = IntList(xs);
    ensures  take L_ = IntList(return);
             L_ == rev(L);
@*/
{
  struct int_list *last = 0;
  struct int_list *cur = xs;
  /*@ apply append_nil_r(rev(L)); @*/
  while(1)
  /*@ inv take L1 = IntList(last);
          take L2 = IntList(cur);
          append(rev(L2), L1) == rev(L);
  @*/
  {
    if (cur == 0) {
      /*@ unfold rev(Seq_Nil {}); @*/
      /*@ unfold append(Seq_Nil {}, L1); @*/
      return last;
    }
    struct int_list *tmp = cur->tail;
    cur->tail = last;
    last = cur;
    cur = tmp;
    /*@ unfold rev(L2); @*/
    /*@ apply append_cons_r (rev (tl(L2)), hd(L2), L1); @*/
  }
}

Exercises

Sized stacks: Fill in annotations where requested:

#include "list.h"
#include "list_length.c"

struct sized_stack {
  unsigned int size;
  struct int_list* data;
};

/*@
type_synonym sizeAndData = {u32 s, datatype seq d}

predicate (sizeAndData) SizedStack(pointer p) {
    take S = Owned<struct sized_stack>(p);
    let s = S.size;
    take d = IntList(S.data);
    assert(s == length(d));
    return { s: s, d: d };
}
@*/

extern struct sized_stack *malloc_sized_stack ();
/*@
spec malloc_sized_stack();
     requires true;
     ensures take u = Block<struct sized_stack>(return);
@*/

extern void *free_sized_stack (struct sized_stack *p);
/*@
spec free_sized_stack(pointer p);
     requires take u = Block<struct sized_stack>(p);
     ensures true;
@*/

struct sized_stack* create()
/*@ ensures take S = SizedStack(return);
            S.s == 0u32;
@*/
{
  struct sized_stack *p = malloc_sized_stack();
  p->size = 0;
  p->data = 0;
  /*@ unfold length(Seq_Nil {}); @*/
  return p;
}

unsigned int sizeOf (struct sized_stack *p)
/* FILL IN HERE */
{
  return p->size;
}

void push (struct sized_stack *p, int x)
/* FILL IN HERE */
{
  struct int_list *data = IntList_cons(x, p->data);
  p->size++;
  p->data = data;
}

int pop (struct sized_stack *p)
/* FILL IN HERE */
{
  struct int_list *data = p->data;
  if (data != 0) {
    int head = data->head;
    struct int_list *tail = data->tail;
    freeIntList(data);
    p->data = tail;
    p->size--;
    return head;
  }
  return 42;
}

int top (struct sized_stack *p)
/*@ requires take S = SizedStack(p);
             S.s > 0u32;
    ensures  take S_ = SizedStack(p);
             S_ == S;
             return == hd(S.d);
@*/
{
  /*@ unfold length(S.d); @*/ 
  // from S.s > 0u32 it follows that the 'else' branch is impossible
  if (p->data != 0) {
    return (p->data)->head;
  }
  else {
    // provably dead code
    return 42; 
  }
}

CN Style

This section gathers some advice on stylistic conventions and best practices in CN.

Constants

The syntax of the C language does not actually include constants. Instead, the convention is to use the macro preprocessor to replace symbolic names by their definitions before the C compiler ever sees them.

This raises a slight awkwardness in CN, because CN specifications and annotations are written in C comments, so they are not transformed by the preprocessor. However, we can approximate the effect of constant values by defining constant functions. We’ve been working with some of these already, e.g., MINi32(), but it is also possible to define our own constant functions. Here is the officially approved idiom:

#define CONST 1
    /*@ function (i32) CONST () @*/
    static int c_CONST() /*@ cn_function CONST; @*/ { return CONST; }

int foo (int x)
/*@
  requires true;
  ensures return == CONST();
@*/
{
  return CONST;
}

Here’s how it works:

  • We first define a C macro CONST in the usual way.

  • The next two lines "import" this constant into CN by defining a CN function CONST() whose body is the C function c_CONST(). The body of c_CONST returns the value of the macro CONST. Notice that the declaration of CONST() has no body.

  • The annotation /*@ cn_function CONST; @*/ links the two functions, CONST() and cn_CONST().

Of course, we could achieve the same effect by defining the CN function CONST() directly…​

#define CONST 1
/*@ function (i32) CONST () { 1i32 } @*/

…​but this version repeats the number 1 in two places — a potential source of nasty bugs!

Case Studies

To close out the tutorial, let’s look at some larger examples.

Imperative Queues

A queue is a linked list with O(1) operations for adding things to one end (the "back") and removing them from the other (the "front"). Here are the C type definitions:

struct int_queue {
  struct int_queueCell* front;  
  struct int_queueCell* back;
};

struct int_queueCell {
  int first;  
  struct int_queueCell* next;
};

A queue consists of a pair of pointers, one pointing to the front element, which is the first in a linked list of `int_queueCell`s, the other pointing directly to the last cell in this list. If the queue is empty, both pointers are NULL.

Abstractly, a queue just represents a list, so we can reuse the seq type from the list examples earlier in the tutorial.

/*@
predicate (datatype seq) IntQueuePtr (pointer q) {
  take Q = Owned<struct int_queue>(q);
  assert (   (is_null(Q.front)  && is_null(Q.back)) 
          || (!is_null(Q.front) && !is_null(Q.back)));
  take L = IntQueueFB(Q.front, Q.back);
  return L;
}
@*/

Given a pointer to an int_queue struct, this predicate grabs ownership of the struct, asserts that the front and back pointers must either both be NULL or both be non-NULL, and then hands off to an auxiliary predicate IntQueueFB. (Conceptually, IntQueueFB is part of IntQueuePTR, but CN currently allows conditional expressions only at the beginning of predicate definitions, not after a take.)

IntQueueFB is where the interesting part starts:

/*@
predicate (datatype seq) IntQueueFB (pointer front, pointer back) {
  if (is_null(front)) {
    return Seq_Nil{};
  } else {
    take B = Owned<struct int_queueCell>(back);
    assert (is_null(B.next));
    assert (ptr_eq(front, back) || !addr_eq(front, back));
    take L = IntQueueAux (front, back);
    return snoc(L, B.first);
  }
}
@*/

First, we case on whether the front of the queue is NULL. If so, then the queue is empty and we return the empty sequence.

If the queue is not empty, we need to walk down the linked list of elements and gather up all their values into a sequence. But we must treat the last element of the queue specially, for two reasons. First, because the push operation is going to follow the back pointer directly to the last list cell without traversing all the others, we need to take that element now rather than waiting to get to it at the end of the recursion starting from the front. Second, and relatedly, there will be two pointers to this final list cell — one from the back field and one from the next field of the second to last cell (or the front pointer, if there is only one cell in the list), so we need to be careful not to take this cell twice.

Accordingly, we begin by take`ing the tail cell and passing it separately to the `IntQueueAux predicate, which has the job of walking down the cells from the front and gathering all the rest of them into a sequence. We take the result from IntQueueAux and snoc on the very last element.

The assert (is_null(B.next)) here gives the CN verifier a crucial piece of information about an invariant of the representation: The back pointer always points to the very last cell in the list, so its next field will always be NULL.

Finally, the IntQueueAux predicate recurses down the list of cells.

/*@
predicate (datatype seq) IntQueueAux (pointer f, pointer b) {
  if (ptr_eq(f,b)) {
    return Seq_Nil{};
  } else {
    take F = Owned<struct int_queueCell>(f);
    assert (!is_null(F.next));
    assert (ptr_eq(F.next, b) || !addr_eq(F.next, b));
    take B = IntQueueAux(F.next, b);
    return Seq_Cons{head: F.first, tail: B};
  }
}
@*/

Its first argument (f) starts out at front and progresses through the list on recursive calls; its b argument is always a pointer to the very last cell.

When f and b are equal, we have reached the last cell and there is nothing to do. (We don’t even have to build a singleton list: that’s going to happen one level up, in IntQueueFB.)

Otherwise, we take the fields of the f, make a recurive call to IntQueueAux to process the rest of the cells, and cons the first field of this cell onto the resulting sequence before returning it. (Again, we need to help the CN verifier by explicitly informing it of the invariant that we know, that C.next cannot be null if f and b are different.)

Now we need a bit of boilerplate: just as with linked lists, we need to be able to allocate and deallocate queues and queue cells. There are no interesting novelties here.

extern struct int_queue *mallocIntQueue();
/*@ spec mallocIntQueue();
    requires true;
    ensures take u = Block<struct int_queue>(return);
@*/ 

extern void freeIntQueue (struct int_queue *p);
/*@ spec freeIntQueue(pointer p);
    requires take u = Block<struct int_queue>(p);
    ensures true;
@*/

extern struct int_queueCell *mallocIntQueueCell();
/*@ spec mallocIntQueueCell();
    requires true;
    ensures take u = Block<struct int_queueCell>(return);
@*/ 

extern void freeIntQueueCell (struct int_queueCell *p);
/*@ spec freeIntQueueCell(pointer p);
    requires take u = Block<struct int_queueCell>(p);
    ensures true;
@*/

Exercise: The function for creating an empty queue just needs to set both of its fields to NULL. See if you can fill in its specification.

#include "queue_headers.h"

struct int_queue* IntQueue_empty ()
{
  struct int_queue *p = mallocIntQueue();
  p->front = 0;
  p->back = 0;
  return p;
}

The push and pop operations are more involved. Let’s look at push first.

Here’s the unannotated C code — make sure you understand it.

#include "queue_headers.h" 

void IntQueue_push (int x, struct int_queue *q)
{
  struct int_queueCell *c = mallocIntQueueCell();
  c->first = x;
  c->next = 0;
  if (q->back == 0) {
    q->front = c;
    q->back = c;
    return;
  } else {
    struct int_queueCell *oldback = q->back;
    q->back->next = c;
    q->back = c;
    return;
  }
}

Exercise: Before reading on, see if you can write down a reasonable top-level specification for this operation.

(One thing you might find odd about this code is that there’s a return statement at the end of each branch of the conditional, rather than a single return at the bottom. The reason for this is that, when CN analyzes a function body containing a conditional, it effectively copies all the code after the conditional into each of the branches. Then, if verification encounters an error related to this code — e.g., "you didn’t establish the ensures conditions at the point of returning — the error message will be confusing because it will not be clear which branch of the conditional it is associated with.)

Now, here is the annotated version of the push operation.

#include "queue_headers.h" 
#include "queue_push_lemma.h" 

void IntQueue_push (int x, struct int_queue *q)
/*@ requires take before = IntQueuePtr(q);
    ensures take after = IntQueuePtr(q);
            after == snoc (before, x);
@*/
{
  struct int_queueCell *c = mallocIntQueueCell();
  c->first = x;
  c->next = 0;
  if (q->back == 0) {
    q->front = c;
    q->back = c;
    return;
  } else {
    struct int_queueCell *oldback = q->back;
    q->back->next = c;
    q->back = c;
    /*@ apply push_lemma(q->front, oldback); @*/
    return;
  }
}

The case where the queue starts out empty (q->back == 0) is easy. CN can work it out all by itself.

The case where the starting queue is nonempty is more interesting. The push operation messes with the end of the sequence of queue elements, so we should expect that validating push is going to require some reasoning about this sequence. Here, in fact, is the lemma we need.

/*@
lemma push_lemma (pointer front, pointer p)
  requires
      ptr_eq(front, p) || !addr_eq(front, p);
      take Q = IntQueueAux(front, p);
      take P = Owned<struct int_queueCell>(p);
  ensures
      ptr_eq(front, P.next) || !addr_eq(front, P.next);
      take NewQ = IntQueueAux(front, P.next);
      NewQ == snoc(Q, P.first);
@*/

This says, in effect, that we have two choices for how to read out the values in some chain of queue cells of length at least 2, starting with the cell front and terminating when we get to the next cell following some given cell p — call it c. We can either gather up all the cells from front to c, or we can gather up just the cells from front to p and then snoc on the single value from c.

When we apply this lemma, p will be the old back cell and c will be the new one. But to prove it (by induction, of course), we need to state it more generally, allowing p to be any internal cell in the list starting at front and c its successor.

The reason we need this lemma is that, to add a new cell at the end of the queue, we need to reassign ownership of the old back cell. In the precondition of push, we took ownership of this cell separately from the rest; in the postcondition, it needs to be treated as part of the rest (so that the new back cell can now be treated specially).

One interesting technicality is worth noting: After the assignment q->back = c we can no longer prove IntQueueFB(q->front, oldback), but we don’t care, since we want to prove IntQueueFB(q->front, q->back). However, crucially, IntQueueAux(q->front, oldback) is still true.

Now let’s look at the pop operation. Here is the un-annotated version:

#include "queue_headers.h"

int IntQueue_pop (struct int_queue *q)
{
  struct int_queueCell* h = q->front;
  if (h == q->back) {
    int x = h->first;
    freeIntQueueCell(h);
    q->front = 0;
    q->back = 0;
    return x;
  } else {
    int x = h->first;
    q->front = h->next;
    freeIntQueueCell(h);
    return x;
  }
}

Exercise: Again, before reading on, see if you can write down a plausible top-level specification. (For extra credit, see how far you can get with verifying it!)

Here is the fully annotated pop code:

#include "queue_headers.h"
#include "queue_pop_lemma.h"

int IntQueue_pop (struct int_queue *q)
/*@ requires take before = IntQueuePtr(q);
             before != Seq_Nil{};
    ensures take after = IntQueuePtr(q);
            after == tl(before);
            return == hd(before);
@*/
{
  /*@ split_case is_null(q->front); @*/
  struct int_queueCell* h = q->front;
  if (h == q->back) {
    /*@ assert ((alloc_id) h == (alloc_id) (q->back)); @*/
    int x = h->first;
    freeIntQueueCell(h);
    q->front = 0;
    q->back = 0;
    /*@ unfold snoc(Seq_Nil{}, x); @*/
    return x;
  } else {
    int x = h->first;
    /*@ apply snoc_facts(h->next, q->back, x); @*/
    q->front = h->next;
    freeIntQueueCell(h);
    return x;
  }
}

There are three annotations to explain. Let’s consider them in order.

First, the split_case on is_null(q->front) is needed to tell CN which of the branches of the if at the beginning of the IntQueueFB predicate it can "unpack". (IntQueuePtr can be unpacked immediately because it is unconditional, but IntQueueFB cannot.)

The guard/condition for IntQueueFB is is_null(front), which is why we need to do a split_case on this value. On one branch of the split_case, we have a contradiction: the fact that before == Seq_Nil{} (from IntQueueFB) conflicts with before != Seq_Nil from the precondition, so that case is immediate. On the other branch, CN now knows that the queue is non-empty as required and type checking proceeds.

When h == q->back, we are in the case where the queue contains just a single element, so we just need to NULL out its front and back fields and deallocate the dead cell. The unfold annotation is needed because the snoc function is recursive, so CN doesn’t do the unfolding automatically.

Finally, when the queue contains two or more elements, we need to deallocate the front cell, return its first field, and redirect the front field of the queue structure to point to the next cell. To push the verification through, we need a simple lemma about the snoc function:

/*@
lemma snoc_facts (pointer front, pointer back, i32 x)
  requires
      take Q = IntQueueAux(front, back);
      take B = Owned<struct int_queueCell>(back);
  ensures
      take NewQ = IntQueueAux(front, back);
      take NewB = Owned<struct int_queueCell>(back);
      Q == NewQ; B == NewB;
      let L = snoc (Seq_Cons{head: x, tail: Q}, B.first);
      hd(L) == x;
      tl(L) == snoc (Q, B.first);
@*/

The crucial part of this lemma is the last three lines, which express a simple, general fact about snoc: if we form a sequence by calling snoc to add a final element B.first to a sequence with head element x and tail Q, then the head of the resulting sequence is still x, and its tail is snoc (Q, B.first).

The requires clause and the first three lines of the ensures clause simply set things up so that we can name the various values we are talking about. Since these values come from structures in the heap, we need to take ownership of them. And since lemmas in CN are effectively just trusted functions that can also take in ghost values, we need to take ownership in both the requires and ensures clauses. (Taking them just in the requires clause would imply that they are consumed and deallocated when the lemma is applied — not what we want!)

(The only reason we can’t currently prove this lemma in CN is that we don’t have `take`s in CN statements, because this is just a simple unfolding.)

Exercise: Investigate what happens when you make each of the following changes to the queue definitions. What error does CN report? Where are the telltale clues in the error report that suggest what the problem was?

  • Remove assert (is_null(B.next)); from InqQueueFB.

  • Remove assert (is_null(B.next)); from InqQueueAux.

  • Remove one or both of occurrences of freeIntQueueCell(f) in IntQueue_pop.

  • Remove, in turn, each of the CN annotations in the bodies of IntQueue_pop and IntQueue_push.

Exercise: The conditional in the pop function tests whether or not f == b to find out whether we have reached the last element of the queue. Another way to get the same information would be to test whether f->next == 0. Can you verify this version?

Note: I (BCP) have not worked out the details, so am not sure how hard this is (or if it is even not possible, though I’d be surprised). Please let me know if you get it working!

Exercise: Looking at the code for the pop operation, it might seem reasonable to move the identical assignments to x in both branches to above the if. This doesn’t "just work" because the ownership reasoning is different. In the first case, ownership of h comes from IntQueueFB (because h == q->back). In the second case, it comes from IntQueueAux (because h != q->back).

Can you generalize the snoc_facts lemma to handle both cases? You can get past the dereference with a split_case but formulating the lemma before the return will be a bit more complicated.

Note: Again, this has not been shown to be possible, but Dhruv believes it should be!

Doubly Linked Lists

A doubly linked list is a linked list where each node has a pointer to both the next node and the previous node. This allows for O(1) operations for adding or removing nodes anywhere in the list. Here is the C type definition:

struct node {
  int data;  
  struct node* prev;
  struct node* next;
};

The idea behind the representation of this list is that we don’t keep track of the front or back, but rather we take any node in the list and have a sequence to the left and to the right of that node. The left and right are from the point of view of the node itself, so left is kept in reverse order. Additionally, similarly to in the Imperative Queues example, we can reuse the seq type.

/*@
datatype Dll {
    Empty_Dll {},
    Dll {datatype seq left, struct node curr, datatype seq right}
}
@*/

The predicate for this datatype is a bit complicated. The idea is that we first want to own the node that is passed in. Then, we want to follow all of the prev pointers to own everything backwards from the node. We want to do the same for the next pointers to own everything forwards from the node. This is how we construct our left and right fields.

/*@
predicate (datatype Dll) Dll_at (pointer p) {
    if (is_null(p)) {
        return Empty_Dll{};
    } else {
        take n = Owned<struct node>(p);
        take Left = Own_Backwards(n.prev, p, n);
        take Right = Own_Forwards(n.next, p, n);
        return Dll{left: Left, curr: n, right: Right};
    }
}

predicate (datatype seq) Own_Forwards(pointer p, pointer prev_pointer, struct node prev_node) {
    if (is_null(p)) {
        return Seq_Nil{};
    } else {
        take n = Owned<struct node>(p);
        assert (ptr_eq(n.prev, prev_pointer));
        assert(ptr_eq(prev_node.next, p));
        take Right = Own_Forwards(n.next, p, n);
        return Seq_Cons{head: n.data, tail: Right};
    }
}

predicate (datatype seq) Own_Backwards(pointer p, pointer next_pointer, struct node next_node) {
    if (is_null(p)) {
        return Seq_Nil{};
    } else {
        take n = Owned<struct node>(p);
        assert (ptr_eq(n.next, next_pointer));
        assert(ptr_eq(next_node.prev, p));
        take Left = Own_Backwards(n.prev, p, n);
        return Seq_Cons{head: n.data, tail: Left};
    }
}
@*/

Note that Dll_at takes ownership of the node passed in, and then calls Own_Backwards and Own_Forwards which recursively take ownership of the rest of the list and add their values to the left and right sequences, respectively.

Additionally, you will notice that Own_Forwards and Own_Backwards include ptr_eq assertions for the prev and next pointers. This is to ensure that the nodes in the list are correctly doubly linked. For example, the line assert (ptr_eq(n.prev, prev_pointer)); in Own_Forwards ensures that the current node correctly points backward to the previous node in the list. The line assert(ptr_eq(prev_node.next, p)); ensures that the previous node correctly points forward to the current node. The same can be said for these assertions in Own_Backwards.

All three of these predicates stop once they reach a null pointer. In this way, we can ensure that the only null pointers in the list are at the beginning and end of the list.

Before we move on to the functions that manipulate the doubly linked list, we need to define a few "getter" functions that will allow us to access the fields of our Dll datatype. This will make our specifications much easier to write.

/*@
function (datatype seq) Right (datatype Dll L) {
    match L {
        Empty_Dll {} => { Seq_Nil{} }
        Dll {left: _, curr: _, right: r} => { r }
    }
}

function (datatype seq) Left (datatype Dll L) {
    match L {
        Empty_Dll {} => { Seq_Nil {} }
        Dll {left: l, curr: _, right: _} => { l }
    }
}

function (struct node) Node (datatype Dll L) {
    match L {
        Empty_Dll {} => {  default<struct node> }
        Dll {left: _, curr: n, right: _} => { n }
    }
}
@*/

We also must include some boilerplate code for allocation and deallocation.

extern struct node *malloc_node();
/*@ spec malloc_node();
    requires true;
    ensures take u = Block<struct node>(return);
            !ptr_eq(return,NULL);
@*/ 

extern void free_node (struct node *p);
/*@ spec free_node(pointer p);
    requires take u = Block<struct node>(p);
    ensures true;
@*/

And we compile all of these files into a single header file.

#include "../list.h"
#include "../list_append.h"
#include "../list_rev.h"

#include "./c_types.h"
#include "./cn_types.h"
#include "./getters.h"
#include "./malloc_free.h"
#include "./predicates.h"

Lastly, an important note about this representation of a doubly linked list is that there is no higher level representation of the list (such as the int_queue structure in the Imperative Queues section). This makes it difficult to reason about adding and removing things from a list that may be empty at some times. If we have an empty list, we do not want any identifier of this list to disappear altogether. To work around this problem, we represent an empty list as a null pointer and require that every function that manipulates the list must return a pointer to somewhere in the list. This way, we can always have a pointer to the list, even if it is empty.

Now we can move on to an initialization function. Since an empty list is represented as a null pointer, we will look at initializing a singleton list (or in other words, a list with only one item).

#include "./headers.h"

struct node *singleton(int element)
/*@ ensures take Ret = Dll_at(return);
        Ret == Dll{left: Seq_Nil{}, curr: struct node{data: element, prev: NULL, next: NULL}, right: Seq_Nil{}};
@*/
{
   struct node *n = malloc_node();
   n->data = element;
   n->prev = 0;
   n->next = 0;
   return n;
}

The add and remove functions are where it gets a little tricker. Let’s start with add. Here is the unannotated version:

#include "./headers.h"

// Adds after the given node and returns a pointer to the new node
struct node *add(int element, struct node *n)
{
    struct node *new_node = malloc_node();
    new_node->data = element;
    new_node->prev = 0;
    new_node->next = 0;

    if (n == 0) //empty list case
    {
        new_node->prev = 0;
        new_node->next = 0;
        return new_node;
    } else { 
        new_node->next = n->next;
        new_node->prev = n;

        if (n->next !=0) {
            n->next->prev = new_node;
        }

        n->next = new_node;
        return new_node;
    }
}

Exercise: Before reading on, see if you can figure out what specifications are needed.

Now, here is the annotated version of the add operation:

#include "./headers.h"

// Adds after the given node and returns a pointer to the new node
struct node *add(int element, struct node *n)
/*@ requires take Before = Dll_at(n);
    ensures  take After = Dll_at(return);

             is_null(n) ? After == Dll { left: Seq_Nil{}, curr: Node(After), right: Seq_Nil{}}
                        : After == Dll { left: Seq_Cons{head: Node(Before).data, tail: Left(Before)}, curr: Node(After), right: Right(Before)};
@*/
{
    struct node *new_node = malloc_node();
    new_node->data = element;
    new_node->prev = 0;
    new_node->next = 0;

    if (n == 0) //empty list case
    {
        new_node->prev = 0;
        new_node->next = 0;
        return new_node;
    } else { 
        /*@ split_case(is_null(n->prev)); @*/
        new_node->next = n->next;
        new_node->prev = n;

        if (n->next !=0) {
            /*@ split_case(is_null(n->next->next)); @*/
            n->next->prev = new_node;
        }

        n->next = new_node;
        return new_node;
    }
}

First, let’s look at the pre and post conditions. The requires clause is straightforward. We need to own the list centered around the node that n points to. Before is a Dll that is either empty, or it has a seq to the left, the current node that n points to, and a seq to the right. This corresponds to the state of the list when it is passed in.

In the ensures clause, we again establish ownership of the list, but this time it is centered around the added node. This means that After is a Dll structure similar to Before, except that the node curr is now the created node. The old curr is pushed into the left part of the new list. The ternary operator in the ensures clause is saying that if the list was empty coming in, it now is a singleton list. Otherwise, the left left part of the list now has the data from the old curr node, the new curr node is the added node, and the right part of the list is the same as before.

Now, let’s look at the annotations in the function body. CN can figure out the empty list case for itself, but it needs some help with the non-empty list case. The split_case on is_null((\*n).prev) tells CN to unpack the Own_Backwards predicate. Without this annotation, CN cannot reason that we didn’t lose the left half of the list before we return, and will claim we are missing a resource for returning. The split_case on is_null(n→next→next) is similar, but for unpacking the Own_Forwards predicate. Note that we have to go one more node forward to make sure that everything past n→next is still owned at the end of the function.

Now let’s look at the remove operation. Traditionally, a remove operation for a list returns the integer that was removed. However we also want all of our functions to return a pointer to the list. Because of this, we define a struct that includes an int and a node.

struct node_and_int {
  struct node* node;
  int data;
};

extern struct node_and_int *malloc_node_and_int();
/*@ spec malloc_node_and_int();
    requires true;
    ensures take u = Block<struct node_and_int>(return);
            !ptr_eq(return,NULL);
@*/ 

extern void free_node_and_int (struct node_and_int *p);
/*@ spec free_node_and_int(pointer p);
    requires take u = Block<struct node_and_int>(p);
    ensures true;
@*/

Now we can look at the code for the remove operation. Here is the un-annotated version:

#include "./headers.h"
#include "./node_and_int.h"

// removes the given node from the list and returns another pointer 
// to somewhere in the list, or a null pointer if the list is empty.
struct node_and_int *remove(struct node *n)
{
    struct node *temp = 0;
    if (n->prev != 0) {
        n->prev->next = n->next;
        temp = n->prev;
    }
    if (n->next != 0) {
        n->next->prev = n->prev;
        temp = n->next;
    }

    struct node_and_int *pair = malloc_node_and_int();
    pair->node = temp;
    pair->data = n->data;

    free_node(n);       
    return pair;
}

Exercise: Before reading on, see if you can figure out what specifications are needed.

Now, here is the fully annotated version of the remove operation:

#include "./headers.h"
#include "./node_and_int.h"

// removes the given node from the list and returns another pointer 
// to somewhere in the list, or a null pointer if the list is empty.
struct node_and_int *remove(struct node *n)
/*@ requires !is_null(n);
             take Before = Dll_at(n);
             let del = Node(Before);
    ensures  take ret = Owned<struct node_and_int>(return);
             take After = Dll_at(ret.node);
             ret.data == del.data;
             (is_null(del.prev) && is_null(del.next)) ? After == Empty_Dll{}
                 : (!is_null(del.next) ? After == Dll{left: Left(Before), curr: Node(After), right: tl(Right(Before))}
                     : After == Dll{left: tl(Left(Before)), curr: Node(After), right: Right(Before)});
@*/
{
    struct node *temp = 0;
    if (n->prev != 0) {
        /*@ split_case(is_null(n->prev->prev)); @*/
        n->prev->next = n->next;
        temp = n->prev;
    }
    if (n->next != 0) {
        /*@ split_case(is_null(n->next->next)); @*/
        n->next->prev = n->prev;
        temp = n->next;
    }

    struct node_and_int *pair = malloc_node_and_int();
    pair->node = temp;
    pair->data = n->data;

    free_node(n);       
    return pair;
}

First, let’s look at the pre and post conditions. The requires clause says that we cannot remove a node from an empty list, so the pointer passed in must not be null. Then we take ownership of the list, and we assign the node of that list to the identifier del to make our spec more readable. So Before refers to the Dll when the function is called, and del refers to the node that will be deleted.

Then in the ensures clause, we must take ownership of the node_and_int struct as well as the Dll that the node is part of. Here, After refers to the Dll when the function returns. We ensure that the int that is returned is the value of the deleted node, as intended. Then we have a complicated nested ternary conditional that ensures that After is the same as Before except for the deleted node. Let’s break down this conditional:

  • The first guard asks if both del.prev and del.next are null. In this case, we are removing the only node in the list, so the resulting list will be empty. The else branch of this conditional contains it’s own conditional.

  • For the following conditional, the guard checks if 'del.prev' is NOT null. Note that in the code, this means that the returned node is del.next, regardless of whether or not del.prev is null. If this is the case, After is now centered around del.next, and the left part of the list is the same as before. Since del.next was previously the head of the right side, the right side loses its head in After. This is where we get After == Dll{left: Left(Before), curr: Node(After), right: tl(Right(Before))}.

  • The final else branch is the case where del.next is null, but del.prev is not null. In this case, the returned node is del.prev. This branch follows the same logic as the one before it, except now we are taking the head of the left side rather than the right side. Now the right side is unchanged, and the left side is just the tail, as seen shown in After == Dll{left: tl(Left(Before)), curr: Node(After), right: Right(Before)};

Now, let’s look at the annotations in the function body. These are similar to in the add function. Both of these split_case annotations are needed to unpack the Own_Forwards and Own_Backwards predicates. Without these annotations, CN will not be able to reason that we didn’t lose the left or right half of the list before we return, and will claim we are missing a resource for returning.

Exercise: There are many other functions that one might want to implement for a doubly linked list. For example, one might want to implement a function that appends one list to another, or a function that reverses a list. Try implementing these functions and writing their specifications.

The Runway

Suppose we have been tasked with writing a program that simulates a runway at an airport. This airport is very small, so it only has one runway that is used for both takeoffs and landings. We want to verify that the runway is always safe by implementing the following specifications into CN:

  1. The runway has two modes: departure mode and arrival mode. The two modes can never be active at the same time, and neither mode is active at the beginning of the day.

  2. There is always a waitlist of planes that need to land at the airport and planes that need to leave the airport at a given moment. These can be modeled with counters W_A for the number of planes waiting to arrive, and W_D for the number of planes waiting to depart.

  3. At any time, a plane is either waiting to arrive, waiting to depart, or on the runway. Once a plane has started arriving or departing, the corresponding counter (W_A or W_D) is decremented. There is no need to keep track of planes once they have arrived or departed. Additionally, once a plane is waiting to arrive or depart, it continues waiting until it has arrived or departed.

  4. Let’s say it takes 5 minutes for a plane to arrive or depart. During these 5 minutes, no other plane may use the runway. We can keep track of how long a plane has been on the runway with the Runway_Counter. If the Runway_Counter is at 0, then there is currently no plane using the runway, and it is clear for another plane to begin arriving or departing. Once the Runway_Counter reaches 5, we can reset it at the next clock tick. One clock tick represents 1 minute.

  5. If there is at least one plane waiting to depart and no cars waiting to arrive, then the runway is set to departure mode (and vice versa for arrivals).

  6. If both modes of the runway are inactive and planes become ready to depart and arrive simultaneously, the runway will activate arrival mode first. If the runway is in arrival mode and there are planes waiting to depart, no more than 3 planes may arrive from that time point. When either no more planes are waiting to arrive or 3 planes have arrived, the runway switches to departure mode. If the runway is on arrival mode and no planes are waiting to depart, then the runway may stay in arrival mode until a plane is ready to depart, from which time the 3-plane limit is imposed (and vice versa for departures). Put simply, if any planes are waiting for a mode that is inactive, that mode will become active no more than 15 minutes later (5 minutes for each of 3 planes).

To encode all this in CN, we first need a way to describe the state of the runway at a given time. We can use a struct that includes the following fields:

  • ModeA and ModeD to represent the arrival and departure modes, respectively. We can define constants for ACTIVE and INACTIVE, as described in the Constants section above.

  • W_A and W_D to represent the number of planes waiting to arrive and depart, respectively.

  • Runway_Time to represent the time (in minutes) that a plane has spent on the runway while arriving or departing.

  • Plane_Counter to represent the number of planes that have arrived or departed while planes are waiting for the other mode. This will help us keep track of the 3-plane limit as described in (6).

#define INACTIVE 0
    /*@ function (i32) INACTIVE () @*/
    static int c_INACTIVE() /*@ cn_function INACTIVE; @*/ { return INACTIVE; }

#define ACTIVE 1
    /*@ function (i32) ACTIVE () @*/
    static int c_ACTIVE() /*@ cn_function ACTIVE; @*/ { return ACTIVE; }

struct State {
  int ModeA;
  int ModeD;

  int W_A;
  int W_D;

  int Runway_Time;
  int Plane_Counter;
};

Next, we need to specify what makes a state valid. We must define a rigorous specification in order to ensure that the runway is always safe and working as intended. Try thinking about what this might look like before looking at the code below.

/*@
function (boolean) valid_state (struct State s) {
     (s.ModeA == INACTIVE() || s.ModeA == ACTIVE()) &&
     (s.ModeD == INACTIVE() || s.ModeD == ACTIVE()) &&
     (s.ModeA == INACTIVE() || s.ModeD == INACTIVE()) &&
     (s.W_A >= 0i32 && s.W_D >= 0i32) &&
     (0i32 <= s.Runway_Time && s.Runway_Time <= 5i32) && 
     (0i32 <= s.Plane_Counter && s.Plane_Counter <= 3i32) && 
     (s.ModeA == INACTIVE() && s.ModeD == INACTIVE() implies s.Plane_Counter == 0i32) &&
     (s.Runway_Time > 0i32 implies (s.ModeA == ACTIVE() || s.ModeD == ACTIVE())) &&

     (s.Plane_Counter > 0i32 && s.ModeA == ACTIVE() implies s.W_D > 0i32) &&
     (s.Plane_Counter > 0i32 && s.ModeD == ACTIVE() implies s.W_A > 0i32)

}
@*/

Let’s walk through the specifications in valid_state:

  • The first two lines ensure that both modes in our model behave as intended: they can only be active or inactive. Any other value for these fields would be invalid.

  • The third line says that either arrival mode or departure mode must be inactive. This specification ensures that the runway is never in both modes at the same time.

  • The fourth line says that the number of planes waiting to arrive or depart must be non-negative. This makes sense: we can’t have a negative number of planes!

  • The fifth line ensures that the runway time is between 0 and 5. This addresses how a plane takes 5 minutes on the runway as described in (4).

  • The sixth line ensures that the plane counter is between 0 and 3. This is important for the 3-plane limit as described in (6).

  • The seventh line refers to the state at the beginning of the day. If both modes are inactive, then the day has just begun, and thus no planes have departed yet. This is why the plane counter must be 0.

  • The eighth line says that if there is a plane on the runway, then one of the modes must be active. This is because a plane can only be on the runway if it is either arriving or departing.

  • The final two lines ensure that we are incrementing Plane_Counter only if there are planes waiting for the other mode, as described in (6).

Now that we have the tools to reason about the state of the runway formally, let’s start writing some functions.

First, let’s look at an initialization function and functions to update Plane_Counter. Step through these yourself and make sure you understand the reasoning behind each specification.

struct State init()
/*@ requires true;
    ensures valid_state(return);
@*/
{
    struct State initial = {INACTIVE,INACTIVE,0,0,0,0};
    return initial;
}

struct State increment_Plane_Counter(struct State s)
/*@ requires valid_state(s);
             0i32 <= s.Plane_Counter;
             s.Plane_Counter <= 2i32;
             s.ModeA == ACTIVE() || s.ModeD == ACTIVE();
             s.ModeA == ACTIVE() implies s.W_D > 0i32;
             s.ModeD == ACTIVE() implies s.W_A > 0i32;
    ensures  valid_state(return);
             s.Plane_Counter == return.Plane_Counter - 1i32;
             s.Runway_Time == return.Runway_Time;
             s.ModeA == return.ModeA;
             s.ModeD == return.ModeD;
             s.W_A == return.W_A;
             s.W_D == return.W_D;
@*/
{
  struct State temp = s;
  temp.Plane_Counter = s.Plane_Counter + 1;
  return temp;
}

struct State reset_Plane_Counter(struct State s)
/*@ requires valid_state(s);
    ensures  valid_state(return);
             return.Plane_Counter == 0i32;
             s.Runway_Time == return.Runway_Time;
             s.ModeA == return.ModeA;
             s.ModeD == return.ModeD;
             s.W_A == return.W_A;
             s.W_D == return.W_D;
@*/
{
  struct State temp = s;
  temp.Plane_Counter = 0;
  return temp;
}

Exercise: Now try adding your own specifications to the following functions. Make sure that you specify a valid state as a pre and post condition for every function. If you get stuck, the solution is in the solutions folder.

#include "state.h"
#include "valid_state.h"
#include "funcs1.h"

struct State increment_Runway_Time(struct State s)
{
  struct State temp = s;
  temp.Runway_Time = s.Runway_Time + 1;
  return temp;
}


struct State reset_Runway_Time(struct State s)
{
  struct State temp = s;
  temp.Runway_Time = 0;
  return temp;
}

struct State arrive(struct State s)
{
  struct State temp = s;
  temp.W_A = s.W_A - 1;
  if (s.W_D > 0) {
    temp = increment_Plane_Counter(temp);
  }
  return temp;
}

struct State depart(struct State s)
{
  struct State temp = s;
  temp.W_D = s.W_D - 1;
  if (s.W_A > 0) {
    temp = increment_Plane_Counter(temp);
  }
  return temp;
}

struct State switch_modes(struct State s) 
{
  struct State temp = s;
  if (s.ModeA == INACTIVE) {
    // if arrival mode is inactive, make it active
    temp.ModeA = ACTIVE;
  } else {
    // if arrival mode is active, make it inactive
    temp.ModeA = INACTIVE;
  }
  if (s.ModeD == INACTIVE) {
    // if departure mode is inactive, make it active
    temp.ModeD = ACTIVE;
  } else {
    // if departure mode is active, make it inactive
    temp.ModeD = INACTIVE;
  }
  return temp;
}


// This function represents what happens every minute at the airport. 
// One `tick` corresponds to one minute.
struct State tick(struct State s)
{

    // Plane on the runway
    if (s.Runway_Time > 0)
    {
      if (s.Runway_Time == 5)
      {
        s = reset_Runway_Time(s);
      } else {
        s = increment_Runway_Time(s);
      }
      return s;
    }
    // Plane chosen to arrive
    else if (s.ModeA == ACTIVE && s.W_A > 0 && s.Plane_Counter < 3) {
      s = arrive(s);
      s = increment_Runway_Time(s);
      return s;
    } 
    // Plane chosen to depart
    else if (s.ModeD == ACTIVE && s.W_D > 0 && s.Plane_Counter < 3) {
      s = depart(s);
      s = increment_Runway_Time(s);
      return s;
    }
    // Need to switch modes
    else {
      // Need to switch from arrival to departure mode
      if (s.ModeA == ACTIVE) {
        s = reset_Plane_Counter(s);
        s = switch_modes(s);
        // If there are planes waiting to depart, let one depart
        if (s.W_D > 0) {
          s = depart(s);
          s = increment_Runway_Time(s);
        }
        return s;
      }
      // Need to switch from departure to arrival mode
      else if (s.ModeD == ACTIVE) {
        s = reset_Plane_Counter(s);
        s = switch_modes(s);


        // If there are planes waiting to arrive, let one arrive
        if (s.W_A > 0) {
          s = arrive(s);
          
          s = increment_Runway_Time(s);
        }
        return s;
      }
      // If neither mode is active, then it must be the beginning of the day.
      else {
        if (s.W_A > 0) {
          s.ModeA = ACTIVE;
          s = arrive(s);
          s = increment_Runway_Time(s);
          return s;
        } else if (s.W_D > 0) {
          s.ModeD = ACTIVE;
          s = depart(s);
          s = increment_Runway_Time(s);
          return s;
        } else {
          // No planes are waiting, so we choose arrival mode and wait
          s.ModeA = ACTIVE;
          return s;
        }
      }
    }
}

Exercise: For extra practice, try coming up with different specifications or variations for this exercise and implementing them yourself!