Pointer and Simple Ownership¶

So far we’ve only considered example functions manipulating integer values. Verification becomes more interesting and challenging when pointers are involved, because the safety of memory accesses via pointers has to be verified.

CN uses separation logic resources and the concept of ownership to reason about memory accesses. A resource is the permission to access a region of memory. Unlike logical constraints, resource ownership is unique, meaning resources cannot be duplicated.

Let’s look at a simple example. The function read takes an int pointer p and returns the pointee value.

exercises/read.c

int read (int *p)
{
  return *p;
}

Running CN on this example produces the following error:

cn verify exercises/read.c
[1/1]: read
exercises/read.c:3:10: error: Missing resource for reading
return \*p;
^~
Resource needed: Owned<signed int>(p)
Consider the state in /var/folders/\_v/ndl32wpj4bb3y9dg11rvc8ph0000gn/T/state_403624.html

For the read *p to be safe, ownership of a resource is missing: a resource Owned<signed int>(p).

Owned resources¶

Given a C-type T and pointer p, the resource Owned<T>(p) asserts ownership of a memory cell at location p of the size of C-type T. It is CN’s equivalent of a points-to assertion in separation logic (indexed by C-types T).

In this example we can ensure the safe execution of read by adding a precondition that requires ownership of Owned<int>(p), as shown below. For now ignore the notation take ... = Owned<int>(p). Since reading the pointer does not disturb its value, we also add a corresponding postcondition, whereby read returns ownership of p after it is finished executing, in the form of another Owned<int>(p) resource.

solutions/read.c

int read (int *p)
/*@ requires take P = Owned<int>(p);
    ensures take P_post = Owned<int>(p);
@*/
{
  return *p;
}

This specification means that:

any function calling read has to be able to provide a resource Owned<int>(p) to pass into read, and
the caller will receive back a resource Owned<int>(p) when read returns.

Resource outputs¶

A caller of read may also wish to know that read actually returns the correct value, the pointee of p, and also that it does not change memory at location p. To phrase both we need a way to refer to the pointee of p.

In CN, resources have outputs. Each resource outputs the information that can be derived from ownership of the resource. What information is returned is specific to the type of resource. A resource Owned<T>(p) (for some C-type T) outputs the pointee value of p, since that can be derived from the resource ownership: assume you have a pointer p and the associated ownership, then this uniquely determines the pointee value of p.

CN uses the take-notation seen in the example above to bind the output of a resource to a new name. The precondition take P = Owned<int>(p) does two things: (1) it assert ownership of resource Owned<int>(p), and (2) it binds the name P to the resource output, here the pointee value of p at the start of the function. Similarly, the postcondition introduces the name P_post for the pointee value on function return.

That means we can use the resource outputs from the pre- and postcondition to strengthen the specification of read as planned. We add two new postconditions specifying

that read returns P (the initial pointee value of p), and
that the pointee values P and P_post before and after execution of read (respectively) are the same.

exercises/read2.c

int read (int *p)
/*@ requires take P = Owned<int>(p);
    ensures take P_post = Owned<int>(p);
            return == P;
            P_post == P;
@*/
{
  return *p;
}

Aside. In standard separation logic, the equivalent specification for read could have been phrased as follows (where \return binds the return value in the postcondition):

∀p.
∀v1.
{ p ↦ P }
read(p)
{ \return. ∃P_post. (p ↦ P_post) /\ return = P /\ P = P_post }

CN’s take notation is just an alternative syntax for quantification over the values of resources, but a useful one: the take notation syntactically restricts how these quantifiers can be used to ensure CN can always infer them.

Linear resource ownership¶

In the specifications we have written so far, functions that receive resources as part of their precondition also return this ownership in their postcondition.

Let’s try the read example from earlier again, but with a postcondition that does not return the ownership:

exercises/read.broken.c

int read (int *p)
/*@ requires take v1 = Owned<int>(p); @*/
{
  return *p;
}

CN rejects this program with the following message:

cn verify exercises/read.broken.c
[1/1]: read
build/exercises/read.broken.c:4:3: error: Left_Sublist-over unused resource 'Owned<signed int>(p)(v1)'
return \*p;
^~~~~~~~~~
Consider the state in /var/folders/\_v/ndl32wpj4bb3y9dg11rvc8ph0000gn/T/state_17eb4a.html

CN has typechecked the function and verified (1) that it is safe to execute under the precondition (given ownership Owned<int>(p)) and (2) that the function (vacuously) satisfies its postcondition. But following the check of the postcondition it finds that not all resources have been "used up".

Indeed, given the above specification, read leaks memory: it takes ownership, does not return it, but also does not deallocate the owned memory or otherwise dispose of it. In CN this is a type error.

CN’s resources are linear. This means not only that resources cannot be duplicated, but also that resources cannot simply be dropped or "forgotten". Every resource passed into a function has to be either returned to the caller or else destroyed by deallocating the owned area of memory (as we shall see later).

CN’s motivation for linear tracking of resources is its focus on low-level systems software in which memory is managed manually; in this context, memory leaks are typically very undesirable. As a consequence, function specifications have to do precise bookkeeping of their resource footprint and, in particular, return any unused resources back to the caller.

Exercises¶

Quadruple. Specify the function quadruple_mem, which is similar to the earlier quadruple function, except that the input is passed as an int pointer. Write a specification that takes ownership of this pointer on entry and returns this ownership on exit, leaving the pointee value unchanged.

exercises/slf_quadruple_mem.c

int quadruple_mem (int *p)
{
  int m = *p + *p;
  return m + m;
}

Abs. Give a specification to the function abs_mem, which computes the absolute value of a number passed as an int pointer.

exercises/abs_mem.c

int abs_mem (int *p)
{
  int x = *p;
  if (x >= 0) {
    return x;
  }
  else {
    return -x;
  }
}

Block resources¶

Aside from the Owned resources seen so far, CN has another built-in type of resource called Block. Given a C-type T and pointer p, Block<T>(p) asserts the same ownership as Owned<T>(p) — ownership of a memory cell at p the size of type T — but, in contrast to Owned, Block memory is not assumed to be initialised.

CN uses this distinction to prevent reads from uninitialised memory:

A read at C-type T and pointer p requires a resource Owned<T>(p), i.e., ownership of initialised memory at the right C-type. The load returns the Owned resource unchanged.
A write at C-type T and pointer p needs only a Block<T>(p) (so, unlike reads, writes to uninitialised memory are fine). The write consumes ownership of the Block resource (it destroys it) and returns a new resource Owned<T>(p) with the value written as the output. This means the resource returned from a write records the fact that this memory cell is now initialised and can be read from.

Since Owned carries the same ownership as Block, just with the additional information that the Owned memory is initalised, a resource Owned<T>(p) is "at least as good" as Block<T>(p) — an Owned<T>(p) resource can be used whenever Block<T>(p) is needed. For instance CN’s type checking of a write to p requires a Block<T>(p), but if an Owned<T>(p) resource is what is available, this can be used just the same. This allows an already-initialised memory cell to be over-written again.

Unlike Owned, whose output is the pointee value, Block has no meaningful output.

Writing through pointers¶

Let’s explore resources and their outputs in another example. The C function incr takes an int pointer p and increments the value in the memory cell that it poinbts to.

exercises/slf0_basic_incr.signed.c

void incr (int *p)
/*@ requires take P = Owned<int>(p);
             ((i64) P) + 1i64 <= (i64) MAXi32();
    ensures take P_post = Owned<int>(p);
            P_post == P + 1i32;
@*/
{
  *p = *p+1;
}

In the precondition we assert ownership of resource Owned<int>(p), binding its output/pointee value to P, and use P to specify that p must point to a sufficiently small value at the start of the function so as not to overflow when incremented. The postcondition asserts ownership of p with output P_post, as before, and uses this to express that the value p points to is incremented by incr: P_post == P + 1i32.

If we incorrectly tweaked this specification and used Block<int>(p) instead of Owned<int>(p) in the precondition, as below, then CN would reject the program.

exercises/slf0_basic_incr.signed.broken.c

void incr (int *p)
/*@ requires take P = Block<int>(p); 
    ensures take P_post = Owned<int>(p); 
@*/
{
  *p = *p+1;
}

CN reports:

build/solutions/slf0_basic_incr.signed.broken.c:6:11: error: Missing resource for reading
int n = \*p;
^~
Resource needed: Owned<signed int>(p)
Consider the state in /var/folders/\_v/ndl32wpj4bb3y9dg11rvc8ph0000gn/T/state_5da0f3.html

The Owned<int>(p) resource required for reading is missing, since, per the precondition, only Block<int>(p) is available. Checking the linked HTML file confirms this. Here the section "Available resources" lists all resource ownership at the point of the failure:

Block<signed int>(p)(u), i.e., ownership of uninitialised memory at location p; the output is a void/unit value u (specified in the second pair of parentheses)
Owned<signed int*>(&ARG0)(p), the ownership of (initialised) memory at location &ARG0, i.e., the memory location where the first function argument is stored; its output is the pointer p (not to be confused with the pointee of p); and finally
__CN_Alloc(&ARG0)(void) is a resource that records allocation information for location &ARG0; this is related to CN’s memory-object semantics, which we ignore for the moment.

Exercises¶

Zero. Write a specification for the function zero, which takes a pointer to uninitialised memory and initialises it to 0.

exercises/zero.c

void zero (int *p)
{
  *p = 0;
}

In-place double. Give a specification for the function inplace_double, which takes an int pointer p and doubles the pointee value: specify the precondition needed to guarantee safe execution and a postcondition that captures the function’s behaviour.

exercises/slf3_basic_inplace_double.c

void inplace_double (int *p)
{
  int n = *p;
  int m = n + n;
  *p = m;
}

Multiple owned pointers¶

When functions manipulate multiple pointers, we can assert their ownership just like before. However pointer ownership in CN is unique -- that is, simultaneously owning Owned or Block resources for two pointers implies that these pointers are disjoint.

The following example shows the use of two Owned resources for accessing two different pointers by a function add, which reads two int values in memory, at locations p and q, and returns their sum.

exercises/add_read.c

unsigned int add (unsigned int *p, unsigned int *q)
/*@ requires take P = Owned<unsigned int>(p);
             take Q = Owned<unsigned int>(q);
    ensures take P_post = Owned<unsigned int>(p);
            take Q_post = Owned<unsigned int>(q);
            P == P_post && Q == Q_post;
            return == P + Q;
@*/
{
  unsigned int m = *p;
  unsigned int n = *q;
  return m+n;
}

This time we use C’s unsigned int type. In C, over- and underflow of unsigned integers is not undefined behaviour, so we do not need any special preconditions to rule this out. Instead, when an arithmetic operation at unsigned type goes outside the representable range, the value "wraps around".

The CN variables P and Q (resp. P_post and Q_post) for the pointee values of p and q before (resp. after) the execution of add have CN basetype u32, so unsigned 32-bit integers, matching the C unsigned int type. Like C’s unsigned integer arithmetic, CN unsigned int values wrap around when exceeding the value range of the type.

Hence, the postcondition return == P + Q holds also when the sum of P and Q is greater than the maximal unsigned int value.

In the following we will sometimes use unsigned integer types to focus on specifying memory ownership, rather than the conditions necessary to show absence of C arithmetic undefined behaviour.

Exercises¶

Swap. Specify the function swap, which takes two owned unsigned int pointers and swaps their values.

exercises/swap.c

void swap (unsigned int *p, unsigned int *q)
{
  unsigned int m = *p;
  unsigned int n = *q;
  *p = n;
  *q = m;
}

Transfer. Write a specification for the function transfer, shown below.

exercises/slf8_basic_transfer.c

void transfer (unsigned int *p, unsigned int *q)
{
  unsigned int n = *p;
  unsigned int m = *q;
  unsigned int s = n + m;
  *p = s;
  *q = 0;
}