Pointer and Simple Ownership¶
So far we’ve only considered example functions manipulating integer values. Verification becomes more interesting and challenging when pointers are involved, because the safety of memory accesses via pointers has to be verified.
CN uses separation logic resources and the concept of ownership to reason about memory accesses. A resource is the permission to access a region of memory. Unlike logical constraints, resource ownership is unique, meaning resources cannot be duplicated.
Let’s look at a simple example. The function read
takes an int
pointer p
and returns the pointee value.
Running CN on this example produces the following error:
cn verify exercises/read.c
[1/1]: read
exercises/read.c:3:10: error: Missing resource for reading
return \*p;
^~
Resource needed: Owned<signed int>(p)
Consider the state in /var/folders/\_v/ndl32wpj4bb3y9dg11rvc8ph0000gn/T/state_403624.html
For the read *p
to be safe, ownership of a resource is missing: a resource Owned<signed int>(p)
.
Owned resources¶
Given a C-type T
and pointer p
, the resource Owned<T>(p)
asserts ownership of a memory cell at location p
of the size of C-type T
. It is CN’s equivalent of a points-to assertion in separation logic (indexed by C-types T
).
In this example we can ensure the safe execution of read
by adding a precondition that requires ownership of Owned<int>(p)
, as shown below. For now ignore the notation take ... = Owned<int>(p)
. Since reading the pointer does not disturb its value, we also add a corresponding postcondition, whereby read
returns ownership of p
after it is finished executing, in the form of another Owned<int>(p)
resource.
int read (int *p)
/*@ requires take P = Owned<int>(p);
ensures take P_post = Owned<int>(p);
@*/
{
return *p;
}
This specification means that:
-
any function calling
read
has to be able to provide a resourceOwned<int>(p)
to pass intoread
, and -
the caller will receive back a resource
Owned<int>(p)
whenread
returns.
Resource outputs¶
A caller of read
may also wish to know that read
actually returns the correct value, the pointee of p
, and also that it does not change memory at location p
. To phrase both we need a way to refer to the pointee of p
.
In CN, resources have outputs. Each resource outputs the information that can be derived from ownership of the resource. What information is returned is specific to the type of resource. A resource Owned<T>(p)
(for some C-type T
) outputs the pointee value of p
, since that can be derived from the resource ownership: assume you have a pointer p
and the associated ownership, then this uniquely determines the pointee value of p
.
CN uses the take
-notation seen in the example above to bind the output of a resource to a new name. The precondition take P = Owned<int>(p)
does two things: (1) it assert ownership of resource Owned<int>(p)
, and (2) it binds the name P
to the resource output, here the pointee value of p
at the start of the function. Similarly, the postcondition introduces the name P_post
for the pointee value on function return.
That means we can use the resource outputs from the pre- and postcondition to strengthen the specification of read
as planned. We add two new postconditions specifying
- that
read
returnsP
(the initial pointee value ofp
), and - that the pointee values
P
andP_post
before and after execution ofread
(respectively) are the same.
int read (int *p)
/*@ requires take P = Owned<int>(p);
ensures take P_post = Owned<int>(p);
return == P;
P_post == P;
@*/
{
return *p;
}
Aside. In standard separation logic, the equivalent specification for read
could have been phrased as follows (where \return
binds the return value in the postcondition):
CN’s take
notation is just an alternative syntax for quantification over the values of resources, but a useful one: the take
notation syntactically restricts how these quantifiers can be used to ensure CN can always infer them.
Linear resource ownership¶
In the specifications we have written so far, functions that receive resources as part of their precondition also return this ownership in their postcondition.
Let’s try the read
example from earlier again, but with a postcondition that does not return the ownership:
CN rejects this program with the following message:
cn verify exercises/read.broken.c
[1/1]: read
build/exercises/read.broken.c:4:3: error: Left_Sublist-over unused resource 'Owned<signed int>(p)(v1)'
return \*p;
^~~~~~~~~~
Consider the state in /var/folders/\_v/ndl32wpj4bb3y9dg11rvc8ph0000gn/T/state_17eb4a.html
CN has typechecked the function and verified (1) that it is safe to
execute under the precondition (given ownership Owned<int>(p)
)
and (2) that the function (vacuously) satisfies its postcondition. But
following the check of the postcondition it finds that not all
resources have been "used up
".
Indeed, given the above specification, read
leaks memory: it takes ownership, does not return it, but also does not deallocate the owned memory or otherwise dispose of it. In CN this is a type error.
CN’s resources are linear. This means not only that resources cannot be duplicated, but also that resources cannot simply be dropped or "forgotten
". Every resource passed into a function has to be either returned to the caller or else destroyed by deallocating the owned area of memory (as we shall see later).
CN’s motivation for linear tracking of resources is its focus on low-level systems software in which memory is managed manually; in this context, memory leaks are typically very undesirable. As a consequence, function specifications have to do precise bookkeeping of their resource footprint and, in particular, return any unused resources back to the caller.
Exercises¶
Quadruple. Specify the function quadruple_mem
, which is similar to the earlier quadruple
function, except that the input is passed as an int
pointer. Write a specification that takes ownership of this pointer on entry and returns this ownership on exit, leaving the pointee value unchanged.
Abs. Give a specification to the function abs_mem
, which computes the absolute value of a number passed as an int
pointer.
int abs_mem (int *p)
{
int x = *p;
if (x >= 0) {
return x;
}
else {
return -x;
}
}
Block resources¶
Aside from the Owned
resources seen so far, CN has another
built-in type of resource called Block
. Given a C-type T
and
pointer p
, Block<T>(p)
asserts the same ownership as
Owned<T>(p)
— ownership of a memory cell at p
the size of type
T
— but, in contrast to Owned
, Block
memory is not assumed
to be initialised.
CN uses this distinction to prevent reads from uninitialised memory:
-
A read at C-type
T
and pointerp
requires a resourceOwned<T>(p)
, i.e., ownership of initialised memory at the right C-type. The load returns theOwned
resource unchanged. -
A write at C-type
T
and pointerp
needs only aBlock<T>(p)
(so, unlike reads, writes to uninitialised memory are fine). The write consumes ownership of theBlock
resource (it destroys it) and returns a new resourceOwned<T>(p)
with the value written as the output. This means the resource returned from a write records the fact that this memory cell is now initialised and can be read from.
Since Owned
carries the same ownership as Block
, just with the
additional information that the Owned
memory is initalised, a
resource Owned<T>(p)
is "at least as good
" as Block<T>(p)
—
an Owned<T>(p)
resource can be used whenever Block<T>(p)
is
needed. For instance CN’s type checking of a write to p
requires a
Block<T>(p)
, but if an Owned<T>(p)
resource is what is
available, this can be used just the same. This allows an
already-initialised memory cell to be over-written again.
Unlike Owned
, whose output is the pointee value, Block
has no meaningful output.
Writing through pointers¶
Let’s explore resources and their outputs in another example. The C function incr
takes an int
pointer p
and increments the value in the memory cell that it poinbts to.
void incr (int *p)
/*@ requires take P = Owned<int>(p);
((i64) P) + 1i64 <= (i64) MAXi32();
ensures take P_post = Owned<int>(p);
P_post == P + 1i32;
@*/
{
*p = *p+1;
}
In the precondition we assert ownership of resource Owned<int>(p)
,
binding its output/pointee value to P
, and use P
to specify
that p
must point to a sufficiently small value at the start of
the function so as not to overflow when incremented. The postcondition
asserts ownership of p
with output P_post
, as before, and uses
this to express that the value p
points to is incremented by
incr
: P_post == P + 1i32
.
If we incorrectly tweaked this specification and used Block<int>(p)
instead of Owned<int>(p)
in the precondition, as below, then CN would reject the program.
void incr (int *p)
/*@ requires take P = Block<int>(p);
ensures take P_post = Owned<int>(p);
@*/
{
*p = *p+1;
}
CN reports:
build/solutions/slf0_basic_incr.signed.broken.c:6:11: error: Missing resource for reading
int n = \*p;
^~
Resource needed: Owned<signed int>(p)
Consider the state in /var/folders/\_v/ndl32wpj4bb3y9dg11rvc8ph0000gn/T/state_5da0f3.html
The Owned<int>(p)
resource required for reading is missing, since, per the precondition, only Block<int>(p)
is available. Checking the linked HTML file confirms this. Here the section "Available resources
" lists all resource ownership at the point of the failure:
-
Block<signed int>(p)(u)
, i.e., ownership of uninitialised memory at locationp
; the output is avoid
/unit
valueu
(specified in the second pair of parentheses) -
Owned<signed int*>(&ARG0)(p)
, the ownership of (initialised) memory at location&ARG0
, i.e., the memory location where the first function argument is stored; its output is the pointerp
(not to be confused with the pointee ofp
); and finally -
__CN_Alloc(&ARG0)(void)
is a resource that records allocation information for location&ARG0
; this is related to CN’s memory-object semantics, which we ignore for the moment.
Exercises¶
Zero. Write a specification for the function zero
, which takes a pointer to uninitialised memory and initialises it to 0
.
In-place double. Give a specification for the function inplace_double
, which takes an int
pointer p
and doubles the pointee value: specify the precondition needed to guarantee safe execution and a postcondition that captures the function’s behaviour.
void inplace_double (int *p)
{
int n = *p;
int m = n + n;
*p = m;
}
Multiple owned pointers¶
When functions manipulate multiple pointers, we can assert their
ownership just like before. However
pointer ownership in CN is unique -- that is, simultaneously owning
Owned
or Block
resources for two pointers implies that these
pointers are disjoint.
The following example shows the use of two Owned
resources for
accessing two different pointers by a function add
, which reads
two int
values in memory, at locations p
and q
, and
returns their sum.
unsigned int add (unsigned int *p, unsigned int *q)
/*@ requires take P = Owned<unsigned int>(p);
take Q = Owned<unsigned int>(q);
ensures take P_post = Owned<unsigned int>(p);
take Q_post = Owned<unsigned int>(q);
P == P_post && Q == Q_post;
return == P + Q;
@*/
{
unsigned int m = *p;
unsigned int n = *q;
return m+n;
}
This time we use C’s unsigned int
type. In C, over- and underflow of unsigned integers is not undefined behaviour, so we do not need any special preconditions to rule this out. Instead, when an arithmetic operation at unsigned type goes outside the representable range, the value "wraps around
".
The CN variables P
and Q
(resp. P_post
and Q_post
) for the pointee values of p
and q
before (resp. after) the execution of add
have CN basetype u32
, so unsigned 32-bit integers, matching the C unsigned int
type. Like C’s unsigned integer arithmetic, CN unsigned int values wrap around when exceeding the value range of the type.
Hence, the postcondition return == P + Q
holds also when the sum of P
and Q
is greater than the maximal unsigned int
value.
In the following we will sometimes use unsigned integer types to focus on specifying memory ownership, rather than the conditions necessary to show absence of C arithmetic undefined behaviour.
Exercises¶
Swap. Specify the function swap
, which takes two owned unsigned int
pointers and swaps their values.
void swap (unsigned int *p, unsigned int *q)
{
unsigned int m = *p;
unsigned int n = *q;
*p = n;
*q = m;
}
Transfer. Write a specification for the function transfer
, shown below.