[FoRK] [Cryptography] The Case for Formal Verification
eugen at leitl.org
Thu Aug 29 23:50:08 PDT 2013
----- Forwarded message from "Perry E. Metzger" <perry at piermont.com> -----
Date: Thu, 29 Aug 2013 16:46:30 -0400
From: "Perry E. Metzger" <perry at piermont.com>
To: cryptography at metzdowd.com
Subject: [Cryptography] The Case for Formal Verification
X-Mailer: Claws Mail 3.9.0 (GTK+ 2.24.20; x86_64-apple-darwin12.4.0)
Taking a break from our discussion of new privacy enhancing protocols,
I thought I'd share something I've been mumbling about in various
private groups for a while. This is almost 100% on the security side
of things, and almost 0% on the cryptography side of things. It is
long, but I promise that I think it is interesting to people doing
When I was a student the first time, in the early to mid 1980s, formal
verification was clearly a dead end that would never get anywhere. A
boss of mine once asserted (circa 1988) that there would never be a
verified program that did anything terribly interesting, and at time
he seemed right.
Today, there is a formally verified microkernel called seL4, a
formally verified C compiler called CompCert, a formally verified
experimental web browser called Quark, and lots of other stuff, much
of which I doubtless don't even know about.
_Things have changed_.
Much of what has changed is proof technology, and it is a
technology. The tools for doing formal verification are now, for the
first time, just barely usable for real work on interesting programs,
and getting better all the time. Over the last twenty five years, we
figured out a lot of stuff people didn't know before about how to
write verification tools and how to verify programs, and the results
have been impressive.
There are usually several arguments against formal verification:
1) We don't know what to specify, so what help does proving a buggy
specification do us?
2) Who would bother writing a proof vastly larger than their program?
3) You can't prove programs of interesting size anyway.
So, taking these in reverse order:
For 3 ("you can't prove anything big enough to be useful!"), the Quark
showed you don't need to prove a program of interesting size. You can
defend millions of lines of buggy code with a "software firewall" made
of formally verified code. Verify the right thousand lines of code
that the rest needs to use to talk to anything else, and you have very
strong security properties for the rest of the code. seL4 and CompCert
are clearly also quite useful programs.
For 2 ("Who would bother with all that work?"), we have libraries in
daily use like sqlite:
where the system has a fairly small amount of production code and
literally 1000 times more lines of test code than production code. If
you're willing to write ninety million lines of test to defend ninety
thousand lines of code, formal verification is totally doable.
Sure, it might not be worth it for throw away code or for your new
video game or conference room scheduler where failure isn't a big
deal, but it is *very* clear why you would want to do this for
foundational code of all sorts.
For 1 ("We'll never write a correct spec anyway, so what good is the
proof?"), I think we've been suffering from sour grapes. We didn't
have the ability to prove big things anyway, so why not tell ourselves
that there's nothing interesting and large we could prove that would
be worth proving?
CompCert is a fine counterexample, a formally verified C compiler:
It works by having a formal spec for C, and a formal spec for the
machine language output. The theorem they prove is that the
compilation process preserves observational equivalence between the
behavior of the C program and the output, which, given correct
notation, is a very small theorem to write down.
You might claim "so what, it is probably actually buggy as hell, the
spec probably isn't really correct anyway, etc." -- except when John
Regehr's group built tools to torture test C compilers, the only
compiler they did *not* find bugs in was CompCert. They found hundreds
of bugs each in every other compiler tested. (They actually found one,
but arguably it was a bug in a Linux include file, not in the
Similarly, one might claim "there is no way to formally specify a web
browser that won't be just as buggy as the web browser!", but Quark's
formal verification doesn't try to show that the entire Web browser is
correct, and doesn't need to -- it shows that some insecure behaviors
are simply impossible. *Those* are much simpler to describe.
Certainly there may be other properties that turn out to be important
that no one has considered yet. However, unlike testing, if people
discovered a hole in the set of theorems being proven -- some property
that was important but which had not been considered -- then that
could be added to what was proved, and _then the problem would be gone
forever_. Verification means you get actual incremental progress that
you can trust. Testing is much less powerful. (Furthermore, future
systems can learn from what you did and add the needed theorem to what
they prove about their own system.)
I don't think the technology is up to proving huge systems correct --
a fairly unambitious C compiler or a microkernel is the current limit
-- but it is up to proving the right thousand lines of code here and
there without much fuss, and that might make an incredible difference
in systems security if people actually take it seriously.
So, if you're interested, how do you get started doing such things?
At the moment, the best system for doing this sort of work is Coq:
Coq is sort of a programming language (although it is not quite Turing
equivalent -- all programs must be guaranteed to terminate for
technical reasons), sort of a form of constructivist logic (i.e. all
existence proofs construct examples, no law of the excluded middle).
It uses a neat trick called the Curry-Howard isomorphism that may take
some getting used to for people not exposed to modern type theory. In
it, types and logical propositions are the same thing, and programs
and proofs are the same thing. You can write a function that adds two
numbers, or a function that proves that there are an infinite number
of primes. The type of the former is clearly an integer, but the type
of the latter is the theorem itself.
A proof that proposition A implies proposition B is function of type
A -> B -- any function that takes a proof of A and yields a proof of B
is after all a proof that if A is true then B is true. (This is why
all functions in Coq itself must terminate -- otherwise all types
would be inhabited so all theorems would be true. That in no way
restricts one's ability to build verified non-terminating programs
using the system, you just have to build them by reasoning about
programs that Coq itself can't execute.)
Proofs in Coq are generally not written by hand, though. Instead one
uses a sublanguage called the "tactic language" in which one invokes
help from Coq to interactively assemble a proof, which makes doing the
proof far easier. For many theorems, you can almost completely
automate the proof, while for others, you need to help Coq along
more. (Some of the tactics are quite amazing, "Omega" for example will
prove any assertion about numbers that does not involve multiplication
by a variable, aka "Presburger arithmetic".)
Often, one builds software using Coq by constructing a sort of
formally verified assertion about what the program would do, and then
using a built in Coq facility to mechanically extract that into a
working verified program written in OCaml, Haskell or Scheme. Nothing
in theory prevents you from doing things many in other ways, though --
the system is quite general.
Coq is, sadly, needlessly hard for the beginner. It has poor
documentation, bad error messages and bad error behavior. These are
not inherent problems, they're problems just with this instance of
things -- people could build better if there was enough interest, and
I hope that as these technologies become more popular people will
build far better versions of the tools.
However, bad documentation or no, at the moment, this is the right
place to go I think. It is the system Quark and CompCert were built
in, and not by accident. It is not for the faint of heart -- the
learning curve is very steep. Still, it is quite doable.
The right places to start with Coq are probably Benjamin Pierce's
online Software Foundations textbook:
and, when one is done with that, possibly Adam Chlipala's online book
"Certified Programming with Dependent Types"
(There's also a manual for the system itself, as well as this book on
Coq: https://www.springer.com/computer/swe/book/978-3-540-20854-9 )
I'm happy to give help to anyone trying to learn how to do this sort
of thing -- we need more people in the world experimenting with
verification if we're going to get truly trustworthy software going
Let me assert that we *do* need truly trustworthy software, too. We've
been very, very good for years now at finding hole after hole in
running code and making the systems we depend on every day into a
minefield that we dare not take an unconsidered step in. Wouldn't it
be nice to make some progress in the other direction?
Perry E. Metzger perry at piermont.com
The cryptography mailing list
cryptography at metzdowd.com
----- End forwarded message -----
Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
ICBM: 48.07100, 11.36820 http://ativel.com http://postbiota.org
AC894EC5: 38A5 5F46 A4FF 59B8 336B 47EE F46E 3489 AC89 4EC5
More information about the FoRK