Dept. of Computer Systems
P.O. Box 325
S-751 05 Uppsala, Sweden
{parosh,bengt,d93ame}@docs.uu.se
Abstract
In this paper we present a general framework for modeling infinite-state authentication protocols that allows an unbounded number of protocol participants. We present a method for automated verification by performing a backwards reachability analysis where sets of insecure states are specified using constraints. We illustrate our method by showing how to model the Needham-Schroeder public-key protocol.
1 Introduction
The increasing popularity of distributed systems and the emergence of new technologies, such as electronic commerce, demands new security solutions. The cornerstone of security is often authentication, therefore easy-to-use methods and tools for modeling and verification of authentication protocols are needed.
Authentication protocols are designed for networks of protocol participants which are able to communicate asynchronously. The communication medium is essentially passive and can be manipulated by intruders that can intercept or interfere with network traffic. These protocols typically have few messages that may be encrypted with symmetric or asymmetric cryptographic keys.
Formal modeling and verification of cryptographic protocols has received much attention in recent years. Several frameworks for modeling authentication protocols and associated correctness properties have been presented. Burrows, Abadi and Needham [BAN89] have developed a belief-based logic for expressing properties of authentication protocols together with inference rules for reasoning about such properties. It is far from obvious how proofs in this framework could be automated. A simpler approach to modeling, which is not based on special purpose logic has been presented by Bolignano [Bol96] and by Schneider [Sch96a, Sch96b]. These approaches model protocols in rather standard transition-system based formalisms, and investigate how authentication can be expressed and analyzed. In general, the mechanization of proofs requires the use of interactive theorem provers. However, by employing carefully chosen abstractions, a finite-state model of the protocol can be created and subsequently analyzed in a finite-state automatic verifier. Manual intervention is still necessary for supplying and verifying the abstraction.
In this work, we intend to improve on these approaches by attempting to automate as much as possible the verification of authentication protocols. The main problems of verifying security protocols are that they are intended to run on networks of arbitrary size, i.e. networks with an unbounded number of protocol participants, and that the participants operate on data such as keys and nonces with values that are hard to bound. We intend to use and further develop recent advances in the analysis of infinite-state systems, which have been able to automate the analysis of certain classes of parameterized systems and systems operating on infinite-valued data structures. In particular, some of our earlier work in [AJ97] have concentrated on proving safety properties: this is applicable since most security properties of interest are safety properties [Bol97].
We present a network model based on the ideas of [Bol96]. Our goals in designing this model was to simplify the task of modeling authentication protocols, and to model authentication in a way that can be used with mechanized techniques of proving invariants of parameterized systems. Security properties are generally safety properties that can be decided by analyzing reachability of network configurations. We present a constraint system that can be used to specify insecure network configurations.
In the next section, we introduce the basics of our protocol model. In Section 3 the network model is presented. In Section 4 we show how insecure network configurations can be specified using constraints and Section 7 explains how we perform verification using this model. A model of the Needham-Schroeder public-key protocol is presented in Section 8.
2 Basics
Authentication protocols usually run on networks of protocol participants which are able to communicate asynchronously over a medium. The medium is essentially passive and can be manipulated by malicious agents that can intercept or interfere with network traffic. The network consists of an arbitrary number of principals. Some of these principals will be considered to be trustworthy, they always act according to the protocol. The non-trustworthy principals may act according to the protocol, but they may also perform malicious actions, such as interfering with network traffic, impersonating other principals and perhaps even cooperating with each other. Since the non-trustworthy principals can manipulate the medium we will model these principals together with the medium as a single process, the enemy. The enemy can store and (if possible) decrypt the messages circulating on the medium. It can also encrypt data to create new messages and send messages in order to mislead trustworthy principals. Similar approaches have been used in [Sch96b, Bol96, Mea96].
The trustworthy principals will be modeled as a set of identical processes called users. The users can play different roles in different protocol sessions, e.g. the user that initiates a session acts as an initiator and the user that gets contacted by the initiator acts as a responder. In general, a user may be engaged in several sessions simultaneously.
In our framework, we model a protocol as a transition system, with states (called configurations), and transitions among configurations. A configuration is built from (typed) objects and their attributes. Thus, for each protocol we must first decide on a finite set of types of objects. This choice can vary between protocols. A typical choice could be to let objects be of the following types.
We will use typewriter font to designate specific objects. Thus we use e.g. A, B to describe specific principals, and S to describe a specific session local state, etc.
The sets of principals, session local states, keys, and nonces are all infinite, while the set of control states is finite. A key is said to be symmetric if it is used both for encryption and decryption, otherwise it is asymmetric. An asymmetric encryption (decryption) key K has a corresponding decryption (encryption) key K-1, which is said to be the inverse key of K. The inverse key of a symmetric key is the key itself.
In the sequel, we make the assumption that encrypted messages are not altered, e.g. decomposed into encrypted parts, except by decryption and re-encryption. In other words, messages cannot be altered by a principal that does not know the decryption and encryption keys.
In the description of a specific protocol, we will assume a finite set of attributes. An attribute is associated with objects of a specific type. The value of an attribute is an object. A typical use of attributes is to describe different properties of sessions, or to specify the keys belonging to different principals, etc. We use the notation object1.attr = object2 to denote that the attribute attr of object1 has value object2.
As an example, in the Needham-Schroeder Protocol (which we shall consider in Section 8, a session local state S has the following attributes, with corresponding types.
Messages which are exchanged during a session, are formed according to the following rules.
In a particular protocol configuration, the enemy knows particular messages. These messages can be used to create new messages, which can then be sent to principals. Initially, the enemy knows some messages. He can deduce new messages from given ones. This deduction relation is usually generated by the operations:
If M is a set of messages, then M* denotes the union of M and the set of messages that can be deduced from M by the operations listed above.
3 The Network Model
We assume that we work with infinite sets A of principals, S of session local states, K of keys, N of nonces.
A protocol P is a triple <Q,Decl,R>, where
A configuration of P is defined by giving (i) a set of active objects, (ii) the values of attributes of these objects, and (iii) the state of the intruder. The state of the intruder is represented by the set of messages known to him. Formally, a configuration g is a triple <X,V,known>, where
Observe that known* is the set of messages which can be deduced from known.
Rules The dynamic behavior of P is defined by the finite set R of rules. A rule r in R has a finite set of parameters, i.e., free variables. Each free variable ranges over objects of a given type. We use lowercase letters to range over variables: thus a,b,c range over free variables of type principal, and similarly for other types. We use x,y to range over variables of all types.
A meta-message is a ``message'' built only from variables, i.e., it is formed according to the following rules.
A rule consists of a guard and a body.
We now define the effect of executing a rule in a given protocol configuration. An instance of a rule is obtained by replacing the parameters of a rule by objects. The rule is enabled if its guard is true (in the obvious way), where the conjunct fresh(X) requires that the object X is not present in the configuration. An enabled rule is executed by possibly adding to the configuration those objects in the instantiated rule which were not present before, and then executing the statements in the body.
For configurations g1,g2, and a rule r, we use g1-->rg2 to denote that we can reach g2 from g1 through the execution of r. We define the discrete transition relation --> by $-->=unionr in R-->r. We let -->* denote the reflexive transitive closure of -->.
4 Constraints
A constraint is a formula of form exists x1 , ... , xnf(x1 , ... , xn)
where x1 , ... , xn are variables as usual, and where f is the conjunction of atomic formulas of the following form
Since all variables are always implicitly existentially quantified, we will sometimes for brevity drop the quantification, and simply write f(x1 , ... , xn).
Intuitively, a constraint f asserts the existence of a set of objects and known messages, whose attributes (parameters) are in some cases identified. Fields are identified if the variables that occupy them are identical. Different variables may or may not represent the same objects in a given configuration which satisfies the particular constraint.
A configuration g satisfies a constraint exists x1 , ... , xnf(x1 , ... , xn) if for each xi there is an object in the configuration, such that the f(x1 , ... , xn) is satisfied. Note that different variables can be mapped to the same object. We let denotation(f) denote the set of configurations satisfying f.
We define an ordering leq on constraints by f1leq f2 if and only if denotation(f2) is a subset of denotation(f1). The ordering can be characterized as follows. We have f1(x1 , ... , xn) leq f2(y1 , ... , ym)
if for each xi we can substitute a corresponding yj such that each conjunct in f1(x1 , ... , xm) occurs in f2(y1 , ... , yn), or can be deduced by decryption.
Reachability
A configuration g0 is said to be initial if there are no active objects at g0. We say that a configuration g is reachable if g0-->*g, for some initial configuration g0. For a finite set of constraints F, we say that F is reachable if there is a f in F and a reachable g in denotation(f).
The reachability problem is defined as follows
To check the reachability of F from g, we perform a reachability analysis backwards. Let pre(f) denote the set {g : exists g' in denotation(f) : g --> g'}, and pre(F) denote the set {g : exists f in F : exists g' in denotation(f) : g --> g'}. Note that pre(F) is equivalent to unionf in F pre(f). Starting from F we define the sequence F0,F1,F2, ... of finite sets of constraints by F0 = F and Fj+1=union(Fj, pre(Fj)). Intuitively, Fj denotes the set of configurations from which F is reachable by a sequence of at most j transitions. Note that the sequence is increasing i.e., that denotation(F0) subset of denotation(F1) subset of denotation(F2) subset of ... If there is a k such that denotation(Fk) = denotation(Fk+1), then denotation(Fk) = denotation(Fj) for all j >= k. It follows that F is reachable if and only if there is an initial configuration g0 such that g0 in denotation(Fk), which is easily checked since Fk is a finite set of constraints. In general, it cannot be guaranteed that a fixpoint is reached.
Computing pre
We have already covered the ordering relation between constraints. Let us now turn to calculating pre of a constraint. So, assume a constraint and a rule. We will compute pre(f) as unionr in R pre(r,f), where pre(r,f) denotes the set {g : exists g' in denotation(f) : g -->r g'}. If f is a constraint and r is a rule, then pre(r, f) is calculated according to the following steps:
For all substitutions of variables in the constraint and in the rule
We observe that in general, the smallest constraint is obtained by choosing those parameters of the rule that occur only in the guard distinct from all other variables.
Authentication
We verify authentication by checking whether insecure states, i.e. states
where violation of authentication has occured, can be reached from an initial
state. We model sets of insecure states by constraints. Usually, violation
of authentication can be described a conjunction of conditions on local
states and on the set of messages that the intruder knows. For example
the following constraint characterizes an insecure set of states, where
a nonce used in a certain session is also known to the intruder.
Example: The Needham-Schroeder Public-Key Protocol
The Needham-Schroeder public-key protocol ([NS78]) is based on public-key cryptography. Each principal A possesses a public key, denoted KA, which any other principal can obtain from a key server. It also possesses a secret key, KA-1, which is the inverse of KA.
The messages exchanged consists of agent ids and nonces, random numbers intended to be used in a single run of the protocol. The goal of the protocol is to establish mutual authentication between an initiating principal A and a responding principal B. The protocol described and modeled here is a reduced version of the protocol. The reduced version omits the steps where the agents request and receive each others public keys from the key server and instead assumes that the participants initially have knowledge of all public keys.
The reduced protocol can be described as:
The first message, which can only be understood by responder B, tells B that initiator A wants to establish communication and secretly communicates a nonce identifier NA. In the second message A receives a proof of Bs presence in the form of the same nonce identifier encrypted with As public key. The second message also contains a nonce challenge which if properly responded to, in message 3, proves As presence to B. Thus after a protocol run the secrets NA and NB will have been established and the presence of both A and B will have been proved.
In [Low96], Lowe presented an attack upon this protocol, which allows an intruder to impersonate another agent. He also suggests an adaption of the protocol to make it secure, in which the second message has been changed to
This version of the protocol will be used in our example model. The Needham-Schroeder Public-Key protocol is defined by the control states idle, init_wait, resp_wait and connected, where idle is the initial value and the state connected denotes that authentication is completed. The attributes of the session local states are:
The rules describing the transitions of the protocol are listed in Figure 1. The first rule corresponds to the initiator sending the first protocol message. The second rule is a combination of the responders receiving the first message and sending the second. The third rule describes the initiators receiving the second message and sending the third message and the fourth rule is where the responder receives the third message and the protocol run is completed.
In an initial configuration, the set knows consists of all public
keys, the enemy's private key, and the names of all principals in the network.
References
[AJ97] Parosh Aziz Abdulla and Bengt Jonsson. Verifying networks of timed processes, 1997. in preparation.
[BAN89] Michael Burrows, Martin Abadi and Roger Needham. A logic of authentication. In Proceedings of the royal Society of London A, volume 426, pages 233-271, 1989.
[Bol96]Dominique Bolignano. An approach to the formal verification of cryptographic protocols. In Proc. 3rd ACM Conference on Computer and Communication Security, March 1996.
[Bol97]Dominique Bolignano. Towards a mechanization of cryptographic verification. In 9th International Computer-Aided Verification Conference, CAV'97, June 1997.
[Low96]Gavin Lowe. Breaking and fixing the Needham-Schroeder public-key protocol using FDR. In Proceedings of TACAS, volume 1055 of Lecture Notes in Computer Science, pages 147-166. Springer Verlag, 1996.
[Mea96]Cathrine Meadows. The NRL protocol analyzer: An overview. Journal of Logic Programming, 1996.
[NS78]Roger M. Needham and Michael D. Schroeder. Using encryption for authentication in large networks of computers. Communications of the ACM, 21(12):993-999, December 1978.
[Sch96a]Steve Schneider. Security properties and CSP. In IEEE Computer Society Symposium on research in Security and Privacy, Oakland, 1996.
[Sch96b]Steve Schneider. Using CSP for protocol analysis: the Needham-Schroeder public-key protocol. Technical Report CSD-TR-96-14, Royal Holloway, University of London, 1996.