Once a secure channel has been established through IKEv1 Phase 1, the endpoints are ready to use that channel to exchange information about how to encrypt traffic and what specific data should be protected. This is where ISAKMP (Internet Security Association and Key Management Protocol) comes into play, as it allows both peers to negotiate and agree on the necessary parameters for securing the IP traffic.

The process of defining what traffic to protect and how to protect it is handled through Quick Mode. In Quick Mode, each message is denoted as QMx, where “x” indicates the sequence number of the message. This mode is typically a three-message exchange. The initiator sends the first message (QM1) and includes the cryptographic algorithms to be used, as well as the traffic selectors, which specify the type of IP traffic (e.g., source/destination IPs, protocols, and port ranges) to be secured.

The responder replies with QM2, which mirrors the exchange by including its selected cryptographic algorithms and traffic selectors. Finally, the initiator sends QM3, which acknowledges the responder’s parameters, effectively finalizing the negotiation. However, if the commit bit is set, which signals the need for explicit confirmation, then a fourth message is added to the sequence.

Assuming Quick Mode sticks to the standard three messages, this brings the total number of messages exchanged for a full IKEv1 session to six if using Aggressive Mode (3 messages for Phase 1 + 3 for Phase 2). If using Main Mode, the total rises to nine messages (6 for Phase 1 + 3 for Phase 2). Regardless of the mode used in Phase 1, Quick Mode remains the mechanism through which IPSec Security Associations are established, enabling secure and encrypted communication between the peers.

Quick Mode Phase 2

Quick Mode Phase 2 is where the real IPSec parameters come into play. At this point in the exchange, the endpoint sends a new proposal, often referred to as a Transform Set. This proposal allows the endpoints to define how data will be encrypted and secured, even if different encryption methods are needed than those used during Phase 1. You can configure these parameters on a router using:

crypto ipsec transform set

In this phase, the initiator sends a proposal to the responder. This proposal includes the details of which encryption and integrity algorithms the endpoint wants to use for the IPSec Security Association. For example, the endpoint might suggest using ESP (Encapsulating Security Payload) as the transport protocol and AES for encryption, or ESP with SHA1 for integrity. Another possible combination could be ESP with 3DES for encryption and MD5 for integrity. These sets of options are included in the proposal and are referred to as the Transform Set.

However, defining how to secure the data is only part of the equation. The endpoint must also define what data is to be secured, which is where Proxy IDs come in. Proxy IDs essentially represent the networks or endpoints to be protected. For example, the initiator might specify that traffic between 10.1.1.0/24 and 10.2.2.0/24 should be encrypted using the proposed transform set. This exchange not only dictates how the traffic will be protected, but also clearly defines what traffic is included in that protection.

By structuring the QM#1 message in this way, the initiator communicates both the encryption methods and the scope of traffic that the IPSec tunnel should secure. This ensures that both endpoints agree not only on the security protocols but also on the specific traffic selectors, laying the groundwork for a successful and secure tunnel session.

The second message (QM#2) is the responder’s reply to the initiator. This message confirms the selected proposal and returns the Proxy IDs. These Proxy IDs, which define the network traffic that the tunnel will protect, must mirror those sent in the first message. The reply reflects the encryption and integrity protocols selected, such as ESP-AES for encryption and ESP-SHA1 for integrity, as well as the specific network addresses or subnets being secured (e.g., 10.1.1.0/24 and 10.2.2.0/24).

While the Proxy IDs should ideally be mirrored on both sides, they don’t always need to be completely identical. For example, in a hub-and-spoke VPN topology, R1 (the hub) may use a single crypto ACL to reach both R2 and R3’s networks, while R2 and R3 may only need to mirror the portion relevant to R1. As long as there is at least one common entry mirrored on both sides for a specific pair of routers, the tunnel can be successfully negotiated. Any unmatched entries are ignored during negotiation.

This flexibility is crucial in real-world deployments where full mirroring might not be practical or necessary. The key takeaway is that each router only needs to have mirrored entries for the specific traffic selectors it is negotiating with its peer, ensuring that secure, encrypted communication can still be established within asymmetric topologies.

The third and final step in Quick Mode, known as QM#3, serves as a simple confirmation message from the initiator to the responder. This message contains a hash value used to validate the exchange and confirm that both parties are synchronized. The hash is derived using the authentication key (SKEYID_a), the unique message ID, and the two nonces (N_I and N_R) that were exchanged in earlier Quick Mode messages.

The formula for this hash looks like:
H(SKEYID_a | Message_ID | N_I | N_R)

This cryptographic hash ensures the integrity of the communication and confirms that both peers are using the correct parameters. It finalizes the Quick Mode exchange and transitions the tunnel into its operational state, ready for encrypted traffic to flow securely. While the message may seem minimal, its role is vital in securing the IPsec SA (Security Association) and providing mutual assurance that both sides are in agreement before protected data transmission begins.

In Quick Mode, all packets (QM#1 through QM#3) are encrypted using the key SKEYID_e. A common misconception is that these packets are traveling through a tunnel, but in reality, they are regular encrypted messages. These exchanges result in the creation of two unidirectional IPsec Security Associations (SAs), which are the logical tunnels used to transmit protected data.

The default timeout for an IPsec SA is typically one hour (3600 seconds). This relatively short lifespan is designed to enforce periodic rekeying, thereby strengthening security. Since the encryption keys used by IPsec are renewed every hour, administrators often prefer not to rely on the ISAKMP SA (negotiated in Phase 1 of IKE) for rekeying. This is where Perfect Forward Secrecy (PFS) becomes relevant. PFS ensures that new keys are generated without reusing any part of the original keying material, thereby preserving the integrity and confidentiality of session keys even if the original ISAKMP SA is compromised.

When PFS is enabled, Quick Mode must adapt slightly. The Transform Set and Proxy IDs still need to be exchanged, but in addition, the endpoint must also include a new Diffie-Hellman Key Exchange (KE) payload. This payload contains the parameters P (prime number), G (generator), and Xa (a new secret exponent) to compute a fresh shared secret. A new nonce (Nonce_I) is also included to contribute to the randomness of the new keys. These elements are all added to the packet, forming a revised Quick Mode message.

The value of G must match on both sides to calculate X correctly and complete the Diffie-Hellman exchange. Thus, the Diffie-Hellman group (also referred to as “group”) used for PFS must be configured identically on both ends of the tunnel. Administrators should statically set the PFS group on each side to avoid interoperability issues, especially when the devices are from different vendors. Any mismatch in configuration could prevent the establishment or rekeying of the tunnel. Although the RFC for PFS specifies that the lowest proposed group should be selected when both sides offer different values, not all vendors comply with this standard, leading to potential issues. Therefore, the best practice is to explicitly match the PFS group configuration on both peers to guarantee compatibility and smooth operation.