Settle a problem:66
As network engineers, we frequently navigate migrations from traditional stacking technologies like Cisco’s StackWise to more robust data center architectures like Virtual Port-Channel (vPC) on the Nexus platform. While this move offers significant improvements in high availability and bandwidth, it also introduces new design considerations.
A common scenario, recently highlighted in a community forum, involves connecting devices that don’t use LACP to a Nexus vPC domain. This raises a critical question: Should vpc orphan-port suspend be enabled?
Let’s break down the scenario, the technology, and the definitive best practice.
A user was migrating their core from a Catalyst 3850 StackWise stack to a pair of Nexus 9000 switches configured with vPC. During the planning phase, they noted that some servers and firewalls were connected to the old stack using two physical links, but without an LACP EtherChannel configured. The plan was to replicate this non-LACP, dual-homed design on the new Nexus vPC pair.
This design means a server, for example, has one link connected to the first Nexus switch (N9K-A) and a second link to the other (N9K-B). From the server’s perspective, these are two independent interfaces, likely managed by a NIC teaming or bonding driver in an active/standby or active/active mode.
From the Nexus vPC domain’s perspective, any port that is not part of a vPC port-channel is considered an orphan port. In this specific case, the server is connected to two separate orphan ports.
To understand the importance of the vpc orphan-port suspend command, we must first understand the problem it solves: the vPC “split-brain” condition.
A stable vPC domain relies on two key communication paths between the peer switches:
A split-brain occurs when both the peer-link and the keepalive link fail simultaneously.
Here’s the sequence of events in a split-brain:
However, this safety mechanism does not apply to orphan ports by default.
Let’s revisit our server with two independent links.
Assume the server’s NIC teaming has chosen the link to N9K-B as its active path.
Now, a split-brain occurs.
The result is a traffic black hole. The server believes it has a working connection, but its traffic is being sent to a dead end. The server’s NIC teaming driver may not fail over because it doesn’t detect a link-down event.
vpc orphan-port suspendThis is precisely where vpc orphan-port suspend comes into play. When this command is configured within the vPC domain, it instructs the switch to extend its split-brain protection mechanism to all orphan ports.
With the command enabled, let’s replay the split-brain scenario:
vpc orphan-port suspend function.Traffic flow is restored, and the black hole is avoided. The command provides a deterministic and immediate failure signal to the connected device, allowing its own high-availability mechanism to function correctly.
1. Best Practice: Use vPC Port-Channels When Possible
The first and best recommendation is always to use LACP and configure a proper vPC port-channel to the end device if it supports it. This provides true active/active load balancing and is the most resilient design.
2. Essential Safety Net: Enable vpc orphan-port suspend
For any device that is single-homed to one vPC peer, or dual-homed without LACP as described in the user’s scenario, enabling vpc orphan-port suspend is not just a good idea—it’s a critical best practice. It acts as an essential safety net for scenarios that a standard vPC port-channel would not encounter.
Configuration:
The command is simple and is configured under the vPC domain on both switches:
N9K-A(config)# vpc domain 10
N9K-A(config-vpc-domain)# orphan-port suspend
N9K-B(config)# vpc domain 10
N9K-B(config-vpc-domain)# orphan-port suspend