Chapter 12

High Availability

This chapter covers the following topics:

Nexus OS (NX-OS) is a resilient OS that has been designed on the paradigms of high availability not just at the system level, but at both the network and process levels as well. Some of the Nexus switches provide high availability by redundancy hardware such as redundant fabric, supervisor cards, and power supplies. Network-level high availability is provided by features such as virtual port-channels (vPC) and First Hop Redundancy Protocol (FHRP), which give users backup paths to failover in case the primary path fails. NX-OS leverages various system components to provide process restartability and virtualization capability, thus providing process-level high availability. This chapter covers some of the important features and components within NX-OS that provide high availability in the network.

Bidirectional Forwarding Detection

Bidirectional forwarding detection (BFD) is a simple, fixed-length hello protocol that is used for faster detection of failures. BFD provides a low-overhead, short-duration mechanism for detecting failures in the path between adjacent forwarding engines. Defined in RFC 5880 through RFC 5884, BFD supports adaptive detection times and a three-way handshake ensuring that both systems are aware of any changes. BFD control packets contain the desired tx and rx intervals by the sender. For example, if a node cannot handle a high rate of BFD packets, it can specify a large desired rx interval. In this way, its neighbor(s) cannot send packets at a smaller interval. The following features of BFD make it a most desirable protocol for failure detection:

  • Subsecond failure detection

  • Media independence (Ethernet, POS, Serial, and so on)

  • Capability to run over User Data Protocol (UDP), data protocol independence (IPv4, IPv6, Label Switched Path [LSP])

  • Application Independent Interior Gateway Protocol (IGP), tunnel liveliness, Fast Reroute (FRR) trigger

When an application (Border Gateway Protocol [BGP], Open Shortest Path First [OSPF], and so on) creates or modifies a BFD session, it provides the following information:

  • Interface handle (single-hop session)

  • Address of the neighbor

  • Local address

  • Desired interval

  • Multiplier

The product of the desired interval and the multiplier indicates the desired failure detection interval. The operational workflow of BFD for a given protocol P is as follows:

  • User-configured BFD for P on the physical interface

  • P initiates creation of BFD session.

  • After the BFD session is created, timers are negotiated.

  • BFD sends periodic control packets to its peer.

  • If a link failure occurs, BFD detects the failure in the desired failure detection interval (desired interval × multiplier) and informs both the peer and the local BFD client (such as BGP) of the failure.

  • The session for P goes down immediately instead of waiting for the Hold Timer to expire.

BFD runs on two modes:

  • Asynchronous mode

  • Demand mode

Note

Demand mode is not supported on Cisco platforms. In demand mode, no hello packets are exchanged after the session is established. In this mode, BFD assumes there is another way to verify connectivity between the two endpoints. Either host may still send hello packets if needed, but they are not generally exchanged.

Asynchronous Mode

Asynchronous mode is the primary mode of operation and is mandatory for BFD to function. In this mode, each system periodically sends BFD control packets to one another. For example, packets sent by router R1 have a source address of NX-1 and a destination address of router NX-2, as Figure 12-1 shows.

Image

Figure 12-1 BFD Asynchronous Mode

Each stream of BFD control packets is independent and does not follow a request-response cycle. If the other system does not receive the configured number of packets in a row (based on the BFD timer and multiplier), the session is declared down. An adaptive failure detection time is used to avoid false failures if a neighbor is sending packets slower than what it is advertising.

BFD async packets are sent on UDP port 3784. The BFD source port must be in the range of 49152 through 65535. The BFD control packets contain the fields in Table 12-1.

Table 12-1 BFD Control Packet Fields

Control Packet Fields

Description

Version

Version of the BFD control header.

Diag

A diagnostic code specifying the local system’s reason for the last change in session state, detection time expired, echo failed, and so on.

State

The current BFD session state, as seen by the transmitting system.

P (Poll Bit)

Poll bit. If set, the transmitting system is requesting verification of connectivity or a parameter change and is expecting a packet with the Final (F) bit in reply.

F (Final Bit)

Final bit. If set, the transmitting system is responding to a received BFD control packet that has the Poll (P) bit set.

Detect Multiplier

Detection time multiplier. The negotiated transmit interval, multiplied by this value, provides the detection time for the transmitting system in asynchronous mode.

My Discriminator

A unique, nonzero discriminator value generated by the transmitting system, used to demultiplex multiple BFD sessions between the same pair of systems.

Your Discriminator

The discriminator received from the corresponding remote system. This field reflects back the received value of My Discriminator; if that value is unknown, this field is zero.

Desired Min TX Interval

The minimum interval, in microseconds, that the local system wants to use when transmitting BFD control packets.

Desired Min RX Interval:

The minimum interval, in microseconds, between received BFD control packets that this system is capable of supporting.

Required Min Echo RX Interval

The minimum interval, in microseconds, between received BFD echo packets that this system is capable of supporting.

Figure 12-2 shows the BFD control packets defined by the IETF.

Image

Figure 12-2 BFD Control Plane Packet Format

Note

BFD supports keyed SHA-1 authentication on NX-OS beginning with Release 5.2.

Asynchronous Mode with Echo Function

Asynchronous mode with echo function is designed to test only the forwarding path, not the host stack on the remote system. It is enabled only after the session is enabled. BFD echo packets are sent in such a way that the other end just loops them back through its forwarding path. For example, a packet sent by router NX-1 is sent with both the source and destination address belonging to NX-1 (see Figure 12-3).

Image

Figure 12-3 BFD Asynchronous Mode with Echo Function

Because echo packets do not require application or host stack processing on the remote end, this function can be used for aggressive detections timers. Another benefit of using echo function is that the sender has complete control of the response time. For the echo function to work, the remote node should also be capable of echo function. The BFD control packets with echo function enabled are sent as UDP packets with the source and destination port 3785.

Configuring and Verifying BFD Sessions

To enable BFD on the Nexus device, configure the command feature bfd. When enabling the feature bfd command on the Nexus switch, the device prints a notification message to disable Internet Control Message Protocol (ICMP) and ICMPv6 redirects on all IPv4- and IPv6 BFD–enabled interfaces. Example 12-1 displays the terminal message that gets printed when the BFD feature is enabled.

Example 12-1 Enabling BFD Feature

NX-1(config)# feature bfd
Please disable the ICMP / ICMPv6 redirects on all IPv4 and IPv6 interfaces
running BFD sessions using the command below
'no ip redirects '
'no ipv6 redirects '

BFD configuration must be enabled under the routing protocol configuration and also under the interface that will be participating in BFD. To enable configuration under the routing protocol (for instance, OSPF), use the command bfd under the router ospf configuration. Under the interface, two important BFD commands are defined:

  • BFD interval

  • BFD echo function

The BFD interval can be defined both under the interface and in global configuration mode. It is defined using the command bfd interval tx-interval min_rx rx-interval multiplier number. The BFD echo function is enabled by default. To disable or enable the BFD echo function, use the command [no] bfd echo. Example 12-2 illustrates the configuration for enabling BFD for OSPF.

Example 12-2 Configuring BFD for OSPF

NX-1
NX-1(config)# int e4/1
NX-1(config-if)# no ip redirects
NX-1(config-if)# no ipv6 redirects
NX-2(config)# bfd interval 300 min_rx 300 multiplier 3
NX-1(config-if)# ip ospf bfd
NX-1(config-if)# no bfd echo
NX-1(config-if)# exit
NX-1(config)# router ospf 100
NX-1(config-router)# bfd

Note

To enable BFD for other routing protocols, refer to the Cisco documentation for the configuration on different Nexus devices.

When BFD is enabled, a BFD session gets established. Use the command show bfd neighbors [detail] to verify the status of BFD. The show bfd neighbors command displays the state of the BFD neighbor, along with the interface, local and remote discriminator, and Virtual Routing and Forwarding (VRF) details. The output with the detail keyword displays all the fields that are part of the BFD control packet, which is useful for debugging purposes to see whether a mismatch could cause the BFD session to flap. Ensure that the State bit is set to Up instead of AdminDown. The output also shows that the echo function is enabled or disabled. Example 12-3 displays the output of the command show bfd neighbors [detail].

Example 12-3 Verifying BFD Neighbors

NX-1# show bfd neighbors detail

OurAddr    NeighAddr     LD/RD                 RH/RS  Holdown(mult) State  Int     Vrf                       
10.1.12.1  10.1.12.2  1090519044/1107296259     Up     667(3)        Up   Eth4/1  default

Session state is Up and not using echo function
Local Diag: 0, Demand mode: 0, Poll bit: 0, Authentication: None
MinTxInt: 300000 us, MinRxInt: 300000 us, Multiplier: 3
Received MinRxInt: 300000 us, Received Multiplier: 3
Holdown (hits): 900 ms (0), Hello (hits): 300 ms (47)
Rx Count: 47, Rx Interval (ms) min/max/avg: 0/1600/260 last: 134 ms ago
Tx Count: 47, Tx Interval (ms) min/max/avg: 236/236/236 last: 136 ms ago
Registered protocols:  ospf
Uptime: 0 days 0 hrs 36 mins 9 secs
Last packet: Version: 1                - Diagnostic: 0  
             State bit: Up             - Demand bit: 0  
             Poll bit: 0               - Final bit: 0  
             Multiplier: 3             - Length: 24  
             My Discr.: 1107296259     - Your Discr.: 1090519044  
             Min tx interval: 300000   - Min rx interval: 300000  
             Min Echo interval: 50000  - Authentication bit: 0
Hosting LC: 4, Down reason: None, Reason not-hosted: None

Before troubleshooting any BFD-related issue, it is important to verify the state of the feature. This is done by using the command show system internal feature-mgr feature feature-name current status. If a problem arises with the process (for instance, a process is not running or has crashed), the state of the process does not show as Running. Example 12-4 displays the state of BFD feature. Here, the BFD is currently in the Running state.

Example 12-4 BFD Feature Status

NX-1# show system internal feature-mgr feature bfd current status
Feature Name       State      Feature   ID UUID   SAP     PID      Service State
----------------   --------   ------    -------   ------  ------   --------------
bfd                enabled    87        706       121     2574     Running

As with other features, BFD also maintains internal event-history logs that are useful in debugging any state machine-related or BFD flaps. The event-history for BFD provides various command-line options. To view the BFD event-history, use the command show system internal bfd event-history [all | errors | logs | msgs | session [discriminator]]. The all option shows all the event-history (that is, all the events and error event-history logs). The errors option shows only the BFD-related errors. The logs options shows all the events for BFD. The msgs option shows BFD-related messages, and the session option helps view the logs related to errors, log messages, and app-events for a particular session.

Example 12-5 displays the BFD event-history logs for a BFD session hosted on an interface on module 4 with the discriminator 0x41000004. This example also helps you understand the information exchange and steps the system goes through in bringing up a BFD session. These are the steps, listed in sequence:

Step 1. The session begins with an Admin Down state.

Step 2. The BFD client (BFDC) adds a BFD session with the interface and IP addresses of the devices between which the session will be established.

Step 3. The BFD component sends an MTS message to the BFDC component on the line card.

Step 4. BFD sends a received session notification to its clients.

Note

The BFD process runs on the supervisor, whereas the BFDC runs on the line card.

Example 12-5 BFD Event-History Logs

NX-1# show system internal bfd event-history logs

1) Event:E_DEBUG, length:95, at 686796 usecs after Sat Oct 28 14:07:10 2017
    [102] bfd_mts_send_msg_to_bfdc(4848): opc 116747 length 36 sent to host_module 4 rrtok 0x27de77d


2) Event:E_DEBUG, length:95, at 676629 usecs after Sat Oct 28 13:14:38 2017
    [102] bfd_mts_send_msg_to_bfdc(4848): opc 116747 length 36 sent to host_module 4 rrtok 0x27ceb4e


3) Event:E_DEBUG, length:95, at 506685 usecs after Sat Oct 28 13:14:11 2017
    [102] bfd_mts_send_msg_to_bfdc(4848): opc 116747 length 36 sent to host_module 4 rrtok 0x27ce825


4) Event:E_DEBUG, length:106, at 92550 usecs after Sat Oct 28 13:14:08 2017
    [102] bfd_mts_sess_change_state_notif_cb(2379): Received sess 0x41000004 notif 3 reason_code No Diagnostic


5) Event:E_DEBUG, length:151, at 92524 usecs after Sat Oct 28 13:14:08 2017
    [102] bfd_mts_sess_change_state_notif_cb(2366): notif 0x41000004 3: [1 if Eth4/1 0x1a180000 iod 0x26 10c010a:0:0:0=10.1.12.1 -> 20c010a:0:0:0=10.1
.12.2]

6) Event:E_DEBUG, length:96, at 315822 usecs after Sat Oct 28 13:14:06 2017
    [102] bfd_mts_send_msg_to_bfdc(4848): opc 116745 length 396 sent to host_module 4 rrtok 0x27ce7ef


7) Event:E_DEBUG, length:81, at 315599 usecs after Sat Oct 28 13:14:06 2017
    [102] bfd_fu_timer_cancel_app_client_expiry(244): disc 0x41000004 app 1090519321:1


8) Event:E_DEBUG, length:167, at 315560 usecs after Sat Oct 28 13:14:06 2017
    [102] bfd_sess_create_session(1713): Client Add 1:1090519321 to session
0x41000004  [1 if Eth4/1 0x1a180000 iod 0x26 10c010a:0:0:0=10.1.12.1 -> 20
c010a:0:0:0=10.1.12.2]

9) Event:E_DEBUG, length:114, at 399344 usecs after Sat Oct 28 13:14:02 2017
    [102] bfd_mts_sess_change_state_notif_cb(2379): Received sess 0x41000004 notif 1 reason_code Administratively Down

Example 12-6 displays the detailed information about the session using the command show system internal bfd event-history session discriminator. The discriminator value is calculated from the LD or your discriminator value from the show bfd neighbor detail output. This value is calculated in hex, as shown in Example 12-6, and is used with the event-history command output. The event-history session command views the errors, logs such as parameters exchanged and state changes, and app events related to a given BFD session.

Example 12-6 BFD Session-Based Event-History

NX-1# hex 1090519044
0x41000004

NX-1# show system internal bfd event-history session 0x41000004

Start of errors for session 0x41000004
1:1365  292509 usecs after Sat Oct 28 13:13:15 2017
        : Code 0x1 0x0 0x0 0x0

End of errors for session 0x41000004

Start of Logs for session 0x41000004
1:2455  612556 usecs after Sat Oct 28 13:14:08 2017
        : Session active params changed: State 3(Up), TX(300000), RX(300000), Mult(3)
2:2455  332770 usecs after Sat Oct 28 13:14:08 2017
        : Session active params changed: State 3(Up), TX(300000), RX(300000), Mult(3)
3:649  92566 usecs after Sat Oct 28 13:14:08 2017
        : Session Up
4:628  92526 usecs after Sat Oct 28 13:14:08 2017
        : Session state changed: 1(Down) -> 3(Up), New diag: 0(No Diagnostic), After: 6 secs
5:2523  92282 usecs after Sat Oct 28 13:14:08 2017
        : Session remote disc changed: 0(0x0) -> 1107296261(0x42000005)
6:2523  732472 usecs after Sat Oct 28 13:14:02 2017
        : Session remote disc changed: 1107296260(0x42000004) -> 0(0x0)
7:2523  732438 usecs after Sat Oct 28 13:14:02 2017
        : Session remote disc changed: 0(0x0) -> 1107296260(0x42000004)
8:2523  452189 usecs after Sat Oct 28 13:14:02 2017
        : Session remote disc changed: 1107296260(0x42000004) -> 0(0x0)
9:2523  452163 usecs after Sat Oct 28 13:14:02 2017
        : Session remote disc changed: 0(0x0) -> 1107296260(0x42000004)
10:707  399365 usecs after Sat Oct 28 13:14:02 2017
        : Session Down Diag 7(Administratively Down)
11:628  399285 usecs after Sat Oct 28 13:14:02 2017
        : Session state changed: 3(Up) -> 1(Down), New diag: 7(Administratively Down), After: 44 secs
12:2523  398654 usecs after Sat Oct 28 13:14:02 2017
        : Session remote disc changed: 1107296260(0x42000004) -> 0(0x0)
13:2455  49895 usecs after Sat Oct 28 13:13:19 2017
        : Session active params changed: State 3(Up), TX(300000), RX(300000), Mult(3)
14:2455  49490 usecs after Sat Oct 28 13:13:19 2017
        : Session active params changed: State 3(Up), TX(300000), RX(300000), Mult(3)
15:2455  770894 usecs after Sat Oct 28 13:13:18 2017
        : Session active params changed: State 3(Up), TX(300000), RX(300000), Mult(3)
16:649  769776 usecs after Sat Oct 28 13:13:18 2017
        : Session Up
17:628  769732 usecs after Sat Oct 28 13:13:18 2017
        : Session state changed: 1(Down) -> 3(Up), New diag: 0(No Diagnostic), After: 3 secs
18:2523  59514 usecs after Sat Oct 28 13:13:17 2017
        : Session remote disc changed: 0(0x0) -> 1107296260(0x42000004)
19:1396  347952 usecs after Sat Oct 28 13:13:15 2017
        : ACL installed
20:602  293945 usecs after Sat Oct 28 13:13:15 2017
        : Session installed on LC 4
21:112  292529 usecs after Sat Oct 28 13:13:15 2017
        : Session Created if 0x1a180000 iod 38 (Eth4/1) src 10.1.12.1, dst 10.1.12.2
22:1364  292508 usecs after Sat Oct 28 13:13:15 2017
        : Code 0x1 0x0 0x0 0x0

End of Logs for session 0x41000004

Start of app-events for session 0x41000004
1:958  315615 usecs after Sat Oct 28 13:14:06 2017
        : Client Add type 1, 1090519321 in state 14
2:1709  292536 usecs after Sat Oct 28 13:13:15 2017
        : Client Add type 1, 1090519321 in state 10
3:1363  292507 usecs after Sat Oct 28 13:13:15 2017
        : Code 0x1 0x0 0x0 0x0

End of app-events for session 0x41000004

The command show system internal bfd transition-history shows the different internal state machine-related events that the BFD session goes through (see Example 12-7). Note that the final state a BFD session should be in BFD_SESS_ST_SESSION_UP. If the BFD session is stuck in one of the other states, this command can identify where the session is stuck.

Example 12-7 BFD Transition History Logs

NX-1# show system internal bfd transition-history

>>>>FSM: <Proto  Sess 0x41000004> has 8 logged transitions<<<<<

1) FSM:<Proto  Sess 0x41000004> Transition at 292788 usecs after Sat Oct 28 13:13:15 2017
    Previous state: [BFD_SESS_ST_INIT]
    Triggered event: [BFD_SESS_EV_INTERFACE]
    Next state: [BFD_SESS_ST_INSTALLING_SESSION]

2) FSM:<Proto  Sess 0x41000004> Transition at 293898 usecs after Sat Oct 28 13:13:15 2017
    Previous state: [BFD_SESS_ST_INSTALLING_SESSION]
    Triggered event: [BFD_SESS_EV_SESSION_INSTALL_SUCCESS]
    Next state: [BFD_SESS_ST_INSTALLING_ACL]

3) FSM:<Proto  Sess 0x41000004> Transition at 347878 usecs after Sat Oct 28 13:13:15 2017
    Previous state: [BFD_SESS_ST_INSTALLING_ACL]
    Triggered event: [BFD_SESS_EV_ACL_RESPONSE]
    Next state: [FSM_ST_NO_CHANGE]

4) FSM:<Proto  Sess 0x41000004> Transition at 347948 usecs after Sat Oct 28 13:13:15 2017
    Previous state: [BFD_SESS_ST_INSTALLING_ACL]
    Triggered event: [BFD_SESS_EV_ACL_INSTALL_SUCCESS]
    Next state: [BFD_SESS_ST_SESSION_DOWN]

5) FSM:<Proto  Sess 0x41000004> Transition at 769773 usecs after Sat Oct 28 13:13:18 2017
    Previous state: [BFD_SESS_ST_SESSION_DOWN]
    Triggered event: [BFD_SESS_EV_SESSION_UP]
    Next state: [BFD_SESS_ST_SESSION_UP]

6) FSM:<Proto  Sess 0x41000004> Transition at 399361 usecs after Sat Oct 28 13:14:02 2017
    Previous state: [BFD_SESS_ST_SESSION_UP]
    Triggered event: [BFD_SESS_EV_SESSION_DOWN]
    Next state: [BFD_SESS_ST_SESSION_DOWN]

7) FSM:<Proto  Sess 0x41000004> Transition at 315593 usecs after Sat Oct 28 13:14:06 2017
    Previous state: [BFD_SESS_ST_SESSION_DOWN]
    Triggered event: [BFD_SESS_EV_CLIENT_ADD]
    Next state: [FSM_ST_NO_CHANGE]

8) FSM:<Proto  Sess 0x41000004> Transition at 92563 usecs after Sat Oct 28 13:14:08 2017
    Previous state: [BFD_SESS_ST_SESSION_DOWN]
    Triggered event: [BFD_SESS_EV_SESSION_UP]
    Next state: [BFD_SESS_ST_SESSION_UP]

    Curr state: [BFD_SESS_ST_SESSION_UP]

When a BFD session is configured, an access list is installed in the hardware; it is verified using the command show system internal access-list interface interface-id module slot. The relevant statistics for the hardware Access Control List (ACL) can be viewed using the command show system internal access-list input statistics module slot. Note that when the BFD is enabled on an interface, the ACL gets installed for both IPv4 and IPv6 in the hardware. Example 12-8 illustrates ACL programmed in the hardware for BFD on the Nexus 7000 switch.

Example 12-8 ACL for BFD in Hardware

NX-1# show system internal access-list interface ethernet 4/1 module 4
Policies in ingress direction:
         Policy type              Policy Id      Policy name
------------------------------------------------------------
    QoS                                3     
    BFD                                6     

No Netflow profiles in ingress direction


INSTANCE 0x0
---------------

  Tcam 1 resource usage:
  ----------------------
   Label_b = 0x2
   Bank 0
   ------
     IPv4 Class
       Policies:  BFD()  [Merged]
       Netflow profile: 0
       Netflow deny profile: 0
       4 tcam entries
     IPv6 Class
       Policies:  BFD()  [Merged]
       Netflow profile: 0
       Netflow deny profile: 0
       2 tcam entries

   0 l4 protocol cam entries
   0 mac etype/proto cam entries
   2 lous
   0 tcp flags table entries
   1 adjacency entries

! Output omitted for brevity
NX-1# show system internal access-list input statistics module 4
                VDC-1 Ethernet4/1 :
                ====================

INSTANCE 0x0
---------------

  Tcam 1 resource usage:
  ----------------------
  Label_b = 0x2
   Bank 0
   ------
     IPv4 Class
       Policies: BFD()  [Merged]
       Netflow profile: 0
       Netflow deny profile: 0
       Entries:
         [Index] Entry [Stats]
         ---------------------
  [0008:0408:0006] prec 1 redirect(0x40001)-routed udp 0.0.0.0/0 0.0.0.0/0 eq 3785 ttl eq 254  flow-label 3785  [0]
  [0009:0508:0007] prec 1 redirect(0x40001)-routed udp 0.0.0.0/0 0.0.0.0/0 eq 3784 ttl eq 255  flow-label 3784  [26874]
  [000a:0608:0008] prec 1 permit-routed ip 0.0.0.0/0 0.0.0.0/0   [1641]
  [000b:0488:0009] prec 1 permit-routed ip 0.0.0.0/0 0.0.0.0/0 fragment   [0]
     IPv6 Class
       Policies: BFD()  [Merged]
       Netflow profile: 0
       Netflow deny profile: 0
       Entries:
         [Index] Entry [Stats]
         ---------------------
 [000c:0509:000a] prec 1 redirect(0x40001)-routed udp 0x0/0 0x0/0 eq 3784 ttl eq 255  flow-label 3784  [0]
  [000d:0409:000b] prec 1 redirect(0x40001)-routed udp 0x0/0 0x0/0 eq 3785 ttl eq 254  flow-label 3785  [0]

Note

The ACL programming on the hardware is dependent on the underlying line card hardware and the Nexus platform. The behavior might differ among Nexus hardware platforms.

To enable the BFD echo function, configure the command bfd echo under the interface. When the session is configured with the echo function, the BFD session starts in asynchronous mode using a slow interval of 2 seconds. When the session is up, and if the interval specified by the client is less than 2 seconds, the echo function gets activated (assuming that the echo function is enabled on the remote peer as well).

Example 12-9 illustrates the configuration of the BFD echo function between NX-1 and NX-2 and the changes in the show bfd neighbors detail command output after the BFD session is established.

Example 12-9 BFD with Echo Function Configuration and Verification

NX-1(config)# interface Ethernet4/1
NX-1(config-if)# bfd echo
NX-1(config-if)# bfd interval 50 min_rx 50 multiplier 3
NX-1# show bfd neighbors detail

OurAddr    NeighAddr     LD/RD                 RH/RS  Holdown(mult) State  Int     Vrf                       
10.1.12.1  10.1.12.2  1090519047/1107296265     Up     667(3)        Up   Eth4/1  default

Session state is Up and using echo function with 50 ms interval
Local Diag: 0, Demand mode: 0, Poll bit: 0, Authentication: None
MinTxInt: 50000 us, MinRxInt: 2000000 us, Multiplier: 3
Received MinRxInt: 2000000 us, Received Multiplier: 3
Holdown (hits): 6000 ms (0), Hello (hits): 2000 ms (1690)
Rx Count: 1690, Rx Interval (ms) min/max/avg: 0/1880/1807 last: 268 ms ago
Tx Count: 1690, Tx Interval (ms) min/max/avg: 1806/1806/1806 last: 269 ms ago
Registered protocols:  ospf
Uptime: 0 days 0 hrs 50 mins 52 secs
Last packet: Version: 1                - Diagnostic: 0  
             State bit: Up             - Demand bit: 0  
             Poll bit: 0               - Final bit: 0  
             Multiplier: 3             - Length: 24  
             My Discr.: 1107296265     - Your Discr.: 1090519047  
             Min tx interval: 50000    - Min rx interval: 2000000  
             Min Echo interval: 50000  - Authentication bit: 0
Hosting LC: 4, Down reason: None, Reason not-hosted: None

If a failure occurs, NX-OS logs a syslog message for BFD failure along with a reason code for the failure and the session discriminator value. Example 12-10 displays the syslog message of a BFD failure on NX-1. Notice that, in this case, the reason is 0x2, which indicates “Echo Function Failed.”

Example 12-10 BFD Failure Log

02:42:01 NX-1  BFD-5-SESSION_STATE_DOWN  BFD session  1107296259  to neighbor 10.1.12.2 on interface Eth4/1 has gone down. Reason: 0x2.
02:44:01 NX-1  BFD-5-SESSION_STATE_DOWN  BFD session  1090519047  to neighbor 10.1.12.2 on interface Eth4/1 has gone down. Reason: 0x2.

Table 12-2 lists all the BFD failure reason codes, along with their description.

Table 12-2 BFD Failure Reason Codes and Description

Reason Code

Description

0

No Diag

1

Control packet detection timer expired

2

Echo function failed

3

Neighbor signaled session down

4

Forwarding plane reset

5

Path down

6

Concatenated path down

7

Administratively down

8

Reverse concatenated path down

Note

In case of any BFD failure event, capturing show tech bfd soon after the BFD flap event is recommended. It is also necessary to capture the show tech feature output for the relevant feature with which BFD is associated; for instance, in case of OSPF, this is show tech ospf.

Nexus also supports BFD over L3 port-channels or BFD on SVI interfaces over L2 port-channel. In both cases, Link Aggregation Control Protocol (LACP) must be enabled for the port-channel interface. BFD is enabled on L3 port-channel interfaces for two methods:

  • BFD per-link

  • Micro BFD session

To enable BFD per-link, use the command bfd per-link under the port-channel interface along with the no ip redirects command. That enables the BFD for the client protocol enabled on that L3 port-channel interface. When BFD per-link mode is used, BFD creates a session for each link in the port-channel and provides accumulated or aggregated results to the client protocol. Example 12-11 demonstrates the configuration of per-link BFD configuration on port-channel interface and its verification using the show bfd neighbors [detail] command output. Use the command show port-channel summary to verify the member ports of the port-channel interface.

Example 12-11 BFD over Port-Channel per-Link Configuration

NX-1
NX-1(config)# interface port-channel1
NX-1(config-if)# no ip redirects
NX-1(config-if)# bfd per-link
NX-1(config-if)# ip router ospf 100 area 0.0.0.0
NX-1(config-if)# ip ospf network point-to-point
NX-1(config-if)# exit
NX-1(config)# router ospf 100
NX-1(config-router)# bfd

NX-1# show port-channel summary
! Output omitted for brevity
--------------------------------------------------------------------------------
Group Port-       Type     Protocol  Member Ports
      Channel
--------------------------------------------------------------------------------
1     Po1(RU)     Eth      LACP      Eth4/1(P)

NX-1# show bfd neighbors details

OurAddr     NeighAddr    LD/RD               RH/RS  Holdown(mult)  State  Int
  Vrf               
10.1.12.1   10.1.12.2  1090519048/0          Up    N/A(3)         Up      Po1     default

Session state is Up
Local Diag: 0
Registered protocols:  ospf
Uptime: 0 days 0 hrs 0 mins 9 secs
Hosting LC: 0, Down reason: None, Reason not-hosted: None
Parent session, please check port channel config for member info

OurAddr     NeighAddr    LD/RD               RH/RS  Holdown(mult)  State  Int     Vrf
10.1.12.1   10.1.12.2  1090519049/1107296267  Up    148(3)         Up     Eth4/1 default

Session state is Up and not using echo function
Local Diag: 0, Demand mode: 0, Poll bit: 0, Authentication: None
MinTxInt: 50000 us, MinRxInt: 50000 us, Multiplier: 3
Received MinRxInt: 50000 us, Received Multiplier: 3
Holdown (hits): 150 ms (0), Hello (hits): 50 ms (176)
Rx Count: 176, Rx Interval (ms) min/max/avg: 0/2133/72 last: 1 ms ago
Tx Count: 176, Tx Interval (ms) min/max/avg: 48/48/48 last: 2 ms ago
Registered protocols:  
Uptime: 0 days 0 hrs 0 mins 9 secs
Last packet: Version: 1                - Diagnostic: 0  
             State bit: Up             - Demand bit: 0  
             Poll bit: 0               - Final bit: 0  
             Multiplier: 3             - Length: 24  
             My Discr.: 1107296267     - Your Discr.: 1090519049  
             Min tx interval: 50000    - Min rx interval: 50000  
             Min Echo interval: 50000  - Authentication bit: 0  
Hosting LC: 4, Down reason: None, Reason not-hosted: None
Member session under parent interface Po1

Nexus 9000 also supports BFD on every link aggregation group (LAG) member interfaces, as defined in RFC 7130. This method is called IETF Micro BFD session. The echo function is not supported on micro BFD sessions. The benefit of using micro BFD sessions is that if any member port goes down, the port is removed from the forwarding table and traffic disruption is prevented on that member link.

Micro BFD sessions are configured using the commands port-channel bfd track-member-link and port-channel bfd destination ip-address on an active L3 port-channel interface. Example 12-12 demonstrates the configuration of micro BFD session configuration on Nexus 9000 switches N9k-1 and N9k-2.

Example 12-12 BFD over Port-Channel (Micro BFD Session Configuration)

N9k-1
N9k-1(config)# interface port-channel2
N9k-1(config-if)# port-channel bfd track-member-link
N9k-1(config-if)# port-channel bfd destination 172.16.0.1
N9k-2
N9k-2(config)# interface port-channel2
N9k-2(config-if)# port-channel bfd track-member-link
N9k-2(config-if)# port-channel bfd destination 172.16.0.0

During verification, it is noticed that the BFD session is established on each member port of the port-channel. In this method, the BFD client is the port-channel itself. Example 12-13 verifies the BFD session on the port-channel interface configured with the micro BFD session. Notice that the client is Ethernet port-channel.

Example 12-13 BFD over Port-Channel

N9k-1
N9k-1# show bfd neighbors

OurAddr     NeighAddr    LD/RD               RH/RS  Holdown(mult)  State  Int       Vrf               
172.16.0.0  172.16.0.1  1090519044/0          Up    N/A(3)         Up     Po2    default
172.16.0.0  172.16.0.1  1090519045/1090519045 Up    121(3)         Up     Eth1/3 default
N9k-1# show bfd neighbors details

OurAddr     NeighAddr    LD/RD               RH/RS  Holdown(mult)  State  Int       Vrf               
172.16.0.1  172.16.0.0  1090519044/0          Up    N/A(3)         Up     Po2    default

Session state is Up
Local Diag: 0
Registered protocols:  eth_port_channel
Uptime: 0 days 0 hrs 9 mins 56 secs
Hosting LC: 0, Down reason: None, Reason not-hosted: None
Parent session, please check port channel config for member info

172.16.0.1  172.16.0.0  1090519045/1090519045 Up    121(3)         Up     Eth1/3 default


Session state is Up and not using echo function
Local Diag: 0, Demand mode: 0, Poll bit: 0, Authentication: None
MinTxInt: 50000 us, MinRxInt: 50000 us, Multiplier: 3
Received MinRxInt: 50000 us, Received Multiplier: 3
Holdown (hits): 150 ms (0), Hello (hits): 50 ms (12619)
Rx Count: 12357, Rx Interval (ms) min/max/avg: 1/1987/48 last: 25 ms ago
Tx Count: 12619, Tx Interval (ms) min/max/avg: 47/47/47 last: 32 ms ago
Registered protocols:  eth_port_channel
Uptime: 0 days 0 hrs 9 mins 56 secs
Last packet: Version: 1                - Diagnostic: 0  
             State bit: Up             - Demand bit: 0  
             Poll bit: 0               - Final bit: 0  
             Multiplier: 3             - Length: 24  
             My Discr.: 1090519045     - Your Discr.: 1090519045  
             Min tx interval: 50000    - Min rx interval: 50000  
             Min Echo interval: 50000  - Authentication bit: 0  
Hosting LC: 1, Down reason: None, Reason not-hosted: None
Member session under parent interface Po2

Note

In case of any issues with a per-link BFD or micro BFD session, collect the show tech bfd and show tech lacp all output and share the captured logs with Cisco Technical Assistance Center (TAC) for investigation purposes.

Nexus High Availability

NX-OS was built around the pillars of high availability (HA) and virtualization. With Nexus devices being designed for data centers and enterprises, the NX-OS architecture adds a huge benefit in maintaining high availability in such environments with various features and capabilities. These features help reduce downtime:

  • Stateful switchover (SSO)

  • In-service software upgrade (ISSU)

  • Graceful insertion and removal

This section discusses in detail these features and shows how they provide HA capability to Nexus devices.

Stateful Switchover

Various Nexus platforms (including the Nexus 7000, Nexus 7700, and Nexus 9500) have support for fabric as well as supervisor redundancy. The benefit of the hardware-based redundancy is that if the active hardware (fabric or supervisor card) fails, the standby hardware takes over the role of active and prevents any kind of traffic and service disruption. In addition, some of the software-based HA features, such as nonstop routing (NSR), nonstop forwarding (NSF), and graceful restart (GR), are leveraged only when a redundant supervisor card is available to synchronize the state to the standby supervisor and seamlessly take over the role of active supervisor when the old active supervisor fails.

With redundant hardware, the supervisor cards must stay in active/ha-standby mode. The supervisor states are verified using the command show module. This command displays all the supervisor cards, line cards, and fabric cards present in the chassis. Example 12-14 displays the show module output on the Nexus 7000 switch. Notice that, in the output, the supervisor card in slot 1 is in ha-standby state and the one in slot 2 is in active state.

Example 12-14 show module Command Output

NX-1# show module
Mod  Ports  Module-Type                         Model              Status
---  -----  ----------------------------------- ------------------ ----------
1    0      Supervisor Module-2                 N7K-SUP2E          ha-standby
2    0      Supervisor Module-2                 N7K-SUP2E          active *
3    32     10 Gbps Ethernet XL Module          N7K-M132XP-12L     ok
4    32     1/10 Gbps Ethernet Module           N7K-F132XP-15      powered-dn
5    48     10/100/1000 Mbps Ethernet XL Module N7K-M148GT-11L     ok
6    48     1/10 Gbps Ethernet Module           N7K-F248XP-25E     ok
7    32     10 Gbps Ethernet XL Module          N7K-M132XP-12L     ok
8    48     1/10 Gbps Ethernet Module           N7K-F248XP-25      ok
! Output omitted for brevity

The HA state is also verified using the command show system redundancy status. When the standby supervisor is booting up, or after a switchover event when the active supervisor moves to a standby role, the ha-standby state is not achieved immediately. The standby supervisor requires synchronizing the state with that of the active supervisor. This is achieved with the system manager (sysmgr) component on the active supervisor. The sysmgr component initiates a global sync (gsync) of active supervisor state to standby supervisor. During the synchronization process, the state is seen as HA synchronization in progress. Note that the standby should not be in this state for too long because it can indicate failure and other issues.

When all the components and states are synchronized between the active and standby supervisor, the Module-Manager is informed that the standby supervisor is up. The Module-Manager then informs all the software components on active supervisor about the availability of the standby supervisor and configures them. This event is known as the Standby Sup Insertion Sequence. Any error faced during this sequence results in a reboot of the standby supervisor.

Example 12-15 displays the system redundancy status. An ideal state for redundancy is active/standby state. In this example, the standby supervisor is currently synchronizing its states with the active supervisor in slot 2.

Example 12-15 System Redundancy State

NX-1# show system redundancy status
Redundancy mode
---------------
      administrative:   HA
         operational:   HA

This supervisor (sup-2)
-----------------------
    Redundancy state:   Active
    Supervisor state:   Active
      Internal state:   Active with HA standby

Other supervisor (sup-1)
------------------------
    Redundancy state:   Standby
    Supervisor state:   HA standby
      Internal state:   HA synchronization in progress

Note

In case of failure during Standby Sup Insertion Sequence, collect the following commands to help identify where the failure has occurred:

  • show logging [nvram]

  • show module internal exception-log

  • show system reset-reason

  • show module internal event-history module slot

On the Nexus 7000 or Nexus 7700 series platform, where virtual device context (VDC) is supported, the HA state should also be maintained across all VDCs configured on the system. This is verified using the command show system redundancy ha status. Example 12-16 verifies the system redundancy state across all VDCs.

Example 12-16 System Redundancy HA Status

NX-1# show system redundancy ha status
VDC No    This supervisor                         Other supervisor                     
------    ---------------                         ---------------                      
vdc 1     Active with HA standby                  HA standby                         
vdc 2     Active with HA standby                  HA standby

Synchronization is achieved using the sysmgr component, so the state information can also be verified using the sysmgr state command show system internal sysmgr state. In this command, verify that the sysmgr state is set to Active/HotStandby, as shown in Example 12-17. This command also shows the current state of the active supervisor card, which is set to Active (SYSMGR_CARDSTATE_ACTIVE) here.

Example 12-17 System Manager State Information

NX-1# show system internal sysmgr state

The master System Manager has PID 4967 and UUID 0x1.
Last time System Manager was gracefully shutdown.
The state is SRV_STATE_MASTER_ACTIVE_HOTSTDBY entered at time Thu Oct 26 13:20:5
4 2017.

The '-b' option (disable heartbeat) is currently disabled.

The '-n' (don't use rlimit) option is currently disabled.

Hap-reset is currently enabled.

Process restart capability is currently disabled.

Watchdog checking is currently enabled.

Watchdog kgdb setting is currently disabled.


        Debugging info:

The trace mask is 0x00000000, the syslog priority enabled is 3.
The '-d' option is currently disabled.
The statistics generation is currently enabled.


        HA info:

slotid = 2    supid = 0
cardstate = SYSMGR_CARDSTATE_ACTIVE .
cardstate = SYSMGR_CARDSTATE_ACTIVE (hot switchover is configured enabled).

Configured to use the real platform manager.
Configured to use the real redundancy driver.
Redundancy register: this_sup = RDN_ST_AC, other_sup = RDN_ST_SB.
EOBC device name: veobc.
Remote addresses:  MTS - 0x00000101/3      IP - 127.1.1.1
MSYNC done.
Remote MSYNC not done.
Module online notification received.
Local super-state is: SYSMGR_SUPERSTATE_STABLE
Standby super-state is: SYSMGR_SUPERSTATE_STABLE
Swover Reason : SYSMGR_UNKNOWN_SWOVER
Total number of Switchovers: 0
Swover threshold settings: 5 switchovers within 4800 seconds
Switchovers within threshold interval: 0
Last switchover time: 0 seconds after system start time
Cumulative time between last 0 switchovers: 0
Start done received for 1 plugins, Total number of plugins = 1


        Statistics:

Message count:           0
Total latency:           0              Max latency:             0
Total exec:              0              Max exec:                0

When the system is in HA or redundancy state, performing a switchover from active to standby supervisor card does not have much impact on the services. Switchovers are usually performed when an upgrade needs to be performed, the Message and Transaction Service (MTS) queue is stuck, when misprogramming has occurred on the supervisor card, and so on. A manual switchover is performed using the command system switchover. When the command is executed, the standby supervisor takes over the role of active and the initial active supervisor is rebooted. Note that when the supervisor switchover happens, some protocols (stateless protocols) might experience flaps, but this does not affect the forwarding. Example 12-18 demonstrates a manual switchover of the system and, at the same time, availability of the system via the redundant supervisor module present in the system.

Example 12-18 Redundancy Switchover

NX-1 SUP-1
NX-1# system switchover
NX-1#
User Access Verification
NX-1 login:
User Access Verification
NX-1 login:
>>>
>>>
>>>
NX7k SUP BIOS version ( 2.12 ) : Build - 05/29/2013 11:58:20
PM FPGA Version : 0x00000025
Power sequence microcode revision - 0x00000009 : card type - 10156EEA0
Booting Spi Flash : Primary
  CPU Signature - 0x000106e4: Version - 0x000106e0
  CPU - 2 : Cores - 4 : HTEn - 1 : HT - 2 : Features - 0xbfebfbff
  FSB Clk - 532 Mhz :  Freq - 2140 Mhz - 2128 Mhz
  MicroCode Version : 0x00000002
  Memory - 32768 MB : Frequency - 1067 MHZ
  Loading Bootloader: Done
  IO FPGA Version   : 0x1000d
  PLX Version       : 861910b5
Bios digital signature verification - Passed
USB bootflash status : [1-1:1-1]

Reset Reason Registers: 0x1 0x0
 Filesystem type is ext2fs, partition type 0x83


              GNU GRUB  version 0.97

Autobooting bootflash:/n7000-s2-kickstart.7.3.2.D1.1.bin bootflash:/n7000-s2-dk
9.7.3.2.D1.1.bin...
 Filesystem type is ext2fs, partition type 0x83
! Output omitted for brevity
NX-1 SUP-2
NX-1 login: admin
Password:


Cisco Nexus Operating System (NX-OS) Software
TAC support: http://www.cisco.com/tac
! Output omitted for brevity
NX-1#

Note

During manual switchover, while the initial active supervisor is being rebooted to take over the role as standby, if the newly active supervisor crashes or reloads, it can lead to a whole system reload and cause major outages. Thus, a manual switchover should always take place during a planned maintenance window.

ISSU

Performing upgrades in any network deployment, especially in a huge data center and enterprise, is unpleasant. In most cases, when a device needs to be upgraded, services and traffic are shifted to the backup or redundant devices, boot variables are set, and then the device is brought down using reload command to perform the upgrade. This becomes more challenging on devices such as the Nexus 7000, with multiple VDCs running on a single box, acting as individual devices and playing different roles. To overcome the challenges of upgrades in the network, leverage the ISSU feature.

ISSU is not a new concept. It is available on multiple Cisco catalyst platforms, including 4500 and 6500 switches. ISSU follows the same concept on Nexus 7000 series devices. The whole ISSU process takes place in a few simple steps:

Step 1. Upgrade the Basic Input and Output System (BIOS) on supervisors and line card modules.

Step 2. Bring up the standby supervisor card with a new image.

Step 3. Switch over from the active to the standby supervisor, which is running on the new image.

Step 4. Bring up old active supervisor card with the new image.

Step 5. Perform a hitless line card upgrade (one at a time).

Step 6. Upgrade the Connectivity Management Processor (CMP).

Note

Starting with NX-OS Release 5.2(1), simultaneous multiple line card upgrades happen on Nexus switches, thus reducing the upgrade time using ISSU.

Before ISSU is performed, especially when the software is being downgraded, perform a sanity check for the configuration compatibility between the existing software version running on the system and the old image to which the system is being downgraded. This check helps inform the network administrators about the features and configurations that are available in the new release but not in the old release, and those configurations then are removed. The incompatibilities are verified using the command show incompatibility-all system nx-os-file-name, as in Example 12-19.

Example 12-19 Verifying Configuration Incompatibilities

NX-1# show incompatibility-all system bootflash:n7000-s2-dk9.7.3.2.D1.1.bin

Checking incompatible configuration(s) for vdc 'NX-1':
------------------------------------------------------
No incompatible configurations

Checking dynamic incompatibilities for vdc 'NX-1':
--------------------------------------------------
No incompatible configurations

Checking incompatible configuration(s) for vdc 'TEST':
--------------------------------------------------------------
No incompatible configurations

Checking dynamic incompatibilities for vdc 'TEST2':
----------------------------------------------------------
No incompatible configurations

Checking incompatible configuration(s) for vdc 'TEST3':
---------------------------------------------------------
No incompatible configurations

Checking dynamic incompatibilities for vdc 'TEST4':
-----------------------------------------------------
No incompatible configurations

An ISSU upgrade is performed using the command install all kickstart kickstart-image system system-image [parallel]. The parallel keyword is used to perform parallel upgrade with I/O modules. ISSU is supposed to perform a nondisruptive software upgrade, which upgrades the software on the Nexus switch without affecting the data plane. For a nondisruptive upgrade, the software must be compatible across releases. If the image is not compatible, the upgrade can be disruptive. Example 12-20 illustrates an example of a disruptive software upgrade from the 6.2(16) image to the 7.3(2)D1(1) image. The output shows that the image is incompatible, so the impact of the upgrade is thus disruptive.

Example 12-20 ISSU Upgrade

NX-1# install all kickstart bootflash:n7000-s2-kickstart.7.3.2.D1.1.bin system bootflash:n7000-s2-dk9.7.3.2.D1.1.bin
Installer will perform compatibility check first. Please wait.

Verifying image bootflash:/n7000-s2-kickstart.7.3.2.D1.1.bin for boot variable "kickstart".
[####################] 100% -- SUCCESS

Verifying image bootflash:/n7000-s2-dk9.7.3.2.D1.1.bin for boot variable "system".
[####################] 100% -- SUCCESS

Performing module support checks.
[####################] 100% -- SUCCESS

Verifying image type.
[####################] 100% -- SUCCESS

Extracting "system" version from image bootflash:/n7000-s2-dk9.7.3.2.D1.1.bin.
[####################] 100% -- SUCCESS

Extracting "kickstart" version from image bootflash:/n7000-s2-kickstart.7.3.2.D1.1.bin.
[####################] 100% -- SUCCESS

Extracting "bios" version from image bootflash:/n7000-s2-dk9.7.3.2.D1.1.bin.
[####################] 100% -- SUCCESS

Extracting "lcflnn7k" version from image bootflash:/n7000-s2-dk9.7.3.2.D1.1.bin.
[####################] 100% -- SUCCESS

Notifying services about system upgrade.
[####################] 100% -- SUCCESS



Compatibility check is done:
Module  bootable          Impact  Install-type  Reason
------  --------  --------------  ------------  ------------------
     1       yes      disruptive         reset  Incompatible image
     2       yes      disruptive         reset  Incompatible image
     3       yes      disruptive         reset  Incompatible image
     4       yes      disruptive         reset  Incompatible image



Images will be upgraded according to following table:
Module      Image                Running-Version(pri:alt)  New-Version  Upg-Required
------ ----------    ------------------------------------  -----------  ------------
     1     system                                  6.2(16)   7.3(2)D1(1)        yes
     1  kickstart                                  6.2(16)   7.3(2)D1(1)        yes
     1       bios  v2.12.0(05/29/2013):v2.12.0(05/29/2013) v2.12.0(05/29/2013)  no
     2     system                                  6.2(16)   7.3(2)D1(1)        yes
     2  kickstart                                  6.2(16)   7.3(2)D1(1)        yes
     2       bios  v2.12.0(05/29/2013):v2.12.0(05/29/2013) v2.12.0(05/29/2013)  no
     3   lcflnn7k                                  6.2(16)   7.3(2)D1(1)        yes
     3       bios  v3.0.29(12/15/2015):v3.0.29(12/15/2015) v3.0.29(12/15/2015)  no
     4   lcflnn7k                                  6.2(16)   7.3(2)D1(1)        yes
     4       bios  v3.0.29(12/15/2015):v3.0.29(12/15/2015)  v3.0.29(12/15/2015) no


Switch will be reloaded for disruptive upgrade.
Do you want to continue with the installation (y/n)?  [n] y

Install is in progress, please wait.

Performing runtime checks.
[####################] 100% -- SUCCESS

Syncing image bootflash:/n7000-s2-kickstart.7.3.2.D1.1.bin to standby.
[####################] 100% -- SUCCESS

Syncing image bootflash:/n7000-s2-dk9.7.3.2.D1.1.bin to standby.
[####################] 100% -- SUCCESS

Setting boot variables.
[####################] 100% -- SUCCESS

Performing configuration copy.
[####################] 100% -- SUCCESS

Module 1:  Upgrading bios/loader/bootrom.
Warning: please do not remove or power off the module at this time.
[####################] 100% -- SUCCESS

Module 2:  Upgrading bios/loader/bootrom.
Warning: please do not remove or power off the module at this time.
[####################] 100% -- SUCCESS

Module 3:  Upgrading bios/loader/bootrom.
Warning: please do not remove or power off the module at this time.
[####################] 100% -- SUCCESS

Module 4:  Upgrading bios/loader/bootrom.
Warning: please do not remove or power off the module at this time.
[####################] 100% -- SUCCESS

Finishing the upgrade, switch will reboot in 10 seconds.
NX-1#
>>>
>>>
>>>
NX7k SUP BIOS version ( 2.12 ) : Build - 05/29/2013 11:58:20
PM FPGA Version : 0x00000025
Power sequence microcode revision - 0x00000009 : card type - 10156EEA0
Booting Spi Flash : Primary
  CPU Signature - 0x000106e4: Version - 0x000106e0
  CPU - 2 : Cores - 4 : HTEn - 1 : HT - 2 : Features - 0xbfebfbff
  FSB Clk - 532 Mhz :  Freq - 2144 Mhz - 2128 Mhz
  MicroCode Version : 0x00000002
  Memory - 32768 MB : Frequency - 1067 MHZ
  Loading Bootloader: Done
  IO FPGA Version   : 0x1000d
  PLX Version       : 861910b5
Bios digital signature verification - Passed
USB bootflash status : [1-1:1-1]

Reset Reason Registers: 0x10 0x0
 Filesystem type is ext2fs, partition type 0x83


              GNU GRUB  version 0.97

Autobooting bootflash:/n7000-s2-kickstart.7.3.2.D1.1.bin bootflash:/n7000-s2-dk
9.7.3.2.D1.1.bin...
 Filesystem type is ext2fs, partition type 0x83
Booting kickstart image: bootflash:/n7000-s2-kickstart.7.3.2.D1.1.bin....
...............................................................................
.........................................................................
Kickstart digital signature verification Successful
Image verification OK

INIT: version 2boot device node /dev/sda
Bootflash firmware upgrade not required
boot device node /dev/sda
boot mirror device node /dev/sdb
Bootflash mirror firmware upgrade not required
boot mirror device node /dev/sdb
obfl device node /dev/sdc
OBFL firmware upgrade not required
obfl device node /dev/sdc
slot0 flash device node /dev/sdd
Checking obfl filesystem.
Checking all filesystems..r.r.r.retval=[1]
r done.
Starting mcelog daemon
Creating logflash directories
Loading system software
/bootflash//n7000-s2-dk9.7.3.2.D1.1.bin read done
System image digital signature verification successful.
Uncompressing system image: bootflash:/n7000-s2-dk9.7.3.2.D1.1.bin Sun Mar 5 09:19:07 UTC 2017
blogger: nothing to do.
C
..done Sun Mar 5 09:19:12 UTC 2017
INIT: Entering runlevel: 3
Starting portmap daemon...
creating NFS state directory: done

System is coming up ... Please wait ...
System is coming up ... Please wait ...
System is coming up ... Please wait ...
System is coming up ... Please wait ...

Failures in ISSU can happen at different stages:

  • Pre-upgrade and BIOS upgrade

  • Standby bootup and switchover

  • Line card upgrade

When an ISSU upgrade fails, it is important to determine which component caused the failure. At this point, the first step is to collect the following logs:

  • Installer log

    show system internal log install [details]

  • Sysmgr and HA-related event-history logs

    show system internal log sysmgr state

    show system internal log sysmgr event-history errors

  • Module Manager log

    show module internal event module slot

  • Upgrade log on line card

    show system internal log sysmgr rtdbctrl

After capturing the relevant logs, it is important to restore the services from ISSU failure. This is done using the command install all. This command ensures that the system normalizes with running image and that all the modules are running the same image.

It is important to remember that an ISSU upgrade might not be compatible with all scenarios, such as OTV (in certain releases), LACP Fast rate, and continuous TCNs in the network. Reviewing the ISSU caveats on CCO is thus recommended before performing an upgrade.

Note

In case of ISSU failure, it is also important to collect show tech-support issu and show tech-support ha outputs before the services are recovered.

Graceful Insertion and Removal

In any network deployment, network engineers must perform hardware replacements, hardware and software replacements, or even an intrusive debugging session to identify a root cause of a problem. In any of these instances, engineers do not want to impact any services running on the network. Usually a maintenance window is scheduled and traffic is diverted to a backup path or redundant device to minimize the impact on any services, but this is a tedious task. NX-OS provides the Graceful Insertion and Removal (GIR) feature, which enables you to put devices in maintenance mode and perform any of the previously stated activities without impacting any services. The intent of GIR is to simplify the isolation of a switch from the network using a single set of commands instead of having to manually shut interfaces or alter metrics. In other words, GIR can essentially be called a macro that automates all manual steps to isolate the switch from the network.

GIR has two modes:

  • Maintenance mode

  • Normal mode

In maintenance mode (also known as the Graceful Removal phase), all data traffic bypasses the node. A parallel path should be available for the GIR to function properly. If no available parallel path exists, service disruptions to the network can arise. Maintenance mode is used to perform maintenance-related activities such as software/hardware upgrades, swaps for bad hardware, or other disruptive activities on the node. The node then can go back to normal mode (also known as Graceful Insertion phase).

To understand the functioning of GIR, examine the topology in Figure 12-4. This topology is a typical spine-leaf topology with two spine nodes and six leaf nodes. The connectivity between spine and leaf is via OSPF.

Image

Figure 12-4 Typical Spine-Leaf Topology

In this topology, suppose that the spine node Spine1 is set to maintenance mode for performing a software upgrade. The first step in GIR is to advertise costly metrics within the routing protocols. Thus, Spine1 advertises the OSPF max-metric to all its OSPF neighbors. When the leaf nodes receive the max-metric, they alter their forwarding path to push all the traffic through Spine2. At this point, the OSPF neighborship is still up between Spine1 and all six leaf nodes (assuming the default Isolate mode, to be discussed), but no data forwarding is happening via Spine1.

Maintenance mode is supported on Nexus 7000 and 7700 series platforms starting with Release 7.2.0 and on Nexus 5500/5600 platforms starting with Release 7.1.0. Maintenance mode is configured using the command system mode maintenance [shutdown]. When the command system mode maintenance is configured, GIR is enabled in default mode, also known as Isolate mode. In this mode, the protocol neighborship is maintained and traffic is diverted to the backup or parallel path. When the command system mode maintenance shutdown is configured, the GIR is enabled in shutdown mode; the protocols go into shutdown state, links are shut down, and traffic loss can occur. Isolate mode for GIR is recommended over shutdown mode.

Example 12-21 demonstrates the differences in feature-level configuration when the device is configured for isolate mode versus shutdown maintenance mode. In both modes, the command show system mode shows that the system mode is Maintenance. Before the system goes into maintenance mode, NX-OS takes a snapshot of the current state of the device and saves it as the before_maintenance snapshot.

Example 12-21 Isolate and Shutdown Maintenance Mode

N7k-1(config)# system mode maintenance

Following configuration will be applied:

router bgp 100
  isolate
router eigrp 100
  isolate
router ospf 100
  isolate
router isis IS-IS
  isolate


Do you want to continue (yes/no)? [no] yes

Generating a snapshot before going into maintenance mode

Starting to apply commands...

Applying : router bgp 100
Applying :   isolate
Applying : router eigrp 100
Applying :   isolate
Applying : router ospf 100
Applying :   isolate
Applying : router isis IS-IS
Applying :   isolate

Maintenance mode operation successful.
N7k-1(config)#
2017 Mar  5 20:40:45 N7k-1 %$ VDC-2 %$ %MMODE-2-MODE_CHANGED: System changed to "maintenance" mode.

N7k-1# show system mode
System Mode: Maintenance
Maintenance Mode Timer: not running
N7k-1(config)# system mode maintenance shutdown

Following configuration will be applied:

router bgp 100
  shutdown
router eigrp 100
  shutdown
  address-family ipv6 unicast
    shutdown
router ospf 100
  shutdown
router isis IS-IS
  shutdown
system interface shutdown

NOTE: 'system interface shutdown' will shutdown all interfaces excluding mgmt 0
Do you want to continue (yes/no)? [no] yes

Generating a snapshot before going into maintenance mode

Starting to apply commands...

Applying : router bgp 100
Applying :   shutdown
Applying : router eigrp 100
Applying :   shutdown
Applying :   address-family ipv6 unicast
Applying :     shutdown
Applying : router ospf 100
Applying :   shutdown
Applying : router isis IS-IS
Applying :   shutdown
Applying : system interface shutdown

Maintenance mode operation successful.

When the system goes into maintenance mode, the processes that were influenced by maintenance mode change their running state to Isolate or Shutdown. Example 12-22 displays the different routing protocol processes and their current state on the system.

Example 12-22 Routing Protocol States during Maintenance Mode

N7k-1# show bgp process

BGP Process Information
BGP Process ID                 : 20105
BGP Protocol Started, reason:  : configuration
BGP Protocol Tag               : 100
BGP Protocol State             : Running (Isolate)
BGP MMODE                      : Initialized BGP Memory State               : OK
BGP asformat                   : asplain
! Output omitted for brevity
N7k-1# show ip eigrp
IP-EIGRP AS 100 ID 0.0.0.0 VRF default
  Process-tag: 100
  Instance Number: 1
  Status: running (isolate)
  Authentication mode: none
  Authentication key-chain: none
! Output omitted for brevity
  Redistributed max-prefix: Disabled
  MMODE: Initialized
  Suppress-FIB-Pending Configured
N7k-1# show isis protocol

ISIS process : IS-IS
 Instance number :  1
 UUID: 1090519320
 Process ID 20143
VRF: default
  System ID : 0000.0000.0001  IS-Type : L1-L2
  SAP : 412  Queue Handle : 15
  Maximum LSP MTU: 1492
  Stateful HA enabled
  Graceful Restart enabled. State: Inactive
  Last graceful restart status : none
  Start-Mode Complete
  BFD IPv4 is globally disabled for ISIS process: IS-IS
  BFD IPv6 is globally disabled for ISIS process: IS-IS
  Topology-mode is base
  Metric-style : advertise(wide), accept(narrow, wide)
  Area address(es) :
    49.0001
  Process is up and running (isolate)
! Output omitted for brevity
N7k-1# show ip ospf internal

ospf 100 VRF default
ospf process tag 100
ospf process instance number 1
ospf process uuid 1090519321
ospf process linux pid 20064
ospf process state running(isolate)
System uptime 05:18:06
SUP uptime 2 05:18:06
Server up        : L3VM|IFMGR|RPM|AM|CLIS|URIB|U6RIB|IP|IPv6|SNMP|BGP|MMODE
Server required  : L3VM|IFMGR|RPM|AM|CLIS|URIB|IP|SNMP
Server registered: L3VM|IFMGR|RPM|AM|CLIS|URIB|IP|SNMP|BGP|MMODE
Server optional  : BGP|MMODE
Early hello : OFF
Force write PSS: FALSE
OSPF mts pkt sap 324
OSPF mts base sap 320

After the maintenance activity is performed, the no system mode maintenance configuration command brings the system out of maintenance mode. When this command is configured, the system is rolled back to normal mode and all the configuration changes made during the isolate or shutdown maintenance mode are rolled back. Example 12-23 illustrates moving the system from maintenance mode to normal mode. Another snapshot then is taken, with the name after_maintenance.

Example 12-23 Switching from Maintenance Mode to Normal Mode

N7k-1(config)# no system mode maintenance

Following configuration will be applied:

router isis IS-IS
  no isolate
router ospf 100
  no isolate
router eigrp 100
  no isolate
router bgp 100
  no isolate

Do you want to continue (yes/no)? [no] yes

Starting to apply commands...

Applying : router isis IS-IS
Applying :   no isolate
Applying : router ospf 100
Applying :   no isolate
Applying : router eigrp 100
Applying :   no isolate
Applying : router bgp 100
Applying :   no isolate

Maintenance mode operation successful.

The after_maintenance snapshot will be generated in 120 seconds
After that time, please use 'show snapshots compare before_maintenance after_maintenance' to check the health of the system

When the system is back to normal mode, verify that the services are normalized, with routes in the Routing Information Base (RIB), VLANs, and so on. The snapshots taken before and after maintenance help verify the same with just a single command. The current available snapshots are verified using the command show snapshots. When both the before and after maintenance snapshots are available, use the command show snapshots compare before_maintenance after_maintenance [summary] to compare the system for any differences. Example 12-24 demonstrates the comparison of before and after maintenance snapshots.

Example 12-24 Comparing Before and After Maintenance Snapshots

N7k-1# show snapshots
Snapshot Name           Time                           Description
------------------------------------------------------------------------------
after_maintenance       Wed Nov  1 02:42:07 2017       system-internal-snapshot
before_maintenance      Wed Nov  1 02:38:01 2017       system-internal-snapshot
N7k-1# show snapshots compare before_maintenance after_maintenance summary

================================================================================
Feature                         before_maintenance after_maintenance changed
================================================================================
basic summary
  # of interfaces                         63              63       
  # of vlans                               1               1       
  # of ipv4 routes vrf default            43              43       
  # of ipv4 paths  vrf default            46              46       
  # of ipv4 routes vrf management          9               9       
  # of ipv4 paths  vrf management          9               9       
  # of ipv6 routes vrf default             3               3       
  # of ipv6 paths  vrf default             3               3       

interfaces
  # of eth interfaces                     60              60       
  # of eth interfaces up                   7               7       
  # of eth interfaces down                53              53       
  # of eth interfaces other                0               0       
  # of vlan interfaces                     1               1    
  # of vlan interfaces up                  0               0       
  # of vlan interfaces down                1               1    
  # of vlan interfaces other               0               0

Most production environments have a limit on the duration of the maintenance window. To set the time limit of the system for the maintenance window, configure the timeout value for the maintenance mode using the command system mode maintenance timeout time-in-minutes. When the timeout value is reached, the system automatically rolls back to normal mode from maintenance mode. Example 12-25 examines configuring the maintenance timeout to 30 minutes and verifying the timeout value using the command show maintenance timeout.

Example 12-25 Maintenance Mode Timeout Settings

N7k-1(config)# system mode maintenance timeout 30

Timer will be started for 30 minutes when the system switches to maintenance mode.
N7k-1# show maintenance timeout
Maintenance mode timeout value: 30 minutes

Not all maintenance windows are supposed to be nondisruptive. Some maintenance windows require a system reload, and some are automatically rebooted because of an expected problem on the switch that you are trying to replicate. Thus, before getting into maintenance mode, define the reset-reason for the reload. The command system mode maintenance on-reload reset-reason options enables you to set the options for different kinds of reloads that are expected during the maintenance window. Multiple options can be set for the reset-reason. Example 12-26 displays all the available reset-reason options and also demonstrates how to set multiple reset-reason options for maintenance mode. The command show maintenance on-reload reset-reasons validates the reset-reasons set for a reload event during the maintenance window.

Example 12-26 On-Reload Reset-Reason Configuration and Verification

Spine2(config)# system mode maintenance on-reload reset-reason ?
  HW_ERROR       Hardware Error
  SVC_FAILURE    Critical service failure
  KERN_FAILURE   Kernel panic
  WDOG_TIMEOUT   Watchdog reset
  FATAL_ERROR    Fatal errors
  LC_FAILURE     LC failure
  MANUAL_RELOAD  Manual reload
  MAINTENANCE    Maintenance mode
  ANY_OTHER      Any other reset
  MATCH_ANY      Any of the above listed reasons

Spine2(config)# system mode maintenance on-reload reset-reason MANUAL_RELOAD
Spine2(config)# system mode maintenance on-reload reset-reason MAINTENANCE
Spine2# show maintenance on-reload reset-reasons
Reset reasons for on-reload maintenance mode:
--------------------------------------------
MANUAL_RELOAD
MAINTENANCE

bitmap = 0xc0

Note

If any issues arise with maintenance mode, collect the command show tech- support mmode output during or just after the problem is seen.

Custom Maintenance Profile

Both Isolate and Shutdown mode GIR have their respective benefits, but this is not always useful. For instance, if a Nexus switch is acting as a BGP route reflector and is not in the data path, the best approach might be not to shut down BGP on that device or isolate the device from BGP. In such cases, shutdown or isolate maintenance mode could impact services. For such instances, a custom maintenance profile can be created.

Two primary maintenance profiles exist:

  • Maintenance-mode

  • Normal-mode

Configuration for these profiles first gets generated after the system has been put in maintenance mode and switched back to normal mode. While creating custom profiles, the profile names remain the same, but the configuration inside the profiles can be modified. When you create custom profiles, it appends the commands to the existing maintenance profile. Hence, the first step is to check whether a maintenance profile has been defined. This is verified using the command show maintenance profile, as in Example 12-27.

Example 12-27 Verifying Maintenance and Normal Profile Configurations

N7k-1# show maintenance profile
[Normal Mode]
router isis IS-IS
  no isolate
router ospf 100
  no isolate
router eigrp 100
  no isolate
router bgp 100
  no isolate

[Maintenance Mode]
router bgp 100
  isolate
router eigrp 100
  isolate
router ospf 100
  isolate
router isis IS-IS
  isolate

If the maintenance-mode and normal-mode profiles are not empty, it is better to remove the existing maintenance profiles content and then create the custom profile from scratch. To remove the maintenance profiles, use the command no configure maintenance profile [maintenance-mode | normal-mode]. These commands are executed from the exec mode. After removing the existing profile configuration, the command configure maintenance profile [maintenance-mode | normal-mode] configures custom profiles from configuration mode. When both the customer maintenance and normal profiles are configured, it is important to also configure the command system maintenance mode always-use-custom-profile so that the system-generated custom profile configuration is not generated and used. Example 12-28 demonstrates all the steps to configure the custom profiles for both maintenance and normal modes. In this example, the maintenance-mode profile is configured to isolate BGP and Intermediate System-to-Intermediate System (ISIS) protocols but shut down OSPF, Enhanced Interior Gateway Routing Protocol (EIGRP), and interface Ethernet 3/1. Along with configuring custom maintenance profiles, it is important to save the configuration so that the customer profiles are retained even after the reloads.

Example 12-28 Configuring Custom Maintenance Profiles

N7k-1# no configure maintenance profile maintenance-mode
Maintenance mode profile maintenance-mode successfully deleted
Enter configuration commands, one per line.  End with CNTL/Z.
Exit maintenance profile mode.
N7k-1# no configure maintenance profile normal-mode
Maintenance mode profile normal-mode successfully deleted
Enter configuration commands, one per line.  End with CNTL/Z.
Exit maintenance profile mode.
N7k-1(config)# configure maintenance profile maintenance-mode
Please configure 'system mode maintenance always-use-custom-profile' if you want to use custom profile always for maintenance mode.
N7k-1(config-mm-profile)#
N7k-1(config-mm-profile)# router bgp 100
N7k-1(config-mm-profile-router)# isolate
N7k-1(config-mm-profile-router)# router ospf 100
N7k-1(config-mm-profile-router)# shutdown
N7k-1(config-mm-profile-router)# router eigrp 100
N7k-1(config-mm-profile-router)# shutdown
N7k-1(config-mm-profile-router)# router isis IS-IS
N7k-1(config-mm-profile-router)# isolate
N7k-1(config-mm-profile-router)# interface e3/1
N7k-1(config-mm-profile-if-verify)# shutdown
N7k-1(config-mm-profile-if-verify)# end

N7k-1(config)# configure maintenance profile normal-mode
Please configure 'system mode maintenance always-use-custom-profile' if you want to use custom profile always for maintenance mode.
N7k-1(config-mm-profile)# router ospf 100
N7k-1(config-mm-profile-router)# no shutdown
N7k-1(config-mm-profile-router)# router eigrp 100
N7k-1(config-mm-profile-router)# no shutdown
N7k-1(config-mm-profile-router)# router isis IS-IS
N7k-1(config-mm-profile-router)# no isolate
N7k-1(config-mm-profile-router)# router bgp 100
N7k-1(config-mm-profile-router)# no isolate
N7k-1(config-mm-profile-router)# interface ethernet 3/1
N7k-1(config-mm-profile-if-verify)# no shutdown
N7k-1(config-mm-profile-if-verify)# exit
N7k-1(config-mm-profile)# exit
N7k-1(config)# system mode maintenance always-use-custom-profile

N7k-1# copy running-config startup-config
[########################################] 100%
N7k-1# show maintenance profile
[Normal Mode]
router ospf 100
  no shutdown
router eigrp 100
  no shutdown
router isis IS-IS
  no isolate
router bgp 100
  no isolate
interface Ethernet3/1
  no shutdown

[Maintenance Mode]
router bgp 100
  isolate
router ospf 100
  shutdown
router eigrp 100
  shutdown
router isis IS-IS
  isolate
interface Ethernet3/1
  shutdown

Note

Use the command show running-config mmode to validate all the configuration settings related to maintenance mode.

To activate maintenance mode with custom profiles, configure the command system mode maintenance dont-generate-profile. This command uses the configuration from the custom profile created on the Nexus switch to get into maintenance mode. Example 12-29 illustrates activating maintenance mode using custom profile configurations.

Example 12-29 Activating Maintenance Mode with Custom Profiles

N7k-1(config)# system mode maintenance dont-generate-profile

Following configuration will be applied:

router bgp 100
  isolate
router ospf 100
  shutdown
router eigrp 100
  shutdown
router isis IS-IS
  isolate
interface Ethernet3/1
  shutdown

Do you want to continue (yes/no)? [no] yes

Generating a snapshot before going into maintenance mode

Starting to apply commands...

Applying : router bgp 100
Applying :   isolate
Applying : router ospf 100
Applying :   shutdown
Applying : router eigrp 100
Applying :   shutdown
Applying : router isis IS-IS
Applying :   isolate
Applying : interface Ethernet3/1
Applying :   shutdown

Maintenance mode operation successful.

Note

To debug maintenance mode, use the command debug mmode logfile. Enabling this debug also enables logging of the debug logs into a logfile that is viewed using the command show system internal mmode logfile. Collecting show tech-support mmode command output is also recommended, in case of any failures with GIR.

Summary

NX-OS being the OS for data center switches was built on paradigms of high availability (HA). This chapter focused on some of the high availability features that are commonly used on Nexus switches, including achieving high availability using BFD, which is used with various routing protocols and features. This chapter detailed verifying the hardware programming and using event-history logs to troubleshoot any BFD issues. The following areas should be verified while troubleshooting BFD session issues:

  • Ensure that the no ip redirects or no ipv6 redirects command is enabled on the interface.

  • Verify the Error code, explains the reason for the BFD failure:

    • No Diag

    • 1: Control packet detection timer expired

    • 2: Echo function failed

    • 3: Neighbor signaled session down

    • 4: Forwarding plane reset

    • 5: Path down

    • 6: Concatenated path down

    • 7: Administratively down

    • 8: Reverse concatenated path down

In addition, this chapter covered the system high availability features, such as SSO and ISSU, which are critical in a production environment. Performing incremental ISSU upgrades that are nondisruptive is better than performing upgrades using the reload command.

The chapter also examined Graceful Insertion and Removal (GIR) and looked at how GIR is used to perform maintenance activities in the network without requiring too many changes. With GIR, maintenance mode is enabled in two modes:

  • Isolate mode

  • Shutdown mode

Isolate mode is recommended for use with GIR. Finally, this chapter elaborated on how to create and use custom profiles for maintenance windows instead of using system-generated profiles.

References

RFC 5880, Bidirectional Forwarding Detection. D. Katz and D. Ward. IETF, http://tools.ietf.org/html/rfc5880, June 2010.

RFC 5881, Bidirectional Forwarding Detection for IPv4 and IPv6 (Single Hop). D. Katz and D. Ward. IETF, http://tools.ietf.org/html/rfc5881, June 2010.

RFC 5882, Generic Application of Bidirectional Forwarding Detection. D. Katz and D. Ward. IETF, http://tools.ietf.org/html/rfc5882, June 2010.

RFC 5883, Bidirectional Forwarding Detection for Multihop Paths. D. Katz and D. Ward. IETF, http://tools.ietf.org/html/rfc5883, June 2010.

RFC 5884, Bidirectional Forwarding Detection for MPLS Label Switched Paths. R. Aggarwal, K. Kompella, T. Nadeau, and G. Swallow. IETF, http://tools.ietf.org/html/rfc5884, June 2010.

Cisco.com Cisco NX-OS Software Configuration Guides. http://www.cisco.com.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.75.217