Cisco UCS and MDS FibreChannel OLS Errors
Recently we installed a brand new UCS System with 6454 Fabric Interconnects attached to Cisco MDS Fibre Channel Switches. But we get into some trouble with FirbeChannel Ports and their PortChannels.
But from the beginning …
Our Setup was the following:
UCS FI 6454 Version 4.2.1f
Cisco 9148 MDS Version 6.2.33
Cisco 9396S MDS Version 8.4(2)
Cisco B200 M5 Blades
I created a PortChannel between the new UCS and the MDS. PortChannel came up as expected. Everything looked fine at this time. So I went further an installed the Blades with VMware ESXi 6.7 U3 in my AutoDeply Setup. Also fine at this point. So I created the new FC zoning to my Huawei Storage.
The “big bang” came when I did a “Rescan Adapter” on the ESXi Host. I got Interface Errors and even the complete PortChannel went down.
2021 Sep 1 11:58:46 san-2a %PORT-5-IF_DOWN_OLS_RCVD: %$VSAN 3270%$ Interface fc1/4 is down (OLS received) port-channel2 ucs3
2021 Sep 1 11:58:46 san-2a %PORT-CHANNEL-5-PORT_DOWN: port-channel2: fc1/4 is down
2021 Sep 1 12:01:14 san-2a %PORT-5-IF_DOWN_NONE: %$VSAN 3270%$ Interface port-channel2 is down (None) ucs3
2021 Sep 1 12:01:14 san-2a %PORT-CHANNEL-5-FOP_CHANGED: port-channel2: first operational port changed from fc1/3 to none
2021 Sep 1 12:01:14 san-2a %PORT-CHANNEL-5-PORT_DOWN: port-channel2: fc1/3 is down
2021 Sep 1 12:01:14 san-2a %PORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN: %$VSAN 3270%$ Interface port-channel2 is down (No operational members) ucs3
2021 Sep 1 12:01:15 san-2a %PORT-5-IF_PORT_QUIESCE_FAILED: Interface fc1/3 port quiesce failed due to failure reason: Force Abort Due to Link Failure (NOS/LOS) (0x119)
2021 Sep 1 12:01:15 san-2a %PORT-5-IF_DOWN_OLS_RCVD: %$VSAN 3270%$ Interface fc1/3 is down (OLS received) port-channel2 ucs3
For troubleshooting purposes I went from PortChannel to a single Interface:
2021 Sep 2 09:17:04 san-2a %PORT-CHANNEL-5-DELETED: port-channel2 deleted
2021 Sep 2 09:18:53 san-2a %PORT-5-IF_UP: %$VSAN 3270%$ Interface fc1/3 is up in mode F ucs3
2021 Sep 2 09:32:04 san-2a %PORT-5-IF_DOWN_OLS_RCVD: %$VSAN 3270%$ Interface fc1/3 is down (OLS received) ucs3
2021 Sep 2 09:32:05 san-2a %PORT-5-IF_UP: %$VSAN 3270%$ Interface fc1/3 is up in mode F ucs3
2021 Sep 2 09:36:32 san-2a %PORT-5-IF_DOWN_OLS_RCVD: %$VSAN 3270%$ Interface fc1/3 is down (OLS received) ucs3
2021 Sep 2 09:36:33 san-2a %PORT-5-IF_UP: %$VSAN 3270%$ Interface fc1/3 is up in mode F ucs3
But I got the same Errors :/
Next try was a newer Cisco 9396S MDS Switch:
2021 Sep 3 08:41:10 san-1b %PORT-5-IF_DOWN_NONE: %$VSAN 3370%$ Interface port-channel32 is down (None)
2021 Sep 3 08:41:10 san-1b %PORT-CHANNEL-5-FOP_CHANGED: port-channel32: first operational port changed from fc1/11 to none
2021 Sep 3 08:41:10 san-1b %PORT-CHANNEL-5-PORT_DOWN: port-channel32: fc1/11 is down
2021 Sep 3 08:41:10 san-1b %PORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN: %$VSAN 3370%$ Interface port-channel32 is down (No operational members)
2021 Sep 3 08:41:10 san-1b %PORT-5-IF_PORT_QUIESCE_FAILED: Interface fc1/11 port quiesce failed due to failure reason: Force Abort Due to Link Failure (NOS/LOS) (0x119)
2021 Sep 3 08:41:10 san-1b %PORT-5-IF_DOWN_OLS_RCVD: %$VSAN 3370%$ Interface fc1/11 is down (OLS received) port-channel32
2021 Sep 3 08:41:40 san-1b %PORT-5-IF_DOWN: %$VSAN 3370%$ Interface fc1/11 is down (Gracefully shutdown) port-channel32
2021 Sep 3 08:41:40 san-1b %PORT-CHANNEL-5-DELETED: port-channel32 deleted
2021 Sep 3 08:42:42 san-1b %LIBIFMGR-5-INTF_COUNTERS_CLEARED: Interface fc1/11, counters cleared by user
2021 Sep 3 08:42:58 san-1b %PORT-5-IF_UP: %$VSAN 3370%$ Interface fc1/11 is up in mode F
2021 Sep 3 08:46:22 san-1b %PORT-5-IF_DOWN_OLS_RCVD: %$VSAN 3370%$ Interface fc1/11 is down (OLS received)
2021 Sep 3 08:46:23 san-1b %PORT-5-IF_UP: %$VSAN 3370%$ Interface fc1/11 is up in mode F
2021 Sep 3 08:47:42 san-1b %PMON-SLOT1-3-RISING_THRESHOLD_REACHED: TX Credit Not Available has reached the rising threshold (port=fc1/11 [0x100a000], value=20) .
2021 Sep 3 08:47:43 san-1b %PMON-SLOT1-3-RISING_THRESHOLD_REACHED: Credit Loss Reco has reached the rising threshold (port=fc1/11 [0x100a000], value=1) .
2021 Sep 3 08:47:44 san-1b %PMON-SLOT1-3-FALLING_THRESHOLD_REACHED: Credit Loss Reco has reached the falling threshold (port=fc1/11 [0x100a000], value=0) .
2021 Sep 3 08:47:44 san-1b %PMON-SLOT1-3-FALLING_THRESHOLD_REACHED: TX Credit Not Available has reached the falling threshold (port=fc1/11 [0x100a000], value=0) .
2021 Sep 3 08:52:28 san-1b %PMON-SLOT1-3-RISING_THRESHOLD_REACHED: TX Credit Not Available has reached the rising threshold (port=fc1/11 [0x100a000], value=10) .
2021 Sep 3 08:52:29 san-1b %PMON-SLOT1-3-RISING_THRESHOLD_REACHED: Credit Loss Reco has reached the rising threshold (port=fc1/11 [0x100a000], value=1) .
2021 Sep 3 08:52:31 san-1b %PMON-SLOT1-3-FALLING_THRESHOLD_REACHED: Credit Loss Reco has reached the falling threshold (port=fc1/11 [0x100a000], value=0) .
2021 Sep 3 08:52:31 san-1b %PMON-SLOT1-3-FALLING_THRESHOLD_REACHED: TX Credit Not Available has reached the falling threshold (port=fc1/11 [0x100a000], value=0) .
Same OLS Errors and some more. But still no stable FirbeChannel connection. The Interfaces always had this errors:
3 input OLS,1 LRR,0 NOS,0 loop inits
2 output OLS,1 LRR, 2 NOS, 2 loop inits
Of course I did some research and found the following:
https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvn14381/?rfs=iqvred
-> Should not hit because Firmware is higher.
https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvo08627/?rfs=iqvred
-> Unsure what the solution is.
https://kb.netapp.com/Advice_and_Troubleshooting/Data_Storage_Systems/Fabric%2C_Interconnect_and_Management_Switches/Cisco_MDS_switch_reporting_%22TX_Credit_Not_Available_has_reached_the_rising_threshold%22
-> We changed enough cable and SFPs.
https://www.dell.com/support/kbdoc/de-ch/000167985/connectrix-mds-port-channel-between-cisco-mds-and-ucs-not-working
-> Single uplinks already tried.
https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvv84472
-> the FI 6454 has the 8c604f as oui and is included on the 9148 and 9396S. Also single uplink should work.
But none of them helped me out.
Frustrated I went to UCS GUI and tripple checked my settings. But there is not much to configure. But in single uplink mode I found a setting about the Fill Pattern which cannot be changed and is set to “Idle”.
I looked in my old UCS System 6248 the Fill Pattern can be changed and the default is set to “Arbff”
I did some research about the “Fill Word” and found the following:
„Cisco UCS 6400 Series Fabric Interconnects do not support 8 Gbps direct-attached FC connectivity (FC uplink ports or FC storage ports) without fill-pattern set to IDLE. When migrating to Cisco UCS 6400 Series Fabric Interconnects from Cisco UCS 6200 Series Fabric Interconnects, do one of the following:“
• Use a SAN switch between the Cisco UCS 6400 Series Fabric Interconnect and the storage array with 8 GB FC connectivity.
• Upgrade the storage array to 16 GB or 32 GB FC connectivity.
So that means you can’t change this setting on the UCS, but on the MDS. I went to my MDS Console and tried:
san-2a(config-if)# switchport fill-pattern IDLE speed 8000
Since that change all my Ports stayed stable and also later in PortChannel configuration 😉
2 thoughts on “Cisco UCS and MDS FibreChannel OLS Errors”
Thanks much for this. Was struggling with the same issue and this really helped.
always nice to hear 😉