UCS Upgrade failed
The last days I tried to upgrade my UCS Domain from 2.2.7(c) to 2.2.8(g).
I have two UCS Domains. One of them went through the upgrade fine, the other one not :/
See what happend and how we fixed it …
I went through the steps I described in
STUMBLING BLOCKS IN UPGRADING CISCO UCS (PART 1OF2)
STUMBLING BLOCKS IN UPGRADING CISCO UCS (PART 2OF2)
In short form:
- Upgrade UCS Manager
- Upgrade IO Modules of the Chassis
- Upgrade Fabric Interconnect
After the reboot the upgraded Fabric Interconnect came up in Setup / Config Mode. This means there were no active config after the reboot.

WARNING: Please do the following with a Cisco TAC Engineer or at your own risk!
First we tried to re-connect to the running FI and get his config:
Type the hot key to suspend the connection: <CTRL>Q
Enter the configuration method. (console/gui) ? console
Installer has detected the presence of a peer Fabric interconnect. This Fabric interconnect will be added to the cluster. Continue (y/n) ? y
Enter the admin password of the peer Fabric interconnect:
Connecting to peer Fabric interconnect... unable to connect! Password could be wrong.
Please ensure that the authentication mode on peer Fabric interconnect is set to 'Local'
Hit enter to try again or type 'restart' to start setup from beginning...
?
Connecting to peer Fabric interconnect... done
Retrieving config from peer Fabric interconnect... done
/isan/bin/getversion: error while loading shared libraries: libosiris.so: cannot open shared object file: No such file or directory
Installer has determined that the peer Fabric Interconnect is running a different firmware version than the local Fabric. Cannot join cluster.
Local Fabric Interconnect
UCSM version :
Kernel version :
System version :
local_model_no : 6248
Peer Fabric Interconnect
UCSM version : 2.2(8g)
Kernel version : 5.2(3)N2(2.27c)
System version : 5.2(3)N2(2.27c)
peer_model_no : 6248
Do you wish to update firmware on this Fabric Interconnect to the Peer's version? (y/n): y
Updating firmware of Fabric Interconnect....... [ Please don't press Ctrl+c while updating firmware ]
Updating images
Please wait for firmware update to complete....
Checking the Compatibility of new Firmware..... [ Please don't Press ctrl+c ].
Verifying image bootflash:/installables/switch/ucs-6100-k9-kickstart.5.2.3.N2.2.27c.bin for boot variable "kickstart".
[# ] 0%[####################] 100% -- SUCCESS
Verifying image bootflash:/installables/switch/ucs-6100-k9-system.5.2.3.N2.2.27c.bin for boot variable "system".
[# ] 0%[####################] 100% -- SUCCESS
Verifying image type.
[# ] 0%[##### ] 20%[####### ] 30%[######### ] 40%[########### ] 50%[########### ] 50%[########### ] 50%[################### ] 90%[####################] 100%[####################] 100% -- SUCCESS
Extracting "system" version from image bootflash:/installables/switch/ucs-6100-k9-system.5.2.3.N2.2.27c.bin.
[# ] 0%[####################] 100% -- SUCCESS
Extracting "kickstart" version from image bootflash:/installables/switch/ucs-6100-k9-kickstart.5.2.3.N2.2.27c.bin.
[# ] 0%[####################] 100% -- SUCCESS
Extracting "bios" version from image bootflash:/installables/switch/ucs-6100-k9-system.5.2.3.N2.2.27c.bin.
[# ] 0%[####################] 100% -- SUCCESS
Performing module support checks.
[####################] 100% -- SUCCESS
Notifying services about system upgrade.
[####################] 100% -- SUCCESS
Compatibility check is done:
Module bootable Impact Install-type Reason
------ -------- -------------- ------------ ------
1 yes disruptive reset Incompatible image
Images will be upgraded according to following table:
Module Image Running-Version New-Version Upg-Required
------ ---------- ---------------------- ---------------------- ------------
1 system 5.2(3)N2(2.28g) 5.2(3)N2(2.27c) yes
1 kickstart 5.2(3)N2(2.28g) 5.2(3)N2(2.27c) yes
1 bios v3.6.0(05/09/2012) v3.6.0(05/09/2012) no
1 SFP-uC v1.1.0.0 v1.0.0.0 no
1 power-seq v3.0 v3.0 no
3 power-seq v2.0 v2.0 no
1 uC v1.2.0.1 v1.2.0.1 no
Switch will be reloaded for disruptive upgrade.
Install is in progress, please wait.
Performing runtime checks.
[####################] 100% -- SUCCESS
Setting boot variables.
[# ] 0%[####################] 100% -- SUCCESS
Performing configuration copy.
[# ] 0%[### ] 10%[#### ] 15%[##### ] 20%[###### ] 25%[####### ] 30%[######## ] 35%[######### ] 40%[########## ] 45%[########### ] 50%[############# ] 60%[############## ] 65%[############### ] 70%[################ ] 75%[################# ] 80%[################## ] 85%[################### ] 90%[####################] 95%[####################] 100%[####################] 100% -- SUCCESS
Converting startup config.
[# ] 0%[####################] 100% -- SUCCESS
Install has been successful.
Firmware Updation Successfully Completed. Please wait to enter the IP address
Type 'reboot' to abort configuration and reboot system
or hit enter to continue. (reboot/<CR>) ?
Peer Fabric interconnect Mgmt0 IPv4 Address: 10.0.0.1
Peer Fabric interconnect Mgmt0 IPv4 Netmask: 255.255.255.0
Cluster IPv4 address : 10.0.0.3
Peer FI is IPv4 Cluster enabled. Please Provide Local Fabric Interconnect Mgmt0 IPv4 Address
Physical Switch Mgmt0 IP address :
Mgmt0 IP must be specified
Physical Switch Mgmt0 IP address : 10.0.0.2
Apply and save the configuration (select 'no' if you want to re-enter)? (yes/no):
Type 'reboot' to abort configuration and reboot system
or hit enter to continue. (reboot/<CR>) ?
Apply and save the configuration (select 'no' if you want to re-enter)? (yes/no): yes
Applying configuration. Please wait.
Tue Oct 31 11:59:53 UTC 2017
Type 'reboot' to abort configuration and reboot system
or hit enter to continue. (reboot/<CR>) ? Configuration file - Ok
2017 Oct 31 12:00:10 UCS1-B %$ VDC-1 %$ %CALLHOME-2-EVENT: dhcpd crashed with crash type:256
2017 Oct 31 12:00:10 UCS1-B %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:00:11 UCS1-B %$ VDC-1 %$ %CALLHOME-2-EVENT: dhcpd crashed with crash type:256
2017 Oct 31 12:00:11 UCS1-B %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:00:11 UCS1-B %$ VDC-1 %$ %CALLHOME-2-EVENT: dhcpd crashed with crash type:256
2017 Oct 31 12:00:11 UCS1-B %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:00:11 UCS1-B %$ VDC-1 %$ %CALLHOME-2-EVENT: dhcpd crashed with crash type:256
2017 Oct 31 12:00:11 UCS1-B %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:00:12 UCS1-B %$ VDC-1 %$ %CALLHOME-2-EVENT: dhcpd crashed with crash type:256
2017 Oct 31 12:00:12 UCS1-B %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:00:12 UCS1-B %$ VDC-1 %$ %USER-2-SYSTEM_MSG: Restart count exhausted for process: dhcpd - pmon
User Access Verification
UCS1-B login: admin
Password:
Cisco Nexus Operating System (NX-OS) Software
TAC support: http://www.cisco.com/tac
Copyright (c) 2002-2017, Cisco Systems, Inc. All rights reserved.
The copyrights to certain works contained in this software are
owned by other third parties and used and distributed under
license. Certain components of this software are licensed under
the GNU General Public License (GPL) version 2.0 or the GNU
Lesser General Public License (LGPL) Version 2.1. A copy of each
such license is available at
http://www.opensource.org/licenses/gpl-2.0.php and
http://www.opensource.org/licenses/lgpl-2.1.php
At this time the FI does not accept any commands, so we did a “hard” reboot. Again, there was no config on the FI.
N5000 BIOS v.3.6.0, Wed 05/09/2012, 03:15 PM
989CB4B4B4B4B4B4B4B4B4B49999999299A0A2A3A0A2A3B2 B2Version 2.00.1201. Copyright (C) 2009 American Megatrends, Inc. Booting kickstart image: bootflash:/installables/switch/ucs-6100-k9-kickstart.5
.2.3.N2.2.27c.bin....
...............................................................................
...........................................Image verification OK
Usage: init 0123456SsQqAaBbCcUu
INIT: [ 10.657597] I2C - Mezz absent
Starting system POST.....
Executing Mod 1 1 SEEPROM Test:...done (0 seconds)
Executing Mod 1 1 GigE Port Test:....done (32 seconds)
Executing Mod 1 1 PCIE Test:.................done (0 seconds)
Mod 1 1 Post Completed Successfully
POST is completed
can't create lock file /var/lock/mtab~207: No such file or directory (use -n flag to override)
S10mount-ramfs.supnuovaca Mounting /isan 3000m
Mounted /isan
Creating /callhome..
Mounting /callhome..
Creating /callhome done.
Callhome spool file system init done.
nohup: redirecting stderr to stdout
autoneg unmodified, ignoring
autoneg unmodified, ignoring
Checking all filesystems..r.r.r. done.
Checking NVRAM block device ... done
The startup-config won't be used until the next reboot.
.
Loading system software
Uncompressing system image: bootflash:/installables/switch/ucs-6100-k9-system.5.2.3.N2.2.27c.bin
Loading plugin 0: core_plugin...
Loading plugin 1: eth_plugin...
Loading plugin 2: fc_plugin...
13+1 records in
13+1 records out
10240 bytes (10 kB) copied, 5.7017e-05 s, 180 MB/s
ethernet end-host mode on CA
FC end-host mode on CA
n_port virtualizer mode.
---------------------------------------------------------------
INIT: Entering runlevel: 3
touch: cannot touch `/var/lock/subsys/netfs': No such file or directory
/isan/bin/muxif_config: fex vlan id: -f,4042
Set name-type for VLAN subsystem. Should be visible in /proc/net/vlan/config
Added VLAN with VID == 4042 to IF -:muxif:-
cp: cannot stat `/isan/plugin_img/fex.bin': No such file or directory
---------------------
enabled fc feature
---------------------
2017 Oct 31 12:09:31 %$ VDC-1 %$ %USER-2-SYSTEM_MSG: CLIS: loading cmd files begin - clis
2017 Oct 31 12:09:34 %$ VDC-1 %$ Oct 31 12:09:34 %KERN-0-SYSTEM_MSG: [ 10.657597] I2C - Mezz absent - kernel
2017 Oct 31 12:09:41 %$ VDC-1 %$ %USER-2-SYSTEM_MSG: CLIS: loading cmd files end - clis
2017 Oct 31 12:09:41 %$ VDC-1 %$ %USER-2-SYSTEM_MSG: CLIS: init begin - clis
2017 Oct 31 12:09:49 %$ VDC-1 %$ %SNMPD-2-CRITICAL: SNMP log critical : load_mib_module :Error, while loading the mib module /isan/lib/libsvc_sam_extSnmpPlugin.so (/isan/lib/libsvc_sam_extSnmpPlugin.so: cannot open shared object file: No such file or directory)
2017 Oct 31 12:09:54 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_controller crashed with crash type:32512
2017 Oct 31 12:09:54 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:09:54 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_dme crashed with crash type:32512
2017 Oct 31 12:09:54 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:09:54 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_dcosAG crashed with crash type:32512
2017 Oct 31 12:09:54 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:09:55 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_bladeAG crashed with crash type:32512
2017 Oct 31 12:09:55 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:09:55 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_portAG crashed with crash type:32512
2017 Oct 31 12:09:55 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:09:55 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_hostagentAG crashed with crash type:32512
2017 Oct 31 12:09:55 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:09:56 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_nicAG crashed with crash type:32512
2017 Oct 31 12:09:56 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:09:56 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_extvmmAG crashed with crash type:32512
2017 Oct 31 12:09:56 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:09:56 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_cliD crashed with crash type:32256
2017 Oct 31 12:09:56 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:09:56 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_pamProxy crashed with crash type:32512
2017 Oct 31 12:09:56 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:09:57 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_controller crashed with crash type:32512
2017 Oct 31 12:09:57 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:09:57 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_dme crashed with crash type:32512
2017 Oct 31 12:09:57 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:09:57 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_dcosAG crashed with crash type:32512
2017 Oct 31 12:09:57 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:09:58 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_bladeAG crashed with crash type:32512
2017 Oct 31 12:09:58 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:09:58 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_portAG crashed with crash type:32512
2017 Oct 31 12:09:58 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:09:59 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_hostagentAG crashed with crash type:32512
2017 Oct 31 12:09:59 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:00 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_nicAG crashed with crash type:32512
2017 Oct 31 12:10:00 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:00 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_extvmmAG crashed with crash type:32512
2017 Oct 31 12:10:00 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:01 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_cliD crashed with crash type:32256
2017 Oct 31 12:10:01 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:01 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_pamProxy crashed with crash type:32512
2017 Oct 31 12:10:01 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:02 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_controller crashed with crash type:32512
2017 Oct 31 12:10:02 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:03 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_dme crashed with crash type:32512
2017 Oct 31 12:10:03 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:03 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_dcosAG crashed with crash type:32512
2017 Oct 31 12:10:03 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:04 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_bladeAG crashed with crash type:32512
2017 Oct 31 12:10:04 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:04 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_portAG crashed with crash type:32512
2017 Oct 31 12:10:04 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:04 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_hostagentAG crashed with crash type:32512
2017 Oct 31 12:10:04 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:05 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_nicAG crashed with crash type:32512
2017 Oct 31 12:10:05 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:05 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_extvmmAG crashed with crash type:32512
2017 Oct 31 12:10:05 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:05 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_cliD crashed with crash type:32256
2017 Oct 31 12:10:05 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:06 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_pamProxy crashed with crash type:32512
2017 Oct 31 12:10:06 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:06 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_controller crashed with crash type:32512
2017 Oct 31 12:10:06 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:06 %$ VDC-1 %$ %USER-2-SYSTEM_MSG: Restart count exhausted for process: svc_sam_controller - pmon
2017 Oct 31 12:10:06 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_dme crashed with crash type:32512
2017 Oct 31 12:10:06 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:06 %$ VDC-1 %$ %USER-2-SYSTEM_MSG: Restart count exhausted for process: svc_sam_dme - pmon
2017 Oct 31 12:10:07 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_dcosAG crashed with crash type:32512
2017 Oct 31 12:10:07 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:07 %$ VDC-1 %$ %USER-2-SYSTEM_MSG: Restart count exhausted for process: svc_sam_dcosAG - pmon
2017 Oct 31 12:10:07 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_bladeAG crashed with crash type:32512
2017 Oct 31 12:10:07 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:07 %$ VDC-1 %$ %USER-2-SYSTEM_MSG: Restart count exhausted for process: svc_sam_bladeAG - pmon
2017 Oct 31 12:10:07 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_portAG crashed with crash type:32512
2017 Oct 31 12:10:07 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:07 %$ VDC-1 %$ %USER-2-SYSTEM_MSG: Restart count exhausted for process: svc_sam_portAG - pmon
2017 Oct 31 12:10:08 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_hostagentAG crashed with crash type:32512
2017 Oct 31 12:10:08 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:08 %$ VDC-1 %$ %USER-2-SYSTEM_MSG: Restart count exhausted for process: svc_sam_hostagentAG - pmon
2017 Oct 31 12:10:08 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_nicAG crashed with crash type:32512
2017 Oct 31 12:10:08 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:08 %$ VDC-1 %$ %USER-2-SYSTEM_MSG: Restart count exhausted for process: svc_sam_nicAG - pmon
2017 Oct 31 12:10:08 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_extvmmAG crashed with crash type:32512
2017 Oct 31 12:10:08 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:08 %$ VDC-1 %$ %USER-2-SYSTEM_MSG: Restart count exhausted for process: svc_sam_extvmmAG - pmon
2017 Oct 31 12:10:09 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_cliD crashed with crash type:32256
2017 Oct 31 12:10:09 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:09 %$ VDC-1 %$ %USER-2-SYSTEM_MSG: Restart count exhausted for process: svc_sam_cliD - pmon
2017 Oct 31 12:10:09 %$ VDC-1 %$ %CALLHOME-2-EVENT: svc_sam_pamProxy crashed with crash type:32512
2017 Oct 31 12:10:09 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
2017 Oct 31 12:10:09 %$ VDC-1 %$ %USER-2-SYSTEM_MSG: Restart count exhausted for process: svc_sam_pamProxy - pmon
2017 Oct 31 12:10:27 %$ VDC-1 %$ %CALLHOME-2-EVENT: httpd.sh crashed with crash type:0
2017 Oct 31 12:10:27 %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
System is coming up ... Please wait ...
System is coming up ... Please wait ...
System is coming up ... Please wait ...
2017 Oct 31 12:10:46 %$ VDC-1 %$ %VDC_MGR-2-VDC_ONLINE: vdc 1 has come online
System is coming up ... Please wait ...
nohup: appending output to `nohup.out'
2017 Oct 31 12:11:01 switch %$ VDC-1 %$ %USER-2-SYSTEM_MSG: Running in PIO stats mode - carmelusd
2017 Oct 31 12:11:01 switch %$ VDC-1 %$ %CALLHOME-2-EVENT: httpd.sh crashed with crash type:0
2017 Oct 31 12:11:01 switch %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
---- Basic System Configuration Dialog ----
This setup utility will guide you through the basic configuration of
the system. Only minimal configuration including IP connectivity to
the Fabric interconnect and its clustering mode is performed through these steps.
Type Ctrl-C at any time to abort configuration and reboot system.
To back track or make modifications to already entered values,
complete input till end of section and answer no when prompted
to apply configuration.
Enter the configuration method. (console/gui) ?
Type 'reboot' to abort configuration and reboot system
or hit enter to continue. (reboot/<CR>) ? 2017 Oct 31 12:11:34 switch %$ VDC-1 %$ %CALLHOME-2-EVENT: httpd.sh crashed with crash type:0
2017 Oct 31 12:11:34 switch %$ VDC-1 %$ %CALLHOME-2-EVENT: SW_CRASH
We had to fix the boot partitions. During the boot process you have to constantly press “CRTL”+ “L”
N5000 BIOS v.3.6.0, Wed 05/09/2012, 03:15 PM 989CB4B4B4B4B4B4B4B4B4B49999999299A0A2A3A0A2A3B2 B2Version 2.00.1201. Copyright (C) 2009 American Megatrends, Inc. User break into bootloader loader> dir bootflash: span.log ucs-6100-k9-kickstart.5.0.3.N2.2.02q.bin ucs-6100-k9-system.5.0.3.N2.2.02q.bin chassis.img pnuos nuova-sim-mgmt-nsg.0.1.0.001.bin chassis2.img fexth.bin installables sysdebug distributables_hdr cores techsupport mts.log vdc_2 vdc_3 vdc_4 distributables initial_setup.log license received
Now we loaded the bootflash:
loader> boot bootflash:/installables/switch/ucs-6100-k9-kickstart.5.2.3.N2.2 <.2.3.N2.2. 27c.bin Booting kickstart image: bootflash:/installables/switch/ucs-6100-k9-kickstart.5 .2.3.N2.2.27c.bin.... ............................................................................... ...........................................Image verification OK Usage: init 0123456SsQqAaBbCcUu INIT: [ 10.668160] I2C - Mezz absent Starting system POST..... Executing Mod 1 1 SEEPROM Test:...done (0 seconds) Executing Mod 1 1 GigE Port Test:....done (32 seconds) Executing Mod 1 1 PCIE Test:.................done (0 seconds) Mod 1 1 Post Completed Successfully POST is completed can't create lock file /var/lock/mtab~207: No such file or directory (use -n flag to override) S10mount-ramfs.supnuovaca Mounting /isan 3000m Mounted /isan Creating /callhome.. Mounting /callhome.. Creating /callhome done. Callhome spool file system init done. nohup: redirecting stderr to stdout autoneg unmodified, ignoring autoneg unmodified, ignoring Checking all filesystems..... done. Checking NVRAM block device ... done The startup-config won't be used until the next reboot. . Loading system software No system image Cisco Nexus Operating System (NX-OS) Software TAC support: http://www.cisco.com/tac Copyright (c) 2002-2016, Cisco Systems, Inc. All rights reserved. The copyrights to certain works contained in this software are owned by other third parties and used and distributed under license. Certain components of this software are licensed under the GNU General Public License (GPL) version 2.0 or the GNU Lesser General Public License (LGPL) Version 2.1. A copy of each such license is available at http://www.opensource.org/licenses/gpl-2.0.php and http://www.opensource.org/licenses/lgpl-2.1.php
At this time we had to configure the IP of the FI:
switch(boot)# conf terminal Enter configuration commands, one per line. End with CNTL/Z. switch(boot)(config)# interface mgmt 0 switch(boot)(config-if)# ip address 10.0.0.2 255.255.255.0 switch(boot)(config-if)# no shutdown switch(boot)(config-if)# exit switch(boot)(config)# ip default-gateway 10.0.0.254 switch(boot)(config)# exit
To get the Debug Plugin on the switch, you have to get it from a TFTP Server
switch(boot)# copy tftp://[my-tftp-ip]/ucs-dplug.5.2.3.N2.2.27c.gbin workspace:debuug_plugin/ucs-dplug.5.2.3.N2.2.27c.gbin Trying to connect to tftp server...... Connection to server Established. Copying Started..... |/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\ TFTP get operation was successful Copy complete, now saving to disk (please wait)... switch(boot)#
In the following step we unmounted the Filesystems and repaired them:
witch(boot)# copy workspace:debug_plugin/ucs-dplug.5.2.3.N2.2.27c.gbin xy
Copy complete, now saving to disk (please wait)...
switch(boot)# load xy
Loading plugin version 5.2(3)N2(2.27c)
###############################################################
Warning: debug-plugin is for engineering internal use only!
For security reason, plugin image has been deleted.
###############################################################
Successfully loaded debug-plugin!!!
Linux(debug)# umount /dev/mtdblock2
Linux(debug)# umount /dev/mtdblock3
Linux(debug)# umount /dev/sda3
Linux(debug)# umount /dev/sda4
Linux(debug)# umount /dev/sda5
Linux(debug)# umount /dev/sda6
Linux(debug)# umount /dev/sda7
umount: /dev/sda7: not mounted
Linux(debug)# umount /dev/sda8
Linux(debug)# umount /dev/sda9
umount: /dev/sda9: not found
Linux(debug)# e2fsck -y /dev/sda3
e2fsck 1.35 (28-Feb-2004)
/dev/sda3: clean, 1746/2125760 files, 1832918/4247184 blocks
Linux(debug)# e2fsck -y /dev/sda7
e2fsck 1.35 (28-Feb-2004)
Couldn't find ext2 superblock, trying backup blocks...
/dev/sda7 was not cleanly unmounted, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong for group #0 (24237, counted=21663).
Fix? yes
Free blocks count wrong for group #1 (32461, counted=32442).
Fix? yes
Free blocks count wrong for group #3 (32461, counted=32459).
Fix? yes
Free blocks count wrong for group #8 (30266, counted=24180).
Fix? yes
Free blocks count wrong for group #13 (32463, counted=32462).
Fix? yes
Free blocks count wrong for group #14 (32462, counted=32463).
Fix? yes
Free blocks count wrong for group #23 (32463, counted=32460).
Fix? yes
Free blocks count wrong for group #26 (32460, counted=32456).
Fix? yes
Free blocks count wrong for group #27 (32461, counted=32460).
Fix? yes
Free blocks count wrong for group #29 (32463, counted=32462).
Fix? yes
Free blocks count wrong (982140, counted=973450).
Fix? yes
Free inodes count wrong for group #0 (9684, counted=9682).
Fix? yes
Free inodes count wrong for group #1 (9696, counted=9693).
Fix? yes
Directories count wrong for group #1 (0, counted=1).
Fix? yes
Free inodes count wrong for group #3 (9696, counted=9694).
Fix? yes
Directories count wrong for group #3 (0, counted=2).
Fix? yes
Free inodes count wrong for group #8 (9689, counted=9687).
Fix? yes
Free inodes count wrong for group #13 (9696, counted=9695).
Fix? yes
Directories count wrong for group #13 (0, counted=1).
Fix? yes
Free inodes count wrong for group #14 (9695, counted=9696).
Fix? yes
Directories count wrong for group #14 (1, counted=0).
Fix? yes
Free inodes count wrong for group #23 (9696, counted=9692).
Fix? yes
Directories count wrong for group #23 (0, counted=1).
Fix? yes
Free inodes count wrong for group #26 (9693, counted=9689).
Fix? yes
Directories count wrong for group #26 (1, counted=2).
Fix? yes
Free inodes count wrong for group #27 (9696, counted=9695).
Fix? yes
Directories count wrong for group #27 (0, counted=1).
Fix? yes
Free inodes count wrong for group #29 (9696, counted=9694).
Fix? yes
Directories count wrong for group #29 (0, counted=1).
Fix? yes
Free inodes count wrong (300543, counted=300523).
Fix? yes
/dev/sda7: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sda7: 53/300576 files (1.9% non-contiguous), 28596/1002046 blocks
Linux(debug)# e2fsck -n -f /dev/sda8
e2fsck 1.35 (28-Feb-2004)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/sda8: 19/501952 files (0.0% non-contiguous), 25374/1002046 blocks
Linux(debug)#
Linux(debug)# e2fsck -n -f /dev/sda9
e2fsck 1.35 (28-Feb-2004)
e2fsck: No such file or directory while trying to open /dev/sda9
The superblock could not be read or does not describe a correct ext2
filesystem. If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 <device>
Linux(debug)#
Linux(debug)#
Linux(debug)# tune2fs -j /dev/sda3
tune2fs 1.35 (28-Feb-2004)
The filesystem already has a journal.
Linux(debug)# reboot
INIT:
INIT: Sending processes the TERM signal
Linux(debug)#
INIT: /isanboot/sbin/loadplugin: line 172: 1490 Hangup $AUTORUN
switch(boot)# Sending all processes the TERM signal...
Sending all processes the KILL signal...
Saving random seed:
Syncing hardware clock to system time
Unmounting file systems:
mount: you must specify the filesystem type
mount: /var not mounted already, or bad option
Please stand by while rebooting the system...
[ 1131.078240] Restarting system.
[ 1131.114620] machine restart
[ 1131.147864] Resetting board (uc)
After two reboots the FI has it’s config
UCS1-B# show cluster state Cluster Id: 0x357b36b2b45611e1-0xbaac547fee935324 B: UP, SUBORDINATE A: UP, PRIMARY HA NOT READY Waiting for response from device. Device count, expected: 3, active: 2 Detailed state of the device selected for HA storage: Chassis 1, serial: MYSERIALNO, state: inactive Chassis 2, serial: MYSERIALNO, state: active Chassis 3, serial: MYSERIALNO, state: active Fabric B, Unable to connect to local chassis-shared-storage management interface : MYSERIALNO Warning: there are pending management I/O errors on one or more devices, failove r may not complete UCS1-B# 2017 Oct 31 15:36:27 UCS1-B %$ VDC-1 %$ %SATCTRL-FEX1 -2-SATCTRL: IOM-1 Module 1: Cold boot 2017 Oct 31 15:36:34 UCS1-B %$ VDC-1 %$ %PFMA-2-FEX_STATUS: Fex 1 is online (Serial number ) 2017 Oct 31 15:36:34 UCS1-B %$ VDC-1 %$ %NOHMS-2-NOHMS_ENV_FEX_ONLINE: FEX-1 On-line 2017 Oct 31 15:36:34 UCS1-B %$ VDC-1 %$ %CALLHOME-2-EVENT: FEX_ONLINE 2017 Oct 31 15:36:34 UCS1-B %$ VDC-1 %$ %PFMA-2-FEX_STATUS: Fex 1 is online (Serial number ) 2017 Oct 31 15:40:22 UCS1-B %$ VDC-1 %$ %PORT-2-IF_DOWN_ERROR_DISABLED: %$VSAN 3370%$ Interface vfc1217 is down (Error disabled) server 1/5, VHBA vHBA_0B 2017 Oct 31 15:40:22 UCS1-B %$ VDC-1 %$ %PORT-2-IF_DOWN_ERROR_DISABLED: %$VSAN 3370%$ Interface vfc1295 is down (Error disabled) server 1/7, VHBA vHBA_0B 2017 Oct 31 15:40:22 UCS1-B %$ VDC-1 %$ %PORT-2-IF_DOWN_ERROR_DISABLED: %$VSAN 3370%$ Interface vfc1177 is down (Error disabled) server 1/6, VHBA vHBA_0B 2017 Oct 31 15:40:23 UCS1-B %$ VDC-1 %$ %PORT-2-IF_DOWN_ERROR_DISABLED: %$VSAN 3370%$ Interface vfc1153 is down (Error disabled) server 1/8, VHBA vHBA_0B 2017 Oct 31 15:40:24 UCS1-B %$ VDC-1 %$ %PORT-2-IF_DOWN_ERROR_DISABLED: %$VSAN 3370%$ Interface vfc1569 is down (Error disabled) server 1/1, VHBA vHBA_0B UCS1-B#
At this point the TAC Engineer told me he fixed this case.
Whooot???? Does not look like everything is fine! Ok, the FI has his config, but there are still errors.
Nerverless I had to open a new TAC Case to get this errors fixed:
[FSM:STAGE:RETRY:]: external VM manager extension-key configuration on local fabric(FSM-STAGE:sam:dme:ExtvmmMasterExtKeyConfig:SetPeer) F16898
[FSM:STAGE:REMOTE-ERROR]: Result: service-unavailable Code: unspecified Message: Error syncing extension key(sam:dme:ExtvmmMasterExtKeyConfig:SetPeer) F78338
[FSM:STAGE:REMOTE-ERROR]: Result: service-unavailable Code: unspecified Message: Error syncing extension key(sam:dme:ExtvmmProviderConfig:SetPeer) F78319
[FSM:STAGE:RETRY:]: external VM manager configuration on peer fabric(FSM-STAGE:sam:dme:ExtvmmProviderConfig:SetPeer) F16879
[FSM:FAILED]: external VM manager extension-key configuration(FSM:sam:dme:ExtvmmMasterExtKeyConfig). Remote-Invocation-Error: Error syncing extension key F999938
[FSM:FAILED]: external VM manager configuration(FSM:sam:dme:ExtvmmProviderConfig). Remote-Invocation-Error: Error syncing extension key F999919
At this time I was very disappointed with the TAC Support.
To got further I found the Bug CSCvf27661
So I opened a new TAC Case to exchange those SSH-Keys.
First to see the error:
UCS1-A# scope system
UCS1-A /system # show managed-entity detail
Managed Entity:
Fabric ID: A
Leadership: Primary
State: Up
Umbilical State: Full
HA Ready: Yes
SSH Internal Root Pub Key Checksum: aLongChecksum1
SSH Internal Root Pub Key Size: 225
SSH Internal Auth Keys Checksum: aLongChecksum2
SSH Internal Auth Keys Size: 225
SSH Internal Keys Status: Matched
Fabric ID: B
Leadership: Subordinate
State: Up
Umbilical State: Full
HA Ready: Yes
SSH Internal Root Pub Key Checksum: aLongChecksum3
SSH Internal Root Pub Key Size: 219
SSH Internal Auth Keys Checksum: aLongChecksum1
SSH Internal Auth Keys Size: 225
SSH Internal Keys Status: Mismatched
Get the keys machting on both systems and edit it with vi 😉
Linux(debug)# cat /root/.ssh/id_rsa.pub ssh-rsa AveryLongSSHKey1== root@(none) Linux(debug)# cat /var/home/samdme/.ssh/authorized_keys ssh-rsa AveryLongSSHKey2== root@(none)
I cant say how wired the TAC Engineer does this, it took more than one hour to edit this two keys. But he also did a mistake! Be sure your key ends with “root@(none)”.
In my case the “(none)” was missing in one key. I changed it after the TAC Session and get all errors gone.
Hope this helps you …
Leave a comment and share!
UPDATE:
Please also check CSCva31113. This explained my issue exactly.
One thought on “UCS Upgrade failed”
Hello Chris,
I read your post. It is very helpful. I have similar problem on clients UCS FI 2248UP. Can you provide to me debug plugin ucs-dplug.5.2.3.N2.2.27c.gbin if you can. I will appreciate this.
Best regards
Bosko Kecman