Settle a problem:41
An engineer was remotely upgrading a Catalyst 9300 switch. They issued the one-shot upgrade command:
install add file flash:cat9k_iosxe.17.09.04.SPA.bin active commit
This command is designed to perform the entire upgrade process—adding the new software, setting it as the active package for the next boot, and committing the change. The final step is a prompt asking for confirmation to reload the switch.
However, the engineer’s SSH session disconnected for a minute right after the command was sent. When they reconnected, the confirmation wizard was gone. Any attempt to re-run the install
command resulted in a frustrating error:
File cannot start new install. Operation is already running.
The engineer was now in a difficult position: the switch was “stuck” in an installation state, and they were afraid to simply reload it, fearing the boot configuration (packages.conf
) might be corrupted or empty, leading to a boot loop or a trip to ROMMON.
The key to solving this problem is understanding what the commit
keyword does. The IOS-XE install
process is transactional and has several distinct stages:
.bin
file) is expanded into its component .pkg
files in the flash memory.commit
keyword, the install activate
process modifies the packages.conf
file, telling the switch which .pkg
files to load upon the next reboot. This action is completed before the final reload prompt appears.When your SSH session disconnected, the first three stages had likely already completed successfully. The switch had updated its boot instructions and was simply waiting on input from a session that no longer existed. This is why the system reports an “operation is already running”—it’s still technically waiting for that final confirmation.
The good news is that the switch is not in a dangerous state. The “commit” action has already done the heavy lifting. The recovery is a matter of verification followed by a manual reload.
Step 1: Verify the Boot Configuration (Build Your Confidence)
Before you do anything else, verify that the switch is configured to boot the new software. This will confirm it is safe to reload. Connect to the switch and run the following command:
show install summary
This command will likely show the installation in a pending or waiting state. More importantly, check the packages.conf
file. This file acts as the bootloader’s instruction manual.
more flash:packages.conf
You should see output that clearly lists the .pkg
files from your new IOS-XE version. It will look something like this (version numbers will vary):
#! /usr/binos/bin/packages_conf.sh
# Copyright (c) 2016-2022 by Cisco Systems, Inc.
# All rights reserved.
boot rp 0 0 rp_boot flash:cat9k-rp-boot.17.09.04.SPA.pkg
boot rp 0 0 rp_core flash:cat9k-rp-core.17.09.04.SPA.pkg
boot rp 0 0 ssa flash:cat9k-ssa.17.09.04.SPA.pkg
... (and so on for all packages)
If you see the new version numbers listed here, you can be 100% confident that the commit phase was successful. The switch knows exactly what to do when it reboots.
Step 2: Save Your Configuration and Reload
Since the boot variables are correctly set, the only remaining step is the one you were interrupted from completing.
First, as a best practice, save your running configuration:
copy running-config startup-config
or
write memory
Now, manually reload the switch:
reload
Proceed with the confirmation. The switch will now reboot, load the new software packages as directed by packages.conf
, and complete the upgrade.
Step 3: Post-Upgrade Verification and Cleanup
Once the switch is back online, SSH into it and verify that the upgrade was successful.
show version
Check the “System image file is” line to confirm it is running the new version.
show install summary
This should now show the installation state as “SUCCESS”.
Finally, it is a critical best practice to clean up the old, inactive software files to reclaim valuable space on your flash storage.
install remove inactive
Confirm the removal, and the process is complete.
To avoid this stressful situation in the future, follow these tips for any critical remote operation:
Use a Scheduled Reload: Before starting the upgrade, schedule a failsafe reload. If you lose access for any reason, the switch will automatically reboot back to its previous state after the timer expires.
reload in 15
(Reloads in 15 minutes)reload at 23:30
(Reloads at a specific time)reload cancel
before initiating the final upgrade reboot.Use a Terminal Multiplexer (Screen/Tmux): If you are working from a Linux/macOS bastion host, run your SSH session inside a terminal multiplexer like screen
or tmux
. If your local connection to the bastion host drops, the session on the host remains active. You can simply reconnect and re-attach to your session, which will be exactly where you left it.
Out-of-Band Access: For mission-critical infrastructure, always have a reliable out-of-band (OOB) management plan, such as a console server. This gives you direct console access to the device, completely independent of the production network, allowing you to recover from almost any situation.