Settle a problem:53
This document provides a comprehensive technical analysis and resolution path for a specific Disaster Recovery System (DRS) backup failure on the Cisco IM and Presence (IM&P) platform, specifically observed on version 14.x.
The core symptom is the failure of a scheduled or manual backup job. Analysis of the DRF Master logs (drf_master.log
) on the IM&P Publisher node reveals the following critical error message during the backup estimation phase:
file:/common/drf/localDevice/drfComponent.xml, unable to read file.
This error indicates that the DRF Master Agent, which orchestrates the backup process across the cluster, is unable to access a key configuration file required to communicate with local backup components. The failure prevents the backup from proceeding beyond the initial estimation stage.
The drfComponent.xml
file, located in the /common/drf/localDevice/
directory, serves as a dynamically generated manifest. The DRF Master Agent uses this file to identify and query all registered components on the local node (e.g., IM&P database, TFTP files) to estimate the total size and scope of the required backup.
The “unable to read file” error typically stems from one of the following conditions:
drfComponent.xml
file has become corrupted due to a disk error, an improper shutdown, or a software anomaly.Because this file is essential for the initial handshake between the master and local backup agents, any issue with its integrity or accessibility will cause an immediate failure of the backup process.
Before proceeding with the resolution, it is imperative to perform the following diagnostic steps to confirm the scope of the issue and rule out other common backup-related problems.
file list activelog /common/drf/localDevice/
to check for the existence of drfComponent.xml
.file view activelog /common/drf/localDevice/drfComponent.xml
. If this command returns garbled text or an I/O error, it strongly suggests file corruption.This procedure details the steps to force the regeneration of the drfComponent.xml
file. This should be performed during a scheduled maintenance window, as it involves restarting a core system service.
Establish CLI Access: Open an SSH session to the IM&P Publisher node using administrative credentials.
Delete the Corrupt Manifest File: Execute the following command precisely. This action removes the problematic file, which will trigger its recreation upon service restart.
file delete activelog /common/drf/localDevice/drfComponent.xml
Confirm the deletion when prompted.
Restart the DRF Master Agent Service: The recommended method is via the CLI to ensure a clean restart.
utils service restart Cisco DRF Master Agent
This service is responsible for regenerating the drfComponent.xml
file upon startup. The restart process may take several minutes.
Verify File Regeneration: After the service restart command completes, wait approximately 2-3 minutes. Then, execute the following command to confirm the file has been successfully recreated:
file list activelog /common/drf/localDevice/ detail
Verify that drfComponent.xml
is present in the output and has a recent timestamp corresponding to the service restart time.
Initiate a Manual Backup: Navigate to the Disaster Recovery System interface on the CUCM Publisher. Select the IM&P server in the backup device list and initiate a manual backup. Monitor the backup status for successful completion.
If the backup continues to fail after following this procedure, re-examine the drf_master.log
and drf_local.log
files for new or different error messages. The issue may be related to a specific component agent failing to register with the newly generated manifest. In such cases, collect a log bundle from the DRS interface (Cohesity DRF Logs
) and escalate the case to the Cisco Technical Assistance Center (TAC) for further investigation.