Operational Resilience: From Business Continuity to Holistic Organizational Resilience

Boris Friedrich
Boris Friedrich
12 min read
Operational Resilience: From Business Continuity to Holistic Organizational Resilience

Operational resilience represents a paradigm shift in how organizations think about disruption. Traditional Business Continuity Management (BCM) focuses on recovering specific processes after defined disaster scenarios. Operational resilience starts from a different question: can we continue delivering our most important services to customers and markets regardless of what goes wrong? The shift is from scenario planning (what if the data center fails?) to outcome assurance (can we keep processing payments within 4 hours no matter what causes the disruption?).

This guide covers the operational resilience framework: identifying important business services, setting impact tolerances, mapping dependencies, testing through scenarios, and aligning with DORA requirements for financial institutions.

BCM vs. Operational Resilience

BCM focuses on disaster recovery plans for specific scenarios: data center failure, pandemic, ransomware attack. Each scenario gets its own recovery plan, tested periodically. The limitation: BCM plans fail when the actual disruption does not match the planned scenario.

Operational resilience starts from the outcome: What are our Important Business Services (IBS)? What is the maximum tolerable disruption for each? Can we stay within those tolerances regardless of what causes the disruption? This outcome-based approach is more demanding but more effective — it forces organizations to build genuine adaptability rather than scenario-specific playbooks.

The Operational Resilience Framework

1. Identify Important Business Services

Map the services your organization delivers to customers, markets, and counterparties. For each, determine: what happens if this service is unavailable for 1 hour, 4 hours, 24 hours, 1 week? Which services are critical enough to warrant impact tolerances? What is the minimum viable version of each service? Important business services are defined from the customer and market perspective, not the internal process perspective. Payment processing is a business service; the database that supports it is a dependency.

2. Set Impact Tolerances

Impact tolerances define the maximum acceptable level of disruption for each important business service. They are expressed in concrete, measurable terms: maximum downtime (e.g., payment processing must be restored within 2 hours), maximum data loss (e.g., no more than 15 minutes of transaction data may be lost), maximum customer impact (e.g., no more than 5% of customers may be affected simultaneously), and maximum financial impact (e.g., disruption costs must not exceed EUR 500,000 per event). Unlike RPO/RTO in BCM, impact tolerances consider the end-to-end service including all dependencies.

3. Map Dependencies

For each important business service, map all dependencies across five categories: People (roles, skills, minimum staffing, key person risks), Processes (business processes that contribute to service delivery), Technology (applications, infrastructure, data, network services), Third parties (vendors, cloud providers, outsourcing partners, utilities), and Facilities (offices, data centers, specialized equipment). Dependency mapping reveals: single points of failure, concentration risks (multiple services depending on one vendor), hidden dependencies (services that seem independent but share infrastructure), and recovery sequence (which dependencies must be restored first).

4. Scenario Testing

Test resilience through increasingly severe scenarios: Tabletop exercises (quarterly): Discussion-based scenarios that test decision-making and communication. Low cost, high learning value. Simulation exercises (semi-annually): Partially activate response procedures. Test specific aspects of resilience (e.g., failover to backup systems, activation of crisis communication). Full-scale exercises (annually): Complete activation simulating a severe disruption. Test end-to-end resilience including third-party dependencies. The goal is not to verify a specific recovery plan but to test whether the organization can stay within impact tolerances under stress. Scenarios should be plausible but severe, and should include multi-failure scenarios that test resilience beyond individual component failure.

DORA and Operational Resilience

DORA embeds operational resilience into EU financial regulation for the first time. Articles 5–16 require the ICT risk management framework. Articles 11–12 require ICT business continuity management with BIA, continuity plans, and regular testing. Article 26 requires advanced resilience testing (TLPT) for significant institutions. Articles 28–44 require third-party risk management for ICT providers. DORA essentially codifies operational resilience for the financial sector, making it legally binding. Financial institutions must demonstrate that they can maintain critical ICT services within defined tolerances — the same outcome-based approach that operational resilience demands.

Frequently Asked Questions

Is operational resilience just rebranded BCM?

No. BCM is a subset of operational resilience focused on recovery from specific disruption scenarios. Operational resilience takes a service-centric, outcome-based view: can we continue delivering services within acceptable parameters regardless of cause? It is broader (covers all disruption types), deeper (maps end-to-end dependencies), and more demanding (requires demonstrable tolerance testing).

What are important business services?

Services delivered to external customers, markets, or the broader economy that, if disrupted, could cause significant harm. Examples in financial services: payment processing, securities settlement, lending decisions, insurance claims handling, market making. Each organization defines its own based on business model, customer impact, and regulatory significance.

How often should resilience be tested?

Tabletop exercises: quarterly. Simulation exercises: semi-annually. Full-scale exercises: annually. DORA requires advanced testing (TLPT) for systemically important institutions on a regular cycle defined by the competent authority. Testing frequency should match the criticality of services and the pace of change in the threat landscape.

Hat ihnen der Beitrag gefallen? Teilen Sie es mit:

Your strategic success starts here

Our clients trust our expertise in digital transformation, compliance, and risk management

Ready for the next step?

Schedule a strategic consultation with our experts now

30 Minutes • Non-binding • Immediately available

For optimal preparation of your strategy session:

Your strategic goals and challenges
Desired business outcomes and ROI expectations
Current compliance and risk situation
Stakeholders and decision-makers in the project

Prefer direct contact?

Direct hotline for decision-makers

Strategic inquiries via email

Detailed Project Inquiry

For complex inquiries or if you want to provide specific information in advance