Why the CER Directive matters for critical infrastructure operators
The EU Critical Entities Resilience (CER) Directive raises the bar for how essential service operators prevent, withstand, and recover from disruptions. By 2027, operators designated as “critical entities” should be able to demonstrate—during audits and inspections—that resilience is governed, measurable, tested, and continuously improved.
This checklist translates the Directive’s intent into practical controls you can implement and evidence you can present. It focuses on four areas that commonly determine audit outcomes: detection capability standards, incident logging requirements, response time benchmarks, and documentation formats.
Step 1: Confirm scope, designation, and accountable ownership
Before you build controls, ensure you can explain what is in scope and who is accountable.
Checklist
- Entity designation confirmed: Internal record of your designation decision and the services/assets in scope (sites, systems, third parties).
- Named executive owner: One accountable leader responsible for resilience, with authority to assign resources.
- Resilience governance structure: Roles and responsibilities for operations, security, business continuity, facilities, IT/OT, communications, legal, and procurement.
- Defined “essential service” outcomes: Clear statements of what “acceptable service” means (capacity, quality, availability, safety constraints).
Audit-ready artifacts
- RACI matrix for resilience-related activities
- List of critical services, supporting processes, and enabling assets
- Risk acceptance criteria and escalation thresholds
Step 2: Run a resilience risk assessment that covers all hazards
CER expects an “all-hazards” approach—physical, cyber, natural, technical, and human factors. Your risk assessment must also consider dependencies and interdependencies.
Checklist
- All-hazards methodology documented (including physical security and cyber resilience in one coherent view).
- Asset and dependency mapping: Upstream and downstream dependencies (energy, telecoms, cloud, transport, water, key suppliers).
- Scenario-based analysis: At least a small set of credible disruption scenarios (e.g., site access loss, OT malfunction, insider misuse, supply chain disruption).
- Risk treatment plan: Controls mapped to risks with owners, due dates, and residual risk ratings.
Audit-ready artifacts
- Risk register with versioning and approval signatures
- Dependency map diagram(s) and an assumptions log
- Control implementation plan with milestones
Step 3: Implement baseline resilience measures (prevention + continuity)
Auditors typically look for evidence that resilience is more than policies: it’s engineering, procedures, and maintained capability.
Checklist
- Physical resilience: Site hardening, access control, surveillance, perimeter controls, visitor management, and maintenance schedules.
- Operational resilience: Redundancy, spare parts strategy, capacity management, and safe failover procedures.
- Business continuity: Business Impact Analysis (BIA), continuity plans, and recovery strategies aligned to service outcomes.
- Third-party resilience: Contractual requirements, supplier risk assessments, and contingency plans for critical providers.
Audit-ready artifacts
- BIA with service priorities and recovery strategies
- Maintenance logs for critical infrastructure
- Supplier assurance pack (critical suppliers, reviews, and contingency measures)
Step 4: Detection capability standards you should be able to demonstrate
CER does not prescribe a single technical stack, but auditors expect timely detection, especially for events that could disrupt essential services. Your goal is to show that you can detect abnormal conditions in both IT and operational/physical environments.
Minimum detection capabilities (practical standard)
- 24/7 monitoring coverage for critical environments (in-house or via a managed service), with defined handoffs.
- Use-case based detection mapped to your scenarios:
- Service degradation indicators (capacity, latency, throughput, error rates)
- OT/ICS alarms and safety interlocks (where applicable)
- Physical intrusion and access anomalies
- Unauthorized configuration changes and privilege misuse
- Alert triage process with severity classification tied to business impact.
- Time synchronization across systems to support reliable forensics and audit trails.
- Health monitoring for detection tooling (so you can prove monitoring is functioning).
Evidence to prepare
- Monitoring architecture diagram (sources → correlation → alerting → ticketing)
- Alert taxonomy (severity definitions and examples)
- Monthly/quarterly monitoring performance reports (coverage, alert volumes, tuning decisions)
- Records of detection control testing (e.g., simulated alerts, sensor checks)
Step 5: Incident logging requirements (what to log, how long, and how to protect it)
Even if related reporting obligations are handled under other regimes, CER-aligned audits will still scrutinize your ability to record, preserve, and reconstruct incidents.
Logging checklist
- Centralized incident record: One system of record (ticketing, case management, or incident platform).
- Event log capture for critical systems and facilities, including:
- Authentication and access events (including privileged actions)
- Configuration changes and deployments
- OT events relevant to safety and service continuity
- Physical access logs (badge events, visitor logs, alarm activations)
- Service health metrics and outage timelines
- Chain-of-custody controls: Role-based access to logs, tamper-evident storage, and documented export procedures.
- Retention policy aligned to operational needs and legal requirements, with a documented rationale.
- Data protection safeguards: Minimization, access controls, and secure handling of personal data within logs.
Incident record fields auditors expect
- Unique incident ID, dates/times (detected, declared, resolved)
- Initial detection source and triage notes
- Scope and impacted services/assets
- Actions taken and decision approvals
- Communications and escalation history
- Root cause analysis (or interim cause), lessons learned, follow-up tasks
- Evidence references (log bundles, screenshots, statements)
Step 6: Response time benchmarks (set targets you can defend and meet)
CER expects effective, timely response—not perfection. The key is to define response benchmarks that match your risk profile, then prove through exercises and real incidents that you meet them.
Recommended benchmark structure Define targets for each phase, by severity:
- Acknowledge (alert is seen and accepted)
- Triage (initial severity, scope hypothesis, and assignment)
- Contain/Control (stop spread, stabilize operations, ensure safety)
- Recover (restore essential service to agreed level)
- Post-incident review (lessons learned and remediation plan)
Example (use your own values)
- Severity 1 (service disruption or imminent safety impact):
- Acknowledge: minutes
- Triage: within an hour
- Contain: same shift
- Restore essential service: within defined recovery objectives
- Severity 2 (degradation, localized outage):
- Acknowledge and triage: same business day or faster depending on exposure
- Severity 3 (no service impact, low risk):
- Managed via standard change/problem workflows
Evidence to prepare
- SLA/OLA documents (including 24/7 on-call arrangements)
- Runbooks with decision trees and escalation paths
- After-action reports showing actual times vs targets
- Proof of training and role readiness (on-call rosters, handover logs)
Step 7: Documentation formats accepted for compliance audits
Auditors rarely require a specific template, but they do require clarity, version control, traceability, and approvals. Aim for consistent, reviewable documentation that can be exported.
Accepted formats (practical guidance)
- Policies and standards: PDF or controlled documents with version, owner, approval date.
- Procedures and runbooks: Document format or wiki with change history and access control; exportable to PDF for audits.
- Plans (BCP/IRP/crisis comms): Controlled documents plus exercise records.
- Registers (risk, assets, incidents, third parties): Spreadsheet or GRC tool export with immutable timestamps or audit logs.
- Diagrams: Architecture and dependency maps with date and owner.
- Evidence packs: Bundled exports (tickets, logs, screenshots) with an index.
Documentation hygiene checklist
- One master index of resilience documents with owners and review cycles
- Review cadence (at least annually, and after major changes/incidents)
- Change control records for critical procedures
- Clear linkage: risks → controls → tests/exercises → improvements
Step 8: Test, exercise, and continuously improve (the make-or-break audit area)
Capabilities must be proven. Tabletop exercises alone are usually insufficient for critical services.
Checklist
- Annual exercise plan covering a mix of scenarios (cyber/physical/operational/supply chain).
- Operational tests where safe and feasible (failover tests, backup restoration tests, access control drills).
- Lessons learned program that generates tracked corrective actions with deadlines.
- Management review: leadership reviews results, approves improvements, and allocates budget.
Audit-ready artifacts
- Exercise schedule, scripts, participant lists, and results
- Corrective action tracker (status, owner, due date, evidence of closure)
- Metrics dashboard (detection performance, response times, recurring issues)
Final audit-prep: Build a “CER compliance evidence pack”
Create a single package you can hand to an auditor within days, not weeks.
Minimum contents
- Scope statement and critical service list
- Risk assessment and treatment plan
- Detection coverage summary and monitoring process
- Incident management process + sample incident records (anonymized if needed)
- Response benchmarks (SLAs/OLAs) and proof of performance
- Exercise reports and corrective action closure evidence
- Document index with version control and approvals
If you can show that your controls are implemented, monitored, tested, and improved, you’ll meet the practical expectations of CER-aligned audits by 2027—and you’ll be measurably more resilient in real disruptions.