The Mathematics of Safety: Understanding SIL (Safety Integrity Level)

SIL is the global measurement scale for safety performance. Ranging from SIL 1 to SIL 4, it quantifies the reliability of a safety system, with SIL 4 representing the highest level of protection required for critical railway signaling.

The Mathematics of Safety: Understanding SIL (Safety Integrity Level)
December 8, 2025 12:36 pm | Last Update: March 20, 2026 10:41 pm
A+
A-
⚡ In Brief
  • Safety Integrity Level (SIL) is a four-level scale defined by IEC 61508 that quantifies the required reliability of a safety function — expressed as the probability that the function fails to perform its intended action on demand. Higher SIL means lower tolerable failure probability.
  • SIL 4 — the highest level — requires a Probability of Dangerous Failure on Demand (PFD) of less than 10⁻⁵, meaning the safety function may fail dangerously at most once per 100,000 demands. For a continuous operation system (like a railway interlocking), this equates to a dangerous failure rate of less than 10⁻⁹ per hour — approximately once every 114,000 years of continuous operation.
  • Railway interlockings (CBI), ETCS onboard units (EVC), Radio Block Centres (RBC), and axle counter evaluators all require SIL 4 certification — the highest level — because a dangerous failure in these systems can directly cause a collision with multiple fatalities.
  • Achieving SIL 4 is not just about hardware redundancy — it requires a defined safety lifecycle from hazard analysis through design, coding, testing, and validation, with each stage documented and verified by an independent assessor. The software alone for a SIL 4 railway system can take 3–5 years to develop and certify.
  • The CENELEC railway standards (EN 50126, EN 50128, EN 50129) provide the railway-specific implementation of SIL requirements, translating the generic IEC 61508 framework into specific methods and documentation requirements for railway signalling and control systems.

In 1994, a software fault in the signalling system at Cowden, Kent — on a single-line section of the Uckfield branch — allowed conflicting movement authorities to be issued simultaneously to two trains. Five people died when the trains collided head-on. The investigation found that the signalling system software had not been subjected to the kind of systematic testing that would have revealed the fault. At the time, no mandatory standard specified what “adequate” software testing meant for railway safety systems.

The Cowden accident, along with Ladbroke Grove (1999) and a series of comparable incidents across Europe, accelerated the development and adoption of the CENELEC railway safety standards — a framework for quantifying exactly how reliable a safety system must be, and precisely what must be done to demonstrate that reliability. The centrepiece of that framework is the Safety Integrity Level — a number between 1 and 4 that encodes, in a single figure, the probability target for a safety function’s failure, and the engineering effort required to achieve it.

What Is Safety Integrity Level (SIL)?

Safety Integrity Level is a measure of the required safety performance of a safety function — the probability that the function fails to perform correctly when demanded by the system. It is defined in IEC 61508 (the generic functional safety standard) and implemented in railway-specific form in the CENELEC EN 50126/50128/50129 standards.

The SIL concept rests on two foundational ideas:

Not all safety functions need the same reliability. A system that prevents a nuclear power plant meltdown has a different failure consequence than a system that controls a parking barrier. Requiring the same engineering effort for both would be wasteful and impractical. SIL provides a structured way to match engineering rigour to consequence.

Safety performance can be quantified. Rather than saying a system is “safe” or “unsafe” in qualitative terms, SIL assigns a specific numerical probability target that the safety function must meet. This target can then be used to specify hardware architecture, software development processes, testing coverage, and validation requirements that will achieve it.

The SIL Scale: Numbers, Probabilities, and Meaning

SIL LevelPFD (Demand Mode)PFH (Continuous Mode)Equivalent ReliabilityRailway System Examples
SIL 110⁻² to 10⁻¹10⁻⁶ to 10⁻⁵ /hrFailure 1 in 10 to 1 in 100 demandsNon-safety-critical monitoring alarms; passenger information systems
SIL 210⁻³ to 10⁻²10⁻⁷ to 10⁻⁶ /hrFailure 1 in 1,000 to 1 in 10,000Platform screen doors; CCTV control systems; some ATP sub-functions
SIL 310⁻⁴ to 10⁻³10⁻⁸ to 10⁻⁷ /hrFailure 1 in 10,000 to 1 in 100,000Level crossing controllers; some ATC onboard functions; fire/life safety systems
SIL 410⁻⁵ to 10⁻⁴10⁻⁹ to 10⁻⁸ /hrFailure 1 in 100,000 to 1 in 1,000,000Interlocking (CBI), ETCS EVC, RBC, axle counter evaluators, OBU

The distinction between demand mode (PFD) and continuous mode (PFH) is important:

  • Demand mode (PFD): Applies to systems that are dormant until needed — an emergency stop button, a fire suppression system. The PFD is the probability the system fails when demanded.
  • Continuous mode (PFH): Applies to systems that are continuously active — a railway interlocking, which must continuously produce correct outputs every second of operation. PFH is the probability of a dangerous failure per hour of continuous operation.

Railway signalling systems operate continuously, so PFH is the relevant metric. A SIL 4 railway interlocking must have a dangerous failure rate of less than 10⁻⁹ per hour — meaning, on average, one dangerous failure per billion hours of operation (approximately 114,000 years).

The CENELEC Railway Safety Standards: EN 50126, EN 50128, EN 50129

IEC 61508 defines the generic SIL framework. For railways, the CENELEC family of standards provides the specific implementation:

StandardTitleScopeKey Requirement
EN 50126RAMS — Reliability, Availability, Maintainability, SafetySystem-level: hazard identification, risk analysis, SIL allocation to functionsDemonstrate that system risks are tolerable; assign SIL to each safety function
EN 50128Software for Railway Control and Protection SystemsSoftware development lifecycle: coding, testing, verification, validationSpecific software development techniques mandatory or recommended per SIL level; independent verification
EN 50129Safety Related Electronic Systems for SignallingHardware and system-level safety assessment; safety case documentationFormal Safety Case document demonstrating SIL achievement; independent Safety Assessor sign-off

How SIL Is Assigned: From Hazard to Number

Assigning a SIL to a safety function is not an arbitrary decision — it follows a defined risk analysis process:

Step 1 — Hazard identification (HAZOP/FHA): All hazards that the system could contribute to are identified. For a railway interlocking, hazards include “conflicting routes set simultaneously,” “signal falsely displayed as clear,” and “point failed to move to requested position.”

Step 2 — Risk estimation: For each hazard, two parameters are estimated:

  • Severity: The worst-case consequence if the hazard leads to an accident (death, serious injury, minor injury, property damage).
  • Frequency: How often the hazardous situation arises — how often a demand is placed on the safety function.

Step 3 — Tolerable risk determination: Industry standards and regulatory requirements define tolerable risk levels for different severity categories. For catastrophic hazards (multiple fatalities), the tolerable risk is typically expressed as a fatality rate per train-kilometre operated or per passenger-journey, derived from societal safety expectations.

Step 4 — Risk reduction calculation: The difference between the unmitigated risk (risk without the safety function) and the tolerable risk determines how much risk reduction is needed. This risk reduction factor maps directly to a SIL level:

Required risk reduction = Unmitigated risk / Tolerable risk

Example: Unmitigated failure rate 10⁻³/hr; Tolerable rate 10⁻⁹/hr
Required reduction factor = 10⁶ → maps to SIL 4 (10⁻⁵ to 10⁻⁴ PFH range)

Achieving SIL 4: What It Actually Takes

SIL 4 certification is not achieved by any single engineering measure — it requires a comprehensive set of practices applied consistently through the entire system development lifecycle:

Hardware Architecture

Hardware must achieve SIL 4 through redundancy. A single processor, however reliable, cannot achieve SIL 4 failure rates — individual electronic components have failure rates many orders of magnitude higher than 10⁻⁹/hour. Redundant architectures are required:

  • 1-out-of-2 (1oo2): Two channels process the same input; both must agree to produce an output. A disagreement causes a safe failure (system shuts down). Achieves SIL 3–4 depending on component reliability.
  • 2-out-of-3 (2oo3): Three channels vote; two must agree. Can tolerate one channel failure while maintaining operation; detects disagreements. Used in most modern SIL 4 interlockings and EVCs.
  • 1-out-of-2D (1oo2D): Two diverse channels (different designs/manufacturers) must both indicate safe before proceeding; either can trigger safe shutdown. Diversity reduces common-cause failures.

Software Development (EN 50128)

SIL 4 software development requires a set of mandatory and highly recommended techniques that go far beyond normal commercial software engineering:

Technique / MeasureSIL 1SIL 2SIL 3SIL 4
Formal specification methods (e.g., Z, B Method)RRHRM
Structured programming (no dynamic constructs)RHRMM
Static analysis (MISRA C, LINT)RHRMM
100% MC/DC test coverageRHRM
Independent verification and validation (IV&V)RRMM
Diverse software implementation (two teams)RHR
Traceability (requirements → code → test)RMMM

M = Mandatory; HR = Highly Recommended; R = Recommended; — = Not specified

The Safety Case: Documenting SIL Achievement

SIL certification is not simply declared by the developer — it must be demonstrated through a formal Safety Case document. The Safety Case is the structured argument that the system meets its required SIL, supported by evidence from design, analysis, and testing. A typical SIL 4 railway signalling Safety Case may run to thousands of pages and encompasses:

  • Hazard analysis and risk assessment results
  • System architecture description and failure mode analysis (FMEA, FTA)
  • Hardware reliability analysis (λ-value calculations for all components)
  • Software development process evidence (inspection records, test results, coverage measurements)
  • Integration test results (hardware-software-system)
  • Independent assessor reports and conclusions
  • Operational safety constraints and maintenance requirements

The Safety Case is reviewed and approved by an independent Safety Assessor — a certified body (in the UK, a NoBo/DeBo under the Railway Interoperability Regulation) that is independent of the developer. Without the Safety Assessor’s sign-off, the system cannot enter service on the railway network.

SIL in Practice: Railway System Assignments

Railway SystemSIL LevelRationale
Computer-Based Interlocking (CBI)SIL 4Dangerous failure could allow conflicting train movements → multiple fatalities
ETCS European Vital Computer (EVC)SIL 4Failure to enforce braking curve → train collision
Radio Block Centre (RBC)SIL 4False movement authority → trains authorised into same section
Axle counter evaluatorSIL 4False clear section → conflicting train authorised in occupied section
Level crossing controllerSIL 3–4Failure to lower barriers → vehicle/train collision
Platform Screen Door controlSIL 2–3Failure to open or close — serious injury risk; lower frequency than collision hazards
Tunnel emergency ventilationSIL 2–3Failure in fire emergency — risk to passengers; mitigated by supplementary protection
Traction power SCADASIL 2–3 (safety functions)Failure to isolate electrified section — electrocution risk; has local manual backup
Passenger information displaysSIL 0 / Non-SILNo safety function; incorrect display does not directly cause an accident

Common Cause Failures: The SIL 4 Hidden Risk

One of the most important concepts in SIL engineering is the common cause failure (CCF) — a failure mode that simultaneously affects multiple redundant channels, defeating the independence that redundancy is designed to provide. If two processors in a 1oo2 architecture are both susceptible to the same software bug, a single triggering event can cause both to fail simultaneously, making the system’s actual failure rate much higher than its individual channel failure rates would suggest.

CCF mitigation in SIL 4 systems includes:

  • Hardware diversity: Using processors from different manufacturers with different silicon designs, so that a manufacturing defect affecting one device family does not affect both channels.
  • Software diversity: Implementing the safety function in two independent software versions developed by different teams from the same specification — a systematic bug in one version is unlikely to appear in the other.
  • Physical separation: Installing redundant channels in different physical locations, protected from the same environmental events (fire, flooding, power surge).
  • Independent power supplies: Ensuring the two channels cannot both lose power from the same upstream failure.

Editor’s Analysis

The SIL framework represents one of the railway industry’s most important methodological advances of the past thirty years. Before EN 50126/50128/50129, railway safety systems were designed and approved based on engineering judgement and operational experience — “proven-in-use” arguments that a system was safe because it had not failed recently. This approach was adequate when systems were purely mechanical or based on simple relay circuits whose failure modes were well understood. It became inadequate when software-based systems entered the safety architecture — software can fail in ways that operational experience does not reveal, because the triggering conditions for a dormant bug may never be encountered in normal service. The Cowden accident in 1994 was a software fault in a system that was not subject to the kind of systematic verification that would have found it. SIL 4 software development, with its mandatory formal methods, 100% MC/DC coverage, and independent verification, exists precisely to find Cowden-type faults before they enter service. The cost is real and substantial — a SIL 4 signalling system costs perhaps 10–20 times more to develop than equivalent non-safety software — but the alternative is accepting an unknowable rate of undetected software faults in systems whose failure can kill dozens of people. The ongoing challenge is maintaining the intellectual integrity of SIL certification as commercial pressures push for faster and cheaper development. The Safety Case document, the independent assessor process, and the EN 50128 mandatory measures are all mechanisms for maintaining that integrity. When any of them is compromised — when Safety Cases are superficial, assessors are captured by developers, or mandatory measures are waived on commercial grounds — the SIL number on the certificate becomes a label rather than a guarantee. — Railway News Editorial

Frequently Asked Questions

Q: Is a SIL 4 system completely safe — will it never fail?
No — SIL 4 does not mean zero failures; it means an extremely low probability of dangerous failure. A SIL 4 continuous-mode system with a PFH of 10⁻⁹ per hour is expected to produce one dangerous failure per billion hours of operation — approximately 114,000 years. In a fleet of 100 SIL 4 interlockings each operating 8,760 hours per year, the expected rate is one dangerous failure across the entire fleet every approximately 1,140 years. This is not zero — it is a very small residual risk that society accepts as the price of having a functioning railway system. The safety case for any SIL 4 system includes a demonstration that the residual risk is tolerable according to the applicable standard; it does not claim that risk is zero.
Q: How long does SIL 4 certification take and why?
SIL 4 certification for a new railway signalling product typically takes 3–7 years from the start of development to receipt of the certifying assessor’s report. The timeline reflects the depth of evidence required: formal specification and verification of the software (which may involve mathematically proving properties of the code), 100% modified condition/decision coverage (MC/DC) testing of every code branch, independent verification of all test evidence by a separate team, integration testing of the hardware-software combination, and a Safety Case document of thousands of pages reviewed by the independent Safety Assessor. The assessor’s review and challenge process can itself take 12–18 months. The duration is a consequence of the thoroughness required — cutting corners reduces the timeline but also reduces the confidence that the SIL 4 target has genuinely been achieved.
Q: What is the difference between SIL and ASIL (Automotive SIL)?
ASIL (Automotive Safety Integrity Level) is the automotive industry’s equivalent of SIL, defined in ISO 26262 rather than IEC 61508. Both are four-level scales (ASIL A to D, corresponding roughly to SIL 1 to 4) with similar underlying mathematics of probability of dangerous failure. The key differences are the specific technical requirements at each level — automotive and railway environments have different failure modes, testing methodologies, and operational contexts — and the standards bodies involved. An ASIL D-certified automotive system is not automatically considered equivalent to a SIL 4 railway system; the certification must be to the railway-specific standards (EN 50128) for railway deployment. Cross-domain recognition of safety certification is an ongoing area of standardisation work, particularly relevant as automotive-derived electronic components (processors, sensors) are increasingly used in railway applications.
Q: Can a system achieve SIL 4 using off-the-shelf commercial hardware?
Yes — SIL 4 systems regularly use commercial off-the-shelf (COTS) processors and electronics as components. However, the SIL 4 claim applies to the system architecture, not to the individual commercial components. A standard Intel or ARM processor is not SIL 4 rated — it has failure rates, software bugs, and uncontrolled internal architecture that preclude individual certification. But two or three such processors, running in a voted redundant architecture with appropriate monitoring and with verified software, can achieve SIL 4 at the system level because the architecture ensures that a single processor failure produces a safe output and is detected before it can cause a dangerous outcome. The SIL 4 system designer must characterise the failure behaviour of each COTS component, demonstrate that the overall architecture tolerates the expected failure modes, and include the component failure rates in the system-level reliability calculation.
Q: What is an “Independent Safety Assessor” and why is one required?
An Independent Safety Assessor (ISA) — called a NoBo (Notified Body) or DeBo (Designated Body) in European railway regulation — is an organisation accredited to review and certify safety cases for railway systems. The ISA reviews the developer’s Safety Case, challenges assumptions and evidence, may conduct its own testing, and ultimately issues (or withholds) a certificate confirming that the system has achieved its claimed SIL. The independence requirement means the ISA must have no financial or organisational relationship with the developer or the procuring railway — it must be able to withhold certification without commercial consequences. Independence is essential because a developer assessing their own work will have a natural bias toward confirming their conclusions; the ISA provides the adversarial challenge that makes the certification process credible. Without mandatory independent assessment, the SIL number on a certificate would reflect only the developer’s own judgement about their work.