A bit-flip in the ECC-protected register file can go unnoticed when writeback forwarding is enabled #2188

viniul · 2024-07-08T20:12:49Z

Summary

The register file can be protected through ECC. In case this ECC detects an error, an alert signal is raised, and execution is aborted. However, the ECC signal is masked (ignored) in case writeback forwarding is enabled, see ibex_core.sv:919:

    // Calculate errors - qualify with WB forwarding to avoid xprop into the alert signal
    assign rf_ecc_err_a_id = |rf_ecc_err_a & rf_ren_a & ~rf_rd_a_wb_match;

This masking ignores the fact that sometimes, the core uses the value from the register file, even when writeback forwarding is enabled -- namely when encountering data hazards.

As a result, a bit-flip in the register file can propagate to the rest of the core, even with ECC in the register file being enabled.

This issue was found by Konrad Jansen, Niklas Kühnapfel, Peter William Deutsch, and Vincent Ulitzsch.

Issue Details

Consider this waveform, in particular the signals listed in this screenshot:

Consider these two instructions which are about to be executed in the attached screenshot of the waveform.

168:  fec42703               lw     a4,-20(s0)
16c:  fcf740e3               blt    a4,a5,12c <byteArrayCompare+0x24>

The branch instruction needs to wait for the load instruction to finish -- therefore the core is encountering a data hazard. A fault injected in the register file (in register a4), will propagate and disrupt execution, but remains undetected by the ECC. In the attached waveform, the fault is injected at 553 ps. To see why the fault does not trigger an alert, let us look at the waveform:

Consider the signals debug_pc_id (pc decode) and debug_pc_wb. At 550, the load instruction is in the writeback stage and the branch instruction is in the decode + execute stage. The load instruction will cause a stall in the pipeline, see ibex_decoder, line 1003

    // If instruction is reading register that load will be writing stall in
    // ID until load is complete. No need to stall when reading zero register.
    assign rf_rd_a_hz = rf_rd_a_wb_match & rf_ren_a;
    assign stall_ld_hz = outstanding_load_wb_i & (rf_rd_a_hz | rf_rd_b_hz);

We therefore stall for one cycle to get the load.

This load happens one cycle later (552 ps). The rf_write_wb signal is set to low, therefore setting

assign rf_rdata_a_fwd = rf_rd_a_wb_match & rf_write_wb_i ? rf_wdata_fwd_wb_i : rf_rdata_a_i;

to not use the value from the writeback stage, but from the rf_rdata_a_i value, see ibex_id_stage, line 1006. However, the rf_rd_a_wb_match signal is not set to low, thereby disabling the ECC signal:

assign rf_ecc_err_a_id = |rf_ecc_err_a & rf_ren_a & ~rf_rd_a_wb_match;

The branch instruction from the decode stage therefore receives the value from the register file, although the rf_rd_a_wb_match signal is still set to high. Because the ECC signal is set to low, our fault goes unnoticed. In the example waveform, the injected fault at 553 ps (in the register file holding register a4) disrupts the value of alu_operand_a_ex, thereby changing the result of branch (debug_branch_decision), but because rf_rd_a_wb_match is still enabled, the ECC does not trigger.

Fix

Only mask the ECC signal in case writeback forwarding is enabled:

    // Calculate errors - qualify with WB forwarding to avoid xprop into the alert signal
    assign rf_ecc_err_a_id = |rf_ecc_err_a & rf_ren_a & (~rf_rd_a_wb_match | ~rf_write_wb);
    assign rf_ecc_err_b_id = |rf_ecc_err_b & rf_ren_b & (~rf_rd_b_wb_match | ~rf_write_wb);

The text was updated successfully, but these errors were encountered:

Under certain circumstances Ibex ignored the ECC check from the register file when it should not have. This fixes the issue. Fixes lowRISC#2188

viniul added the Type:Bug Bugs label Jul 8, 2024

GregAC added a commit to GregAC/ibex that referenced this issue Jul 15, 2024

[rtl] Fix logic for generating ECC related asserts

1777106

Under certain circumstances Ibex ignored the ECC check from the register file when it should not have. This fixes the issue. Fixes lowRISC#2188

GregAC added a commit to GregAC/ibex that referenced this issue Jul 15, 2024

[rtl] Fix logic for generating ECC related alerts

f34cad4

Under certain circumstances Ibex ignored the ECC check from the register file when it should not have. This fixes the issue. Fixes lowRISC#2188

GregAC closed this as completed in 9e4a950 Jul 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A bit-flip in the ECC-protected register file can go unnoticed when writeback forwarding is enabled #2188

A bit-flip in the ECC-protected register file can go unnoticed when writeback forwarding is enabled #2188

viniul commented Jul 8, 2024 •

edited

Loading

A bit-flip in the ECC-protected register file can go unnoticed when writeback forwarding is enabled #2188

A bit-flip in the ECC-protected register file can go unnoticed when writeback forwarding is enabled #2188

Comments

viniul commented Jul 8, 2024 • edited Loading

Summary

Issue Details

Fix

viniul commented Jul 8, 2024 •

edited

Loading