Skip to content

Conversation

@AlexJones0
Copy link

See also some relevant discussion in this comment: #178 (comment). In general without a multithreaded implementation we currently prefer using a short (~100ns or so) timer over Bottom Halves for decoupling larger processing tasks and yielding back to the vCPU for a short while.

From personal testing of more involved OpenTitan integration flows, I found that when running under heavy processing loads (running many tests in parallel, such that QEMU is being pre-empted quite often), these BHs could occasionally be quite inconsistent and take a surprisingly long time to be scheduled. This was contributing to unexpected test failures due to timing - while timing is not expected to be accurate, the BH was inconsistently causing an order of magnitude difference than what is expected for some OTP operations.

Copy link

@pamaury pamaury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not an expert of this part of the code but the change looks reasonable.

Copy link

@rivos-eblot rivos-eblot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being OT-offboarded tonight, keeping the review as comment-only :-)

From testing, it seems that the use of Bottom Halves on hosts under
higher processing loads (where QEMU is more often pre-empted) can
produce inconsistent / slow results. Timers with a short timeout provide
the same decoupling functionality as the BH but with more consistency
and expedience.

This could be especially problematic for SW which expects digest writes
to be processed within a certain time. If handling of the BH was
deferred long enough, then enough guest code could be processed for it
to look like e.g. 10 ms had passed, exceeding common SW timeouts. This
could be seen happening occasionally under high processing loads.

Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
From testing, it seems that the use of Bottom Halves on hosts under
higher processing loads (where QEMU is more often pre-empted) can
produce inconsistent / slow results. Timers with a short timeout provide
the same decoupling functionality as the BH but with more consistency
and expedience.

Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
@AlexJones0
Copy link
Author

CI failures are unrelated (the //sw/device/silicon_creator/rom_ext/e2e/ownership:flash_permission_test_slot_ba_sim_qemu_rom_ext fail is rarely flaky due to host runner load running many ownership tests in parallel), so I'm merging this regardless.

@AlexJones0 AlexJones0 merged commit 88d955e into lowRISC:ot-10.1.0 Dec 9, 2025
7 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants