Skip to content

Conversation

@dicej
Copy link
Contributor

@dicej dicej commented Dec 10, 2025

The spec says we should allow this, so now we do.

Thansk to Alex for the test case!

Fixes #12128

@dicej dicej requested a review from alexcrichton December 10, 2025 22:36
@dicej dicej requested a review from a team as a code owner December 10, 2025 22:36
Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code/test all look fine, but I'm trying to correlate this with the spec as well. I thought I remember that historically may_enter was a thing but it's no longer present in CanonicalABI.md. Can you clarify which point in the spec I should be looking at to double-check this? (and probably queue up some sort of rename/refactor to handle may_enter to align with the spec)

@dicej
Copy link
Contributor Author

dicej commented Dec 10, 2025

Code/test all look fine, but I'm trying to correlate this with the spec as well. I thought I remember that historically may_enter was a thing but it's no longer present in CanonicalABI.md. Can you clarify which point in the spec I should be looking at to double-check this? (and probably queue up some sort of rename/refactor to handle may_enter to align with the spec)

Search for trap_if_on_the_stack in CanonicalABI.md and the prose that follows. I agree that some renaming and/or refactoring may be due.

@dicej
Copy link
Contributor Author

dicej commented Dec 10, 2025

I'm investigating the test failure; looks like dropping a subtask while it's yielding in an infinite loop will require some extra care. EDIT: the problem is more fundamental than that, and the refactoring Alex suggested above is going to be necessary in order to fix #12128 properly.

@github-actions github-actions bot added the wasmtime:api Related to the API of the `wasmtime` crate itself label Dec 11, 2025
@alexcrichton
Copy link
Member

Joel and I discussed this and the conclusion is that the may_enter handling in Wasmtime is outdated and no longer in sync with the spec after component-model-async refactors. Effectively we need to rebuild the reentrance check from scratch throughout Wasmtime and avoid using may_enter for the triple-purpose of: preventing reentrance, requiring post_return, and lockdown-on-trap. This'll require refactors internally to use a new component-model-async helper but will have an impact on component adapter performance as well. This all corresponds to trap_if_on_the_stack in the spec (scroll down a bit)

@dicej dicej requested a review from a team as a code owner December 16, 2025 22:04
@dicej dicej requested review from cfallin and removed request for a team December 16, 2025 22:04
@dicej dicej force-pushed the fix-12128 branch 3 times, most recently from fbb2cf2 to 4abfdce Compare December 17, 2025 00:12
@alexcrichton
Copy link
Member

Current thinking, assuming I'm remembering this all correctly:

  • Wasmtime will analyze/store a boolean that indicates, for a component, "components with core modules contain no components", aka core modules are located only in the leaves of a component composition. If this is true, then reentrance checks in component<->component adapters can be skipped entirely. If this is false, which is expected to be quite rare today, then this PR's slow path will kick in.
  • The slow path is expected to be tweaked at the spec level. Right now flags a per-component-instance but they sort of need to be "find the least upper bound" in the path of components-to-the-root to set the reentrance flag on to ensure that implementation details of where implementations live aren't leaked (precise details TBD). This effectively means, however, that the may_enter flag for the host-level component will be component-wide.
  • There will be a new check on the sync<->sync adapter which will manage can_block-style flags. This is necessary to ensure that when a sync-typed function is called it's not allowed to block. This is an accidental omission and bug in today's implementation which will be fixed.

In short, sync<->sync adapters will get faster on one axis (removing reentrance checks) but slower on another access (manipulating the can_block) flag. The can_block flag will live per-vm::ComponentInstance and will be as fast as the current reentrance check. In essence the sync<->sync adapter performance profile is not expected to change.

@dicej dicej force-pushed the fix-12128 branch 4 times, most recently from 9726f92 to 38c4537 Compare December 22, 2025 23:50
Now that the component model supports multiple tasks and cooperative threads
running concurrently in the same instance, the old model of using a per-instance
flag to track whether a component may be entered no longer works.  Instead of
keeping track of whether an instance has been entered at all, we must now keep
track of whether an instance has been entered for each in-progress task.

This commit removes `FLAG_MAY_ENTER` from `InstanceFlags`, relying instead on a
per-task call stack maintained at runtime.  When the `component-model-async`
feature is enabled, we reuse the `GuestTask` stack for this purpose.  When the
`component-model-async` feature is disabled, there's just a single stack used
for the entire store.  Note that these stacks are only updated when crossing
component boundaries -- not at every internal call within an instance.

Each time we're about to enter a instance from either the host or the guest, we
check the call stack, and if the instance is already present, we trap.
Otherwise, we push a new element onto the stack and later pop it back off once
the callee returns a value.

In addition to trapping on recursive reentrance, we do a couple of additional
checks for host-to-guest calls, for which we previously relied on
`FLAG_MAY_ENTER`:

- If _any_ instance owned by the store has previously trapped, we disallow entering that or any other instance owned by the store.

- If a post-return call is needed for an instance, we disallow entering it until that call has been made.

Note that, for sync-to-sync, guest-to-guest calls, the above process entails
significantly more overhead than the prior code, which only involved checking
and setting a flag and did not require calling into the host at all.  I intend
to follow this PR up with one or more optimizations to reduce that overhead.
See the discussion of `trap_if_on_the_stack` in
https://github.com/WebAssembly/component-model/blob/main/design/mvp/CanonicalABI.md
for examples of such optimizations.

In addition to the above refactoring, this commit does a couple of related
things:

- When a host function recursively calls back into guest code, we now chain the old call stack to the new one.  This allows us to catch recursive reentrance which may span multiple top-level instances.

- Previously, we did not push a new task on the stack for sync-to-sync, guest-to-guest calls, which meant we missed catching some violations of the sync-task-must-not-block-before-returning rule.  Now that we are pushing a new task in that case, we catch all such violations, which means some of the existing WAST tests needed updating.

bless disas/component-model tests

optimize sync-to-sync, guest-to-guest calls

We now omit recursive reentrance checks entirely when generating fused adapters
for components which cannot possibly recursively reenter themselves, noting that
components which only contain modules in leaf (sub)components fall into that
category.

However, we must still include a runtime global variable check-and-set to
enforce the may-block rules for sync-typed tasks.  With some more effort, we
could also eliminate those for components statically known to never make
blocking calls to intrinsics or imports.

The implementation of the may-block check centers around a
per-top-level-instance global variable called `task_may_block`, which we update
each time we switch threads and tasks, as well as whenever `task.return` or
`task.cancel` is called.  This required shuffling some code around and creating
a new `StoreOpaque::set_thread` function which encapsulates switching threads
and updating `task_may_block` for the old and new instances.

tweak rules to determine whether recursive reentrance is possible

This accounts for components which instantiate imported modules and/or
components.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

wasmtime:api Related to the API of the `wasmtime` crate itself

Projects

None yet

Development

Successfully merging this pull request may close these issues.

component-model-async: Cannot reenter component during a task's yield?

2 participants