Skip to content
This repository was archived by the owner on Oct 3, 2025. It is now read-only.

Conversation

@WhaleKit
Copy link
Contributor

@WhaleKit WhaleKit commented Jan 5, 2025

Related to issue: #17
This is not final, but I'd like to present my prototype of async/resume support and get feedback

It adds facilities for suspending and resuming execution of wasm code.
Namely:

  • host coroutines - functions that, when called, can yield execution (along with "yielded value") from themselves and wasm vm to host, and take it back from host (along with "resume argument")
  • 3 wasm suspend conditions - time, atomic flag, and callback - that suspend the wasm code execution. They are checked every time a function is called, returned from and control flow "branches" into loop block (i.e. execution jumps back).

from what I can see, I need to do at least few more things:

  • add typed wrappers for async function types and result enums (so that "yielded value" and "resume argument" are typed), for comparable to "typed functions" result would require a "coroutine types registry" in store that would check types when exporting. not impossible, but would make this pr too big when you call wasm function, you may get anything any imported function yielded as yielded value, not sure how would ensuring type safery of that would work
  • make this all optional with feature flags (opt-in or opt-out),
  • consider adding lifetime to "yielded value" and "resume argument" (so that "resume argument" could reference something from host caller and "yielded value" could reference something in "suspended host coro"), tests in rust nope, very likely undoable without unsafe and very careful lifetimancy
  • look for typos

@WhaleKit
Copy link
Contributor Author

WhaleKit commented Jan 8, 2025

I found what i broke, (and fixed it) yay!

On that note, it'd be cool to run entire test suite but interrupt it randomly (which can be done with a should-suspend user callback that uses a random number generator)

@WhaleKit
Copy link
Contributor Author

WhaleKit commented Jan 9, 2025

i did, and it failed the start.wast - which makes sense, because i forgot to handle start function suspending. So i fixed that.

Also added my async-testing edits to test runner, but under a "test_async" feature for now. Not sure how to do it properly - test both normal running and periodically suspended one, and whether it's even desirable

@WhaleKit
Copy link
Contributor Author

WhaleKit commented Jan 9, 2025

i made the api very opt-in: the whole thing is planned to be behind feature flag, and even if some other dependency adds that feature - to engage with "suspend" functionality user needs to go out of their way and call "_coro" versions of functions - the "old" ones just return error when execution is suspended, keeping things backwards-compatible.

However, does it make sense to do all that? It adds burden on users who do want to use async/suspend to remember to call "_coro" versions. and how often do you run code in a sandboxed environment, but aren't interested in controlling how much execution time it takes? Maybe if you're running trusted code, or keep wasm executors in separate processes that you can just kill if they are taking too long. Should i just modify existing api to return "maybe result maybe suspended execution" (rather than "_coro" versions), and not bother with feature flag?
upd: And if users want to ignore possibility of suspend, they can indicate it with ?.suspend_to_err()? i just added

@WhaleKit
Copy link
Contributor Author

WhaleKit commented Jan 9, 2025

something that could be useful, but beyond the scope of this pr
Adding to really long stretches of instructions a special "check if you should suspend" instruction if they don't have function calls (since those already do that),
marking small wasm functions to indicate that executor doesn't need to check if it should suspend when calling them/returning from them

@WhaleKit
Copy link
Contributor Author

So just as i initially planned, resume/async functionality if very opt-in: user needs to enable "async" feature, and even if it's enabled by something else in the dependency tree, none of user's code should break (unless they match on error, where i followed existing practice and feature-gated relevant enum variant). They also need to go out of their way and set set_suspend_conditions on host or create coroutine functions, and use "*_coro" functions when calling exported function or instantiating modules

@WhaleKit WhaleKit marked this pull request as ready for review January 13, 2025 20:31
@explodingcamera explodingcamera self-requested a review January 16, 2025 13:40
Copy link
Owner

@explodingcamera explodingcamera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really cool so far! I'm not fully on board with the API / amount of internal changes, and don't have a lot of time at least in January so it might take some time for me to fully review this and play around with some ideas on how to simplify it. The general principals definitely looks like something I want to add though.

@WhaleKit
Copy link
Contributor Author

That's fair enough
I could move "suspend conditions" (without host coro) to separate PR, but it doesn't contribute that much code - no more than 200 LOC i'd say

There are some compromises here I'm not entirely happy with either. If you or prefer different variants that ones i took, or have entirely different ideas, i'd rewrite with your directions in mind

I'd prefer if CoroState were consumed self by the "resume" and on suspension returned self in "Suspended", variant, but I don't see a way to make it dyn-compatible. I'm guessing that's why it doesn't consume self in experimental std trait either
With that, I could either

  • split "Returned or Suspended" enum into 2: "PotentialCoroCallResult<ReturnedVal, SuspendedState>" with state for first-time call, and "CoroStateResumeResult<ReturnedVal>" without state, for subsequent "resume" calls, and have more boilerplate implementing common functions for both. That also means if users needed to implement some of their own traits for "Returned or Suspended", they'd have to do it for 2 of them, but that's probably not that common.
  • have just one enum "CoroResult<Res, State>", and subsequent calls could return CoroResult<Res, ()> - but then users would have to write CoroResult::Suspended(()) every time they make or match CoroResult<Res, ()>
    i went with former to spare the user at the expense of more boilerplate inside, but i don't insist on it

(I could also not have CoroState at all, and make "host coro"s store their intermediary state in store's userdata (and add said userdata). Also could do away with YieldedValue and ResumeArgument and make host communicate with suspended coro via userdata. That could reduce allocations and make subsequently adding "save interpreter state" feature easier, but I feel like it makes for much less convenient api. Also, api proposed here allows that too, only thing missing being userdata in store which could be added relatively easy)

Another compromise is the degree of op-in-ness of api vs niceness of the code. I am being cautious not to introduce anything to public api if there is no feature, and "conditional pub" approach i use here makes code a fair bit less nice. I could try to make a macro that makes thing conditionally public, or just let async-specific types (and maybe even functions) spill into namespace even if there is no feature enabled, and only conditionally disable the code that affects runtime performance and lets users create host coroutines or set suspend conditions

@WhaleKit
Copy link
Contributor Author

instead of putting "SuspendConditions" in store, it could be passed some other way. I put it into store so that if a host function were to call wasm callback, it couldn't forget to pass "SuspendConditions" to it, or wasm callback could get stuck in the loop there and never exit,
I think any other way of passing it should be similarly hard to forget to pass further

@WhaleKit
Copy link
Contributor Author

WhaleKit commented Sep 8, 2025

I removed feature-gating because i was unhappy with how it looked, and updated to current "next" branch resolving conflicts

@WhaleKit WhaleKit changed the title Async Coro Sep 25, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants