Non-blocking Interface for Payjoin State Machine #1446
Conversation
spacebear21
left a comment
There was a problem hiding this comment.
concept ACK. I mostly reviewed the v1 state machine before realizing those same changes are applied in v2 as well, so all my comments apply to the v2 changes too. I'm actually not sure we need to support an async-compatible interface for v1 at all? It's going to result in a lot of duplication and afaik v1 is a) synchronous protocol and b) getting deprecated.
| /// function sets that parameter to None so that it is ignored in subsequent steps of the | ||
| /// receiver flow. This protects the receiver from accidentally subtracting fees from their own | ||
| /// outputs. | ||
| #[cfg_attr(not(feature = "v1"), allow(dead_code))] |
There was a problem hiding this comment.
Unrelated to this PR but this seems odd.. Why is this "not v1" feature gate here in the v1 module??
There was a problem hiding this comment.
I thought it was saying that that if v1 is not being used allow this as dead code, which seemed logical to me. But I am not really familiar with rust conventions on how to handle features that aren't used.
b8a35e8 to
b0d967f
Compare
Roger that, just removed |
|
@spacebear21 if our current targets will use this interface and they support v1, we need to support v1. We're have a clean cut point for v1 support but we haven't hit it yet and forcing it with a regression (which I think not supporting v1 on this new api + migrating e.g. bbmobile to this new api would cause) is gonna bring us bigger problems than some duplicate code will. |
|
Concept ACK. The new I would prefer if the existing callback based fns were refactored to use the newly introduced ones, since the latter are are lower level. That should reduce complexity and improve readability. |
|
Unit tests are definitely coming!
I generally agree, however the existing callback based fns are more compatible with the |
Yes. The docs have always made a point of emphasizing this library does no IO, and while technically true, it was arguably in only weaker sense than it should be, as normally it would kind of imply the library is completely agnostic to all IO considerations, but the callback API does make it harder if the callbacks need to do async IO. IMO that gap was/is a bug. We reasoned we could close this gap without technically breaking semver compatibility as it strictly adds API surface. Since there was no concrete need at the time, that meant it didn't need to be in the 1.0 milestone. I think there is some benefit in frontloading this refactor (i.e. not putting off to a followup PR), which is that dogfooding this the new non-blocking API will help it be narrower (by finding good answers things like the As an example of something I'm hoping such dogfooding of the API can catch, the allocation of the temporary hashmap in the non blocking ownership check may preclude its use on constrained hardware devices where allocation needs to be avoided. Finding a good type signature which is intuitive and easy to use, and which allows avoiding heap allocations would allow avoid a situation in the future where we might need two versions of this functionality, one which avoids allocations and one which is required for backwards compatibility. |
b0d967f to
7a0c45f
Compare
|
PR has been updated to reflect a lot (but not all) of the feedback. Still very preliminary but just wanted to get this out there to verify I'm on the right track. Things updated:
Things not updated:
Please let me know if there is anything that is glaringly wrong. I will continue to refactor and work on the items that have not yet been updated in the next few days. |
| fn from(e: ProtocolError) -> Self { Error::Protocol(e) } | ||
| } | ||
|
|
||
| impl From<ImplementationError> for Error { |
There was a problem hiding this comment.
this extends the public API, and since ImplementationError has some From impls (which arguably also shouldn't be there) effectively allows any &str or Box<dyn Error + Send + sync>) to be converted to an Error
@chavic IIRC you have been auditing and removing some of these to remove unintended pub functionality, thoughts?
the only thing that stands out at the moment is that there is a lot of new |
Coverage Report for CI Build 26348330495Coverage increased (+0.2%) to 85.339%Details
Uncovered Changes
Coverage Regressions6 previously-covered lines in 2 files lost coverage.
Coverage Stats
💛 - Coveralls |
d6102ed to
837ff11
Compare
0xZaddyy
left a comment
There was a problem hiding this comment.
Hi @xstoicunicornx solid work on the implementations, just little observations and i have suggested some changes. also i noticed some new validation methods doesn't have some error handling like finalize_signed_proposal
837ff11 to
39561c8
Compare
|
Thanks for taking a look @0xZaddyy ! All these errors will be going away with the updated implementation however I definitely agree with your suggestions and will keep that error format in mind moving forward. |
39561c8 to
989b5cf
Compare
|
PR is getting much closer to getting out of draft state. Got comments and tests refined with this latest push. Some items I would like feedback on:
Originally I had scoped the FFI bindings to be part of this PR but at this point I think it might be better to open a separate PR for that if no one has any objections. Please let me know what questions or other feedback you all have. |
989b5cf to
13cb21b
Compare
There was a problem hiding this comment.
Can't speak for everyone but I have some thought on your questions.
Currently the get_inputs_owned_validator is returned as a Result due to the way that the input's previous output script is looked up, is there a better way to grab the script so that doesn't require returning Result?
Seeing the InputsOwnedValidator::new() I think its reasonable to use a Result.
The run_async method on the base Validator runs the callback sequentially rather than concurrently, is there a way to run concurrently without requiring additional dependency? Is it important for this to run concurrently?
I think without using something like futures::join_all the readability would just nosedive and I think it would not be worth the nicety of having it concurrent.
Does the visibility of the validator components look alright?
I think that we either have to choose these to be internal and the below methods to be the public wrappers or make all the new new()'s at least somewhat consistent as pub
Should get_inputs/outputs_owned/seen_validator methods be kept? They are basically just wrappers for the validator constructors.
I don't think I have a complete answer but I think just removing the wrappers in favor of making the new impls pub fn makes more sense to me. What does putting them in pub wrappers give us exactly?
The existing check_inputs_not_owned and identify_receiver_outputs take a callback that has &Script as an argument but the new run and run_async methods in the InputsOwnedValidator and OutputsOwnedValidator take a callback that has &ScriptBuf as an argument, is this inconsistency alright? I only did this because I could not store Script type within the Validator without wrapping it in a heap allocated type like Box.
Is there a real benefit to keeping the &Script consistent vs accepting &ScriptBuf if we would still need it to be different by being in a Box<>
|
I have reimplemented this with async callbacks instead. See PR #1546 . As I was working through implementing the bindings of this PR @spacebear21 prompted me to rethink this approach as the added complexity of introducing the Additionally, the only reason I had pursued this implementation pattern of extracting the item to be validated and then returning the validation result to advance the state machine was because I had seen the |
c572359 to
af81bd7
Compare
e60abbf to
0dde47a
Compare
Cover the success and error paths for each validation step in the receiver's OriginalPayload's broadcast suitability, input ownership, and input known validation as well as PsbtContext's proposal finalization.
Restructure match arms in the v2 receiver typestates to return directly from each branch instead of binding an intermediate value and returning after the match. Also rename a shadowed `inner` binding to `payjoin_psbt` for clarity.
Consolidate standalone receiver processing functions into a ReceiverProcessor class that encapsulates the payjoin module, RPC client, and persister. Fix the PJ helper type to use prototype inference, update the web import paths from src to dist, and correct the CheckInputsNotSeenCallback parameter type.
Introduce an implementation-agnostic interface for receiver typestates that currently require callback-based validation to advance. Previously, each validation step demanded a synchronous closure, coupling the state machine to the caller's execution model. This made integration difficult for wallets where signing, broadcast checks, or ownership lookups are asynchronous or handled by a separate process. Each callback-based transition is now split into a two-phase pattern: a method to extract the data that needs checking (get_*_refs, extract_tx_*, psbt_to_sign) and a corresponding method to submit results and advance the state (apply_*_checks, apply_broadcast_suitability, finalize_signed_proposal). A lightweight Reference/TaggedReference framework with typed tags ensures completeness and ordering of the submitted checks at runtime. This applies across v1 and v2 receiver flows, including input ownership, input-seen, output ownership, broadcast suitability, proposal finalization, and transaction monitoring. The original closure-based methods are preserved as convenience wrappers over the new API, so this is backward-compatible for existing integrators.
Expose the two-phase validation API from the previous commit through the FFI bindings layer. Update integration tests in C#, Dart, JavaScript, and Python to exercise both callback and nonblocking transition modes.
Migrate both v1 and v2 receiver flows in payjoin-cli from the callback-based validation API to the two-phase extract/apply non-blocking API.
0dde47a to
38b475e
Compare
|
I think these commits should be moved out into separate PRs as they do not align with the main goal of this PR and adds noise for the reviewer: |
I agree, the only reason I left them in is because they are, generally speaking, not really necessary. The only time they become something that is nice to have is in preparation for this PR which touches very essential and prominent parts of the payjoin receiver state machine. Another way of saying it is, these were changes that originally were going to be smushed in with relevant commits but splitting them out seemed more readable for reviewers. If others agree with Evan I'd be happy to submit as separate PR. |
|
Based on the comments it seems like you can split those commits out ahead of time to ease review. Sounds like a good idea to me. |
Summary
Currently some states (UncheckedOriginalPayload, MaybeInputsOwned, MaybeInputsSeen, OutputsUnknown, ProvisionalProposal) require the usage of synchronous callbacks to transition to the next state in the payjoin state machine. This is inconvenient for languages which required the use of asynchronous calls for validation, for example when calling the bitcoin rpc, which use the payjoin FFI language bindings. While there are some workarounds that can be used to adequately validate the state machine transitions (except for UncheckedOriginalPayload) when relying on asynchronous validation, these are a bit unwieldily to use. Instead, this PR implements an asynchronous compatible interface that makes it more straightforward to validated the state machine transitions with asynchronous calls.
Background
I was attempting to prove out the Javascript language bindings by implementing a Node version of the payjoin-cli and opened issue #1389 after encountering some of the challenges with using asynchronous validation callbacks. It was revealed to me that this was a known pain point and there was a solution proposed by @arminsabouri of creating new state transition methods that accept the result of a validation call. This is my attempt at implementing this solution.
Any and all feedback is welcome! This is currently in draft status as I am still working on testing and the language bindings.
AI Disclosure
Unit tests were written with the help of Claude AI.
Pull Request Checklist
Please confirm the following before requesting review:
AI
in the body of this PR.