Skip to content

Conversation

@cpetig
Copy link
Contributor

@cpetig cpetig commented Jan 14, 2026

This needed a bit more of file infrastructure than I anticipated. The code in wasip3_file_utils.c needs more work to set the proper errno when writes/reads fail, but for stdout/in it should be good for a first step.

Copy link
Collaborator

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! I'll caution that the test coverage of this repository is very light so I'm going to try to pick these apart in terms of review to try to make them as correct as I can think. In that sense it might be good to expand the test suite over time, but what I'm talking about may also be covered by other tests over time as they get enabled, unsure.

One high-level point though is that reads/writes of streams don't have a default way to communicate an error if you're working with just the stream. Conventionally in wasip3 this means that the error information is always a side-channel of sorts via a future or a task or something like that. Whenever a read or write indicates that the stream is closed we'll want to then block waiting on the resolution of the subtask in question. That'll carry error information with it and indicate if the prior read failed, why it failed, etc.

I'm not sure how best to organize this in wasi-libc in terms of whether we want all this in read.c/write.c, most of it in a helper, or what. I do think a lot of this is going to want to be shared with pipes/tcp eventually so we'll want to keep that in mind to ensure that there's minimal duplication across these files.

// TODO(wasip3)
errno = ENOTSUP;
return -1;
return __wasilibc_read3(fildes, buf, nbyte);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the general style in wasi-libc is to have the implementation of this function inlined here (since I think this will forever be the only caller, right?)

Some helpers I could see being reasonable to put elsewhere, but for the main body of the implementation that seems like it'll be read-specific so I think it's reasonable to inline here (and in write below)

Although I'll be honest in terms of style I'm just winging it I don't have a strong rhyme or reason one way or the other. Keeping things as "one symbol per file and as few deps on other files as possible" seems to be the general ethos.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping to factorize out common parts between read and write and having both in a single file (file_util) helps with that, previously I had considerable code inlined here and in write.c.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah if there's a large shared body of code between read/write definitely worth splitting out. I'd imagine though that ABI-wise there's details of read/write that aren't shared which would make sense to go here (vs having this function effectively just tail-call another function), but I'm also ok waiting to see how the dust settles in wasip3 to see what the best organization is

Comment on lines +46 to +49
/// Start an asynchronous read or write, returns zero on success.
/// Stores the waitable, status and offset location.
int (*read3)(void*, void *buf, size_t nbyte, waitable_t *waitable, wasip3_waitable_status_t *out, off_t**);
int (*write3)(void*, void const *buf, size_t nbyte, waitable_t *waitable, wasip3_waitable_status_t *out, off_t**);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to make this look more like get_{read,write}_stream above? Basically where these are in theory simple accessors for the underlying stream and that's it. That way the implementation of read isn't split up across this callback and the wasip3_file_utils.c file (and in the future I think would help deduplicate for, e.g., pipes and tcp streams

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously I just returned a stream_u8_t, then I found that each subsystem has its own name for stream_write (filesystem_, sockets_, stdin_, ...) and moved the write call into the descriptor specific code as well. I don't think it makes a difference, but it just feels wrong to call filesystem_stream_u8_read on a tcp socket.

And as this interface works for read, write and poll I thought that it might be the right level of abstraction.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok that makes sense. Given that those are wit-bindgen-isms I think it'd be reasonable to have something like #define filesystem_... stream_u8... with a comment explaining why and that should be sufficient to have a shared stream utility which doesn't look like it's tied to any one interface

off_t **offs) {
stdio3_t *stdio = (stdio3_t *)data;
if (!stdio->input) {
errno = EBADF;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For wasip2 this returns errno = EOPNOTSUPP; which might be more appropriate here? The fd is valid just of the wrong kind

Copy link
Contributor Author

@cpetig cpetig Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I just checked with the Linux manual page for write which documented EBADF. EOPNOTSUPP makes sense to me as well.

Comment on lines +103 to +107
stdin_tuple2_stream_u8_future_result_void_error_code_t stdin;
stdin_read_via_stream(&stdin);

if (!terminal_stdin_get_terminal_stdin(&stdio->terminal_in))
stdio->terminal_in.__handle = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this perhaps be more of a lazy initialization similar to how wasip2 works? I felt that worked out pretty well and helps reduce runtime dependencies to a bare minimum where possible


typedef struct {
stdin_stream_u8_t input;
stdin_future_result_void_error_code_t future;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps input_future as a name to clearly connect it to the input variable? Also mind adding some //-style comments explaining what these fields are and how the futures/streams/tasks all relate?

}
stdin_stream_u8_t read_side = stdin_stream_u8_new(&stdio->stdout);
wasip3_subtask_status_t res = (*func)(read_side, &stdio->stdout_result);
stdio->stdout_task = WASIP3_SUBTASK_HANDLE(res);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This'll want to handle the case that res wasn't actually "pending" to mean that the stdout stream is closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The next write will pick up the error anyway, do you think the open should fail immediately? Wouldn't init returning -1 render the file descriptor infrastructure broken (at least the first open will fail in descriptor_table_insert, IIRC).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that returning -1 shouldn't happen, but handling it in this case (IMO) should effectively move stdout into a state of "it's closed, don't try again". I'd imagine that this would trigger writes to start returning EPIPE (or maybe 0) or something like that, but basically the case that the function returns immediately should be used to update the internal structure vs returning an error

@cpetig
Copy link
Contributor Author

cpetig commented Jan 15, 2026

Thanks for this! I'll caution that the test coverage of this repository is very light so I'm going to try to pick these apart in terms of review to try to make them as correct as I can think. In that sense it might be good to expand the test suite over time, but what I'm talking about may also be covered by other tests over time as they get enabled, unsure.

I support the thankless task to keep high quality standards on open source, even though it clearly means more (early) work for me within this MR!

I will implement my idea for proper error handling to files (there should a proper check on a file API test) on this MR, but it might take me some days. Similarly for factoring code out of read+write.

Thanks for all your work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants