Skip to content

[pull] main from facebook:main#173

Merged
pull[bot] merged 3 commits into
code:mainfrom
facebook:main
Jun 3, 2025
Merged

[pull] main from facebook:main#173
pull[bot] merged 3 commits into
code:mainfrom
facebook:main

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented Jun 3, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.1)

Can you help keep this open source service alive? 💖 Please sponsor : )

sebmarkbage and others added 3 commits June 3, 2025 11:30
Alternative to #33421. The difference is that this also adds an
underscore between the "R" and the ID.

The reason we wanted to use special characters is because we use the
full spectrum of A-Z 0-9 in our ID generation so we can basically
collide with any common word (or anyone using a similar algorithm,
base64 or even base16). It's a little less likely that someone would put
`_R_` specifically unless you generate like two IDs separated by
underscore.


![9w2ogt](https://github.com/user-attachments/assets/21b2d2ac-1a3a-4657-ba0b-1616e49dfdee)
This lets us track what data each Server Component depended on. This
will be used by Performance Track and React DevTools.

We use Node.js `async_hooks`. This has a number of downside. It is
Node.js specific so this feature is not available in other runtimes
until something equivalent becomes available. It's [discouraged by
Node.js docs](https://nodejs.org/api/async_hooks.html#async-hooks). It's
also slow which makes this approach only really viable in development
mode. At least with stack traces. However, it's really the only solution
that gives us the data that we need.

The [Diagnostic
Channel](https://nodejs.org/api/diagnostics_channel.html) API is not
sufficient. Not only is many Node.js built-in APIs missing but all
libraries like databases are also missing. Were as `async_hooks` covers
pretty much anything async in the Node.js ecosystem.

However, even if coverage was wider it's not actually showing the
information we want. It's not enough to show the low level I/O that is
happening because that doesn't provide the context. We need the stack
trace in user space code where it was initiated and where it was
awaited. It's also not each low level socket operation that we want to
surface but some higher level concept which can span a sequence of I/O
operations but as far as user space is concerned.

Therefore this solution is anchored on stack traces and ignore listing
to determine what the interesting span is. It is somewhat
Promise-centric (and in particular async/await) because it allows us to
model an abstract span instead of just random I/O. Async/await points
are also especially useful because this allows Async Stacks to show the
full sequence which is not supported by random callbacks. However, if no
Promises are involved we still to our best to show the stack causing
plain I/O callbacks.

Additionally, we don't want to track all possible I/O. For example,
side-effects like logging that doesn't affect the rendering performance
doesn't need to be included. We only want to include things that
actually block the rendering output. We also need to track which data
blocks each component so that we can track which data caused a
particular subtree to suspend.

We can do this using `async_hooks` because we can track the graph of
what resolved what and then spawned what.

To track what suspended what, something has to resolve. Therefore it
needs to run to completion before we can show what it was suspended on.
So something that never resolves, won't be tracked for example.

We use the `async_hooks` in `ReactFlightServerConfigDebugNode` to build
up an `ReactFlightAsyncSequence` graph that collects the stack traces
for basically all I/O and Promises allocated in the whole app. This is
pretty heavy, especially the stack traces, but it's because we don't
know which ones we'll need until they resolve. We don't materialize the
stacks until we need them though.

Once they end up pinging the Flight runtime, we collect which current
executing task that pinged the runtime and then log the sequence that
led up until that runtime into the RSC protocol. Currently we only
include things that weren't already resolved before we started rendering
this task/component, so that we don't log the entire history each time.

Each operation is split into two parts. First a `ReactIOInfo` which
represents an I/O operation and its start/end time. Basically the start
point where it was start. This is basically represents where you called
`new Promise()` or when entering an `async function` which has an
implied Promise. It can be started in a different component than where
it's awaited and it can be awaited in multiple places. Therefore this is
global information and not associated with a specific Component.

The second part is `ReactAsyncInfo`. This represents where this I/O was
`await`:ed or `.then()` called. This is associated with a point in the
tree (usually the Promise that's a direct child of a Component). Since
you can have multiple different I/O awaited in a sequence technically it
forms a dependency graph but to simplify the model these awaits as
flattened into the `ReactDebugInfo` list. Basically it contains each
await in a sequence that affected this part from unblocking.

This means that the same `ReactAsyncInfo` can appear in mutliple
components if they all await the same `ReactIOInfo` but the same Promise
only appears once.

Promises that are only resolved by other Promises or immediately are not
considered here. Only if they're resolved by an I/O operation. We pick
the Promise basically on the border between user space code and ignored
listed code (`node_modules`) to pick the most specific span but abstract
enough to not give too much detail irrelevant to the current audience.
Similarly, the deepest `await` in user space is marked as the relevant
`await` point.

This feature is only available in the `node` builds of React. Not if you
use the `edge` builds inside of Node.js.

---------

Co-authored-by: Sebastian "Sebbie" Silbermann <silbermann.sebastian@gmail.com>
Stacked on #33388.

This encodes the I/O entries as their own row type (`"J"`). This makes
it possible to parse them directly without first parsing the debug info
for each component. E.g. if you're just interested in logging the I/O
without all the places it was awaited.

This is not strictly necessary since the debug info is also readily
available without parsing the actual trees. (That's how the Server
Components Performance Track works.) However, we might want to exclude
this information in profiling builds while retaining some limited form
of I/O tracking.

It also allows for logging side-effects that are not awaited if we
wanted to.
@pull pull Bot added the ⤵️ pull label Jun 3, 2025
@pull pull Bot merged commit 3fb17d1 into code:main Jun 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant