Run plot comms on the R thread by lionel- · Pull Request #1100 · posit-dev/ark

lionel- · 2026-03-10T18:05:53Z

Progress towards #689
Branched from #1099
Addresses posit-dev/positron#12825 (planning to close)

I did not foresee all the simplifications falling out of this one!

Plots no longer live in their own threads, they live on the R thread. That frees up memory since each background thread takes about 2mb of stack space (depending on platform), so 20 open plots would take 40mb of memory.
DeviceContext moves from a standalone thread_local! into a regular field of Console. All access goes through Console::get().device_context(). The CommHandlerContext provides access to Console via ctx.console(), so plot handlers reach the device context through a controlled context chain.
The old idle-time polling loop (process_rpc_requests with crossbeam::Select) is removed. Plot RPCs are now dispatched synchronously through the shell handler on the R thread. This improves plot latency when pre-renderings require adjustments because we no longer timeout-poll for plot updates (related to Performance for plotting positron#5184).
The GraphicsDeviceNotification async channel is removed. Since the UI comm now runs on the R thread (from Run UI comm on the R thread #1099), prerender settings are updated synchronously via Console::get().device_context().set_prerender_settings(). This removes quite a bit of plumbing.
Fixes a dev.hold() regression I introduced with pre-renderings (Send pre-renderings of new plots to the frontend #775). We were previously sending pre-renderings unconditionally, now they are held until dev.flush(). This is tested by new integration tests.

DavisVaughan

I have paused on review until we can get these regressions resolved

crates/ark/tests/plots.rs

crates/ark/src/console/console_repl.rs

DavisVaughan · 2026-03-16T20:13:06Z

crates/ark/src/console/console_repl.rs

        dap: Arc<Mutex<Dap>>,
        session_mode: SessionMode,
    ) -> Self {
+        let device_context = DeviceContext::new(iopub_tx.clone());


It makes sense to me to remove iopub_tx from DeviceContext now and only pull iopub_tx directly from Console

Over in graphics_device.rs, you can just Console::get().iopub_tx() anytime you need it now!

It feels like this better scopes responsibilities to me

I feel the opposite. It hides that the device context directly communicates with the frontend via IOPub.

Also ideally we should avoid Console::get() calls as these bypass Rust's ownership model. But then the issue is that the device belongs to console and not the other way around. Cloning the IOPub channel is the right ownership structure from that point of view.

It hides that the device context directly communicates with the frontend via IOPub

Hmm I see what you mean

Like the DeviceContext struct should have an "include what you use" kind of policy in the struct fields, and by not having iopub_tx in there, we mask what it uses

DavisVaughan · 2026-03-16T20:15:46Z

crates/ark/src/ui/ui_comm.rs

+        Console::get()
+            .device_context()
+            .set_prerender_settings(params.settings);


Is it possible that we could now make DeviceContext have non-Cell internals?

I am disturbed by being able to call get() to then perform a mutable update, rather than being forced to go through get_mut()

The Cell is actually more reassuring, especially regarding the possibility of reentrancy (flushing calls process_changes which can recurse into graphics device callbacks).

In general get() is not a signal that the state is not mutable, it's only a signal that the state is shared, and can still be mutable e.g. through Cells or other mechanisms.

It's all a bit hand wavy considering that our callbacks might produce multiple active get_mut() at the same time, but adding to the unsoundness would be... unsound ;)

DavisVaughan · 2026-03-16T20:30:22Z

crates/ark/src/plots/graphics_device.rs

-        // Don't try to render a plot if we're currently drawing.
-        if self.is_drawing.get() {
-            log::trace!("Refusing to render due to `is_drawing`");
-            return;
-        }
-
-        // Don't try to render a plot if someone is asking us not to, i.e. `dev.hold()`
-        if !self.should_render.get() {
-            log::trace!("Refusing to render due to `should_render`");
-            return;
-        }


I am pretty sure these are important

Since we now process the requests when R is idle, is_drawing is structurally false, we're never mid-draw.

Regarding should_render (matching dev.hold()), it's true that holds can span across prompts, I can see that happening when developing a plot function interactively. This would require:

Creating a plot that Positron knows about (opened comm)

Calling dev.hold() in the console

Resizing the plot in the frontend UI

So that's a very narrow case. We show the uncommitted visual changes in that case, which seems like a reasonable compromise. The alternative would be to refuse to render, producing a distorted plot in the frontend after resize.

crates/ark/src/plots/graphics_device.rs

Better reflects the borrowing model

lionel- · 2026-04-03T12:23:38Z

I also fixed a race between comm_open and subsequent comm_msg, the tests revealed it was possible to get the latter before the former. That's because the Shell thread is in charge of sending comm_open on IOPub, so we had a triangular configuration where the R thread and the Shell thread were racing to send their messages on IOPub.

To fix this, I made comm opening blocking. There is a new CommEvent::Barrier variant to support this. On the Shell side, the shell thread now uses a select loop that drains comm events while a request is in flight.

A nice consequence of this is that the frontend now receives new plots as soon as they are created, rather than at the end of a loop:

for (i in 1:3) {
  plot(i)
  Sys.sleep(1)
}

I also noticed that posit-dev/positron#6736 is broken again.
I've opened posit-dev/positron#12825 for this.

And of course this is all now under test coverage.

DavisVaughan

Two main comments to discuss, other things are minor:

Not sure about passing console through handle_msg and friends
Not sure about CommEvent::Barrier vs passing done through CommEvent::Opened

See below for more on both

My test file seems to be working fine now

DavisVaughan · 2026-04-03T16:57:40Z

crates/amalthea/src/socket/shell.rs

+            let ready = sel.ready();
+
+            while let Ok(event) = self.comm_event_rx.try_recv() {
+                self.process_comm_event(event);
+            }
+
+            if ready == resp_idx {
+                break response_rx.recv().unwrap();
+            }


So even if sel.ready() returns the execute response as the first "ready" thing, you want to go off and process the comm_events first?

I think I was expecting to see

if ready == resp_idx { break response_rx.recv().unwrap(); }

right after let ready =

DavisVaughan · 2026-04-03T17:21:51Z

crates/ark/src/console/console_comm.rs

+        // Block until Shell has processed the Opened event, ensuring the
+        // `comm_open` message is on IOPub before we return. Any updates


Just a gut check, but is this really what it ensures? In CommEvent::Opened we do an iopub_tx.send() call, but that doesn't necessarily mean that IOPub has received it by the end of CommEvent::Opened and the following CommEvent::Barrier event is processed.

I'm not sure if that matters.

DavisVaughan · 2026-04-03T17:24:03Z

crates/ark/src/console/console_repl.rs

        dap: Arc<Mutex<Dap>>,
        session_mode: SessionMode,
    ) -> Self {
+        let device_context = DeviceContext::new(iopub_tx.clone());


It hides that the device context directly communicates with the frontend via IOPub

Hmm I see what you mean

Like the DeviceContext struct should have an "include what you use" kind of policy in the struct fields, and by not having iopub_tx in there, we mask what it uses

DavisVaughan · 2026-04-03T17:25:22Z

crates/ark/src/console/console_repl.rs

+    #[cfg(feature = "testing")]
+    pub fn test_init() {


Can you put this up with start() and call it test_start()?

DavisVaughan · 2026-04-03T17:40:43Z

crates/ark/src/data_explorer/r_data_explorer.rs

    }

-    fn handle_msg(&mut self, msg: CommMsg, ctx: &CommHandlerContext) {
+    fn handle_msg(&mut self, msg: CommMsg, ctx: &CommHandlerContext, _console: &Console) {


I'm not sure I agree with passing console through all these handlers

It seems like the only current use that I can see is in impl CommHandler for PlotComm for handle_msg() and handle_close().

But those could just do Console::get().

It doesn't seem like it helps clarify ownership very much, because so many other things in graphics_device.rs also just do Console::get() already. And it seems likely that in handle_rpc() (called by handle_msg()) we could very easily just do another Console::get() rather than passing through the &Console we got from handle_msg(). It seems complex to me to know when to use which pattern.

It also seems like this _console: &Console argument is the reason for all of the new test_init() infra, and overall this just seems like it adds quite a bit of complexity for (to me) not a lot of obvious benefit 😢

As an example, should on_did_execute_request() gain a console: &Console argument and thread it through process_changes -> process_new_plot -> should_use_dynamic_plots to the Console::get call there?

I don't think so, I feel like it makes sense that we just invoke Console::get() from there as required like we do everywhere else.

But now introducing this new pattern of handle_msg() and friends having a console passed through confuses me and makes me question how we should do things in all the other cases, and I'm not sure that's worth it

DavisVaughan · 2026-04-03T18:03:04Z

crates/ark/src/plots/graphics_device.rs

@@ -1271,11 +1137,10 @@ pub(crate) fn on_execute_request(
 #[tracing::instrument(level = "trace", skip_all)]
 pub(crate) fn on_did_execute_request() {


I think on_did_execute_request() should perhaps be a &self method

It is only called in handle_active_request(), but we have Console there as self, hence we also have DeviceContext, so we can just call self.device_context().on_did_execute_request() from there?

But I am also a bit weirded out by the ownership implications of all this (both the current state of things and my proposal)

self.device_context().on_did_execute_request() would call DeviceContext::process_changes(), but that goes through process_new_plot() and should_use_dynamic_plots() and you end up accessing the "parent" owner of DeviceContext from there via Console::get(), which feels very mind boggling!

DavisVaughan · 2026-04-03T18:11:22Z

crates/ark/src/plots/graphics_device.rs

-        // Save our new socket.
-        // Refcell Safety: Short borrows in the file.
-        self.sockets.borrow_mut().insert(id.clone(), socket);
+        match Console::get_mut().comm_open_backend(PLOT_COMM_NAME, Box::new(plot_comm)) {


I think this is another place where it feeeels weird that DeviceContext is a field of Console, yet refers to Console as well

DavisVaughan · 2026-04-03T18:16:40Z

crates/ark/src/console.rs

 }

-pub(crate) struct Console {
+pub struct Console {


DavisVaughan · 2026-04-03T18:26:55Z

crates/ark/tests/plots.rs

@@ -699,3 +699,481 @@ fn test_plot_default_size_without_metadata() {
    frontend.recv_iopub_idle();


Random note that a cargo build --release is throwing this for me right now

warning: unused import: `libr::SEXP` --> crates/ark/src/r_task.rs:19:5 | 19 | use libr::SEXP; | ^^^^^^^^^^

I think this type is only used in debug builds?

DavisVaughan · 2026-04-03T18:29:51Z

crates/ark/tests/plots.rs

    frontend.recv_shell_execute_reply();
 }
+
+/// Test that `dev.hold()` suppresses intermediate plot output.


It looks like plot(lm(disp ~ drat, data = mtcars)) is working again, but I dont see a backend test for it?

lionel- · 2026-04-03T22:03:52Z

Ok let's discuss Console access by comms in our next sync

lionel- force-pushed the task/sync-plot-comm branch 2 times, most recently from 34baa70 to 845b4a7 Compare March 11, 2026 07:08

lionel- requested a review from DavisVaughan March 11, 2026 13:47

DavisVaughan requested changes Mar 16, 2026

View reviewed changes

lionel- force-pushed the task/sync-ui-comm branch 3 times, most recently from bd54059 to be02717 Compare March 20, 2026 06:19

Base automatically changed from task/sync-ui-comm to main March 20, 2026 06:50

lionel- force-pushed the task/sync-plot-comm branch 2 times, most recently from 7386726 to be3d85c Compare March 31, 2026 13:24

lionel- mentioned this pull request Mar 31, 2026

Add inline data explorers to Ark #1124

Merged

lionel- added 14 commits April 3, 2026 10:39

Run plot comms synchronously on the R thread

f679f8e

Fix dev.hold() issue introduced by pre-renderings

95c50aa

Move device context to Console

c9d5433

Set Prerender settings synchronously from the UI handler

36d8348

Fix message interpolation

8d42daa

Update stale comment

22b4b29

Address code review

78ae528

Pass Console reference to handlers directly

8a80e3a

Better reflects the borrowing model

Add comment about the need for Cell

10c7680

Remove unneeded warmup

bafa67e

Always record plot even if withheld

bc636c5

Ignore UI comm busy events by default in tests

e65e886

Fix synchronisation between comm opening and plot updates

720d592

Fix semantic conflicts from rebase

74abf66

lionel- force-pushed the task/sync-plot-comm branch from be3d85c to 74abf66 Compare April 3, 2026 09:00

Fix workspace inheritance

49a6715

lionel- requested a review from DavisVaughan April 3, 2026 12:13

DavisVaughan approved these changes Apr 3, 2026

View reviewed changes

		// Block until Shell has processed the Opened event, ensuring the
		// `comm_open` message is on IOPub before we return. Any updates

		@@ -1271,11 +1137,10 @@ pub(crate) fn on_execute_request(
		#[tracing::instrument(level = "trace", skip_all)]
		pub(crate) fn on_did_execute_request() {

		@@ -699,3 +699,481 @@ fn test_plot_default_size_without_metadata() {
		frontend.recv_iopub_idle();

Conversation

lionel- commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DavisVaughan left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lionel- commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DavisVaughan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lionel- commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lionel- commented Mar 10, 2026 •

edited

Loading

DavisVaughan left a comment •

edited

Loading

lionel- commented Apr 3, 2026 •

edited

Loading