Skip to content

Run teleop_ros2 in all modes in the teleop-ros2-docker test#226

Open
life1ess wants to merge 1 commit intomainfrom
sgrizan/branch1
Open

Run teleop_ros2 in all modes in the teleop-ros2-docker test#226
life1ess wants to merge 1 commit intomainfrom
sgrizan/branch1

Conversation

@life1ess
Copy link
Contributor

@life1ess life1ess commented Mar 6, 2026

Summary by CodeRabbit

  • New Features

    • Added a runtime option to enable mock operators for teleoperation demos; when enabled, the demo can load a synthetic hands controller and extend plugin search paths (including env-provided and discovered ancestor directories).
  • Chores

    • CI validation now runs per-mode checks sequentially with explicit mode logging and environment propagation; each run is constrained to a 5s timeout that is handled so other modes continue.

@life1ess life1ess self-assigned this Mar 6, 2026
@coderabbitai
Copy link

coderabbitai bot commented Mar 6, 2026

📝 Walkthrough

Walkthrough

Replaced CI smoke-test with a per-mode Docker verification loop (four modes, 5s timeout per run). Added use_mock_operators flag to the teleop ROS2 publisher that conditionally discovers plugin directories and injects a synthetic hands PluginConfig.

Changes

Cohort / File(s) Summary
CI Workflow Update
\.github/workflows/build-ubuntu.yml
Replaced single smoke-test with an iterative loop that runs a Docker container per mode (controller_teleop, hand_teleop, controller_raw, full_body) with a 5s timeout; per-mode exit code 124 (timeout) is handled in-command and treated non-fatal so the loop continues.
Teleop publisher & plugin handling
examples/teleop_ros2/python/teleop_ros2_publisher.py
Added _find_plugins_dirs(start: Path) -> List[Path]; introduced use_mock_operators runtime flag (stored as _use_mock_operators); when enabled, collects ISAAC_TELEOP_PLUGIN_PATH and discovered plugin dirs and injects a PluginConfig for a controller_synthetic_hands into the TeleopSession. Adjusted _to_pose typing to accept Sequence[float] and optional orientation.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant CI as CI Workflow
    participant Docker as Docker Container
    participant Node as TeleopRos2PublisherNode
    participant FS as Filesystem/Env
    participant Session as TeleopSession

    CI->>Docker: run container for mode (mode, MOCK env)
    Docker->>Node: start teleop_ros2_publisher (--mode, use_mock flag)
    Node->>FS: read ISAAC_TELEOP_PLUGIN_PATH
    Node->>FS: call _find_plugins_dirs(start) -> plugin dirs
    Node->>Session: assemble plugins (include synthetic_hands if mock)
    Node->>Session: start session (plugins loaded)
    Session-->>Node: return status
    Docker-->>CI: exit code (124 handled as non-fatal)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I hopped through modes — four doors to try,
I sniffed up plugin paths beneath the sky.
CI times each hop, a gentle five-second tap,
Mock hands appear, then vanish with a clap.
Carrots, code, and tiny testing paws.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Run teleop_ros2 in all modes in the teleop-ros2-docker test' accurately describes the main change: the PR modifies the build workflow to test all teleop modes sequentially rather than just running a smoke test.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch sgrizan/branch1

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/build-ubuntu.yml:
- Around line 177-183: No code change required: the "Verify application modes"
workflow loop correctly handles timeout exit code 124 and tests the four modes
matching the application's _TELEOP_MODES; leave the docker run line and the
escaped \$? check as-is (referencing the Verify application modes block,
matrix.ros_distro and the tested modes controller_teleop, hand_teleop,
controller_raw, full_body).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: df4d90ba-198e-4995-9016-4e9da65eb253

📥 Commits

Reviewing files that changed from the base of the PR and between 904f15b and 3c9122c.

📒 Files selected for processing (1)
  • .github/workflows/build-ubuntu.yml

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/build-ubuntu.yml:
- Around line 177-183: The new smoke test job (teleop-ros2-docker) must be added
as a release gate by making the publish jobs depend on it: update the
publish-wheel and publish-github-release-assets jobs (or their parent workflow
triggers) to include teleop-ros2-docker in their needs list (in addition to
build-ubuntu and test-cloudxr) so a failing teleop-ros2-docker run blocks
publishing; locate the job named teleop-ros2-docker and the jobs publish-wheel
and publish-github-release-assets in the workflow and add teleop-ros2-docker to
their needs array (or equivalent dependency configuration).

In `@examples/teleop_ros2/python/teleop_ros2_publisher.py`:
- Around line 84-91: The helper _find_plugins_dirs currently stops after the
first plugins/ directory due to the break, which can hide ancestor plugin
directories (e.g., one containing controller_synthetic_hands) and makes
PluginConfig.search_paths brittle; remove the break so the loop collects every
plugin_dir found for each parent (and the start) and return the full list of
candidate Paths so PluginConfig.search_paths can include all matching plugin
directories.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9171f9ad-98ee-497e-a326-bbcfe2bd092a

📥 Commits

Reviewing files that changed from the base of the PR and between 3c9122c and f86f763.

📒 Files selected for processing (2)
  • .github/workflows/build-ubuntu.yml
  • examples/teleop_ros2/python/teleop_ros2_publisher.py

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
.github/workflows/build-ubuntu.yml (1)

177-183: ⚠️ Potential issue | 🟠 Major

Add teleop-ros2-docker to the publish gates.

This new verification job still does not block publish-wheel or publish-github-release-assets, so artifacts can be published after a failed mode check.

🔒 Proposed gating change
   publish-wheel:  # to internal Artifactory
     runs-on: [self-hosted, linux]
     needs:
       - build-ubuntu
+      - teleop-ros2-docker
       - test-cloudxr
     environment: dev
     # Publish only after build and CloudXR tests succeed.
-    if: ${{ needs.build-ubuntu.result == 'success' && needs.test-cloudxr.result == 'success' }}
+    if: ${{ needs.build-ubuntu.result == 'success' && needs.teleop-ros2-docker.result == 'success' && needs.test-cloudxr.result == 'success' }}
...
   publish-github-release-assets:
     runs-on: ubuntu-latest
     needs:
       - build-ubuntu
+      - teleop-ros2-docker
       - test-cloudxr
...
-    if: ${{ github.event_name == 'push' && needs.build-ubuntu.result == 'success' && needs.test-cloudxr.result == 'success' }}
+    if: ${{ github.event_name == 'push' && needs.build-ubuntu.result == 'success' && needs.teleop-ros2-docker.result == 'success' && needs.test-cloudxr.result == 'success' }}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/build-ubuntu.yml around lines 177 - 183, The new
verification job ("Verify application modes") is not included as a prerequisite
for the publish jobs, so publishing can proceed even if mode verification fails;
update the workflow to add the verification job (the job that runs the for-loop
with docker/teleop_ros2_ref) as a gate for the publish jobs by adding its job id
(e.g., "Verify application modes" / the job key for the teleop-ros2-docker
verification) to the needs/dependencies of "publish-wheel" and
"publish-github-release-assets" (or the publish job keys) so those publish jobs
only run if the teleop-ros2-docker verification succeeds.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/build-ubuntu.yml:
- Around line 177-183: The loop that runs docker and invokes "timeout 5 uv run
teleop_ros2_publisher.py ..." accepts any zero exit as success; change the
command so you capture the command's exit code and explicitly require it to be
124 (timeout) to pass. Concretely, in the for-loop around the docker run line
(the block invoking teleop_ros2_publisher.py via timeout inside docker), run the
docker command, save its exit code ($?), and then test "[ $exit_code -eq 124 ]"
(fail otherwise) so only a timeout exit from teleop_ros2_publisher.py is
considered success.

In `@examples/teleop_ros2/python/teleop_ros2_publisher.py`:
- Line 93: The _to_pose helper currently types position and orientation as
List[float], which causes false positives when callers pass numpy.ndarray;
change the annotations on _to_pose (function name _to_pose) to accept generic
sequences — e.g. use Sequence[float] for position and Sequence[float] | None (or
Optional[Sequence[float]]) for orientation — and add/import Sequence from typing
so static type checkers accept numpy arrays and other indexable iterables.

---

Duplicate comments:
In @.github/workflows/build-ubuntu.yml:
- Around line 177-183: The new verification job ("Verify application modes") is
not included as a prerequisite for the publish jobs, so publishing can proceed
even if mode verification fails; update the workflow to add the verification job
(the job that runs the for-loop with docker/teleop_ros2_ref) as a gate for the
publish jobs by adding its job id (e.g., "Verify application modes" / the job
key for the teleop-ros2-docker verification) to the needs/dependencies of
"publish-wheel" and "publish-github-release-assets" (or the publish job keys) so
those publish jobs only run if the teleop-ros2-docker verification succeeds.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: d1363755-fb63-43be-9187-0226b84b3003

📥 Commits

Reviewing files that changed from the base of the PR and between f86f763 and 1036f88.

📒 Files selected for processing (2)
  • .github/workflows/build-ubuntu.yml
  • examples/teleop_ros2/python/teleop_ros2_publisher.py

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/build-ubuntu.yml:
- Around line 177-184: The CI loop considers a 5s survival as success, but
teleop_ros2_publisher.py's OpenXR retry loop can spin forever and falsely pass;
update the test to assert the node actually enters a session by either (A)
adding a one-time CLI flag to teleop_ros2_publisher.py (e.g., --require-session
or --exit-if-no-session) that causes the process to exit non‑zero if
session.step() is never reached within the timeout, or (B) change the docker run
command to capture and check stdout/stderr for the session start marker (search
for the log line emitted when session.step() is called) within the timeout and
fail if not found; modify the test invocation in the workflow where
teleop_ros2_publisher.py is executed so it fails unless session.step() is
observed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f6263eb1-01c8-439d-806f-5e0669991511

📥 Commits

Reviewing files that changed from the base of the PR and between 1036f88 and ac2c860.

📒 Files selected for processing (2)
  • .github/workflows/build-ubuntu.yml
  • examples/teleop_ros2/python/teleop_ros2_publisher.py

Comment on lines +177 to +184
- name: Verify application modes
run: |
docker run --rm --entrypoint bash teleop_ros2_ref:${{ matrix.ros_distro }} -c \
"source /usr/local/bin/teleop-env-setup && ros2 --help > /dev/null && python3 -c \"import rclpy; rclpy.init(); rclpy.shutdown(); print('rclpy OK')\""
touch /tmp/dummy.json
for mode in controller_teleop hand_teleop controller_raw full_body; do
echo "Testing mode: $mode"
docker run --rm -v /tmp/dummy.json:/tmp/dummy.json -e NV_CXR_RUNTIME_DIR=/tmp -e XR_RUNTIME_JSON=/tmp/dummy.json --entrypoint bash teleop_ros2_ref:${{ matrix.ros_distro }} -c \
"source /usr/local/bin/teleop-env-setup && timeout 5 uv run teleop_ros2_publisher.py --ros-args -p mode:=$mode -p use_mock_operators:=true; rc=\$?; [ \$rc -eq 124 ]"
done
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don’t let the OpenXR retry loop count as a passing mode run.

teleop_ros2_publisher.py catches Failed to get OpenXR system and retries forever, so this step still passes if mock plugin discovery breaks and the node never reaches session.step(). Right now the only success signal is “stayed alive for 5s”, which is weaker than the PR objective.

🔧 Suggested hardening
     - name: Verify application modes
       run: |
         touch /tmp/dummy.json
         for mode in controller_teleop hand_teleop controller_raw full_body; do
           echo "Testing mode: $mode"
-          docker run --rm -v /tmp/dummy.json:/tmp/dummy.json -e NV_CXR_RUNTIME_DIR=/tmp -e XR_RUNTIME_JSON=/tmp/dummy.json --entrypoint bash teleop_ros2_ref:${{ matrix.ros_distro }} -c \
-            "source /usr/local/bin/teleop-env-setup && timeout 5 uv run teleop_ros2_publisher.py --ros-args -p mode:=$mode -p use_mock_operators:=true; rc=\$?; [ \$rc -eq 124 ]"
+          log_file="$(mktemp)"
+          docker run --rm -v /tmp/dummy.json:/tmp/dummy.json -e NV_CXR_RUNTIME_DIR=/tmp -e XR_RUNTIME_JSON=/tmp/dummy.json --entrypoint bash teleop_ros2_ref:${{ matrix.ros_distro }} -c \
+            "source /usr/local/bin/teleop-env-setup && timeout 5 uv run teleop_ros2_publisher.py --ros-args -p mode:=$mode -p use_mock_operators:=true" \
+            >"${log_file}" 2>&1
+          rc=$?
+          cat "${log_file}"
+          [ "${rc}" -eq 124 ]
+          ! grep -q "No XR client connected" "${log_file}"
         done
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/build-ubuntu.yml around lines 177 - 184, The CI loop
considers a 5s survival as success, but teleop_ros2_publisher.py's OpenXR retry
loop can spin forever and falsely pass; update the test to assert the node
actually enters a session by either (A) adding a one-time CLI flag to
teleop_ros2_publisher.py (e.g., --require-session or --exit-if-no-session) that
causes the process to exit non‑zero if session.step() is never reached within
the timeout, or (B) change the docker run command to capture and check
stdout/stderr for the session start marker (search for the log line emitted when
session.step() is called) within the timeout and fail if not found; modify the
test invocation in the workflow where teleop_ros2_publisher.py is executed so it
fails unless session.step() is observed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant