Run teleop_ros2 in all modes in the teleop-ros2-docker test#226
Run teleop_ros2 in all modes in the teleop-ros2-docker test#226
Conversation
📝 WalkthroughWalkthroughReplaced CI smoke-test with a per-mode Docker verification loop (four modes, 5s timeout per run). Added Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant CI as CI Workflow
participant Docker as Docker Container
participant Node as TeleopRos2PublisherNode
participant FS as Filesystem/Env
participant Session as TeleopSession
CI->>Docker: run container for mode (mode, MOCK env)
Docker->>Node: start teleop_ros2_publisher (--mode, use_mock flag)
Node->>FS: read ISAAC_TELEOP_PLUGIN_PATH
Node->>FS: call _find_plugins_dirs(start) -> plugin dirs
Node->>Session: assemble plugins (include synthetic_hands if mock)
Node->>Session: start session (plugins loaded)
Session-->>Node: return status
Docker-->>CI: exit code (124 handled as non-fatal)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/build-ubuntu.yml:
- Around line 177-183: No code change required: the "Verify application modes"
workflow loop correctly handles timeout exit code 124 and tests the four modes
matching the application's _TELEOP_MODES; leave the docker run line and the
escaped \$? check as-is (referencing the Verify application modes block,
matrix.ros_distro and the tested modes controller_teleop, hand_teleop,
controller_raw, full_body).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: df4d90ba-198e-4995-9016-4e9da65eb253
📒 Files selected for processing (1)
.github/workflows/build-ubuntu.yml
3c9122c to
f86f763
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/build-ubuntu.yml:
- Around line 177-183: The new smoke test job (teleop-ros2-docker) must be added
as a release gate by making the publish jobs depend on it: update the
publish-wheel and publish-github-release-assets jobs (or their parent workflow
triggers) to include teleop-ros2-docker in their needs list (in addition to
build-ubuntu and test-cloudxr) so a failing teleop-ros2-docker run blocks
publishing; locate the job named teleop-ros2-docker and the jobs publish-wheel
and publish-github-release-assets in the workflow and add teleop-ros2-docker to
their needs array (or equivalent dependency configuration).
In `@examples/teleop_ros2/python/teleop_ros2_publisher.py`:
- Around line 84-91: The helper _find_plugins_dirs currently stops after the
first plugins/ directory due to the break, which can hide ancestor plugin
directories (e.g., one containing controller_synthetic_hands) and makes
PluginConfig.search_paths brittle; remove the break so the loop collects every
plugin_dir found for each parent (and the start) and return the full list of
candidate Paths so PluginConfig.search_paths can include all matching plugin
directories.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 9171f9ad-98ee-497e-a326-bbcfe2bd092a
📒 Files selected for processing (2)
.github/workflows/build-ubuntu.ymlexamples/teleop_ros2/python/teleop_ros2_publisher.py
f86f763 to
1036f88
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
.github/workflows/build-ubuntu.yml (1)
177-183:⚠️ Potential issue | 🟠 MajorAdd
teleop-ros2-dockerto the publish gates.This new verification job still does not block
publish-wheelorpublish-github-release-assets, so artifacts can be published after a failed mode check.🔒 Proposed gating change
publish-wheel: # to internal Artifactory runs-on: [self-hosted, linux] needs: - build-ubuntu + - teleop-ros2-docker - test-cloudxr environment: dev # Publish only after build and CloudXR tests succeed. - if: ${{ needs.build-ubuntu.result == 'success' && needs.test-cloudxr.result == 'success' }} + if: ${{ needs.build-ubuntu.result == 'success' && needs.teleop-ros2-docker.result == 'success' && needs.test-cloudxr.result == 'success' }} ... publish-github-release-assets: runs-on: ubuntu-latest needs: - build-ubuntu + - teleop-ros2-docker - test-cloudxr ... - if: ${{ github.event_name == 'push' && needs.build-ubuntu.result == 'success' && needs.test-cloudxr.result == 'success' }} + if: ${{ github.event_name == 'push' && needs.build-ubuntu.result == 'success' && needs.teleop-ros2-docker.result == 'success' && needs.test-cloudxr.result == 'success' }}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/build-ubuntu.yml around lines 177 - 183, The new verification job ("Verify application modes") is not included as a prerequisite for the publish jobs, so publishing can proceed even if mode verification fails; update the workflow to add the verification job (the job that runs the for-loop with docker/teleop_ros2_ref) as a gate for the publish jobs by adding its job id (e.g., "Verify application modes" / the job key for the teleop-ros2-docker verification) to the needs/dependencies of "publish-wheel" and "publish-github-release-assets" (or the publish job keys) so those publish jobs only run if the teleop-ros2-docker verification succeeds.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/build-ubuntu.yml:
- Around line 177-183: The loop that runs docker and invokes "timeout 5 uv run
teleop_ros2_publisher.py ..." accepts any zero exit as success; change the
command so you capture the command's exit code and explicitly require it to be
124 (timeout) to pass. Concretely, in the for-loop around the docker run line
(the block invoking teleop_ros2_publisher.py via timeout inside docker), run the
docker command, save its exit code ($?), and then test "[ $exit_code -eq 124 ]"
(fail otherwise) so only a timeout exit from teleop_ros2_publisher.py is
considered success.
In `@examples/teleop_ros2/python/teleop_ros2_publisher.py`:
- Line 93: The _to_pose helper currently types position and orientation as
List[float], which causes false positives when callers pass numpy.ndarray;
change the annotations on _to_pose (function name _to_pose) to accept generic
sequences — e.g. use Sequence[float] for position and Sequence[float] | None (or
Optional[Sequence[float]]) for orientation — and add/import Sequence from typing
so static type checkers accept numpy arrays and other indexable iterables.
---
Duplicate comments:
In @.github/workflows/build-ubuntu.yml:
- Around line 177-183: The new verification job ("Verify application modes") is
not included as a prerequisite for the publish jobs, so publishing can proceed
even if mode verification fails; update the workflow to add the verification job
(the job that runs the for-loop with docker/teleop_ros2_ref) as a gate for the
publish jobs by adding its job id (e.g., "Verify application modes" / the job
key for the teleop-ros2-docker verification) to the needs/dependencies of
"publish-wheel" and "publish-github-release-assets" (or the publish job keys) so
those publish jobs only run if the teleop-ros2-docker verification succeeds.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: d1363755-fb63-43be-9187-0226b84b3003
📒 Files selected for processing (2)
.github/workflows/build-ubuntu.ymlexamples/teleop_ros2/python/teleop_ros2_publisher.py
1036f88 to
ac2c860
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/build-ubuntu.yml:
- Around line 177-184: The CI loop considers a 5s survival as success, but
teleop_ros2_publisher.py's OpenXR retry loop can spin forever and falsely pass;
update the test to assert the node actually enters a session by either (A)
adding a one-time CLI flag to teleop_ros2_publisher.py (e.g., --require-session
or --exit-if-no-session) that causes the process to exit non‑zero if
session.step() is never reached within the timeout, or (B) change the docker run
command to capture and check stdout/stderr for the session start marker (search
for the log line emitted when session.step() is called) within the timeout and
fail if not found; modify the test invocation in the workflow where
teleop_ros2_publisher.py is executed so it fails unless session.step() is
observed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: f6263eb1-01c8-439d-806f-5e0669991511
📒 Files selected for processing (2)
.github/workflows/build-ubuntu.ymlexamples/teleop_ros2/python/teleop_ros2_publisher.py
| - name: Verify application modes | ||
| run: | | ||
| docker run --rm --entrypoint bash teleop_ros2_ref:${{ matrix.ros_distro }} -c \ | ||
| "source /usr/local/bin/teleop-env-setup && ros2 --help > /dev/null && python3 -c \"import rclpy; rclpy.init(); rclpy.shutdown(); print('rclpy OK')\"" | ||
| touch /tmp/dummy.json | ||
| for mode in controller_teleop hand_teleop controller_raw full_body; do | ||
| echo "Testing mode: $mode" | ||
| docker run --rm -v /tmp/dummy.json:/tmp/dummy.json -e NV_CXR_RUNTIME_DIR=/tmp -e XR_RUNTIME_JSON=/tmp/dummy.json --entrypoint bash teleop_ros2_ref:${{ matrix.ros_distro }} -c \ | ||
| "source /usr/local/bin/teleop-env-setup && timeout 5 uv run teleop_ros2_publisher.py --ros-args -p mode:=$mode -p use_mock_operators:=true; rc=\$?; [ \$rc -eq 124 ]" | ||
| done |
There was a problem hiding this comment.
Don’t let the OpenXR retry loop count as a passing mode run.
teleop_ros2_publisher.py catches Failed to get OpenXR system and retries forever, so this step still passes if mock plugin discovery breaks and the node never reaches session.step(). Right now the only success signal is “stayed alive for 5s”, which is weaker than the PR objective.
🔧 Suggested hardening
- name: Verify application modes
run: |
touch /tmp/dummy.json
for mode in controller_teleop hand_teleop controller_raw full_body; do
echo "Testing mode: $mode"
- docker run --rm -v /tmp/dummy.json:/tmp/dummy.json -e NV_CXR_RUNTIME_DIR=/tmp -e XR_RUNTIME_JSON=/tmp/dummy.json --entrypoint bash teleop_ros2_ref:${{ matrix.ros_distro }} -c \
- "source /usr/local/bin/teleop-env-setup && timeout 5 uv run teleop_ros2_publisher.py --ros-args -p mode:=$mode -p use_mock_operators:=true; rc=\$?; [ \$rc -eq 124 ]"
+ log_file="$(mktemp)"
+ docker run --rm -v /tmp/dummy.json:/tmp/dummy.json -e NV_CXR_RUNTIME_DIR=/tmp -e XR_RUNTIME_JSON=/tmp/dummy.json --entrypoint bash teleop_ros2_ref:${{ matrix.ros_distro }} -c \
+ "source /usr/local/bin/teleop-env-setup && timeout 5 uv run teleop_ros2_publisher.py --ros-args -p mode:=$mode -p use_mock_operators:=true" \
+ >"${log_file}" 2>&1
+ rc=$?
+ cat "${log_file}"
+ [ "${rc}" -eq 124 ]
+ ! grep -q "No XR client connected" "${log_file}"
done🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/build-ubuntu.yml around lines 177 - 184, The CI loop
considers a 5s survival as success, but teleop_ros2_publisher.py's OpenXR retry
loop can spin forever and falsely pass; update the test to assert the node
actually enters a session by either (A) adding a one-time CLI flag to
teleop_ros2_publisher.py (e.g., --require-session or --exit-if-no-session) that
causes the process to exit non‑zero if session.step() is never reached within
the timeout, or (B) change the docker run command to capture and check
stdout/stderr for the session start marker (search for the log line emitted when
session.step() is called) within the timeout and fail if not found; modify the
test invocation in the workflow where teleop_ros2_publisher.py is executed so it
fails unless session.step() is observed.
Summary by CodeRabbit
New Features
Chores