[RORDEV-1944] investigating socket hangs up issue#82
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a Cypress synchronization helper and integrates a new GitHub composite action that can start/stop a background Elasticsearch connectivity monitor (host, Docker container, Kubernetes pod) into E2E workflow jobs. ChangesTenancy Test Synchronization
ES Connectivity Monitoring Infrastructure
Sequence Diagram(s)sequenceDiagram
autonumber
participant Runner as "GH Actions runner"
participant Host as "Host (curl)"
participant Docker as "Docker kbn container"
participant KubePod as "Kibana pod (k8s)"
participant ES as "Elasticsearch"
Runner->>Host: start es-connectivity-monitor (composite action)
Host->>Host: spawn background loop (every 5s) -> log to /tmp/es-connectivity-monitor.log
Host->>ES: curl inputs.es_url/_cluster/health (record status/timing)
Host->>Docker: exec TCP probe to es-ror:9200 and tail container logs
Host->>KubePod: kubectl exec TCP probe to resolved ES service:9200
Note right of Host: loop repeats until stopped
Runner->>Host: stop es-connectivity-monitor (always)
Host->>Host: kill PID (best-effort), print /tmp/es-connectivity-monitor.log
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@e2e-tests/cypress/support/page-objects/IndexManagement.ts`:
- Around line 113-114: The current spinner waits assume the loader will both
appear and then be hidden, which flaps when loading has already finished or the
element is removed; replace the two strict gets with a resilient check against
the '[data-test-subj="sectionLoading"]' element: attempt to get it with a
timeout, then branch on its existence/visibility so the helper passes if the
loader is already absent, if it appears then waits until it is not
visible/removed, and if it is present but hidden immediately treat that as
success; target the selector '[data-test-subj="sectionLoading"]' when
implementing this logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4dacbc3c-dec7-4a95-943f-11b47c2c9076
📒 Files selected for processing (2)
e2e-tests/cypress/e2e/Tenancy.cy.tse2e-tests/cypress/support/page-objects/IndexManagement.ts
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
.github/workflows/trigger-e2e-tests.yml (2)
64-68: ES connectivity monitor log is not included in S3 artifact uploads.The "Stop ES connectivity monitor" step prints the log to stdout only. The subsequent "S3 Upload Videos & show logs" step (on failure) uploads videos and Docker logs, but the ES connectivity log at
/tmp/es-connectivity-monitor.logis not explicitly included. If you want to correlate ES connectivity loss with test failures, consider having.github/upload-videos(or a dedicated step) pick up/tmp/es-connectivity-monitor.log.Also applies to: 125-129, 186-190
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/trigger-e2e-tests.yml around lines 64 - 68, The ES connectivity monitor's log (/tmp/es-connectivity-monitor.log) is only printed to stdout in the "Stop ES connectivity monitor" step and isn't included in the artifacts; update the upload step (the "S3 Upload Videos & show logs" step or the reusable action .github/upload-videos) to explicitly collect and upload /tmp/es-connectivity-monitor.log (or add a dedicated step after the "Stop ES connectivity monitor" step to upload that file to S3/artifacts), ensuring the file path is added to the list of uploaded logs so ES connectivity traces are available alongside videos and Docker logs.
39-42: Consider explicitly passinges_url,es_user, andes_passwordto make configuration explicit rather than relying on action defaults.The workflow currently relies on the monitor's default values (https://localhost:9200, kibana/kibana credentials), which do match the port mappings in both Docker and ECK test environments. However, explicitly passing these values would improve clarity and reduce fragility if future test infrastructure changes the port or credentials. This is not currently causing issues, but makes the configuration harder to audit.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/trigger-e2e-tests.yml around lines 39 - 42, The ES connectivity monitor step ("Start ES connectivity monitor" using ./.github/es-connectivity-monitor) currently relies on the action's defaults; update that step to explicitly pass the es_url, es_user, and es_password inputs (e.g., set with: es_url: "http://localhost:9200", es_user: "kibana", es_password: "kibana" or the appropriate test creds) so configuration is explicit and not dependent on action defaults; ensure the input names match the action's input parameters and keep the existing with: action: start entry.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/es-connectivity-monitor/action.yml:
- Around line 33-35: The curl run step currently interpolates inputs with `${{
inputs.es_user }}`, `${{ inputs.es_password }}` and `${{ inputs.es_url }}`
directly into the shell command (the `run:` block) which is unsafe; update the
step to bind each input to environment variables (e.g., ES_USER, ES_PASSWORD,
ES_URL) via the step's `env:` and then reference them inside the `run:` script
as `$ES_USER`, `$ES_PASSWORD`, and `$ES_URL`; apply the same change to the other
similar curl steps (those using the same `${{ inputs.* }}` patterns on lines
noted) to avoid command injection and accidental secret leakage.
- Line 45: The workflow currently hardcodes the Docker Elasticsearch target URL
("https://es-ror:9200/_cluster/health?pretty") which isn't configurable; add a
new optional input (e.g. es_docker_host with default "es-ror") to action.yml and
replace the hardcoded host in the probe step with the constructed URL using that
input (e.g. https://${{ inputs.es_docker_host }}:9200/_cluster/health?pretty) so
the Docker probe becomes configurable and consistent with the existing es_url
input; ensure input name matches casing used elsewhere and update any references
to es_ror/es-ror to the new input name.
---
Nitpick comments:
In @.github/workflows/trigger-e2e-tests.yml:
- Around line 64-68: The ES connectivity monitor's log
(/tmp/es-connectivity-monitor.log) is only printed to stdout in the "Stop ES
connectivity monitor" step and isn't included in the artifacts; update the
upload step (the "S3 Upload Videos & show logs" step or the reusable action
.github/upload-videos) to explicitly collect and upload
/tmp/es-connectivity-monitor.log (or add a dedicated step after the "Stop ES
connectivity monitor" step to upload that file to S3/artifacts), ensuring the
file path is added to the list of uploaded logs so ES connectivity traces are
available alongside videos and Docker logs.
- Around line 39-42: The ES connectivity monitor step ("Start ES connectivity
monitor" using ./.github/es-connectivity-monitor) currently relies on the
action's defaults; update that step to explicitly pass the es_url, es_user, and
es_password inputs (e.g., set with: es_url: "http://localhost:9200", es_user:
"kibana", es_password: "kibana" or the appropriate test creds) so configuration
is explicit and not dependent on action defaults; ensure the input names match
the action's input parameters and keep the existing with: action: start entry.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 015c56f8-d984-43b7-9067-2264c7bbafc4
📒 Files selected for processing (2)
.github/es-connectivity-monitor/action.yml.github/workflows/trigger-e2e-tests.yml
| curl -sk -u "${{ inputs.es_user }}:${{ inputs.es_password }}" \ | ||
| "${{ inputs.es_url }}/_cluster/health?pretty" \ | ||
| -o /tmp/es-health.json -w "HTTP %{http_code} in %{time_total}s\n" 2>&1 || true |
There was a problem hiding this comment.
Avoid direct ${{ inputs.* }} interpolation into shell run: blocks — use env: vars instead.
${{ }} expressions in composite-action run: blocks are handled via variable interpolation before shell execution, so embedding them directly is vulnerable to command injection if a caller ever passes a value containing shell metacharacters.
Additionally, the raw input values are written into the runner's temporary shell script on disk before execution, meaning es_password can appear in runner debug traces.
The safe pattern — per GitHub's own composite-action documentation — is to bind inputs to env: variables on each step and reference them as $ENV_VAR inside the shell script.
🔒 Proposed fix — bind credentials via env:
- name: Start monitor
if: inputs.action == 'start'
shell: bash
+ env:
+ ES_USER: ${{ inputs.es_user }}
+ ES_PASSWORD: ${{ inputs.es_password }}
+ ES_URL: ${{ inputs.es_url }}
run: |
(
while true; do
echo "===== $(date -Is) ====="
echo "== ES connectivity from host =="
- curl -sk -u "${{ inputs.es_user }}:${{ inputs.es_password }}" \
- "${{ inputs.es_url }}/_cluster/health?pretty" \
+ curl -sk -u "${ES_USER}:${ES_PASSWORD}" \
+ "${ES_URL}/_cluster/health?pretty" \
-o /tmp/es-health.json -w "HTTP %{http_code} in %{time_total}s\n" 2>&1 || true
cat /tmp/es-health.json 2>/dev/null || true
echo
echo "== ES connectivity from KBN Docker container =="
KBN_CONTAINER=$(docker ps --filter "name=kbn" --format "{{.Names}}" 2>/dev/null | head -1)
if [ -n "$KBN_CONTAINER" ]; then
echo "Container: $KBN_CONTAINER"
docker exec "$KBN_CONTAINER" \
- curl -sk -u "${{ inputs.es_user }}:${{ inputs.es_password }}" \
+ curl -sk -u "${ES_USER}:${ES_PASSWORD}" \
https://es-ror:9200/_cluster/health?pretty \
-w "HTTP %{http_code} in %{time_total}s\n" 2>&1 || true
...
if [ -n "$ES_SVC" ]; then
kubectl exec -n "$KBN_NS" "$KBN_NAME" -- \
- curl -sk -u "${{ inputs.es_user }}:${{ inputs.es_password }}" \
+ curl -sk -u "${ES_USER}:${ES_PASSWORD}" \
"https://${ES_SVC}:9200/_cluster/health?pretty" \Also applies to: 44-46, 64-66
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/es-connectivity-monitor/action.yml around lines 33 - 35, The curl
run step currently interpolates inputs with `${{ inputs.es_user }}`, `${{
inputs.es_password }}` and `${{ inputs.es_url }}` directly into the shell
command (the `run:` block) which is unsafe; update the step to bind each input
to environment variables (e.g., ES_USER, ES_PASSWORD, ES_URL) via the step's
`env:` and then reference them inside the `run:` script as `$ES_USER`,
`$ES_PASSWORD`, and `$ES_URL`; apply the same change to the other similar curl
steps (those using the same `${{ inputs.* }}` patterns on lines noted) to avoid
command injection and accidental secret leakage.
| echo "Container: $KBN_CONTAINER" | ||
| docker exec "$KBN_CONTAINER" \ | ||
| curl -sk -u "${{ inputs.es_user }}:${{ inputs.es_password }}" \ | ||
| https://es-ror:9200/_cluster/health?pretty \ |
There was a problem hiding this comment.
Hardcoded Docker ES hostname es-ror:9200 is not configurable.
The Docker probe target (https://es-ror:9200) is baked in while the host probe target is already an input (es_url). If the internal Docker service name ever changes, this probe silently becomes a no-op (swallowed by || true). Consider exposing it as an optional input (e.g. es_docker_host, defaulting to es-ror) for consistency and future-proofing.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/es-connectivity-monitor/action.yml at line 45, The workflow
currently hardcodes the Docker Elasticsearch target URL
("https://es-ror:9200/_cluster/health?pretty") which isn't configurable; add a
new optional input (e.g. es_docker_host with default "es-ror") to action.yml and
replace the hardcoded host in the probe step with the constructed URL using that
input (e.g. https://${{ inputs.es_docker_host }}:9200/_cluster/health?pretty) so
the Docker probe becomes configurable and consistent with the existing es_url
input; ensure input name matches casing used elsewhere and update any references
to es_ror/es-ror to the new input name.
|
Nothing to merge here. The fix was tested and works well |
Summary by CodeRabbit
Tests
Chores