Skip to content

Conversation

@Scotchman0
Copy link
Member

@Scotchman0 Scotchman0 commented Nov 25, 2025

Update aro-dnsmasq-pre.sh to support more than one search domains - update AWK to exclude first line and include the rest, rather than than only print column 2, as well as update search domain list entry at aro-dns.conf to enable multiple search terms.

Which issue this PR addresses:

Issue/Fixes: Azure/ARO-RP#4485
Issue/Fixes: https://issues.redhat.com/browse/ARO-22870 (this includes detailed testing/implementation and validation steps for this change as well and is accessible externally)

What this PR does / why we need it:

Problem: ARO clusters cannot utilize more than one search cluster-wide which is the default cluster.domain generated entry. If any search in the list for a manual network configuration is injected that is NOT this cluster.domain address this value will override the search parameter for the cluster, breaking traffic flow. cluster.domain is required for healthy lookups, but secondary domains may be needed for streamlined traffic handling.

This change adjusts the awk logic in selecting the search terms from "Select the first search entry in the list from the cloned /etc/resolv.conf.dnsmasq file before building the permanent /etc/resolv.conf file" to enable this value to get set with any/all search domains specified at install time (or day-2).. Additionally the aro-dns.conf file requires a list type to function with multiple values, and so the iteration must be used to facilitiate multiple value injects.

Test plan for issue:

How did you test that this PR works?

  • Cloned the script out, modified the target files from /etc/resolv.conf.dnsmasq to /tmp/resolv.conf.dnsmasq and /tmp/resolv.conf to avoid modifying live cluster config + executed. Validated the results include the test search parameters injected into /etc/NetworkManager/system-connections/eth0.nmconnection which specify additional test domains.

  • After validating, modified the live /usr/local/bin/aro-dnsmasq-pre.sh script build after resetting /etc/resolv.conf from /etc/resolv.conf.dnsmasq and re-executing the script.

  • Subsequent tests involved updates to /etc/NetworkManager/system-connections/eth0.nmconnection to set values explicitly + restarting hosts to enable flow. (This makes machine-config unhappy because on-disk state has changed but this is inconsequential for testing).

  • Are there unit tests?
    No

  • Are there integration/e2e tests?
    No

  • If it is not possible to write automated tests, explain why and document how
    to manually test and verify the feature.

The change can be tested easily by making the adjustments on the existing target script:
/usr/local/bin/aro-dnsmasq-pre.sh and either comment out the cp -Z /etc/resolv.conf /etc/resolv.conf.dnsmasq line or copy the /etc/resolv.conf.dnsmasq file back over /etc/resolv.conf to start so that we reset our template back to defaults, then simply run the script:

/usr/local/bin/aro-dnsmasq-pre.sh which will update /etc/NetworkManager/conf.d/aro-dns.conf and subsequently reload /etc/resolv.conf which will now reflect the desired multi-search parameter config.

Is there any documentation that needs to be updated for this PR?

This is part of an effort to review/consider updating search domains within ARO - it is likely a docs update will be needed after review.

How do you know this will function as expected in production?

  • limited local testing is promising, review and vetting is required.

Update aro-dnsmasq-pre.sh to support more than one search domains - update AWK to exclude first line and include the rest, rather than than only print column 2, as well as update search domain list entry at aro-dns.conf to enable multiple search terms.
@openshift-ci openshift-ci bot requested review from rogbas and tsatam November 25, 2025 14:04
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 25, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Scotchman0
Once this PR has been reviewed and has the lgtm label, please assign hawkowl for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Scotchman0
Copy link
Member Author

/test unit

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 25, 2025

@Scotchman0: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/unit b211c4f link true /test unit

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@Scotchman0
Copy link
Member Author

The Unit test I believe is invalid for this change.

The change appears to be getting rejected because it doesn't match expected state:

assert.Equal(t, expectedIgnitionFileContents[file], string(fileContents), "missing storage data in file %v", file)

        	Error Trace:	/go/src/github.com/openshift/installer-aro-wrapper/pkg/installer/custominstallconfig_test.go:385
        	            				/go/src/github.com/openshift/installer-aro-wrapper/pkg/installer/custominstallconfig_test.go:331
        	Error:      	Not equal: 
        	            	expected: "\n#!/bin/bash\nset -euo pipefail\n\n# This bash script is a part of the ARO DnsMasq configuration\n# It's deployed as part of the 99-aro-dns-* machine config\n# See https://github.com/Azure/ARO-RP\n\n# This file can be rerun and the effect is idempotent, output might change if the DHCP configuration changes\n\nNODEIP=$(/sbin/ip --json route get 168.63.129.16 | /bin/jq -r \".[].prefsrc\")\n\nif [ \"$NODEIP\" != \"\" ]; then\n    /bin/cp -Z /etc/resolv.conf /etc/resolv.conf.dnsmasq\n    SEARCHDOMAIN=$(awk '/^search/ { print $2; }' /etc/resolv.conf.dnsmasq)\n    /bin/chmod 0744 /etc/resolv.conf.dnsmasq\n\n    cat <<EOF | /bin/tee /etc/NetworkManager/conf.d/aro-dns.conf\n# Added by dnsmasq.service\n[global-dns]\nsearches=$SEARCHDOMAIN\n\n[global-dns-domain-*]\nservers=$NODEIP\nEOF\n\n    # network manager may already be running at this point.\n    # reload to update /etc/resolv.conf with this configuration\n    /usr/bin/nmcli general reload conf\n    /usr/bin/nmcli general reload dns-rc\nfi\n"
        	            	actual  : "\n#!/bin/bash\nset -euo pipefail\n\n# This bash script is a part of the ARO DnsMasq configuration\n# It's deployed as part of the 99-aro-dns-* machine config\n# See https://github.com/Azure/ARO-RP\n\n# This file can be rerun and the effect is idempotent, output might change if the DHCP configuration changes\n\nNODEIP=$(/sbin/ip --json route get 168.63.129.16 | /bin/jq -r \".[].prefsrc\")\n\nif [ \"$NODEIP\" != \"\" ]; then\n    /bin/cp -Z /etc/resolv.conf /etc/resolv.conf.dnsmasq\n    SEARCHDOMAIN=$(awk '/^search/ { $1=\"\"; print $0 }' /etc/resolv.conf.dnsmasq)\n    /bin/chmod 0744 /etc/resolv.conf.dnsmasq\n\n    cat <<EOF | /bin/tee /etc/NetworkManager/conf.d/aro-dns.conf\n# Added by dnsmasq.service\n[global-dns]\nsearches=$(for i in ${SEARCHDOMAIN}; do echo -n ${i},; done)\n\n[global-dns-domain-*]\nservers=$NODEIP\nEOF\n\n    # network manager may already be running at this point.\n    # reload to update /etc/resolv.conf with this configuration\n    /usr/bin/nmcli general reload conf\n    /usr/bin/nmcli general reload dns-rc\nfi\n"
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -14,3 +14,3 @@
        	            	     /bin/cp -Z /etc/resolv.conf /etc/resolv.conf.dnsmasq
        	            	-    SEARCHDOMAIN=$(awk '/^search/ { print $2; }' /etc/resolv.conf.dnsmasq)
        	            	+    SEARCHDOMAIN=$(awk '/^search/ { $1=""; print $0 }' /etc/resolv.conf.dnsmasq)
        	            	     /bin/chmod 0744 /etc/resolv.conf.dnsmasq
        	            	@@ -20,3 +20,3 @@
        	            	 [global-dns]
        	            	-searches=$SEARCHDOMAIN
        	            	+searches=$(for i in ${SEARCHDOMAIN}; do echo -n ${i},; done)
        	            	 
        	Test:       	TestApplyInstallConfigCustomisations
        	Messages:   	missing storage data in file /usr/local/bin/aro-dnsmasq-pre.sh

This change modifies the ignition files that we are using to check integrity of the platform during the test and so I submit this unit test is invalid and should be ignored. Thank you!

@hlipsig
Copy link
Contributor

hlipsig commented Nov 26, 2025

Hi Thanks for this PR! We're working on fully deprecating and removing DNSMasq entirely in favor of Openshift functionality. That will land in ARO for installs at 4.21. If you feel strongly that we need this meanwhile let's have a meeting to discuss as the implications are non-trivial for the ARO team.

@hlipsig hlipsig added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ARO does not support more than 1 search domain cluster-wide

2 participants