Skip to content

CKS: fix creation on shared network if HA is enabled#8588

Merged
JoaoJandre merged 1 commit intoapache:4.18from
weizhouapache:4.18-fix-cks-ha-shared-network
Sep 26, 2024
Merged

CKS: fix creation on shared network if HA is enabled#8588
JoaoJandre merged 1 commit intoapache:4.18from
weizhouapache:4.18-fix-cks-ha-shared-network

Conversation

@weizhouapache
Copy link
Member

Description

This PR fixes #8585

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • build/CI

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

@codecov
Copy link

codecov bot commented Feb 1, 2024

Codecov Report

Attention: Patch coverage is 0% with 1 line in your changes missing coverage. Please review.

Project coverage is 13.16%. Comparing base (b34f093) to head (ee0abd9).
Report is 86 commits behind head on 4.18.

Files with missing lines Patch % Lines
...bernetes/cluster/KubernetesClusterManagerImpl.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               4.18    #8588      +/-   ##
============================================
- Coverage     13.16%   13.16%   -0.01%     
+ Complexity     9203     9201       -2     
============================================
  Files          2724     2724              
  Lines        258087   258087              
  Branches      40223    40223              
============================================
- Hits          33987    33984       -3     
- Misses       219793   219797       +4     
+ Partials       4307     4306       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@DaanHoogland
Copy link
Contributor

@weizhouapache is this still valid?

@weizhouapache
Copy link
Member Author

@weizhouapache is this still valid?

@DaanHoogland
it is valid. But need more changes

@yadvr yadvr added this to the 4.19.1.0 milestone Apr 30, 2024
Copy link
Member

@yadvr yadvr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - needs testing

Copy link
Contributor

@sureshanaparti sureshanaparti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clgtm

@weizhouapache weizhouapache modified the milestones: 4.19.1.0, 4.19.2 Jun 24, 2024
@kiranchavala
Copy link
Member

@blueorangutan package

@blueorangutan
Copy link

@kiranchavala a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@kiranchavala kiranchavala self-assigned this Sep 23, 2024
@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 11178

Copy link
Member

@kiranchavala kiranchavala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weizhouapache , i am getting the following exception when I provide a external loadbalancer IP during cks cluster creation in a shared network

Screenshot 2024-09-24 at 12 54 51 PM

@weizhouapache
Copy link
Member Author

@weizhouapache , i am getting the following exception when I provide a external loadbalancer IP during cks cluster creation in a shared network

Screenshot 2024-09-24 at 12 54 51 PM

because 10.0.0.96.6 is not valid address
try 10.0.96.6 😄

@kiranchavala
Copy link
Member

@weizhouapache , i am getting the following exception when I provide a external loadbalancer IP during cks cluster creation in a shared network
Screenshot 2024-09-24 at 12 54 51 PM

because 10.0.0.96.6 is not valid address try 10.0.96.6 😄

@weizhouapache , i am getting the following exception when I provide a external loadbalancer IP during cks cluster creation in a shared network
Screenshot 2024-09-24 at 12 54 51 PM

because 10.0.0.96.6 is not valid address try 10.0.96.6 😄

My bad should check my eye sight :-)

@weizhouapache
Copy link
Member Author

@kiranchavala
are you testing it ?
please note, you need to setup the lb (nginx/haproxy) for the external IP

it does not work in advanced zone with security groups.
some ports need to be opened.

Copy link
Member

@kiranchavala kiranchavala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Tested using nginx configuration as Loadbalancer

Able to deploy a cks cluster on a shared network with HA option enabled

sample nginx config



[root@centos7 ~]# cat nginx.conf
error_log stderr notice;
worker_processes auto;
events {
    multi_accept on;
    use epoll;
    worker_connections 1024;
}
stream {
    upstream cloudstack {
        server 10.1.17.89:6443;
        server 10.1.17.90:6443 backup;
        server 10.1.17.86:6443 backup;
    }
    server {
        listen 0.0.0.0:6443;
        proxy_pass cloudstack;
        proxy_timeout 10m;
        proxy_connect_timeout 1s;
    }
    upstream control-1 {
        server 10.1.17.89:22;
    }
    server {
        listen 0.0.0.0:2222;
        proxy_pass control-1;
        proxy_timeout 10m;
        proxy_connect_timeout 1s;
    }
    upstream control-2 {
        server 10.1.17.90:22;
    }
    server {
        listen 0.0.0.0:2223;
        proxy_pass control-2;
        proxy_timeout 10m;
        proxy_connect_timeout 1s;
    }
    upstream control-3 {
        server 10.1.17.86:22;
    }
    server {
        listen 0.0.0.0:2224;
        proxy_pass control-3;
        proxy_timeout 10m;
        proxy_connect_timeout 1s;
    }
}

if there is no external IP is given exception is thrown

Screenshot 2024-09-25 at 12 33 29 PM

@weizhouapache
Copy link
Member Author

great, thanks for the testing @kiranchavala

@weizhouapache weizhouapache marked this pull request as ready for review September 25, 2024 12:38
@JoaoJandre
Copy link
Contributor

@blueorangutan package

@blueorangutan
Copy link

@JoaoJandre a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 11202

Copy link
Contributor

@DaanHoogland DaanHoogland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clgtm

@DaanHoogland
Copy link
Contributor

@blueorangutan test securityGroups

@blueorangutan
Copy link

@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@DaanHoogland
Copy link
Contributor

@blueorangutan test

@blueorangutan
Copy link

@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian test result (tid-11549)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 54370 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8588-t11549-kvm-ol8.zip
Smoke tests completed. 108 look OK, 3 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
ContextSuite context=TestISOUsage>:setup Error 0.00 test_usage.py
test_01_migrate_VM_and_root_volume Error 79.28 test_vm_life_cycle.py
test_02_migrate_VM_with_two_data_disks Error 49.86 test_vm_life_cycle.py
test_08_migrate_vm Error 47.21 test_vm_life_cycle.py
test_02_redundant_VPC_default_routes Failure 361.56 test_vpc_redundant.py
test_03_create_redundant_VPC_1tier_2VMs_2IPs_2PF_ACL_reboot_routers Failure 432.01 test_vpc_redundant.py

@JoaoJandre
Copy link
Contributor

Merging based on reviews, manual testing (#8588 (review)) and CI results, the failures shown in CI are not related to this PR

@JoaoJandre JoaoJandre merged commit bb820f7 into apache:4.18 Sep 26, 2024
JoaoJandre added a commit that referenced this pull request Sep 26, 2024
* 4.18:
  CKS: fix creation on shared network if HA is enabled (#8588)
@blueorangutan
Copy link

[SF] Trillian test result (tid-11548)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 78570 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8588-t11548-kvm-ol8.zip
Smoke tests completed. 57 look OK, 54 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_01_events_resource Error 2.92 test_events_resource.py
test_DeleteDomain Failure 97.15 test_accounts.py
test_forceDeleteDomain Failure 100.17 test_accounts.py
test_dedicateGuestVlanRange Error 0.00 test_guest_vlan_range.py
ContextSuite context=TestDedicateGuestVlanRange>:teardown Error 0.00 test_guest_vlan_range.py
test_01_internallb_roundrobin_1VPC_3VM_HTTP_port80 Failure 3.92 test_internal_lb.py
test_02_internallb_roundrobin_1RVPC_3VM_HTTP_port80 Failure 3.95 test_internal_lb.py
test_03_vpc_internallb_haproxy_stats_on_all_interfaces Failure 3.80 test_internal_lb.py
test_04_rvpc_internallb_haproxy_stats_on_all_interfaces Failure 4.91 test_internal_lb.py
test_01_create_ipv6_public_ip_range Error 0.04 test_ipv6_infra.py
test_04_verify_guest_lspci Error 803.61 test_deploy_virtio_scsi_vm.py
test_06_verify_guest_lspci_again Error 803.52 test_deploy_virtio_scsi_vm.py
ContextSuite context=TestLoadBalance>:setup Error 0.00 test_loadbalance.py
test_01_ping_in_vr_success Failure 0.03 test_diagnostics.py
test_02_ping_in_vr_failure Failure 0.02 test_diagnostics.py
test_07_arping_in_vr Failure 0.03 test_diagnostics.py
test_10_traceroute_in_vr Failure 0.02 test_diagnostics.py
test_13_retrieve_vr_default_files Failure 0.02 test_diagnostics.py
test_14_retrieve_vr_one_file Failure 0.03 test_diagnostics.py
test_01_native_to_native_network_migration Error 4.04 test_migration.py
test_02_native_to_native_vpc_migration Error 6.37 test_migration.py
test_01_deploy_vm_from_direct_download_template_nfs_storage Error 5.53 test_direct_download.py
ContextSuite context=TestDirectDownloadTemplates>:teardown Error 1.10 test_direct_download.py
test_network_acl Error 2.36 test_network_acl.py
test_03_create_network_domain_network_offering Error 7.14 test_domain_network_offerings.py
ContextSuite context=TestIpv6Network>:setup Error 0.00 test_network_ipv6.py
test_03_create_vpc_domain_vpc_offering Error 8.20 test_domain_vpc_offerings.py
test_10_vpc_tier_kubernetes_cluster Error 3.54 test_kubernetes_clusters.py
ContextSuite context=TestNetworkPermissions>:setup Error 0.00 test_network_permissions.py
test_delete_account Error 91.80 test_network.py
test_delete_network_while_vm_on_it Error 8.68 test_network.py
test_delete_network_while_vm_on_it Error 8.68 test_network.py
test_deploy_vm_l2network Error 7.62 test_network.py
test_deploy_vm_l2network Error 7.62 test_network.py
test_l2network_restart Error 9.85 test_network.py
test_l2network_restart Error 9.85 test_network.py
ContextSuite context=TestL2Networks>:teardown Error 10.99 test_network.py
test_01_port_fwd_on_src_nat Failure 0.03 test_network.py
test_02_port_fwd_on_non_src_nat Error 0.04 test_network.py
ContextSuite context=TestPublicIP>:setup Error 5.48 test_network.py
test_reboot_router Error 162.63 test_network.py
test_releaseIP Error 42.51 test_network.py
test_network_rules_acquired_public_ip_1_static_nat_rule Error 0.05 test_network.py
test_network_rules_acquired_public_ip_2_nat_rule Error 0.04 test_network.py
test_network_rules_acquired_public_ip_3_Load_Balancer_Rule Error 0.05 test_network.py
test_01_nic Error 44.24 test_nic.py
test_extendPhysicalNetworkVlan Error 0.06 test_non_contigiousvlan.py
ContextSuite context=TestNonStrictAffinityGroups>:setup Error 0.00 test_nonstrict_affinity_group.py
ContextSuite context=TestIsolatedNetworksPasswdServer>:setup Error 0.00 test_password_server.py
test_01_isolated_persistent_network Error 0.06 test_persistent_network.py
ContextSuite context=TestPortablePublicIPAcquire>:setup Error 0.00 test_portable_publicip.py
test_01_create_delete_portforwarding_fornonvpc Error 2.95 test_portforwardingrules.py
test_01_vpc_privategw_acl Failure 5.42 test_privategw_acl.py
test_02_vpc_privategw_static_routes Failure 5.96 test_privategw_acl.py
test_03_vpc_privategw_restart_vpc_cleanup Failure 5.23 test_privategw_acl.py
test_04_rvpc_privategw_static_routes Failure 4.83 test_privategw_acl.py
test_dedicatePublicIpRange Error 0.03 test_public_ip_range.py
test_dedicate_public_ip_range_for_system_vms Error 0.02 test_public_ip_range.py
test_dedicate_public_ip_range_for_system_vms_01_ssvm Error 0.10 test_public_ip_range.py
test_dedicate_public_ip_range_for_system_vms_02_cpvm Error 0.11 test_public_ip_range.py
test_create_pvlan_network Error 0.08 test_pvlan.py
test_CRUD_operations_userdata Error 2.71 test_register_userdata.py
test_deploy_vm_with_registered_userdata Error 2.90 test_register_userdata.py
test_deploy_vm_with_registered_userdata_with_override_policy_allow Error 2.63 test_register_userdata.py
test_deploy_vm_with_registered_userdata_with_override_policy_append Error 2.58 test_register_userdata.py
test_deploy_vm_with_registered_userdata_with_override_policy_deny Error 2.57 test_register_userdata.py
test_deploy_vm_with_registered_userdata_with_params Error 2.53 test_register_userdata.py
test_link_and_unlink_userdata_to_template Error 2.61 test_register_userdata.py
test_user_userdata_crud Error 2.67 test_register_userdata.py
ContextSuite context=TestIsolatedNetworks>:setup Error 0.00 test_routers_network_ops.py
ContextSuite context=TestRedundantIsolateNetworks>:setup Error 0.00 test_routers_network_ops.py
ContextSuite context=TestResetVmOnReboot>:setup Error 0.00 test_reset_vm_on_reboot.py
ContextSuite context=TestRAMCPUResourceAccounting>:setup Error 0.00 test_resource_accounting.py
ContextSuite context=TestRouterServices>:setup Error 0.00 test_routers.py
ContextSuite context=TestRouterDHCPHosts>:setup Error 0.00 test_router_dhcphosts.py
ContextSuite context=TestRouterDHCPOpts>:setup Error 0.00 test_router_dhcphosts.py
ContextSuite context=TestRouterDns>:setup Error 0.00 test_router_dns.py
ContextSuite context=TestRouterDnsService>:setup Error 0.00 test_router_dnsservice.py
ContextSuite context=TestRouterIpTablesPolicies>:setup Error 0.00 test_routers_iptables_default_policy.py
ContextSuite context=TestVPCIpTablesPolicies>:setup Error 0.00 test_routers_iptables_default_policy.py
test_01_sys_vm_start Failure 0.15 test_secondary_storage.py
ContextSuite context=TestCpuCapServiceOfferings>:setup Error 0.00 test_service_offerings.py
ContextSuite context=TestServiceOfferings>:setup Error 0.28 test_service_offerings.py
ContextSuite context=TestSnapshotRootDisk>:setup Error 0.00 test_snapshots.py
ContextSuite context=TestSnapshotStandaloneBackup>:setup Error 0.00 test_snapshots.py
test_01_list_sec_storage_vm Failure 0.05 test_ssvm.py
test_02_list_cpvm_vm Failure 0.05 test_ssvm.py
test_03_ssvm_internals Failure 0.06 test_ssvm.py
test_04_cpvm_internals Failure 0.05 test_ssvm.py
test_05_stop_ssvm Failure 0.05 test_ssvm.py
test_06_stop_cpvm Failure 0.07 test_ssvm.py
test_07_reboot_ssvm Failure 0.05 test_ssvm.py
test_08_reboot_cpvm Failure 0.04 test_ssvm.py
test_09_reboot_ssvm_forced Failure 0.07 test_ssvm.py
test_10_reboot_cpvm_forced Failure 0.05 test_ssvm.py
test_11_destroy_ssvm Failure 0.05 test_ssvm.py
test_12_destroy_cpvm Failure 0.05 test_ssvm.py
ContextSuite context=TestVMWareStoragePolicies>:setup Error 0.00 test_storage_policy.py
test_02_create_template_with_checksum_sha1 Error 65.63 test_templates.py
test_03_create_template_with_checksum_sha256 Error 65.58 test_templates.py
test_04_create_template_with_checksum_md5 Error 65.58 test_templates.py
test_05_create_template_with_no_checksum Error 65.63 test_templates.py
test_02_deploy_vm_from_direct_download_template Error 1.30 test_templates.py
ContextSuite context=TestCreateTemplateWithDirectDownload>:teardown Error 17.10 test_templates.py
ContextSuite context=TestTemplates>:setup Error 23.67 test_templates.py
ContextSuite context=TestISOUsage>:setup Error 0.00 test_usage.py
ContextSuite context=TestLBRuleUsage>:setup Error 0.00 test_usage.py
ContextSuite context=TestNatRuleUsage>:setup Error 0.00 test_usage.py
ContextSuite context=TestPublicIPUsage>:setup Error 0.00 test_usage.py
ContextSuite context=TestSnapshotUsage>:setup Error 0.00 test_usage.py
ContextSuite context=TestVmUsage>:setup Error 0.00 test_usage.py
ContextSuite context=TestVolumeUsage>:setup Error 0.00 test_usage.py
ContextSuite context=TestVpnUsage>:setup Error 0.00 test_usage.py
ContextSuite context=TestVmAutoScaling>:setup Error 0.00 test_vm_autoscaling.py
test_01_deploy_vm_on_specific_host Error 1.33 test_vm_deployment_planner.py
test_02_deploy_vm_on_specific_cluster Error 1.32 test_vm_deployment_planner.py
test_03_deploy_vm_on_specific_pod Error 1.35 test_vm_deployment_planner.py
test_04_deploy_vm_on_host_override_pod_and_cluster Error 1.39 test_vm_deployment_planner.py
test_05_deploy_vm_on_cluster_override_pod Error 1.34 test_vm_deployment_planner.py
ContextSuite context=TestDeployVM>:setup Error 0.00 test_vm_life_cycle.py
test_01_migrate_VM_and_root_volume Error 1.36 test_vm_life_cycle.py
test_02_migrate_VM_with_two_data_disks Error 1.41 test_vm_life_cycle.py
test_01_secure_vm_migration Error 235.41 test_vm_life_cycle.py
test_01_secure_vm_migration Error 235.41 test_vm_life_cycle.py
ContextSuite context=TestVMLifeCycle>:setup Error 3.84 test_vm_life_cycle.py
ContextSuite context=TestVmSnapshot>:setup Error 3.90 test_vm_snapshots.py
ContextSuite context=TestCreateVolume>:setup Error 0.00 test_volumes.py
test_01_root_volume_encryption Error 1.32 test_volumes.py
test_02_data_volume_encryption Error 1.34 test_volumes.py
test_03_root_and_data_volume_encryption Error 1.37 test_volumes.py
ContextSuite context=TestVolumes>:setup Error 31.02 test_volumes.py
ContextSuite context=TestIpv6Vpc>:setup Error 0.00 test_vpc_ipv6.py
ContextSuite context=TestVPCRedundancy>:setup Error 0.00 test_vpc_redundant.py
ContextSuite context=TestVPCNics>:setup Error 0.00 test_vpc_router_nics.py
ContextSuite context=TestRVPCSite2SiteVpn>:setup Error 0.00 test_vpc_vpn.py
ContextSuite context=TestVPCSite2SiteVPNMultipleOptions>:setup Error 0.00 test_vpc_vpn.py
ContextSuite context=TestVpcRemoteAccessVpn>:setup Error 0.00 test_vpc_vpn.py
ContextSuite context=TestVpcSite2SiteVpn>:setup Error 0.00 test_vpc_vpn.py
test_02_cancel_host_maintenace_with_migration_jobs Error 1.61 test_host_maintenance.py
test_03_cancel_host_maintenace_with_migration_jobs_failure Error 1.68 test_host_maintenance.py
test_01_cancel_host_maintenance_ssh_enabled_agent_connected Failure 12.62 test_host_maintenance.py
test_03_cancel_host_maintenance_ssh_disabled_agent_connected Failure 15.67 test_host_maintenance.py
test_04_cancel_host_maintenance_ssh_disabled_agent_disconnected Failure 32.50 test_host_maintenance.py
ContextSuite context=TestHostMaintenanceAgents>:teardown Error 33.64 test_host_maintenance.py
test_disable_oobm_ha_state_ineligible Error 1516.49 test_hostha_kvm.py

@DaanHoogland DaanHoogland deleted the 4.18-fix-cks-ha-shared-network branch September 26, 2024 19:42
dhslove pushed a commit to ablecloud-team/ablestack-cloud that referenced this pull request Oct 14, 2024
dhslove pushed a commit to ablecloud-team/ablestack-cloud that referenced this pull request Oct 14, 2024
* 4.18:
  CKS: fix creation on shared network if HA is enabled (apache#8588)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CKS: Fail to create CKS cluster with HA enabled on Shared networks

7 participants