Skip to content

ubuntu16: fix three issues with ubuntu 16.04 hosts#3227

Merged
yadvr merged 3 commits intoapache:masterfrom
ustcweizhou:4.11.2-ubuntu16-issues
May 5, 2019
Merged

ubuntu16: fix three issues with ubuntu 16.04 hosts#3227
yadvr merged 3 commits intoapache:masterfrom
ustcweizhou:4.11.2-ubuntu16-issues

Conversation

@ustcweizhou
Copy link
Contributor

Description

This fixes three issues with ubuntu 16.04 hosts

  1. fix unable to add host if cloudbrX is not configured
  2. Stop service libvirt-bin.socket while add a host
  3. Diable libvirt default network

Details can be found in each commit.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

@yadvr
Copy link
Member

yadvr commented Mar 14, 2019

@blueorangutan package

@blueorangutan
Copy link

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

if (network.isActive() == 1) {
network.destroy();
}
if (network.getAutostart()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we destroy the network and set autostart to false, would that cause an issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhtyd We have destroyed the default network and did not see any issue in our production.
it seems to break packaging/debian/cloudstack-agent.init, however, as /etc/init.d/cloudstack-agent is not in use any more after systemd changes, it should not be a problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still do support initd with centos6 packages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhtyd
I checked the code again, the packaging/debian/cloudstack-agent.init will be packaged in ubuntu/debs and deployed on ubuntu servers as /etc/init.d/cloudstack-agent.
Since we do not support ubuntu 12.04 any more , and systemd is used in ubuntu 14.04 and later, I think this change will not break other functions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@blueorangutan
Copy link

Packaging result: ✔centos6 ✔centos7 ✔debian. JID-2632

@yadvr
Copy link
Member

yadvr commented Mar 14, 2019

@blueorangutan test

@blueorangutan
Copy link

@rhtyd a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link

Trillian test result (tid-3421)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 26752 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr3227-t3421-kvm-centos7.zip
Intermittent failure detected: /marvin/tests/smoke/test_host_maintenance.py
Smoke tests completed. 68 look OK, 0 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File

@wido
Copy link
Contributor

wido commented Mar 18, 2019

I have never seen these issues, but I can understand why they happen.

LGTM

@yadvr yadvr changed the base branch from 4.11 to master May 4, 2019 15:15
@yadvr
Copy link
Member

yadvr commented May 4, 2019

@ustcweizhou can you rebase to latest master? We've only deprecated older Ubuntu versions recently, but not in 4.11.

while add a ubuntu16.04 host with native eth0 (cloudbrX is not configured),
the operation failed and I got the following error in /var/log/cloudstack/agent/setup.log

```
DEBUG:root:execute:ifconfig eth0
DEBUG:root:[Errno 2] No such file or directory
  File "/usr/lib/python2.7/dist-packages/cloudutils/serviceConfig.py", line 38, in configration
    result = self.config()
  File "/usr/lib/python2.7/dist-packages/cloudutils/serviceConfig.py", line 211, in config
    super(networkConfigUbuntu, self).cfgNetwork()
  File "/usr/lib/python2.7/dist-packages/cloudutils/serviceConfig.py", line 108, in cfgNetwork
    device = self.netcfg.getDefaultNetwork()
  File "/usr/lib/python2.7/dist-packages/cloudutils/networkConfig.py", line 53, in getDefaultNetwork
    pdi = networkConfig.getDevInfo(dev)
  File "/usr/lib/python2.7/dist-packages/cloudutils/networkConfig.py", line 157, in getDevInfo
    elif networkConfig.isBridge(dev) or networkConfig.isOvsBridge(dev):
```

The issue is caused by commit 9c7cd8c
2017-09-19 16:45 Sigert Goeminne ● CLOUDSTACK-10081: CloudUtils getDevInfo function will now return "bridge" instead o
service libvirt-bin.socket will be started when add a ubuntu 16.04 host
DEBUG:root:execute:sudo /usr/sbin/service libvirt-bin start

However, libvirt-bin service will be broken by it after restarting
Stopping service libvirt-bin.socket will fix the issue.

An example is given as below.

```
root@node32:~# /etc/init.d/libvirt-bin restart
[ ok ] Restarting libvirt-bin (via systemctl): libvirt-bin.service.
root@node32:~# virsh list
error: failed to connect to the hypervisor
error: no valid connection
error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory

root@node32:~# systemctl stop libvirt-bin.socket

root@node32:~# /etc/init.d/libvirt-bin restart
[ ok ] Restarting libvirt-bin (via systemctl): libvirt-bin.service.
root@node32:~# virsh list
 Id    Name                           State
----------------------------------------------------

```
By default, libvirt will create default network virbr0 on kvm hypervisors.
If vm uses the same ip range 192.168.122.0/24, there will be some issues.

In some cases, if we run tcpdump inside vm, we will see the ip of kvm hypervisor as source ip.
@ustcweizhou ustcweizhou force-pushed the 4.11.2-ubuntu16-issues branch from 1e90215 to fd8fff4 Compare May 4, 2019 18:12
@ustcweizhou
Copy link
Contributor Author

@rhtyd thanks for review. rebased with latest master.

@yadvr
Copy link
Member

yadvr commented May 4, 2019

Thanks Wei
@blueorangutan package

@blueorangutan
Copy link

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔centos6 ✔centos7 ✔debian. JID-2732

@yadvr
Copy link
Member

yadvr commented May 4, 2019

@blueorangutan test

@blueorangutan
Copy link

@rhtyd a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link

Trillian test result (tid-3549)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 34765 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr3227-t3549-kvm-centos7.zip
Intermittent failure detected: /marvin/tests/smoke/test_internal_lb.py
Intermittent failure detected: /marvin/tests/smoke/test_vpc_redundant.py
Intermittent failure detected: /marvin/tests/smoke/test_vpc_vpn.py
Intermittent failure detected: /marvin/tests/smoke/test_hostha_kvm.py
Smoke tests completed. 68 look OK, 2 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
test_05_rvpc_multi_tiers Failure 408.79 test_vpc_redundant.py
test_05_rvpc_multi_tiers Error 433.55 test_vpc_redundant.py
test_hostha_enable_ha_when_host_in_maintenance Error 304.45 test_hostha_kvm.py

@yadvr yadvr merged commit 3729511 into apache:master May 5, 2019
@yadvr yadvr added this to the 4.13.0.0 milestone May 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants