vmware: don't use redundant worker VM to extract volume by yadvr · Pull Request #3218 · apache/cloudstack

yadvr · 2019-03-12T10:31:35Z

Problem: A VM fails to start, which has a VM snapshot after performing the extractVolume operation on the stopped VM.
Root Cause: While downloading a volume of a stopped VM, it creates a worker VM with the volume attached and it creates a temporary snapshot, and then it creates another worker VM cloned from that snapshot and exports an OVA file and deletes the cloned worker VM. Afterward, it deletes the initial worker VM and the temporary snapshot which merges any previous snapshot(s) to the main disk. So, while starting the VM, VMware is not able to find the snapshot disk.
Solution: Export the volume using a single worker VM (when VM is stopped) without creating a temporary snapshot.
Upgrade Notes: After the upgrade, the admin should replace the live systemvm ISO file in all secondary storage pools with the new systemvm.iso on the management server (keeping the original name which the original systemvm.iso had on secondary storage). Finally, admin should destroy all the old SSVMs.

Types of changes

Breaking change (fix or feature that would cause existing functionality to change)
New feature (non-breaking change which adds functionality)
Bug fix (non-breaking change which fixes an issue)
Enhancement (improves an existing feature and functionality)
Cleanup (Code refactoring and cleanup, that may add test cases)

This fixes the issue that VM with VMsnapshots fails to start after extract volume is done on a stopped VM, on VMware. Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>

borisstoyanov · 2019-03-22T09:16:25Z

@blueorangutan package

blueorangutan · 2019-03-22T09:16:38Z

@borisstoyanov a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

blueorangutan · 2019-03-22T10:21:38Z

Packaging result: ✔centos6 ✔centos7 ✔debian. JID-2639

borisstoyanov · 2019-03-25T11:53:22Z

@blueorangutan test

blueorangutan · 2019-03-25T11:53:55Z

@borisstoyanov a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

borisstoyanov · 2019-03-26T08:34:47Z

@blueorangutan test

blueorangutan · 2019-03-26T08:34:57Z

@borisstoyanov a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

DaanHoogland

stylish comments only

DaanHoogland · 2019-03-26T12:34:33Z

...isors/vmware/src/main/java/com/cloud/hypervisor/vmware/manager/VmwareStorageManagerImpl.java

+                    s_logger.error(msg);
+                    throw new Exception(msg);
+                }
+                clonedVm.exportVm(exportPath, exportName, false, false);  //Note: volss: not to create ova.


how about unifying both calls by assigning the clonedVm or vmMo to a local var, so it is obvious where the real work is done in both cases?

It's because, if we need to clone i.e. require a worker VM clonedVm, then in finally we need to detach and destroy that worker VM. This is why we cannot refactor them to use a single variable.

ok, as said it is a matter of style but we have that boolean to decide: clonedWorkerVMNeeded

DaanHoogland · 2019-03-26T12:38:31Z

...isors/vmware/src/main/java/com/cloud/hypervisor/vmware/manager/VmwareStorageManagerImpl.java

@@ -1014,16 +1018,19 @@ private Pair<String, String> copyVolumeToSecStorage(VmwareHostService hostServic
                String datastoreVolumePath = getVolumePathInDatastore(dsMo, volumePath + ".vmdk", searchExcludedFolders);
                workerVm.attachDisk(new String[] {datastoreVolumePath}, morDs);
                vmMo = workerVm;


if we move clonedWorkerVMNeeded outside the try workerVm is not needed.

Same reason as before, using different instances/variable, it's easier to track is we need to destroy the worker vm after we're done. If we re-use vmMo, we might accidently destroy the user-vm.

DaanHoogland · 2019-03-26T12:43:04Z

plugins/hypervisors/vmware/src/main/java/com/cloud/storage/resource/VmwareStorageProcessor.java

+            if (vmMo != null && vmMo.getSnapshotMor(exportName) != null) {
+                vmMo.removeSnapshot(exportName, false);
+            }
            if (workerVm != null) {


same here as in VmwareStorageManagerImpl.java#1020; if we move clonedWorkerVMNeeded outside the try block we don't need this extra workerVm

DaanHoogland · 2019-03-26T12:44:30Z

plugins/hypervisors/vmware/src/main/java/com/cloud/storage/resource/VmwareStorageProcessor.java

+                Pair<VirtualMachineMO, String[]> cloneResult =
+                        vmMo.cloneFromCurrentSnapshot(workerVmName, 0, 4, diskDevice, VmwareHelper.getDiskDeviceDatastore(volumeDeviceInfo.first()));
+                clonedVm = cloneResult.first();
+                clonedVm.exportVm(exportPath, exportName, false, false);


also here, it would be nice to see just one call for doing the 'real' work.

In the current mechanism, we need a worker VM to take snapshot and export the disk; without at least one worker VM I'm not sure how we can export template/disk?

borisstoyanov

LGTM, manually verified this.

yadvr · 2019-05-23T09:27:08Z

@DaanHoogland I'm not exactly sure how you want us to refactor the changes, we'll need at least one worker VM to extract/snapshot the disks to a single vmdk file because the disk in question may be linked-one (multiple files). If we do figure out a clever way out, I'll raise another PR without blocking this, if that's okay.

DaanHoogland · 2019-05-23T09:28:30Z

I am not 👎 and was about to merge, @rhtyd .

DaanHoogland · 2019-05-23T09:29:36Z

actually ask if merge is ok (tag [WIP] still in place)

borisstoyanov · 2019-05-23T11:07:17Z

yes I think so, if @rhtyd doesn't have something in mind

yadvr · 2019-05-23T12:19:07Z

Thanks @DaanHoogland

borisstoyanov · 2019-06-13T07:13:02Z

@blueorangutan package

blueorangutan · 2019-06-13T07:13:27Z

@borisstoyanov a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

blueorangutan · 2019-06-13T07:33:27Z

Packaging result: ✔centos6 ✔centos7 ✔debian. JID-2886

borisstoyanov

LGTM
test-results.xlsx

vmware: don't use redundant worker VM to extract volume

2acbaa5

This fixes the issue that VM with VMsnapshots fails to start after extract volume is done on a stopped VM, on VMware. Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>

yadvr added the component:vmware label Mar 12, 2019

yadvr added this to the 4.13.0.0 milestone Mar 12, 2019

yadvr requested review from PaulAngus, mike-tutkowski and nvazquez March 25, 2019 08:24

yadvr added the status:work-in-progress label Mar 25, 2019

DaanHoogland approved these changes Mar 26, 2019

View reviewed changes

borisstoyanov approved these changes Mar 26, 2019

View reviewed changes

yadvr changed the title ~~vmware: don't use redundant worker VM to extract volume~~ [WIP DO NOT MERGE] vmware: don't use redundant worker VM to extract volume Mar 29, 2019

yadvr requested a review from DagSonsteboSB May 10, 2019 18:15

DaanHoogland changed the title ~~[WIP DO NOT MERGE] vmware: don't use redundant worker VM to extract volume~~ vmware: don't use redundant worker VM to extract volume May 23, 2019

DaanHoogland merged commit 4f35639 into apache:master May 23, 2019

yadvr removed the status:work-in-progress label May 23, 2019

borisstoyanov reviewed Jun 17, 2019

View reviewed changes

Conversation

yadvr commented Mar 12, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Types of changes

Uh oh!

borisstoyanov commented Mar 22, 2019

Uh oh!

blueorangutan commented Mar 22, 2019

Uh oh!

blueorangutan commented Mar 22, 2019

Uh oh!

borisstoyanov commented Mar 25, 2019

Uh oh!

blueorangutan commented Mar 25, 2019

Uh oh!

borisstoyanov commented Mar 26, 2019

Uh oh!

blueorangutan commented Mar 26, 2019

Uh oh!

DaanHoogland left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

borisstoyanov left a comment

Choose a reason for hiding this comment

Uh oh!

yadvr commented May 23, 2019

Uh oh!

DaanHoogland commented May 23, 2019

Uh oh!

DaanHoogland commented May 23, 2019

Uh oh!

borisstoyanov commented May 23, 2019

Uh oh!

yadvr commented May 23, 2019

Uh oh!

borisstoyanov commented Jun 13, 2019

Uh oh!

blueorangutan commented Jun 13, 2019

Uh oh!

blueorangutan commented Jun 13, 2019

Uh oh!

borisstoyanov left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yadvr commented Mar 12, 2019 •

edited

Loading