vmware: don't use redundant worker VM to extract volume#3218
vmware: don't use redundant worker VM to extract volume#3218DaanHoogland merged 1 commit intoapache:masterfrom
Conversation
This fixes the issue that VM with VMsnapshots fails to start after extract volume is done on a stopped VM, on VMware. Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
|
@blueorangutan package |
|
@borisstoyanov a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
|
Packaging result: ✔centos6 ✔centos7 ✔debian. JID-2639 |
|
@blueorangutan test |
|
@borisstoyanov a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests |
|
@blueorangutan test |
|
@borisstoyanov a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests |
DaanHoogland
left a comment
There was a problem hiding this comment.
stylish comments only
| s_logger.error(msg); | ||
| throw new Exception(msg); | ||
| } | ||
| clonedVm.exportVm(exportPath, exportName, false, false); //Note: volss: not to create ova. |
There was a problem hiding this comment.
how about unifying both calls by assigning the clonedVm or vmMo to a local var, so it is obvious where the real work is done in both cases?
There was a problem hiding this comment.
It's because, if we need to clone i.e. require a worker VM clonedVm, then in finally we need to detach and destroy that worker VM. This is why we cannot refactor them to use a single variable.
There was a problem hiding this comment.
ok, as said it is a matter of style but we have that boolean to decide: clonedWorkerVMNeeded
| @@ -1014,16 +1018,19 @@ private Pair<String, String> copyVolumeToSecStorage(VmwareHostService hostServic | |||
| String datastoreVolumePath = getVolumePathInDatastore(dsMo, volumePath + ".vmdk", searchExcludedFolders); | |||
| workerVm.attachDisk(new String[] {datastoreVolumePath}, morDs); | |||
| vmMo = workerVm; | |||
There was a problem hiding this comment.
if we move clonedWorkerVMNeeded outside the try workerVm is not needed.
There was a problem hiding this comment.
Same reason as before, using different instances/variable, it's easier to track is we need to destroy the worker vm after we're done. If we re-use vmMo, we might accidently destroy the user-vm.
| if (vmMo != null && vmMo.getSnapshotMor(exportName) != null) { | ||
| vmMo.removeSnapshot(exportName, false); | ||
| } | ||
| if (workerVm != null) { |
There was a problem hiding this comment.
same here as in VmwareStorageManagerImpl.java#1020; if we move clonedWorkerVMNeeded outside the try block we don't need this extra workerVm
| Pair<VirtualMachineMO, String[]> cloneResult = | ||
| vmMo.cloneFromCurrentSnapshot(workerVmName, 0, 4, diskDevice, VmwareHelper.getDiskDeviceDatastore(volumeDeviceInfo.first())); | ||
| clonedVm = cloneResult.first(); | ||
| clonedVm.exportVm(exportPath, exportName, false, false); |
There was a problem hiding this comment.
also here, it would be nice to see just one call for doing the 'real' work.
There was a problem hiding this comment.
In the current mechanism, we need a worker VM to take snapshot and export the disk; without at least one worker VM I'm not sure how we can export template/disk?
borisstoyanov
left a comment
There was a problem hiding this comment.
LGTM, manually verified this.
|
@DaanHoogland I'm not exactly sure how you want us to refactor the changes, we'll need at least one worker VM to extract/snapshot the disks to a single vmdk file because the disk in question may be linked-one (multiple files). If we do figure out a clever way out, I'll raise another PR without blocking this, if that's okay. |
|
I am not 👎 and was about to merge, @rhtyd . |
|
actually ask if merge is ok (tag [WIP] still in place) |
|
yes I think so, if @rhtyd doesn't have something in mind |
|
Thanks @DaanHoogland |
|
@blueorangutan package |
|
@borisstoyanov a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
|
Packaging result: ✔centos6 ✔centos7 ✔debian. JID-2886 |
borisstoyanov
left a comment
There was a problem hiding this comment.
LGTM
test-results.xlsx
Problem: A VM fails to start, which has a VM snapshot after performing the extractVolume operation on the stopped VM.
Root Cause: While downloading a volume of a stopped VM, it creates a worker VM with the volume attached and it creates a temporary snapshot, and then it creates another worker VM cloned from that snapshot and exports an OVA file and deletes the cloned worker VM. Afterward, it deletes the initial worker VM and the temporary snapshot which merges any previous snapshot(s) to the main disk. So, while starting the VM, VMware is not able to find the snapshot disk.
Solution: Export the volume using a single worker VM (when VM is stopped) without creating a temporary snapshot.
Upgrade Notes: After the upgrade, the admin should replace the live systemvm ISO file in all secondary storage pools with the new systemvm.iso on the management server (keeping the original name which the original systemvm.iso had on secondary storage). Finally, admin should destroy all the old SSVMs.
Types of changes