Skip to content

controller/vm: add unit tests for VM Controller state machine#2634

Open
shreyanshjain7174 wants to merge 1 commit intomicrosoft:mainfrom
shreyanshjain7174:controller/vm-unit-tests
Open

controller/vm: add unit tests for VM Controller state machine#2634
shreyanshjain7174 wants to merge 1 commit intomicrosoft:mainfrom
shreyanshjain7174:controller/vm-unit-tests

Conversation

@shreyanshjain7174
Copy link
Copy Markdown
Contributor

@shreyanshjain7174 shreyanshjain7174 commented Mar 18, 2026

Depends on #2627 — must merge first.

Adds unit tests for the VM Controller state machine (internal/controller/vm/). To make the Manager testable, two field types are changed to their existing interfaces (vmmanager.LifetimeManager, combined GuestManager). Mocks are generated with mockgen following the internal/windows/mock/ pattern.

Tests are biased toward production failure paths — resource cleanup, state corruption, and half-started scenarios — rather than guard-check permutations.

30 test cases covering: TerminateVM transitions (idempotency, Created/Running→Terminated, Close fail→Invalid, Invalid recovery, Terminate() error swallowed with Close still called, CloseConnection() error swallowed with Close still called), CreateVM duplicate-call rejection, StartVM failure paths (uvm.Start fails→Invalid, listener/connection failure→Invalid, idempotency when already Running, wrong-state rejection), ExecIntoHost (terminal+stderr precondition, success, exec error), DumpStacks (supported, not supported, dump error), StartTime/ExitStatus for relevant states, representative guard checks (one per method for realistic wrong-state calls), and State.String() for all values. All tests run without admin or HCS.

@shreyanshjain7174 shreyanshjain7174 force-pushed the controller/vm-unit-tests branch from eeccba4 to 687d39e Compare March 20, 2026 06:32
@shreyanshjain7174 shreyanshjain7174 force-pushed the controller/vm-unit-tests branch from 687d39e to ef82816 Compare March 20, 2026 06:53
@shreyanshjain7174 shreyanshjain7174 marked this pull request as ready for review March 20, 2026 07:13
@shreyanshjain7174 shreyanshjain7174 requested a review from a team as a code owner March 20, 2026 07:13
@shreyanshjain7174 shreyanshjain7174 force-pushed the controller/vm-unit-tests branch 3 times, most recently from e8cacd2 to 4718b32 Compare March 24, 2026 14:28
@shreyanshjain7174 shreyanshjain7174 force-pushed the controller/vm-unit-tests branch from 4718b32 to 09f8fa8 Compare March 30, 2026 09:25
@shreyanshjain7174 shreyanshjain7174 force-pushed the controller/vm-unit-tests branch 2 times, most recently from e179fe0 to aba0d5f Compare April 6, 2026 11:23
@shreyanshjain7174 shreyanshjain7174 force-pushed the controller/vm-unit-tests branch 4 times, most recently from 3d5fad6 to fbae1c6 Compare April 29, 2026 05:22
@shreyanshjain7174 shreyanshjain7174 force-pushed the controller/vm-unit-tests branch 2 times, most recently from 35ea6c5 to a470ed1 Compare April 29, 2026 18:05
Defines per-platform vmLifetime and guestManager seam interfaces over
*vmmanager.UtilityVM and *guestmanager.Guest. Each interface is a superset
of the methods Controller invokes directly plus the device methods consumed
by the wired-in sub-controllers (vpci/network/plan9/scsi). *vmmanager and
*guestmanager satisfy these via Go structural typing; no concrete sister
fields needed.

The per-platform split is necessary because:
  - AddPlan9 / RemovePlan9 are LCOW-only on *vmmanager.UtilityVM.
  - Guest network signatures differ across LCOW and WCOW.
  - vPCI guest and Plan9 guest methods are LCOW-only.
  - UpdateHvSocketAddress is WCOW-only.

Guest() uses a type assertion to preserve its concrete return type.

30 tests covering:

Idempotency (3): duplicate StartVM no-op, TerminateVM from Terminated
  and NotCreated.

State guards (8): StartVM rejects NotCreated/Terminated/Invalid,
  ExecIntoHost rejects Terminated, terminal+stderr input validation,
  Update methods reject non-Running (4 subtests), Stats rejects
  Terminated, Wait rejects NotCreated, DumpStacks rejects Terminated,
  UpdateCPUGroup rejects empty ID.

TerminateVM cleanup chain (6): full Terminate->CloseConnection->Close
  ordering, guest close failure swallowed, uvm.Terminate failure
  swallowed, uvm.Close failure -> StateInvalid, recovery from Invalid,
  retry failure stays Invalid.

Error propagation (4): ExecIntoHost guest error, DumpStacks guest error,
  UpdatePolicyFragment guest error, ExitStatus not-terminated error.

API delegation (4): ExecIntoHost exit code forwarding, DumpStacks
  with and without capability, ExitStatus value assembly.

Wait + background goroutine (5): Wait from Terminated with log drain,
  Wait context cancelled during log drain, Wait Running happy path,
  waitForVMExit Running->Terminated transition, waitForVMExit
  overwrites Invalid->Terminated (pins current behaviour).

Mocks generated per-platform (mocks/mock_lcow_vm.go, mock_wcow_vm.go)
with matching //go:build constraints.

Signed-off-by: Shreyansh Sancheti <shsancheti@microsoft.com>
@shreyanshjain7174 shreyanshjain7174 force-pushed the controller/vm-unit-tests branch from a470ed1 to 1611cc8 Compare April 30, 2026 05:30
"github.com/Microsoft/go-winio/pkg/guid"
)

// vmLifetime and guestManager are defined per-platform in vm_lcow.go and
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this comment. The idea identified here is implicit for V2 shims and therefore, this just adds noise.


// vmLifetime is the LCOW-flavoured seam over [*vmmanager.UtilityVM].
// AddPlan9 / RemovePlan9 are LCOW-only on the host side, hence the per-platform split.
type vmLifetime interface {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename this to utilityVM


// vmLifetime is the WCOW-flavoured seam over [*vmmanager.UtilityVM].
// WCOW has no host-side Plan9, hence the per-platform split.
type vmLifetime interface {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do not need to declare the interface for each variant.
You can use approach similar to how it's done here-

platformControllers //nolint:unused,nolintlint // embedded for cross-platform compatibility; empty on WCOW

Idea is keep the common APIs in types.go and then platform specific ones in vm_lcow and vm_wcow.
types.go

type utilityVM interface {
	ID() string
	RuntimeID() guid.GUID
	Start(ctx context.Context) error
	Wait(ctx context.Context) error
	Terminate(ctx context.Context) error
	Close(ctx context.Context) error
	SetCPUGroup(ctx context.Context, settings *hcsschema.CpuGroup) error
	UpdateCPULimits(ctx context.Context, settings *hcsschema.ProcessorLimits) error
	UpdateMemory(ctx context.Context, memory uint64) error
	PropertiesV2(ctx context.Context, types ...hcsschema.PropertyType) (*hcsschema.Properties, error)
	StartedTime() time.Time
	StoppedTime() time.Time
	ExitError() error

	AddDevice(ctx context.Context, vmbusGUID guid.GUID, settings hcsschema.VirtualPciDevice) error
	RemoveDevice(ctx context.Context, vmbusGUID guid.GUID) error
	AddNIC(ctx context.Context, nicID string, settings *hcsschema.NetworkAdapter) error
	RemoveNIC(ctx context.Context, nicID string, settings *hcsschema.NetworkAdapter) error
	AddSCSIDisk(ctx context.Context, disk hcsschema.Attachment, controller uint, lun uint) error
	RemoveSCSIDisk(ctx context.Context, controller uint, lun uint) error

    // Platform specific APIs
    platformUVM
}

type guestManager interface {
	CreateConnection(ctx context.Context, gcsServiceID guid.GUID, opts ...guestmanager.ConfigOption) error
	CloseConnection() error
	AddSecurityPolicy(ctx context.Context, opts guestresource.ConfidentialOptions) error
	InjectPolicyFragment(ctx context.Context, fragment guestresource.SecurityPolicyFragment) error
	Capabilities() gcs.GuestDefinedCapabilities
	DumpStacks(ctx context.Context) (string, error)
	ExecIntoUVM(ctx context.Context, request *cmd.CmdProcessRequest) (int, error)

    // Platform specific APIs
    platformGuestManager 
}

vm_lcow.go

type platformUVM interface {
    // LCOW-specific methods
	AddPlan9(ctx context.Context, settings hcsschema.Plan9Share) error
	RemovePlan9(ctx context.Context, settings hcsschema.Plan9Share) error
}

type platformGuestManager interface {
    // LCOW-specific methods
	AddVPCIDevice(ctx context.Context, settings guestresource.LCOWMappedVPCIDevice) error
	AddNetworkInterface(ctx context.Context, settings *guestresource.LCOWNetworkAdapter) error
	RemoveNetworkInterface(ctx context.Context, settings *guestresource.LCOWNetworkAdapter) error
	AddMappedDirectory(ctx context.Context, settings guestresource.LCOWMappedDirectory) error
	RemoveMappedDirectory(ctx context.Context, settings guestresource.LCOWMappedDirectory) error
	AddMappedVirtualDisk(ctx context.Context, settings guestresource.LCOWMappedVirtualDisk) error
	RemoveMappedVirtualDisk(ctx context.Context, settings guestresource.LCOWMappedVirtualDisk) error
	RemoveSCSIDevice(ctx context.Context, settings guestresource.SCSIDevice) error
}

vm_wcow.go

type platformUVM interface {
}

type platformGuestManager interface {
	// WCOW-specific methods
	UpdateHvSocketAddress(ctx context.Context, address *hcsschema.HvSocketAddress) error
	AddNetworkInterface(ctx context.Context, adapterID string, requestType guestrequest.RequestType, settings *hcn.HostComputeEndpoint) error
	RemoveNetworkInterface(ctx context.Context, adapterID string, requestType guestrequest.RequestType, settings *hcn.HostComputeEndpoint) error
	AddNetworkNamespace(ctx context.Context, settings *hcn.HostComputeNamespace) error
	RemoveNetworkNamespace(ctx context.Context, settings *hcn.HostComputeNamespace) error
	AddMappedVirtualDisk(ctx context.Context, settings guestresource.WCOWMappedVirtualDisk) error
	AddMappedVirtualDiskForContainerScratch(ctx context.Context, settings guestresource.WCOWMappedVirtualDisk) error
	RemoveMappedVirtualDisk(ctx context.Context, settings guestresource.WCOWMappedVirtualDisk) error
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second thoughts- Please see the uber review comment.


// guestManager is the LCOW-flavoured seam over [*guestmanager.Guest].
type guestManager interface {
CreateConnection(ctx context.Context, gcsServiceID guid.GUID, opts ...guestmanager.ConfigOption) error
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rebase on current main. There is one more Guest API- PrepareConnection.

Copy link
Copy Markdown
Contributor

@rawahars rawahars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interface segregation can be simplified. You can possibly cherry-pick rawahars@1ca1a08

This refactors the VM controller to use interfaces.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants