-
Notifications
You must be signed in to change notification settings - Fork 140
Add vhost-user support with RNG device implementation #527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
6600bb5 to
f9fe527
Compare
mtjhrc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a look at the tests, and honestly I don't see much value in them, we don't have any arbitrary coverage percent metric for merging stuff, so I would just remove most of them.
The rest of the code looks good, but I've only had a quick look so far, when this was still a draft, I'll try running this and have another look later.
Having vhost-user support in libkrun seems pretty cool, thanks!
f9fe527 to
7d18fa0
Compare
|
Not really a review, just a comment: @dorindabassey This is some great work and a huge addition to the project. @slp passt uses vhost-user to provide near-native network performance in user mode for QEMU. Perhaps we should consider doing the same here, especially as the main network driver for the v2 API. |
Right, that would be great to have! It's mostly about throughput and latency, but also, perhaps, one day, to implement live migration of TCP connections, as it's only available via vhost-user interface. I'm not sure how much effort networking support would take on top of this pull request, and whether it's beyond its scope or not, but let me share a couple of pointers just in case. The part of API that's implemented by passt is this: https://passt.top/passt/tree/vhost_user.c?id=af7b81b5408da8c56bb22dd11679f2b4024a45c8#n1128 And there's a bit of documentation about how it can be used with QEMU here: https://www.qemu.org/docs/master/system/devices/net.html#using-passt-as-the-user-mode-network-stack Let me know if you have any question! I'll also tag @vivier, the author of the vhost-user implementation in passt, for good measure. |
It's fine for this PR to stand as-is. This can be used as a base for another series adding the network implementation. |
include/libkrun.h
Outdated
| * -ENOENT - Context doesn't exist | ||
| * -ENOTSUP - vhost-user support not compiled in | ||
| */ | ||
| int32_t krun_set_vhost_user_rng(uint32_t ctx_id, const char *socket_path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With possible extensions in mind, is there any way this can be generalized to something like:
int32_t krun_add_vhost_user_device(uint32_t ctx_id, ...);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think this is a good point actually. For things like virtio-rng, and the mentioned virtio-net (passt) I think this should be rather simple - basically just configuring the number and sizes of queues and the device ID actually.
(some things would be more tricky to generalize, I can think of virtio-gpu, any device that requires shared memory regions ( that is being worked on: rust-vmm/vhost#339 ).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can define a struct vhost_user_params that is essentially a generic pointer to some data based on the device ID? This can provide the API extensibility needed for eventually adding more complicated devices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm currently refactoring it to do something like this:
int32_t krun_add_vhost_user_device(uint32_t ctx_id,
krun_vhost_user_device_type device_type,
const char *socket_path,
uint16_t num_queues,
uint16_t queue_size);
For RNG specifically, you'd call it like: krun_add_vhost_user_device(ctx, KRUN_VHOST_USER_DEVICE_RNG, "/path/to/rng.sock", 0, 0);
The 0s would use defaults (1 queue, 256 size for RNG).
Other devices like sound, net, can, etc. could be added later with device-specific APIs as needed, for more complex device like gpu we could extend this with a void *device_config parameter for device-specific config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm currently refactoring it to do something like this:
int32_t krun_add_vhost_user_device(uint32_t ctx_id, krun_vhost_user_device_type device_type, const char *socket_path, uint16_t num_queues, uint16_t queue_size);
I think it would be nice to support configuring different queue size for each queue. So just have uint16_t *queue_sizes I guess.
Another thing I would consider adding is a const char *name that would correspond to the VirtioDevice::device_name - seems like it would be helpful for debugging (we currently just use it for logging).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah make sense, I'll add those Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably remove the default (1 queue, 256 size) values and just make it required for the user to configure the virtqueues explicitly.
krun_add_vhost_user_device(ctx, KRUN_VHOST_USER_DEVICE_RNG, "/path/to/rng.sock", 1, 256);
If they submit something like:
krun_add_vhost_user_device(ctx, KRUN_VHOST_USER_DEVICE_RNG, "/path/to/rng.sock", 0, 0);
It should fail with something like Invalid virtqueue configuration or something of the sort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice to support configuring different queue size for each queue. So just have uint16_t *queue_sizes I guess.
I've implemented the queue_sizes array change as suggested, so each queue can have a different size.
Another thing I would consider adding is a const char *name that would correspond to the VirtioDevice::device_name - seems like it would be helpful for debugging (we currently just use it for logging).
Done! Added the name parameter (NULL means auto-generate from device_type)
I'd probably remove the default (1 queue, 256 size) values and just make it required for the user to configure the virtqueues explicitly.
I'd rather keep the defaults (0, NULL = use device defaults) reasons being
- Most users won't know what queue configuration to use. For simple devices like
rng, forcing them to specifykrun_add_vhost_user_device(ctx, KRUN_VHOST_USER_DEVICE_RNG, socket, 1, sizes)means they need to know RNG uses 1 queue, and that 256 is a reasonable queue size, and also have to write boilerplate for every device. - When we add more device types e.g (sound, CAN), each will have different default queue configs, adding these in documentation users would need to look up and specify the correct values for each device type in their code.
The current API supports both use cases - simple with defaults, and advanced with full control. @tylerfanelli WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've implemented the queue_sizes array change as suggested, so each queue can have a different size.
👍
Done! Added the name parameter (NULL means auto-generate from device_type)
👍
Most users won't know what queue configuration to use. For simple devices like rng, forcing them to specify krun_add_vhost_user_device(ctx, KRUN_VHOST_USER_DEVICE_RNG, socket, 1, sizes) means they need to know RNG uses 1 queue, and that 256 is a reasonable queue size, and also have to write boilerplate for every device.
I agree that having user know that is a bit annoying (and e.g. QEMU doesn't force the user to know that). However I quite dislike using 0 for the number of queues and NULL for sizes. Instead I would much prefer to #define the suggested defaults in the header.
This way its quite easy for the user to see what the default value is without studying the docs or even worse having to look it up in libkrun source code. I think this is a reasonable middle ground.
#define KRUN_DEVICE_RNG_NUM_QUEUES 1
#define KRUN_DEVICE_RNG_DEFAULT_QUEUE_SIZES (uint16_t[]){256}
krun_add_vhost_user_device(ctx,
KRUN_VHOST_USER_DEVICE_RNG,
"vhost-user-rng",
socket,
KRUN_DEVICE_RNG_NUM_QUEUES,
KRUN_DEVICE_RNG_DEFAULT_QUEUE_SIZES
);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to @mtjhrc's suggestion. With that, the user still has to explicitly configure the number of queues and their size, but they are still "known" default values that the user doesn't need prior knowledge of. I agree that is a good middle ground.
Adding support for virtio-net over vhost doesn't require any changes to any of our networking code. Our All the user would have to do to enable it is:
|
9cfb1ea to
fe4e9ba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we switched to a generic API for attaching vhost-user devices, at this point I think we I would remove the specific VhostUserRng altogether. Instead just instantiate the VhostUserDevice with the arguments obtained from user.
While it's possible we might eventually need some device specific front-ends, for devices that don't require anything special this just introduces unnecessary boilerplate.
| typedef enum { | ||
| KRUN_VHOST_USER_DEVICE_RNG = 4, | ||
| KRUN_VHOST_USER_DEVICE_SND = 25, | ||
| KRUN_VHOST_USER_DEVICE_CAN = 36, | ||
| } krun_vhost_user_device_type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Up to this point we have avoided using C enums in the API.
Technically they don't have a very stable ABI, the size of the backing integer is up to the compiler and compiler options - this could be anything from uint8_t to int64_t ( though unlikely, and typically its just a 32-bit "int") .
I would suggest to stay on the safe side and just #define the values
Btw.:
C++, C23 and Rust allow you to specify the backing integer type for enums, but that's not an option for us
| match device_config.device_type { | ||
| VIRTIO_ID_RNG => { | ||
| let rng = Arc::new(Mutex::new( | ||
| devices::virtio::VhostUserRng::new( | ||
| &device_config.socket_path, | ||
| device_config.name.clone(), | ||
| device_config.num_queues, | ||
| &device_config.queue_sizes, | ||
| ) | ||
| .map_err(|e| RegisterRngDevice(device_manager::mmio::Error::VhostUserDevice(e)))?, | ||
| )); | ||
|
|
||
| let id = String::from(rng.lock().unwrap().id()); | ||
| attach_mmio_device(vmm, id, intc.clone(), rng).map_err(RegisterRngDevice)?; | ||
| } | ||
| VIRTIO_ID_SND | VIRTIO_ID_CAN => { | ||
| // TODO: Implement when vhost-user sound/CAN device wrappers are added | ||
| return Err(RegisterRngDevice( | ||
| device_manager::mmio::Error::VhostUserDevice(std::io::Error::new( | ||
| std::io::ErrorKind::Unsupported, | ||
| format!( | ||
| "Vhost-user device type {} not yet implemented", | ||
| device_config.device_type | ||
| ), | ||
| )), | ||
| )); | ||
| } | ||
| _ => { | ||
| return Err(RegisterRngDevice( | ||
| device_manager::mmio::Error::VhostUserDevice(std::io::Error::new( | ||
| std::io::ErrorKind::InvalidInput, | ||
| format!( | ||
| "Unknown vhost-user device type {}", | ||
| device_config.device_type | ||
| ), | ||
| )), | ||
| )); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say we shouldn't care what the VIRTIO_ID_* is at all except for generating the string name for the device, and we should just pass along whatever the user gives us to a generic vhost-user device wrapper.
If we don't know what the device ID is for and user didn't provide a name, I would just generate a name for the device - something like "vhost-user-uknown-40" (where 40 is the user provided id).
Also, I think we should just allow unknown/potentially invalid device IDs to be passed - the user might want to run a custom kernel in the guest which understands their custom device and I don't see a good reason to prevent them.
Implement vhost-user support for connecting to external virtio device backends running in separate processes. Add vhost-user feature flag, vhost dependency, and krun_add_vhost_user_device() generalized API for adding vhost-user devices. Signed-off-by: Dorinda Bassey <dbassey@redhat.com>
Add memfd-backed memory region creation to enable memory sharing with vhost-user backends via FD passing. When vhost-user is enabled, all guest RAM regions are created with memfd backing instead of anonymous mmap. This lays the groundwork for vhost-user device support while maintaining backward compatibility such that the VM boots normally with standard memory when vhost-user is not configured. Signed-off-by: Dorinda Bassey <dbassey@redhat.com>
| if protocol_features.contains(VhostUserProtocolFeatures::MQ) { | ||
| our_protocol_features |= VhostUserProtocolFeatures::MQ; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked up the spec and I feel like this is not fully correct. Our frontend advertises that it supports VHOST_USER_PROTOCOL_F_MQ, that is fine. But that feature means the frontend should ask the backend for the number of queues (VhostUserFrontend::get_queue_num), problem is that the backend might not support this feature so we can't rely on this.
This opens up some problems:
- if the user provided
0for the number of queues then that's easy I think - we should create however many queues the device requests (if the device doesn't support this feature than this should be an error as we can't know...) - if the user said the device should have e.g. 2 queues and the device says there should be e.g. 3, then I guess this should be an error? Maybe we could just create
3queues andlog::warn!, but that's a bit sketchy - I'm not sure what should be the sizes of the created queues. This doesn't neatly fit into the user providing an array. Does anyone have some suggestion how to tackle that?
Implement generic vhost-user device wrapper with connection, feature negotiation, and Guest physical address(GPA) to Virtual address(VA) translation. Supports protocol feature negotiation (CONFIG, MQ) and forwards backend interrupts to guest. Signed-off-by: Dorinda Bassey <dbassey@redhat.com>
Implements a vhost-user RNG device as a thin wrapper around the core VhostUserDevice. The VMM now switches between the standard RNG device and vhost-user RNG depending on whether a socket path is configured via krun_add_vhost_user_device() This allows us to use the RNG device from the rust-vmm vhost-device running in a separate process for better isolation and flexibility. Signed-off-by: Dorinda Bassey <dbassey@redhat.com>
Adds --vhost-user-rng command line option to specify a vhost-user RNG backend socket path. When provided, the VM uses the external vhost-user RNG device instead of the built-in virtio-rng implementation. Example usage: ./examples/chroot_vm \ --vhost-user-rng=/tmp/vhost-rng.sock0 \ / /bin/sh -c "head -c 32 /dev/hwrng | xxd" Signed-off-by: Dorinda Bassey <dbassey@redhat.com>
fe4e9ba to
b3bfb2e
Compare
This PR adds vhost-user frontend support to libkrun, enabling virtio devices to run in separate processes using rust-vmm's vhost-device backends for improved isolation and flexibility. The RNG frontend is implemented as the initial use case, and is designed to easily support additional devices (sound, GPU, can, etc).