1708 字
9 分钟

Virtio-Balloon Deep Dive and How to Disable It on a VPS

Preface#

I recently bought a VPS whose provider had Virtio-Balloon enabled by default, which made the system very unstable. So I decided to dig into how Virtio-Balloon actually works.

The Virtio-Balloon driver acts like a memory “balloon” that can be used to dynamically adjust guest memory.

Feature Overview#

The Virtio-Balloon driver works by allocating memory inside the guest VM, then notifying qemu of that allocation. qemu then frees the corresponding memory on the host side so that other VMs can use it, achieving memory overcommit and reuse.

To enable this feature, you must add the Virtio-Balloon backend device at VM startup, and install the Virtio-Balloon driver inside the guest.

Installation#

  • Add the device. If you use libvirt, add the following XML snippet to the VM definition:
Terminal window
<devices>
<memballoon model='virtio'>
<alias name='balloon0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
<stats period='10'/>
</memballoon>
</devices>

For qemu command-line, add the following device:

Terminal window
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0×4
  • Install the driver. For Windows, there are plenty of tutorials online, so we’ll skip that here. On Linux, the Virtio-Balloon driver has been in the kernel for a long time, so mainstream distributions usually ship it by default.
Terminal window
[root@localhost _posts]# modinfo virtio-balloon
filename: /lib/modules/3.10.0-327.el7.x86_64/kernel/drivers/virtio/virtio_balloon.ko
license: GPL
description: Virtio balloon driver
rhelversion: 7.2
srcversion: F2D65C53D0AFD06A3668942
alias: virtio:d00000005v*
depends: virtio,virtio_ring
intree: Y
vermagic: 3.10.0-327.el7.x86_64 SMP mod_unload modversions
signer: CentOS Linux kernel signing key
sig_key: 79:AD:88:6A:11:3C:A0:22:35:26:33:6C:0F:82:5B:8A:94:29:6A:B3
sig_hashalgo: sha256

Usage#

With the above in place, once the VM is started you can begin using the ballooning feature.

  • Check memory usage via libvirt:
Terminal window
virsh # dommemstat test

Check via qemu’s HMP monitor:

Terminal window
info balloon
  • Set the target memory size for the VM. Note this directly sets the current usable memory of the VM. For example, if the VM initially has 8G and you want to reclaim 2G, you should set the current memory to 6G.

On the libvirt side:

Terminal window
virsh # setmem test 4096

In qemu’s HMP:

Terminal window
balloon 4096

Code Analysis#

Let’s look at Virtio-Balloon from the code perspective. Starting from the Linux driver side: compared to other virtio drivers, the balloon driver is relatively small and easy to read. All its code lives in drivers/virtio/virtio_balloon.c.

First, look at the definition of virtio_balloon_driver. It’s easy to see there aren’t many callbacks: virtballoon_probe (called on driver load), virtballoon_remove (on unload), and virtballoon_changed. That strongly suggests most of the core logic is in virtballoon_changed.

Terminal window
static struct virtio_driver virtio_balloon_driver = {
.feature_table = features,
.feature_table_size = ARRAY_SIZE(features),
.driver.name = KBUILD_MODNAME,
.driver.owner = THIS_MODULE,
.id_table = id_table,
.probe = virtballoon_probe,
.remove = virtballoon_remove,
.config_changed = virtballoon_changed,
#ifdef CONFIG_PM_SLEEP
.freeze = virtballoon_freeze,
.restore = virtballoon_restore,
#endif
};

The virtballoon_changed function itself is quite simple: it just queues a work item whose main job is to run the update_balloon_size_work callback.

Terminal window
static void virtballoon_changed(struct virtio_device *vdev)
{
struct virtio_balloon *vb = vdev->priv;
unsigned long flags;
spin_lock_irqsave(&vb->stop_update_lock, flags);
if (!vb->stop_update)
queue_work(system_freezable_wq, &vb->update_balloon_size_work);
spin_unlock_irqrestore(&vb->stop_update_lock, flags);
}

To really understand this work item, we need to look at the Virtio-Balloon initialization function virtballoon_probe. From the snippet below you can see that on initialization, it registers the stats_request callback to handle stats requests coming from the backend. It also defines two work items: update_balloon_size_func for adjusting memory and update_balloon_stats_func for updating memory statistics—this latter one becomes obvious when you skim the code again.

Terminal window
static int virtballoon_probe(struct virtio_device *vdev)
{
struct virtqueue *vqs[3];
vq_callback_t *callbacks[] = { balloon_ack, balloon_ack, stats_request };
static const char * const names[] = { "inflate", "deflate", "stats" };
int err, nvqs;
nvqs = virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_STATS_VQ) ? 3 : 2;
err = vb->vdev->config->find_vqs(vb->vdev, nvqs, vqs, callbacks, names,
NULL);
INIT_WORK(&vb->update_balloon_stats_work, update_balloon_stats_func);
INIT_WORK(&vb->update_balloon_size_work, update_balloon_size_func);
}

Let’s first introduce the basic update_balloon_size_func functionality. The code is very straightforward: it gets the diff between the current and target size, then either inflates or deflates the balloon, and finally calls update_balloon_size to refresh the current memory info.

Terminal window
static void update_balloon_size_func(struct work_struct *work)
{
struct virtio_balloon *vb;
s64 diff;
vb = container_of(work, struct virtio_balloon,
update_balloon_size_work);
diff = towards_target(vb);
if (diff > 0)
diff -= fill_balloon(vb, diff);
else if (diff < 0)
diff += leak_balloon(vb, -diff);
update_balloon_size(vb);
if (diff)
queue_work(system_freezable_wq, work);
}

fill_balloon and leak_balloon are quite similar, so we’ll just look at fill_balloon. It calls balloon_page_enqueue to add pages to the balloon, then calls tell_host to update the host-side structures. balloon_page_enqueue is a helper implemented by the kernel for balloon-style page accounting. Regardless of whether it’s KVM or Xen ballooning, they all end up calling this function, which lives in mm/balloon_compaction.c. We won’t go deeper here, since this article focuses on Virtio-Balloon itself; we’ll revisit this function when we dive into memory management internals.

Terminal window
static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
{
struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
unsigned num_allocated_pages;
/* We can only do one array worth at a time. */
num = min(num, ARRAY_SIZE(vb->pfns));
mutex_lock(&vb->balloon_lock);
for (vb->num_pfns = 0; vb->num_pfns < num;
vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
struct page *page = balloon_page_enqueue(vb_dev_info);
if (!page) {
dev_info_ratelimited(&vb->vdev->dev,
"Out of puff! Can't get %u pages\n",
VIRTIO_BALLOON_PAGES_PER_PAGE);
/* Sleep for at least 1/5 of a second before retry. */
msleep(200);
break;
}
set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
if (!virtio_has_feature(vb->vdev,
VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
adjust_managed_page_count(page, -1);
}
num_allocated_pages = vb->num_pfns;
/* Did we get any? */
if (vb->num_pfns != 0)
tell_host(vb, vb->inflate_vq);
mutex_unlock(&vb->balloon_lock);
return num_allocated_pages;
}

With the basic functionality covered, let’s go back to the original question of this article: the Virtio-Balloon driver also implements a periodic memory monitoring feature. As we saw during initialization, when the backend issues a stat command, stats_request triggers another work item, update_balloon_stats_func.

Terminal window
static void stats_request(struct virtqueue *vq)
{
struct virtio_balloon *vb = vq->vdev->priv;
spin_lock(&vb->stop_update_lock);
if (!vb->stop_update)
queue_work(system_freezable_wq, &vb->update_balloon_stats_work);
spin_unlock(&vb->stop_update_lock);
}

update_balloon_stats_func mainly calls stats_handle_request:

Terminal window
static void update_balloon_stats_func(struct work_struct *work)
{
struct virtio_balloon *vb;
vb = container_of(work, struct virtio_balloon,
update_balloon_stats_work);
stats_handle_request(vb);
}

stats_handle_request gathers all the statistics into the vb struct and then sends vb back to the backend.

Terminal window
static void stats_handle_request(struct virtio_balloon *vb)
{
struct virtqueue *vq;
struct scatterlist sg;
unsigned int len;
update_balloon_stats(vb);
vq = vb->stats_vq;
if (!virtqueue_get_buf(vq, &len))
return;
sg_init_one(&sg, vb->stats, sizeof(vb->stats));
virtqueue_add_outbuf(vq, &sg, 1, vb, GFP_KERNEL);
virtqueue_kick(vq);
}

update_balloon_stats is what actually samples the guest memory usage.

Terminal window
static void update_balloon_stats(struct virtio_balloon *vb)
{
unsigned long events[NR_VM_EVENT_ITEMS];
struct sysinfo i;
int idx = 0;
long available;
all_vm_events(events);
si_meminfo(&i);
available = si_mem_available();
update_stat(vb, idx++, VIRTIO_BALLOON_S_SWAP_IN,
pages_to_bytes(events[PSWPIN]));
update_stat(vb, idx++, VIRTIO_BALLOON_S_SWAP_OUT,
pages_to_bytes(events[PSWPOUT]));
update_stat(vb, idx++, VIRTIO_BALLOON_S_MAJFLT, events[PGMAJFAULT]);
update_stat(vb, idx++, VIRTIO_BALLOON_S_MINFLT, events[PGFAULT]);
update_stat(vb, idx++, VIRTIO_BALLOON_S_MEMFREE,
pages_to_bytes(i.freeram));
update_stat(vb, idx++, VIRTIO_BALLOON_S_MEMTOT,
pages_to_bytes(i.totalram));
update_stat(vb, idx++, VIRTIO_BALLOON_S_AVAIL,
pages_to_bytes(available));
}

At this point we’ve covered the key parts of the Virtio-Balloon driver implementation.

The qemu backend code is relatively simple. During balloon device initialization, it calls qemu_add_balloon_handler to register the callbacks that will be triggered by QMP commands. It also initializes the virtqueues ivq, dvq, and svq to receive requests and stats, and then invokes the relevant callbacks.

Terminal window
static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
{
VirtIODevice *vdev = VIRTIO_DEVICE(dev);
VirtIOBalloon *s = VIRTIO_BALLOON(dev);
int ret;
virtio_init(vdev, "virtio-balloon", VIRTIO_ID_BALLOON,
sizeof(struct virtio_balloon_config));
ret = qemu_add_balloon_handler(virtio_balloon_to_target,
virtio_balloon_stat, s);
if (ret < 0) {
error_setg(errp, "Adding balloon handler failed");
virtio_cleanup(vdev);
return;
}
s->ivq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats);
reset_stats(s);
register_savevm(dev, "virtio-balloon", -1, 1,
virtio_balloon_save, virtio_balloon_load, s);
object_property_add(OBJECT(dev), "guest-stats", "guest statistics",
balloon_stats_get_all, NULL, NULL, s, NULL);
object_property_add(OBJECT(dev), "guest-stats-polling-interval", "int",
balloon_stats_get_poll_interval,
balloon_stats_set_poll_interval,
NULL, s, NULL);
}

Using the memory stats refresh path as an example (memory grow/shrink is similar): after registering the guest-stats-polling-interval property, you can configure a polling period. Once set, qemu starts a periodic timer balloon_stats_poll_cb to query memory stats in real time.

Terminal window
static void balloon_stats_set_poll_interval(Object *obj, struct Visitor *v,
void *opaque, const char *name,
Error **errp)
{
VirtIOBalloon *s = opaque;
Error *local_err = NULL;
int64_t value;
visit_type_int(v, &value, name, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
if (value < 0) {
error_setg(errp, "timer value must be greater than zero");
return;
}
if (value > UINT32_MAX) {
error_setg(errp, "timer value is too big");
return;
}
if (value == s->stats_poll_interval) {
return;
}
if (value == 0) {
/* timer=0 disables the timer */
balloon_stats_destroy_timer(s);
return;
}
if (balloon_stats_enabled(s)) {
/* timer interval change */
s->stats_poll_interval = value;
balloon_stats_change_timer(s, value);
return;
}
/* create a new timer */
g_assert(s->stats_timer == NULL);
s->stats_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, balloon_stats_poll_cb, s);
s->stats_poll_interval = value;
balloon_stats_change_timer(s, 0);
}

virtio_balloon_receive_stats reads the stats structure we discussed earlier from the virtqueue and updates the in-memory stats via balloon_stats_enabled.

Terminal window
static void virtio_balloon_receive_stats(VirtIODevice *vdev, VirtQueue *vq)
{
VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
VirtQueueElement *elem = &s->stats_vq_elem;
VirtIOBalloonStat stat;
size_t offset = 0;
qemu_timeval tv;
if (!virtqueue_pop(vq, elem)) {
goto out;
}
/* Initialize the stats to get rid of any stale values. This is only
* needed to handle the case where a guest supports fewer stats than it
* used to (ie. it has booted into an old kernel).
*/
reset_stats(s);
while (iov_to_buf(elem->out_sg, elem->out_num, offset, &stat, sizeof(stat))
== sizeof(stat)) {
uint16_t tag = virtio_tswap16(vdev, stat.tag);
uint64_t val = virtio_tswap64(vdev, stat.val);
offset += sizeof(stat);
if (tag < VIRTIO_BALLOON_S_NR)
s->stats[tag] = val;
}
s->stats_vq_offset = offset;
if (qemu_gettimeofday(&tv) < 0) {
fprintf(stderr, "warning: %s: failed to get time of day\n", __func__);
goto out;
}
s->stats_last_update = tv.tv_sec;
out:
if (balloon_stats_enabled(s)) {
balloon_stats_change_timer(s, s->stats_poll_interval);
}
}

Summary#

We’ve now walked through the implementation of Virtio-Balloon and its periodic memory statistics feature. Memory overcommit via Virtio-Balloon has some inherent issues:

  • The guest can perceive memory size changes. Ballooning is implemented in the kernel and works by changing the visible amount of memory, which can be unfriendly to applications.

  • Effective memory overcommit requires real-time monitoring: when a guest’s memory usage rises, the system must promptly return memory, which is hard to do robustly at the hypervisor level alone.

  • Balloon-based overcommit doesn’t truly provide redundant memory. It’s essentially just “robbing Peter to pay Paul”—when all guests are under heavy memory load, you can’t exceed the physical memory limit.

How to Disable Virtio-Balloon#

  1. Check whether the module is loaded: use lsmod to list loaded modules and see whether virtio_balloon is present. Run:

    _**lsmod | grep virtio_balloon**_

    If there is no output, the module is not loaded.

  2. Unload the module with rmmod: if virtio_balloon is loaded, you can unload it with:

    **_sudo rmmod virtio_balloon_**

    This attempts to remove the virtio_balloon module. If it succeeds, you shouldn’t see any error messages.

How to Prevent It from Loading at Boot#

  1. Go to the /etc/modprobe.d/ directory by running:

    **_sudo cd /etc/modprobe.d/_**

  2. Create a new configuration file, e.g. blacklist-virtio-balloon.conf:

    **_sudo nano /etc/modprobe.d/blacklist-virtio-balloon.conf_**

  3. Add a rule to blacklist the module. In the file, add the following line:

    **_blacklist virtio_balloon_**

    This tells the system to blacklist virtio_balloon during boot.

  4. Save and close the file: press Ctrl + X, then Y to confirm saving.

  5. Update initramfs by running:

    **_sudo update-initramfs -u_**

Virtio-Balloon Deep Dive and How to Disable It on a VPS
https://catcat.blog/en/virtio-balloon.html
作者
猫猫博客
发布于
2023-09-19
许可协议
CC BY-NC-SA 4.0