<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/vhost/net.c, branch v6.12.80</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-09-04T13:31:44+00:00</updated>
<entry>
<title>vhost/net: Protect ubufs with rcu read lock in vhost_net_ubuf_put()</title>
<updated>2025-09-04T13:31:44+00:00</updated>
<author>
<name>Nikolay Kuratov</name>
<email>kniv@yandex-team.ru</email>
</author>
<published>2025-08-05T13:09:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cbc00a76a5ff91098577d8aac321b00b6d139b75'/>
<id>urn:sha1:cbc00a76a5ff91098577d8aac321b00b6d139b75</id>
<content type='text'>
commit dd54bcf86c91a4455b1f95cbc8e9ac91205f3193 upstream.

When operating on struct vhost_net_ubuf_ref, the following execution
sequence is theoretically possible:
CPU0 is finalizing DMA operation                   CPU1 is doing VHOST_NET_SET_BACKEND
                             // ubufs-&gt;refcount == 2
vhost_net_ubuf_put()                               vhost_net_ubuf_put_wait_and_free(oldubufs)
                                                     vhost_net_ubuf_put_and_wait()
                                                       vhost_net_ubuf_put()
                                                         int r = atomic_sub_return(1, &amp;ubufs-&gt;refcount);
                                                         // r = 1
int r = atomic_sub_return(1, &amp;ubufs-&gt;refcount);
// r = 0
                                                      wait_event(ubufs-&gt;wait, !atomic_read(&amp;ubufs-&gt;refcount));
                                                      // no wait occurs here because condition is already true
                                                    kfree(ubufs);
if (unlikely(!r))
  wake_up(&amp;ubufs-&gt;wait);  // use-after-free

This leads to use-after-free on ubufs access. This happens because CPU1
skips waiting for wake_up() when refcount is already zero.

To prevent that use a read-side RCU critical section in vhost_net_ubuf_put(),
as suggested by Hillf Danton. For this lock to take effect, free ubufs with
kfree_rcu().

Cc: stable@vger.kernel.org
Fixes: 0ad8b480d6ee9 ("vhost: fix ref cnt checking deadlock")
Reported-by: Andrey Ryabinin &lt;arbn@yandex-team.com&gt;
Suggested-by: Hillf Danton &lt;hdanton@sina.com&gt;
Signed-off-by: Nikolay Kuratov &lt;kniv@yandex-team.ru&gt;
Message-Id: &lt;20250805130917.727332-1-kniv@yandex-team.ru&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net: extend ubuf_info callback to ops structure</title>
<updated>2024-04-22T23:21:35+00:00</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2024-04-19T11:08:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7ab4f16f9e2440e797eae88812f800458e5879d2'/>
<id>urn:sha1:7ab4f16f9e2440e797eae88812f800458e5879d2</id>
<content type='text'>
We'll need to associate additional callbacks with ubuf_info, introduce
a structure holding ubuf_info callbacks. Apart from a more smarter
io_uring notification management introduced in next patches, it can be
used to generalise msg_zerocopy_put_abort() and also store
-&gt;sg_from_iter, which is currently passed in struct msghdr.

Reviewed-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Reviewed-by: Willem de Bruijn &lt;willemb@google.com&gt;
Link: https://lore.kernel.org/all/a62015541de49c0e2a8a0377a1d5d0a5aeb07016.1713369317.git.asml.silence@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost</title>
<updated>2024-03-19T15:57:39+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-03-19T15:57:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d95fcdf4961d27a3d17e5c7728367197adc89b8d'/>
<id>urn:sha1:d95fcdf4961d27a3d17e5c7728367197adc89b8d</id>
<content type='text'>
Pull virtio updates from Michael Tsirkin:

 - Per vq sizes in vdpa

 - Info query for block devices support in vdpa

 - DMA sync callbacks in vduse

 - Fixes, cleanups

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (35 commits)
  virtio_net: rename free_old_xmit_skbs to free_old_xmit
  virtio_net: unify the code for recycling the xmit ptr
  virtio-net: add cond_resched() to the command waiting loop
  virtio-net: convert rx mode setting to use workqueue
  virtio: packed: fix unmap leak for indirect desc table
  vDPA: report virtio-blk flush info to user space
  vDPA: report virtio-block read-only info to user space
  vDPA: report virtio-block write zeroes configuration to user space
  vDPA: report virtio-block discarding configuration to user space
  vDPA: report virtio-block topology info to user space
  vDPA: report virtio-block MQ info to user space
  vDPA: report virtio-block max segments in a request to user space
  vDPA: report virtio-block block-size to user space
  vDPA: report virtio-block max segment size to user space
  vDPA: report virtio-block capacity to user space
  virtio: make virtio_bus const
  vdpa: make vdpa_bus const
  vDPA/ifcvf: implement vdpa_config_ops.get_vq_num_min
  vDPA/ifcvf: get_max_vq_size to return max size
  virtio_vdpa: create vqs with the actual size
  ...
</content>
</entry>
<entry>
<title>vhost: Added pad cleanup if vnet_hdr is not present.</title>
<updated>2024-03-19T06:45:49+00:00</updated>
<author>
<name>Andrew Melnychenko</name>
<email>andrew@daynix.com</email>
</author>
<published>2024-01-15T19:48:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f6baca2d32ead51b10e15f1eef812d7c6a7b9a40'/>
<id>urn:sha1:f6baca2d32ead51b10e15f1eef812d7c6a7b9a40</id>
<content type='text'>
When the Qemu launched with vhost but without tap vnet_hdr,
vhost tries to copy vnet_hdr from socket iter with size 0
to the page that may contain some trash.
That trash can be interpreted as unpredictable values for
vnet_hdr.
That leads to dropping some packets and in some cases to
stalling vhost routine when the vhost_net tries to process
packets and fails in a loop.

Qemu options:
  -netdev tap,vhost=on,vnet_hdr=off,...

Signed-off-by: Andrew Melnychenko &lt;andrew@daynix.com&gt;
Message-Id: &lt;20240115194840.1183077-1-andrew@daynix.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
</entry>
<entry>
<title>vhost/net: remove vhost_net_page_frag_refill()</title>
<updated>2024-03-05T10:38:14+00:00</updated>
<author>
<name>Yunsheng Lin</name>
<email>linyunsheng@huawei.com</email>
</author>
<published>2024-02-28T09:30:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4051bd8129ac6c1faf447264aa7a8f91feb778b8'/>
<id>urn:sha1:4051bd8129ac6c1faf447264aa7a8f91feb778b8</id>
<content type='text'>
The page frag in vhost_net_page_frag_refill() uses the
'struct page_frag' from skb_page_frag_refill(), but it's
implementation is similar to page_frag_alloc_align() now.

This patch removes vhost_net_page_frag_refill() by using
'struct page_frag_cache' instead of 'struct page_frag',
and allocating frag using page_frag_alloc_align().

The added benefit is that not only unifying the page frag
implementation a little, but also having about 0.5% performance
boost testing by using the vhost_net_test introduced in the
last patch.

Signed-off-by: Yunsheng Lin &lt;linyunsheng@huawei.com&gt;
Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</content>
</entry>
<entry>
<title>page_frag: unify gfp bits for order 3 page allocation</title>
<updated>2024-03-05T10:38:14+00:00</updated>
<author>
<name>Yunsheng Lin</name>
<email>linyunsheng@huawei.com</email>
</author>
<published>2024-02-28T09:30:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4bc0d63a23957262b01b5848bd5e1739e98bbf36'/>
<id>urn:sha1:4bc0d63a23957262b01b5848bd5e1739e98bbf36</id>
<content type='text'>
Currently there seems to be three page frag implementations
which all try to allocate order 3 page, if that fails, it
then fail back to allocate order 0 page, and each of them
all allow order 3 page allocation to fail under certain
condition by using specific gfp bits.

The gfp bits for order 3 page allocation are different
between different implementation, __GFP_NOMEMALLOC is
or'd to forbid access to emergency reserves memory for
__page_frag_cache_refill(), but it is not or'd in other
implementions, __GFP_DIRECT_RECLAIM is masked off to avoid
direct reclaim in vhost_net_page_frag_refill(), but it is
not masked off in __page_frag_cache_refill().

This patch unifies the gfp bits used between different
implementions by or'ing __GFP_NOMEMALLOC and masking off
__GFP_DIRECT_RECLAIM for order 3 page allocation to avoid
possible pressure for mm.

Leave the gfp unifying for page frag implementation in sock.c
for now as suggested by Paolo Abeni.

Signed-off-by: Yunsheng Lin &lt;linyunsheng@huawei.com&gt;
Reviewed-by: Alexander Duyck &lt;alexanderduyck@fb.com&gt;
CC: Alexander Duyck &lt;alexander.duyck@gmail.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</content>
</entry>
<entry>
<title>vhost: convert poll work to be vq based</title>
<updated>2023-07-03T16:15:13+00:00</updated>
<author>
<name>Mike Christie</name>
<email>michael.christie@oracle.com</email>
</author>
<published>2023-06-26T23:22:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=493b94bf5ae0f6d67bd3728e8c723800d99a13ad'/>
<id>urn:sha1:493b94bf5ae0f6d67bd3728e8c723800d99a13ad</id>
<content type='text'>
This has the drivers pass in their poll to vq mapping and then converts
the core poll code to use the vq based helpers. In the next patches we
will allow vqs to be handled by different workers, so to allow drivers
to execute operations like queue, stop, flush, etc on specific polls/vqs
we need to know the mappings.

Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;20230626232307.97930-8-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
</entry>
<entry>
<title>vhost, vhost_net: add helper to check if vq has work</title>
<updated>2023-07-03T16:15:13+00:00</updated>
<author>
<name>Mike Christie</name>
<email>michael.christie@oracle.com</email>
</author>
<published>2023-06-26T23:22:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9784df151a601fa1d6d146f8a7f35a2d875e2976'/>
<id>urn:sha1:9784df151a601fa1d6d146f8a7f35a2d875e2976</id>
<content type='text'>
In the next patches each vq might have different workers so one could
have work but others do not. For net, we only want to check specific vqs,
so this adds a helper to check if a vq has work pending and converts
vhost-net to use it.

Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
Message-Id: &lt;20230626232307.97930-5-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
</entry>
<entry>
<title>vhost_net: revert upend_idx only on retriable error</title>
<updated>2023-06-08T19:43:08+00:00</updated>
<author>
<name>Andrey Smetanin</name>
<email>asmetanin@yandex-team.ru</email>
</author>
<published>2023-04-24T20:44:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1f5d2e3bab16369d5d4b4020a25db4ab1f4f082c'/>
<id>urn:sha1:1f5d2e3bab16369d5d4b4020a25db4ab1f4f082c</id>
<content type='text'>
Fix possible virtqueue used buffers leak and corresponding stuck
in case of temporary -EIO from sendmsg() which is produced by
tun driver while backend device is not up.

In case of no-retriable error and zcopy do not revert upend_idx
to pass packet data (that is update used_idx in corresponding
vhost_zerocopy_signal_used()) as if packet data has been
transferred successfully.

v2: set vq-&gt;heads[ubuf-&gt;desc].len equal to VHOST_DMA_DONE_LEN
in case of fake successful transmit.

Signed-off-by: Andrey Smetanin &lt;asmetanin@yandex-team.ru&gt;
Message-Id: &lt;20230424204411.24888-1-asmetanin@yandex-team.ru&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Andrey Smetanin &lt;asmetanin@yandex-team.ru&gt;
Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
</content>
</entry>
<entry>
<title>vhost-net: support VIRTIO_F_RING_RESET</title>
<updated>2023-02-21T00:26:59+00:00</updated>
<author>
<name>Kangjie Xu</name>
<email>kangjie.xu@linux.alibaba.com</email>
</author>
<published>2022-08-25T08:56:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=313389be06ff60b4ad23bdb4ff24a12e7c47cb26'/>
<id>urn:sha1:313389be06ff60b4ad23bdb4ff24a12e7c47cb26</id>
<content type='text'>
Add VIRTIO_F_RING_RESET, which indicates that the driver can reset a
queue individually.

VIRTIO_F_RING_RESET feature is added to virtio-spec 1.2. The relevant
information is in
    oasis-tcs/virtio-spec#124
    oasis-tcs/virtio-spec#139

The implementation only adds the feature bit in supported features. It
does not require any other changes because we reuse the existing vhost
protocol.

The virtqueue reset process can be concluded as two parts:
1. The driver can reset a virtqueue. When it is triggered, we use the
set_backend to disable the virtqueue.
2. After the virtqueue is disabled, the driver may optionally re-enable
it. The process is basically similar to when the device is started,
except that the restart process does not need to set features and set
mem table since they do not change. QEMU will send messages containing
size, base, addr, kickfd and callfd of the virtqueue in order.
Specifically, the host kernel will receive these messages in order:
    a. VHOST_SET_VRING_NUM
    b. VHOST_SET_VRING_BASE
    c. VHOST_SET_VRING_ADDR
    d. VHOST_SET_VRING_KICK
    e. VHOST_SET_VRING_CALL
    f. VHOST_NET_SET_BACKEND
Finally, after we use set_backend to attach the virtqueue, the virtqueue
will be enabled and start to work.

Signed-off-by: Kangjie Xu &lt;kangjie.xu@linux.alibaba.com&gt;
Signed-off-by: Xuan Zhuo &lt;xuanzhuo@linux.alibaba.com&gt;
Message-Id: &lt;20220825085610.80315-1-kangjie.xu@linux.alibaba.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
</content>
</entry>
</feed>
