<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/vhost/scsi.c, branch v6.6.133</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.133</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.133'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-08-15T10:08:57+00:00</updated>
<entry>
<title>vhost-scsi: Fix log flooding with target does not exist errors</title>
<updated>2025-08-15T10:08:57+00:00</updated>
<author>
<name>Mike Christie</name>
<email>michael.christie@oracle.com</email>
</author>
<published>2025-06-11T21:01:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d3bf3088f7e929073b074b980efe317bddfad771'/>
<id>urn:sha1:d3bf3088f7e929073b074b980efe317bddfad771</id>
<content type='text'>
[ Upstream commit 69cd720a8a5e9ef0f05ce5dd8c9ea6e018245c82 ]

As part of the normal initiator side scanning the guest's scsi layer
will loop over all possible targets and send an inquiry. Since the
max number of targets for virtio-scsi is 256, this can result in 255
error messages about targets not existing if you only have a single
target. When there's more than 1 vhost-scsi device each with a single
target, then you get N * 255 log messages.

It looks like the log message was added by accident in:

commit 3f8ca2e115e5 ("vhost/scsi: Extract common handling code from
control queue handler")

when we added common helpers. Then in:

commit 09d7583294aa ("vhost/scsi: Use common handling code in request
queue handler")

we converted the scsi command processing path to use the new
helpers so we started to see the extra log messages during scanning.

The patches were just making some code common but added the vq_err
call and I'm guessing the patch author forgot to enable the vq_err
call (vq_err is implemented by pr_debug which defaults to off). So
this patch removes the call since it's expected to hit this path
during device discovery.

Fixes: 09d7583294aa ("vhost/scsi: Use common handling code in request queue handler")
Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Reviewed-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
Reviewed-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Message-Id: &lt;20250611210113.10912-1-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>vhost-scsi: Return queue full for page alloc failures during copy</title>
<updated>2025-06-04T12:42:05+00:00</updated>
<author>
<name>Mike Christie</name>
<email>michael.christie@oracle.com</email>
</author>
<published>2024-12-03T19:15:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f5121d5ba7ce412d9e6e8287cc83b3e80c56586e'/>
<id>urn:sha1:f5121d5ba7ce412d9e6e8287cc83b3e80c56586e</id>
<content type='text'>
[ Upstream commit 891b99eab0f89dbe08d216f4ab71acbeaf7a3102 ]

This has us return queue full if we can't allocate a page during the
copy operation so the initiator can retry.

Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;20241203191705.19431-5-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Acked-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>vhost-scsi: protect vq-&gt;log_used with vq-&gt;mutex</title>
<updated>2025-06-04T12:41:53+00:00</updated>
<author>
<name>Dongli Zhang</name>
<email>dongli.zhang@oracle.com</email>
</author>
<published>2025-04-03T06:29:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ca85c2d0db5f8309832be45858b960d933c2131c'/>
<id>urn:sha1:ca85c2d0db5f8309832be45858b960d933c2131c</id>
<content type='text'>
[ Upstream commit f591cf9fce724e5075cc67488c43c6e39e8cbe27 ]

The vhost-scsi completion path may access vq-&gt;log_base when vq-&gt;log_used is
already set to false.

    vhost-thread                       QEMU-thread

vhost_scsi_complete_cmd_work()
-&gt; vhost_add_used()
   -&gt; vhost_add_used_n()
      if (unlikely(vq-&gt;log_used))
                                      QEMU disables vq-&gt;log_used
                                      via VHOST_SET_VRING_ADDR.
                                      mutex_lock(&amp;vq-&gt;mutex);
                                      vq-&gt;log_used = false now!
                                      mutex_unlock(&amp;vq-&gt;mutex);

				      QEMU gfree(vq-&gt;log_base)
        log_used()
        -&gt; log_write(vq-&gt;log_base)

Assuming the VMM is QEMU. The vq-&gt;log_base is from QEMU userpace and can be
reclaimed via gfree(). As a result, this causes invalid memory writes to
QEMU userspace.

The control queue path has the same issue.

Signed-off-by: Dongli Zhang &lt;dongli.zhang@oracle.com&gt;
Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
Reviewed-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;20250403063028.16045-2-dongli.zhang@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>vhost-scsi: Fix handling of multiple calls to vhost_scsi_set_endpoint</title>
<updated>2025-04-10T12:37:32+00:00</updated>
<author>
<name>Mike Christie</name>
<email>michael.christie@oracle.com</email>
</author>
<published>2025-01-29T21:09:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2b34bdc42df047794542f3e220fe989124e4499a'/>
<id>urn:sha1:2b34bdc42df047794542f3e220fe989124e4499a</id>
<content type='text'>
[ Upstream commit 5dd639a1646ef5fe8f4bf270fad47c5c3755b9b6 ]

If vhost_scsi_set_endpoint is called multiple times without a
vhost_scsi_clear_endpoint between them, we can hit multiple bugs
found by Haoran Zhang:

1. Use-after-free when no tpgs are found:

This fixes a use after free that occurs when vhost_scsi_set_endpoint is
called more than once and calls after the first call do not find any
tpgs to add to the vs_tpg. When vhost_scsi_set_endpoint first finds
tpgs to add to the vs_tpg array match=true, so we will do:

vhost_vq_set_backend(vq, vs_tpg);
...

kfree(vs-&gt;vs_tpg);
vs-&gt;vs_tpg = vs_tpg;

If vhost_scsi_set_endpoint is called again and no tpgs are found
match=false so we skip the vhost_vq_set_backend call leaving the
pointer to the vs_tpg we then free via:

kfree(vs-&gt;vs_tpg);
vs-&gt;vs_tpg = vs_tpg;

If a scsi request is then sent we do:

vhost_scsi_handle_vq -&gt; vhost_scsi_get_req -&gt; vhost_vq_get_backend

which sees the vs_tpg we just did a kfree on.

2. Tpg dir removal hang:

This patch fixes an issue where we cannot remove a LIO/target layer
tpg (and structs above it like the target) dir due to the refcount
dropping to -1.

The problem is that if vhost_scsi_set_endpoint detects a tpg is already
in the vs-&gt;vs_tpg array or if the tpg has been removed so
target_depend_item fails, the undepend goto handler will do
target_undepend_item on all tpgs in the vs_tpg array dropping their
refcount to 0. At this time vs_tpg contains both the tpgs we have added
in the current vhost_scsi_set_endpoint call as well as tpgs we added in
previous calls which are also in vs-&gt;vs_tpg.

Later, when vhost_scsi_clear_endpoint runs it will do
target_undepend_item on all the tpgs in the vs-&gt;vs_tpg which will drop
their refcount to -1. Userspace will then not be able to remove the tpg
and will hang when it tries to do rmdir on the tpg dir.

3. Tpg leak:

This fixes a bug where we can leak tpgs and cause them to be
un-removable because the target name is overwritten when
vhost_scsi_set_endpoint is called multiple times but with different
target names.

The bug occurs if a user has called VHOST_SCSI_SET_ENDPOINT and setup
a vhost-scsi device to target/tpg mapping, then calls
VHOST_SCSI_SET_ENDPOINT again with a new target name that has tpgs we
haven't seen before (target1 has tpg1 but target2 has tpg2). When this
happens we don't teardown the old target tpg mapping and just overwrite
the target name and the vs-&gt;vs_tpg array. Later when we do
vhost_scsi_clear_endpoint, we are passed in either target1 or target2's
name and we will only match that target's tpgs when we loop over the
vs-&gt;vs_tpg. We will then return from the function without doing
target_undepend_item on the tpgs.

Because of all these bugs, it looks like being able to call
vhost_scsi_set_endpoint multiple times was never supported. The major
user, QEMU, already has checks to prevent this use case. So to fix the
issues, this patch prevents vhost_scsi_set_endpoint from being called
if it's already successfully added tpgs. To add, remove or change the
tpg config or target name, you must do a vhost_scsi_clear_endpoint
first.

Fixes: 25b98b64e284 ("vhost scsi: alloc cmds per vq instead of session")
Fixes: 4f7f46d32c98 ("tcm_vhost: Use vq-&gt;private_data to indicate if the endpoint is setup")
Reported-by: Haoran Zhang &lt;wh1sper@zju.edu.cn&gt;
Closes: https://lore.kernel.org/virtualization/e418a5ee-45ca-4d18-9b5d-6f8b6b1add8e@oracle.com/T/#me6c0041ce376677419b9b2563494172a01487ecb
Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Reviewed-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
Message-Id: &lt;20250129210922.121533-1-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Acked-by: Stefano Garzarella &lt;sgarzare@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>vhost/scsi: null-ptr-dereference in vhost_scsi_get_req()</title>
<updated>2024-10-10T09:58:08+00:00</updated>
<author>
<name>Haoran Zhang</name>
<email>wh1sper@zju.edu.cn</email>
</author>
<published>2024-10-01T20:14:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=25613e6d9841a1f9fb985be90df921fa99f800de'/>
<id>urn:sha1:25613e6d9841a1f9fb985be90df921fa99f800de</id>
<content type='text'>
commit 221af82f606d928ccef19a16d35633c63026f1be upstream.

Since commit 3f8ca2e115e5 ("vhost/scsi: Extract common handling code
from control queue handler") a null pointer dereference bug can be
triggered when guest sends an SCSI AN request.

In vhost_scsi_ctl_handle_vq(), `vc.target` is assigned with
`&amp;v_req.tmf.lun[1]` within a switch-case block and is then passed to
vhost_scsi_get_req() which extracts `vc-&gt;req` and `tpg`. However, for
a `VIRTIO_SCSI_T_AN_*` request, tpg is not required, so `vc.target` is
set to NULL in this branch. Later, in vhost_scsi_get_req(),
`vc-&gt;target` is dereferenced without being checked, leading to a null
pointer dereference bug. This bug can be triggered from guest.

When this bug occurs, the vhost_worker process is killed while holding
`vq-&gt;mutex` and the corresponding tpg will remain occupied
indefinitely.

Below is the KASAN report:
Oops: general protection fault, probably for non-canonical address
0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 1 PID: 840 Comm: poc Not tainted 6.10.0+ #1
Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS
1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:vhost_scsi_get_req+0x165/0x3a0
Code: 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 2b 02 00 00
48 b8 00 00 00 00 00 fc ff df 4d 8b 65 30 4c 89 e2 48 c1 ea 03 &lt;0f&gt; b6
04 02 4c 89 e2 83 e2 07 38 d0 7f 08 84 c0 0f 85 be 01 00 00
RSP: 0018:ffff888017affb50 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: ffff88801b000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff888017affcb8
RBP: ffff888017affb80 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff888017affc88 R14: ffff888017affd1c R15: ffff888017993000
FS:  000055556e076500(0000) GS:ffff88806b100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000200027c0 CR3: 0000000010ed0004 CR4: 0000000000370ef0
Call Trace:
 &lt;TASK&gt;
 ? show_regs+0x86/0xa0
 ? die_addr+0x4b/0xd0
 ? exc_general_protection+0x163/0x260
 ? asm_exc_general_protection+0x27/0x30
 ? vhost_scsi_get_req+0x165/0x3a0
 vhost_scsi_ctl_handle_vq+0x2a4/0xca0
 ? __pfx_vhost_scsi_ctl_handle_vq+0x10/0x10
 ? __switch_to+0x721/0xeb0
 ? __schedule+0xda5/0x5710
 ? __kasan_check_write+0x14/0x30
 ? _raw_spin_lock+0x82/0xf0
 vhost_scsi_ctl_handle_kick+0x52/0x90
 vhost_run_work_list+0x134/0x1b0
 vhost_task_fn+0x121/0x350
...
 &lt;/TASK&gt;
---[ end trace 0000000000000000 ]---

Let's add a check in vhost_scsi_get_req.

Fixes: 3f8ca2e115e5 ("vhost/scsi: Extract common handling code from control queue handler")
Signed-off-by: Haoran Zhang &lt;wh1sper@zju.edu.cn&gt;
[whitespace fixes]
Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;b26d7ddd-b098-4361-88f8-17ca7f90adf7@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>vhost-scsi: Handle vhost_vq_work_queue failures for events</title>
<updated>2024-07-11T10:49:20+00:00</updated>
<author>
<name>Mike Christie</name>
<email>michael.christie@oracle.com</email>
</author>
<published>2024-03-16T00:46:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8f174c5db1e04a67a66b5535f5d0403ea823a44f'/>
<id>urn:sha1:8f174c5db1e04a67a66b5535f5d0403ea823a44f</id>
<content type='text'>
[ Upstream commit b1b2ce58ed23c5d56e0ab299a5271ac01f95b75c ]

Currently, we can try to queue an event's work before the vhost_task is
created. When this happens we just drop it in vhost_scsi_do_plug before
even calling vhost_vq_work_queue. During a device shutdown we do the
same thing after vhost_scsi_clear_endpoint has cleared the backends.

In the next patches we will be able to kill the vhost_task before we
have cleared the endpoint. In that case, vhost_vq_work_queue can fail
and we will leak the event's memory. This has handle the failure by
just freeing the event. This is safe to do, because
vhost_vq_work_queue will only return failure for us when the vhost_task
is killed and so userspace will not be able to handle events if we
sent them.

Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;20240316004707.45557-2-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>vhost-scsi: Rename vhost_scsi_iov_to_sgl</title>
<updated>2023-08-10T19:24:28+00:00</updated>
<author>
<name>Mike Christie</name>
<email>michael.christie@oracle.com</email>
</author>
<published>2023-07-09T20:28:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c5ace19efb0ac884a9a417e2a1499ce9849bdaa5'/>
<id>urn:sha1:c5ace19efb0ac884a9a417e2a1499ce9849bdaa5</id>
<content type='text'>
Rename vhost_scsi_iov_to_sgl to vhost_scsi_map_iov_to_sgl so it matches
matches the naming style used for vhost_scsi_copy_iov_to_sgl.

Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;20230709202859.138387-3-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Acked-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
</content>
</entry>
<entry>
<title>vhost-scsi: Fix alignment handling with windows</title>
<updated>2023-08-10T19:24:27+00:00</updated>
<author>
<name>Mike Christie</name>
<email>michael.christie@oracle.com</email>
</author>
<published>2023-07-09T20:28:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5ced58bfa132c8ba0f9c893eb621595a84cfee12'/>
<id>urn:sha1:5ced58bfa132c8ba0f9c893eb621595a84cfee12</id>
<content type='text'>
The linux block layer requires bios/requests to have lengths with a 512
byte alignment. Some drivers/layers like dm-crypt and the directi IO code
will test for it and just fail. Other drivers like SCSI just assume the
requirement is met and will end up in infinte retry loops. The problem
for drivers like SCSI is that it uses functions like blk_rq_cur_sectors
and blk_rq_sectors which divide the request's length by 512. If there's
lefovers then it just gets dropped. But other code in the block/scsi
layer may use blk_rq_bytes/blk_rq_cur_bytes and end up thinking there is
still data left and try to retry the cmd. We can then end up getting
stuck in retry loops where part of the block/scsi thinks there is data
left, but other parts think we want to do IOs of zero length.

Linux will always check for alignment, but windows will not. When
vhost-scsi then translates the iovec it gets from a windows guest to a
scatterlist, we can end up with sg items where the sg-&gt;length is not
divisible by 512 due to the misaligned offset:

sg[0].offset = 255;
sg[0].length = 3841;
sg...
sg[N].offset = 0;
sg[N].length = 255;

When the lio backends then convert the SG to bios or other iovecs, we
end up sending them with the same misaligned values and can hit the
issues above.

This just has us drop down to allocating a temp page and copying the data
when we detect a misaligned buffer and the IO is large enough that it
will get split into multiple bad IOs.

Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;20230709202859.138387-2-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Acked-by: Stefan Hajnoczi &lt;stefanha@redhat.com&gt;
</content>
</entry>
<entry>
<title>vhost_scsi: add support for worker ioctls</title>
<updated>2023-07-03T16:15:14+00:00</updated>
<author>
<name>Mike Christie</name>
<email>michael.christie@oracle.com</email>
</author>
<published>2023-06-26T23:23:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d74b55e6550225ad0a28f0faa590cc9f780ba392'/>
<id>urn:sha1:d74b55e6550225ad0a28f0faa590cc9f780ba392</id>
<content type='text'>
This has vhost-scsi support the worker ioctls by calling the
vhost_worker_ioctl helper.

With a single worker, the single thread becomes a bottlneck when trying
to use 3 or more virtqueues like:

fio --filename=/dev/sdb  --direct=1 --rw=randrw --bs=4k \
--ioengine=libaio --iodepth=128  --numjobs=3

With the patches and doing a worker per vq, we can scale to at least
16 vCPUs/vqs (that's my system limit) with the same command fio command
above with numjobs=16:

fio --filename=/dev/sdb  --direct=1 --rw=randrw --bs=4k \
--ioengine=libaio --iodepth=64  --numjobs=16

which gives around 2002K IOPs.

Note that for testing I dropped depth to 64 above because the vhost/virt
layer supports only 1024 total commands per device. And the only tuning I
did was set LIO's emulate_pr to 0 to avoid LIO's PR lock in the main IO
path which becomes an issue at around 12 jobs/virtqueues.

Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;20230626232307.97930-17-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
</entry>
<entry>
<title>vhost_scsi: flush IO vqs then send TMF rsp</title>
<updated>2023-07-03T16:15:14+00:00</updated>
<author>
<name>Mike Christie</name>
<email>michael.christie@oracle.com</email>
</author>
<published>2023-06-26T23:23:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0a3eac5239d2ddcfef8cc5b0c40095981536db90'/>
<id>urn:sha1:0a3eac5239d2ddcfef8cc5b0c40095981536db90</id>
<content type='text'>
With one worker we will always send the scsi cmd responses then send the
TMF rsp, because LIO will always complete the scsi cmds first then call
into us to send the TMF response.

With multiple workers, the IO vq workers could be running while the
TMF/ctl vq worker is running so this has us do a flush before completing
the TMF to make sure cmds are completed when it's work is later queued
and run.

Signed-off-by: Mike Christie &lt;michael.christie@oracle.com&gt;
Message-Id: &lt;20230626232307.97930-12-michael.christie@oracle.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
</entry>
</feed>
