summaryrefslogtreecommitdiff
path: root/drivers/vhost/scsi.c
AgeCommit message (Collapse)AuthorFilesLines
2021-02-23vhost scsi: alloc vhost_scsi with kvzalloc() to avoid delayDongli Zhang1-6/+3
The size of 'struct vhost_scsi' is order-10 (~2.3MB). It may take long time delay by kzalloc() to compact memory pages by retrying multiple times when there is a lack of high-order pages. As a result, there is latency to create a VM (with vhost-scsi) or to hotadd vhost-scsi-based storage. The prior commit 595cb754983d ("vhost/scsi: use vmalloc for order-10 allocation") prefers to fallback only when really needed, while this patch allocates with kvzalloc() with __GFP_NORETRY implicitly set to avoid retrying memory pages compact for multiple times. The __GFP_NORETRY is implicitly set if the size to allocate is more than PAGE_SZIE and when __GFP_RETRY_MAYFAIL is not explicitly set. Cc: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Cc: Joe Jin <joe.jin@oracle.com> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Link: https://lore.kernel.org/r/20210123080853.4214-1-dongli.zhang@oracle.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2020-12-19vhost scsi: fix error return code in vhost_scsi_set_endpoint()Zhang Changzhong1-1/+2
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: 25b98b64e284 ("vhost scsi: alloc cmds per vq instead of session") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Link: https://lore.kernel.org/r/1607071411-33484-1-git-send-email-zhangchangzhong@huawei.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-11-25vhost scsi: fix lun reset completion handlingMike Christie1-1/+3
vhost scsi owns the scsi se_cmd but lio frees the se_cmd->se_tmr before calling release_cmd, so while with normal cmd completion we can access the se_cmd from the vhost work, we can't do the same with se_cmd->se_tmr. This has us copy the tmf response in vhost_scsi_queue_tm_rsp to our internal vhost-scsi tmf struct for when it gets sent to the guest from our worker thread. Fixes: efd838fec17b ("vhost scsi: Add support for LUN resets.") Signed-off-by: Mike Christie <michael.christie@oracle.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Link: https://lore.kernel.org/r/1605887459-3864-1-git-send-email-michael.christie@oracle.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-11-16vhost scsi: Add support for LUN resets.Mike Christie1-13/+134
In newer versions of virtio-scsi we just reset the timer when an a command times out, so TMFs are never sent for the cmd time out case. However, in older kernels and for the TMF inject cases, we can still get resets and we end up just failing immediately so the guest might see the device get offlined and IO errors. For the older kernel cases, we want the same end result as the modern virtio-scsi driver where we let the lower levels fire their error handling and handle the problem. And at the upper levels we want to wait. This patch ties the LUN reset handling into the LIO TMF code which will just wait for outstanding commands to complete like we are doing in the modern virtio-scsi case. Note: I did not handle the ABORT case to keep this simple. For ABORTs LIO just waits on the cmd like how it does for the RESET case. If an ABORT fails, the guest OS ends up escalating to LUN RESET, so in the end we get the same behavior where we wait on the outstanding cmds. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/1604986403-4931-6-git-send-email-michael.christie@oracle.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-11-16vhost scsi: add lun parser helperMike Christie1-2/+7
Move code to parse lun from req's lun_buf to helper, so tmf code can use it in the next patch. Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/1604986403-4931-5-git-send-email-michael.christie@oracle.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-11-16vhost scsi: fix cmd completion raceMike Christie1-27/+15
We might not do the final se_cmd put from vhost_scsi_complete_cmd_work. When the last put happens a little later then we could race where vhost_scsi_complete_cmd_work does vhost_signal, the guest runs and sends more IO, and vhost_scsi_handle_vq runs but does not find any free cmds. This patch has us delay completing the cmd until the last lio core ref is dropped. We then know that once we signal to the guest that the cmd is completed that if it queues a new command it will find a free cmd. Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Maurizio Lombardi <mlombard@redhat.com> Link: https://lore.kernel.org/r/1604986403-4931-4-git-send-email-michael.christie@oracle.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-11-16vhost scsi: alloc cmds per vq instead of sessionMike Christie1-79/+128
We currently are limited to 256 cmds per session. This leads to problems where if the user has increased virtqueue_size to more than 2 or cmd_per_lun to more than 256 vhost_scsi_get_tag can fail and the guest will get IO errors. This patch moves the cmd allocation to per vq so we can easily match whatever the user has specified for num_queues and virtqueue_size/cmd_per_lun. It also makes it easier to control how much memory we preallocate. For cases, where perf is not as important and we can use the current defaults (1 vq and 128 cmds per vq) memory use from preallocate cmds is cut in half. For cases, where we are willing to use more memory for higher perf, cmd mem use will now increase as the num queues and queue depth increases. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/1604986403-4931-3-git-send-email-michael.christie@oracle.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Maurizio Lombardi <mlombard@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-07-29vhost/scsi: fix up req type endian-nessMichael S. Tsirkin1-1/+1
vhost/scsi doesn't handle type conversion correctly for request type when using virtio 1.0 and up for BE, or cross-endian platforms. Fix it up using vhost_32_to_cpu. Cc: stable@vger.kernel.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-06-10Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds1-1/+1
Pull virtio updates from Michael Tsirkin: - virtio-mem: paravirtualized memory hotplug - support doorbell mapping for vdpa - config interrupt support in ifc - fixes all over the place * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (40 commits) vhost/test: fix up after API change virtio_mem: convert device block size into 64bit virtio-mem: drop unnecessary initialization ifcvf: implement config interrupt in IFCVF vhost: replace -1 with VHOST_FILE_UNBIND in ioctls vhost_vdpa: Support config interrupt in vdpa ifcvf: ignore continuous setting same status value virtio-mem: Don't rely on implicit compiler padding for requests virtio-mem: Try to unplug the complete online memory block first virtio-mem: Use -ETXTBSY as error code if the device is busy virtio-mem: Unplug subblocks right-to-left virtio-mem: Drop manual check for already present memory virtio-mem: Add parent resource for all added "System RAM" virtio-mem: Better retry handling virtio-mem: Offline and remove completely unplugged memory blocks mm/memory_hotplug: Introduce offline_and_remove_memory() virtio-mem: Allow to offline partially unplugged memory blocks mm: Allow to offline unmovable PageOffline() pages via MEM_GOING_OFFLINE virtio-mem: Paravirtualized memory hotunplug part 2 virtio-mem: Paravirtualized memory hotunplug part 1 ...
2020-06-06Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-0/+1
Pull SCSI updates from James Bottomley: :This series consists of the usual driver updates (qla2xxx, ufs, zfcp, target, scsi_debug, lpfc, qedi, qedf, hisi_sas, mpt3sas) plus a host of other minor updates. There are no major core changes in this series apart from a refactoring in scsi_lib.c" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (207 commits) scsi: ufs: ti-j721e-ufs: Fix unwinding of pm_runtime changes scsi: cxgb3i: Fix some leaks in init_act_open() scsi: ibmvscsi: Make some functions static scsi: iscsi: Fix deadlock on recovery path during GFP_IO reclaim scsi: ufs: Fix WriteBooster flush during runtime suspend scsi: ufs: Fix index of attributes query for WriteBooster feature scsi: ufs: Allow WriteBooster on UFS 2.2 devices scsi: ufs: Remove unnecessary memset for dev_info scsi: ufs-qcom: Fix scheduling while atomic issue scsi: mpt3sas: Fix reply queue count in non RDPQ mode scsi: lpfc: Fix lpfc_nodelist leak when processing unsolicited event scsi: target: tcmu: Fix a use after free in tcmu_check_expired_queue_cmd() scsi: vhost: Notify TCM about the maximum sg entries supported per command scsi: qla2xxx: Remove return value from qla_nvme_ls() scsi: qla2xxx: Remove an unused function scsi: iscsi: Register sysfs for iscsi workqueue scsi: scsi_debug: Parser tables and code interaction scsi: core: Refactor scsi_mq_setup_tags function scsi: core: Fix incorrect usage of shost_for_each_device scsi: qla2xxx: Fix endianness annotations in source files ...
2020-06-04vhost: allow device that does not depend on vhost workerJason Wang1-1/+1
vDPA device currently relays the eventfd via vhost worker. This is inefficient due the latency of wakeup and scheduling, so this patch tries to introduce a use_worker attribute for the vhost device. When use_worker is not set with vhost_dev_init(), vhost won't try to allocate a worker thread and the vhost_poll will be processed directly in the wakeup function. This help for vDPA since it reduces the latency caused by vhost worker. In my testing, it saves 0.2 ms in pings between VMs on a mutual host. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20200529080303.15449-2-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-05-26scsi: vhost: Notify TCM about the maximum sg entries supported per commandSudhakar Panneerselvam1-0/+1
vhost-scsi pre-allocates the maximum sg entries per command and if a command requires more than VHOST_SCSI_PREALLOC_SGLS entries, then that command is failed by it. This patch lets vhost communicate the max sg limit when it registers vhost_scsi_ops with TCM. With this change, TCM would report the max sg entries through "Block Limits" VPD page which will be typically queried by the SCSI initiator during device discovery. By knowing this limit, the initiator could ensure the maximum transfer length is less than or equal to what is reported by vhost-scsi. Link: https://lore.kernel.org/r/1590166317-953-1-git-send-email-sudhakar.panneerselvam@oracle.com Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Sudhakar Panneerselvam <sudhakar.panneerselvam@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-04-17vhost: Create accessors for virtqueues private_dataEugenio Pérez1-7/+7
Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Link: https://lore.kernel.org/r/20200331192804.6019-2-eperezma@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-04-01vhost: allow per device message handlerJason Wang1-1/+1
This patch allow device to register its own message handler during vhost_dev_init(). vDPA device will use it to implement its own DMA mapping logic. Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20200326140125.19794-3-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-10-23compat_ioctl: move drivers to compat_ptr_ioctlArnd Bergmann1-11/+1
Each of these drivers has a copy of the same trivial helper function to convert the pointer argument and then call the native ioctl handler. We now have a generic implementation of that, so use it. Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-05-27vhost: scsi: add weight supportJason Wang1-6/+6
This patch will check the weight and exit the loop if we exceeds the weight. This is useful for preventing scsi kthread from hogging cpu which is guest triggerable. This addresses CVE-2019-3900. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Fixes: 057cbf49a1f0 ("tcm_vhost: Initial merge for vhost level target fabric driver") Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-05-27vhost: introduce vhost_exceeds_weight()Jason Wang1-1/+8
We used to have vhost_exceeds_weight() for vhost-net to: - prevent vhost kthread from hogging the cpu - balance the time spent between TX and RX This function could be useful for vsock and scsi as well. So move it to vhost.c. Device must specify a weight which counts the number of requests, or it can also specific a byte_weight which counts the number of bytes that has been processed. Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-05-12vhost-scsi: remove incorrect memory barrierPaolo Bonzini1-1/+0
At this point, vs_tpg is not public at all; tv_tpg_vhost_count is accessed under tpg->tv_tpg_mutex; tpg->vhost_scsi is accessed under vhost_scsi_mutex. Therefor there are no atomic operations involved at all here, just remove the barrier. Reported-by: Andrea Parri <andrea.parri@amarulasolutions.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2019-03-10Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-6/+0
Pull SCSI updates from James Bottomley: "This is mostly update of the usual drivers: arcmsr, qla2xxx, lpfc, hisi_sas, target/iscsi and target/core. Additionally Christoph refactored gdth as part of the dma changes. The major mid-layer change this time is the removal of bidi commands and with them the whole of the osd/exofs driver and filesystem. This is a major simplification for block and mq in particular" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (240 commits) scsi: cxgb4i: validate tcp sequence number only if chip version <= T5 scsi: cxgb4i: get pf number from lldi->pf scsi: core: replace GFP_ATOMIC with GFP_KERNEL in scsi_scan.c scsi: mpt3sas: Add missing breaks in switch statements scsi: aacraid: Fix missing break in switch statement scsi: kill command serial number scsi: csiostor: drop serial_number usage scsi: mvumi: use request tag instead of serial_number scsi: dpt_i2o: remove serial number usage scsi: st: osst: Remove negative constant left-shifts scsi: ufs-bsg: Allow reading descriptors scsi: ufs: Allow reading descriptor via raw upiu scsi: ufs-bsg: Change the calling convention for write descriptor scsi: ufs: Remove unused device quirks Revert "scsi: ufs: disable vccq if it's not needed by UFS device" scsi: megaraid_sas: Remove a bunch of set but not used variables scsi: clean obsolete return values of eh_timed_out scsi: sd: Optimal I/O size should be a multiple of physical block size scsi: MAINTAINERS: SCSI initiator and target tweaks scsi: fcoe: make use of fip_mode enum complete ...
2019-02-05scsi: target/core: Remove the write_pending_status() callback functionBart Van Assche1-6/+0
Due to the patch that makes TMF handling synchronous the write_pending_status() callback function is no longer called. Hence remove it. Acked-by: Felipe Balbi <balbi@ti.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Andy Grover <agrover@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Mike Christie <mchristi@redhat.com> Cc: Himanshu Madhani <himanshu.madhani@qlogic.com> Cc: Quinn Tran <quinn.tran@qlogic.com> Cc: Saurav Kashyap <saurav.kashyap@qlogic.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Juergen Gross <jgross@suse.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-29vhost: fix OOB in get_rx_bufs()Jason Wang1-1/+1
After batched used ring updating was introduced in commit e2b3b35eb989 ("vhost_net: batch used ring update in rx"). We tend to batch heads in vq->heads for more than one packet. But the quota passed to get_rx_bufs() was not correctly limited, which can result a OOB write in vq->heads. headcount = get_rx_bufs(vq, vq->heads + nvq->done_idx, vhost_len, &in, vq_log, &log, likely(mergeable) ? UIO_MAXIOV : 1); UIO_MAXIOV was still used which is wrong since we could have batched used in vq->heads, this will cause OOB if the next buffer needs more than 960 (1024 (UIO_MAXIOV) - 64 (VHOST_NET_BATCH)) heads after we've batched 64 (VHOST_NET_BATCH) heads: Acked-by: Stefan Hajnoczi <stefanha@redhat.com> ============================================================================= BUG kmalloc-8k (Tainted: G B ): Redzone overwritten ----------------------------------------------------------------------------- INFO: 0x00000000fd93b7a2-0x00000000f0713384. First byte 0xa9 instead of 0xcc INFO: Allocated in alloc_pd+0x22/0x60 age=3933677 cpu=2 pid=2674 kmem_cache_alloc_trace+0xbb/0x140 alloc_pd+0x22/0x60 gen8_ppgtt_create+0x11d/0x5f0 i915_ppgtt_create+0x16/0x80 i915_gem_create_context+0x248/0x390 i915_gem_context_create_ioctl+0x4b/0xe0 drm_ioctl_kernel+0xa5/0xf0 drm_ioctl+0x2ed/0x3a0 do_vfs_ioctl+0x9f/0x620 ksys_ioctl+0x6b/0x80 __x64_sys_ioctl+0x11/0x20 do_syscall_64+0x43/0xf0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 INFO: Slab 0x00000000d13e87af objects=3 used=3 fp=0x (null) flags=0x200000000010201 INFO: Object 0x0000000003278802 @offset=17064 fp=0x00000000e2e6652b Fixing this by allocating UIO_MAXIOV + VHOST_NET_BATCH iovs for vhost-net. This is done through set the limitation through vhost_dev_init(), then set_owner can allocate the number of iov in a per device manner. This fixes CVE-2018-16880. Fixes: e2b3b35eb989 ("vhost_net: batch used ring update in rx") Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-15vhost/scsi: Use copy_to_iter() to send control queue responseBijan Mottahedeh1-8/+12
Uses copy_to_iter() instead of __copy_to_user() in order to ensure we support arbitrary layouts and an input buffer split across iov entries. Fixes: 0d02dbd68c47b ("vhost/scsi: Respond to control queue operations") Signed-off-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-01-03Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds1-2/+2
Pull virtio/vhost updates from Michael Tsirkin: "Features, fixes, cleanups: - discard in virtio blk - misc fixes and cleanups" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost: correct the related warning message vhost: split structs into a separate header file virtio: remove deprecated VIRTIO_PCI_CONFIG() vhost/vsock: switch to a mutex for vhost_vsock_hash virtio_blk: add discard and write zeroes support
2018-12-20vhost: correct the related warning messagewangyan1-2/+2
Fixes: 'commit d588cf8f618d ("target: Fix se_tpg_tfo->tf_subsys regression + remove tf_subsystem")' 'commit cbbd26b8b1a6 ("[iov_iter] new primitives - copy_from_iter_full() and friends")' Signed-off-by: Yan Wang <wangyan122@huawei.com> Reviewed-by: Jun Piao <piaojun@huawei.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-11-29scsi: target: replace fabric_ops.name with fabric_aliasDavid Disseldorp1-1/+0
iscsi_target_mod is the only LIO fabric where fabric_ops.name differs from the fabric_ops.fabric_name string. fabric_ops.name is used when matching target/$fabric ConfigFS create paths, so rename it .fabric_alias and fallback to target/$fabric vs .fabric_name comparison if .fabric_alias isn't initialised. iscsi_target_mod is the only fabric module to set .fabric_alias . All other fabric modules rely on .fabric_name matching and can drop the duplicate string. Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-29scsi: target: drop unnecessary get_fabric_name() accessor from fabric_opsDavid Disseldorp1-6/+1
All fabrics return a const string. In all cases *except* iSCSI the get_fabric_name() string matches fabric_ops.name. Both fabric_ops.get_fabric_name() and fabric_ops.name are user-facing, with the former being used for PR/ALUA state and the latter for ConfigFS (config/target/$name), so we unfortunately need to keep both strings around for now. Replace the useless .get_fabric_name() accessor function with a const string fabric_name member variable. Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-10-25vhost/scsi: Use common handling code in request queue handlerBijan Mottahedeh1-197/+164
Change the request queue handler to use common handling routines same as the control queue handler. Signed-off-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-10-25vhost/scsi: Extract common handling code from control queue handlerBijan Mottahedeh1-99/+172
Prepare to change the request queue handler to use common handling routines. Signed-off-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-10-25vhost/scsi: Respond to control queue operationsBijan Mottahedeh1-0/+190
The vhost-scsi driver currently does not handle any control queue operations. In particular, vhost_scsi_ctl_handle_kick, merely prints out a debug message but does nothing else. This can cause guest VMs to hang. As part of SCSI recovery from an error, e.g., an I/O timeout, the SCSI midlayer attempts to abort the failed operation. The SCSI virtio driver translates the abort to a SCSI TMF request that gets put on the control queue (virtscsi_abort -> virtscsi_tmf). The SCSI virtio driver then waits indefinitely for this request to be completed, but it never will because vhost-scsi never responds to that request. To avoid a hang, always respond to control queue operations; explicitly reject TMF requests, and return a no-op response to event requests. Signed-off-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-10-25vhost/scsi: truncate T10 PI iov_iter to prot_bytesGreg Edwards1-1/+3
Commands with protection information included were not truncating the protection iov_iter to the number of protection bytes in the command. This resulted in vhost_scsi mis-calculating the size of the protection SGL in vhost_scsi_calc_sgls(), and including both the protection and data SG entries in the protection SGL. Fixes: 09b13fa8c1a1 ("vhost/scsi: Add ANY_LAYOUT support in vhost_scsi_handle_vq") Signed-off-by: Greg Edwards <gedwards@ddn.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Fixes: 09b13fa8c1a1093e9458549ac8bb203a7c65c62a Cc: stable@vger.kernel.org Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-24Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds1-1/+1
Pull virtio updates from Michael Tsirkin: "virtio, vhost: fixes, tweaks No new features but a bunch of tweaks such as switching balloon from oom notifier to shrinker" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost/scsi: increase VHOST_SCSI_PREALLOC_PROT_SGLS to 2048 vhost: allow vhost-scsi driver to be built-in virtio: pci-legacy: Validate queue pfn virtio: mmio-v1: Validate queue PFN virtio_balloon: replace oom notifier with shrinker virtio-balloon: kzalloc the vb struct virtio-balloon: remove BUG() in init_vqs
2018-08-22vhost/scsi: increase VHOST_SCSI_PREALLOC_PROT_SGLS to 2048Greg Edwards1-1/+1
The current value of VHOST_SCSI_PREALLOC_PROT_SGLS is too small to accommodate larger I/Os, e.g. 16-32 MiB, when the VIRTIO_SCSI_F_T10_PI feature bit is negotiated and the backing store supports T10 PI. vhost-scsi rejects the command with errors like: [ 59.581317] vhost_scsi_calc_sgls: requested sgl_count: 1820 exceeds pre-allocated max_sgls: 512 Signed-off-by: Greg Edwards <gedwards@ddn.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-08-02scsi: target: loop, usb, vhost, xen: use target_remove_sessionMike Christie1-1/+1
This converts drivers that were only calling transport_deregister_session to use target_remove_session. The calling of transport_deregister_session_configfs via target_remove_session for these types of drivers is ok, because they were not exporting info from fields like sess_acl_list, sess->se_tpg and sess->fabric_sess_ptr from configfs accessible functions, so they will see no difference. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Felipe Balbi <balbi@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Andrzej Pietrasiewicz <andrzej.p@samsung.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Juergen Gross <jgross@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-08-02scsi: target: rename target_alloc_sessionMike Christie1-1/+1
Rename target_alloc_session to target_setup_session to avoid confusion with the other transport session allocation function that only allocates the session and because the target_alloc_session does so much more. It allocates the session, sets up the nacl and registers the session. The next patch will then add a remove function to match the setup in this one, so it should make sense for all drivers, except iscsi, to just call those 2 functions to setup and remove a session. iscsi will continue to be the odd driver. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Chris Boot <bootc@bootc.net> Cc: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Cc: Michael Cyr <mikecyr@linux.vnet.ibm.com> Cc: <qla2xxx-upstream@qlogic.com> Cc: Johannes Thumshirn <jth@kernel.org> Cc: Felipe Balbi <balbi@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Andrzej Pietrasiewicz <andrzej.p@samsung.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Juergen Gross <jgross@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-07-02scsi: target: Remove second argument from fabric_make_tpg()Bart Van Assche1-3/+1
Since most target drivers do not use the second fabric_make_tpg() argument ("group") and since it is trivial to derive the group pointer from the wwn pointer, do not pass the group pointer to fabric_make_tpg(). Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Cc: Felipe Balbi <felipe.balbi@linux.intel.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-06-20scsi: target: Convert target drivers to use sbitmapMatthew Wilcox1-3/+3
The sbitmap and the percpu_ida perform essentially the same task, allocating tags for commands. The sbitmap outperforms the percpu_ida as documented here: https://lkml.org/lkml/2014/4/22/553 The sbitmap interface is a little harder to use, but being able to remove the percpu_ida code and getting better performance justifies the additional complexity. Signed-off-by: Matthew Wilcox <willy@infradead.org> Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> # f_tcm Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-06-20scsi: target: Abstract tag freeingMatthew Wilcox1-1/+1
Introduce target_free_tag() and convert all drivers to use it. Signed-off-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-06-13treewide: kzalloc() -> kcalloc()Kees Cook1-6/+9
The kzalloc() function has a 2-factor argument form, kcalloc(). This patch replaces cases of: kzalloc(a * b, gfp) with: kcalloc(a * b, gfp) as well as handling cases of: kzalloc(a * b * c, gfp) with: kzalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kzalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kzalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kzalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | kzalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kzalloc( - sizeof(u8) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(char) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(u8) * COUNT + COUNT , ...) | kzalloc( - sizeof(__u8) * COUNT + COUNT , ...) | kzalloc( - sizeof(char) * COUNT + COUNT , ...) | kzalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kzalloc + kcalloc ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kzalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kzalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kzalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kzalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kzalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kzalloc(C1 * C2 * C3, ...) | kzalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) | kzalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) | kzalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) | kzalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kzalloc(sizeof(THING) * C2, ...) | kzalloc(sizeof(TYPE) * C2, ...) | kzalloc(C1 * C2 * C3, ...) | kzalloc(C1 * C2, ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) | - kzalloc + kcalloc ( - (E1) * E2 + E1, E2 , ...) | - kzalloc + kcalloc ( - (E1) * (E2) + E1, E2 , ...) | - kzalloc + kcalloc ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-13treewide: kmalloc() -> kmalloc_array()Kees Cook1-1/+1
The kmalloc() function has a 2-factor argument form, kmalloc_array(). This patch replaces cases of: kmalloc(a * b, gfp) with: kmalloc_array(a * b, gfp) as well as handling cases of: kmalloc(a * b * c, gfp) with: kmalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kmalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kmalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The tools/ directory was manually excluded, since it has its own implementation of kmalloc(). The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kmalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | kmalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kmalloc( - sizeof(u8) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(char) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) | kmalloc( - sizeof(u8) * COUNT + COUNT , ...) | kmalloc( - sizeof(__u8) * COUNT + COUNT , ...) | kmalloc( - sizeof(char) * COUNT + COUNT , ...) | kmalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kmalloc + kmalloc_array ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kmalloc + kmalloc_array ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kmalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kmalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kmalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kmalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kmalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kmalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kmalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kmalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kmalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | kmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kmalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kmalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kmalloc(C1 * C2 * C3, ...) | kmalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) | kmalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) | kmalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) | kmalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kmalloc(sizeof(THING) * C2, ...) | kmalloc(sizeof(TYPE) * C2, ...) | kmalloc(C1 * C2 * C3, ...) | kmalloc(C1 * C2, ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) | - kmalloc + kmalloc_array ( - (E1) * E2 + E1, E2 , ...) | - kmalloc + kmalloc_array ( - (E1) * (E2) + E1, E2 , ...) | - kmalloc + kmalloc_array ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>
2018-02-01vhost: remove unused lock check flag in vhost_dev_cleanup()夷则(Caspar)1-1/+1
In commit ea5d404655ba ("vhost: fix release path lockdep checks"), Michael added a flag to check whether we should hold a lock in vhost_dev_cleanup(), however, in commit 47283bef7ed3 ("vhost: move memory pointer to VQs"), RCU operations have been replaced by mutex, we can remove the no-longer-used `locked' parameter now. Signed-off-by: Caspar Zhang <jinli.zjl@alibaba-inc.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-31vhost/scsi: Improve a size determination in four functionsMarkus Elfring1-5/+4
Replace the specification of four data structures by pointer dereferences as the parameter for the operator "sizeof" to make the corresponding size determination a bit safer according to the Linux coding style convention. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-11-17Merge branch 'work.iov_iter' of ↵Linus Torvalds1-50/+23
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull iov_iter updates from Al Viro: - bio_{map,copy}_user_iov() series; those are cleanups - fixes from the same pile went into mainline (and stable) in late September. - fs/iomap.c iov_iter-related fixes - new primitive - iov_iter_for_each_range(), which applies a function to kernel-mapped segments of an iov_iter. Usable for kvec and bvec ones, the latter does kmap()/kunmap() around the callback. _Not_ usable for iovec- or pipe-backed iov_iter; the latter is not hard to fix if the need ever appears, the former is by design. Another related primitive will have to wait for the next cycle - it passes page + offset + size instead of pointer + size, and that one will be usable for everything _except_ kvec. Unfortunately, that one didn't get exposure in -next yet, so... - a bit more lustre iov_iter work, including a use case for iov_iter_for_each_range() (checksum calculation) - vhost/scsi leak fix in failure exit - misc cleanups and detritectomy... * 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (21 commits) iomap_dio_actor(): fix iov_iter bugs switch ksocknal_lib_recv_...() to use of iov_iter_for_each_range() lustre: switch struct ksock_conn to iov_iter vhost/scsi: switch to iov_iter_get_pages() fix a page leak in vhost_scsi_iov_to_sgl() error recovery new primitive: iov_iter_for_each_range() lnet_return_rx_credits_locked: don't abuse list_entry xen: don't open-code iov_iter_kvec() orangefs: remove detritus from struct orangefs_kiocb_s kill iov_shorten() bio_alloc_map_data(): do bmd->iter setup right there bio_copy_user_iov(): saner bio size calculation bio_map_user_iov(): get rid of copying iov_iter bio_copy_from_iter(): get rid of copying iov_iter move more stuff down into bio_copy_user_iov() blk_rq_map_user_iov(): move iov_iter_advance() down bio_map_user_iov(): get rid of the iov_for_each() bio_map_user_iov(): move alignment check into the main loop don't rely upon subsequent bio_add_pc_page() calls failing ... and with iov_iter_get_pages_alloc() it becomes even simpler ...
2017-11-17Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds1-2/+2
Pull virtio updates from Michael Tsirkin: "Fixes in qemu, vhost and virtio" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: fw_cfg: fix the command line module name vhost/vsock: fix uninitialized vhost_vsock->guest_cid vhost: fix end of range for access_ok vhost/scsi: Use safe iteration in vhost_scsi_complete_cmd_work() virtio_balloon: fix deadlock on OOM
2017-11-15vhost/scsi: Use safe iteration in vhost_scsi_complete_cmd_work()Byungchul Park1-2/+2
The following patch changed the behavior which originally did safe iteration. Make it safe as it was. 12bdcbd539c6327c09da0503c674733cb2d82cb5 vhost/scsi: Don't reinvent the wheel but use existing llist API Signed-off-by: Byungchul Park <byungchul.park@lge.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-10-25locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns ↵Mark Rutland1-1/+1
to READ_ONCE()/WRITE_ONCE() Please do not apply this to mainline directly, instead please re-run the coccinelle script shown below and apply its output. For several reasons, it is desirable to use {READ,WRITE}_ONCE() in preference to ACCESS_ONCE(), and new code is expected to use one of the former. So far, there's been no reason to change most existing uses of ACCESS_ONCE(), as these aren't harmful, and changing them results in churn. However, for some features, the read/write distinction is critical to correct operation. To distinguish these cases, separate read/write accessors must be used. This patch migrates (most) remaining ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following coccinelle script: ---- // Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and // WRITE_ONCE() // $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch virtual patch @ depends on patch @ expression E1, E2; @@ - ACCESS_ONCE(E1) = E2 + WRITE_ONCE(E1, E2) @ depends on patch @ expression E; @@ - ACCESS_ONCE(E) + READ_ONCE(E) ---- Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: davem@davemloft.net Cc: linux-arch@vger.kernel.org Cc: mpe@ellerman.id.au Cc: shuah@kernel.org Cc: snitzer@redhat.com Cc: thor.thayer@linux.intel.com Cc: tj@kernel.org Cc: viro@zeniv.linux.org.uk Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-12vhost/scsi: switch to iov_iter_get_pages()Al Viro1-48/+20
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-10-12fix a page leak in vhost_scsi_iov_to_sgl() error recoveryAl Viro1-2/+3
we are advancing sg as we go, so the pages we need to drop in case of error are *before* the current sg. Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-07-14Merge branch 'for-next' of ↵Linus Torvalds1-8/+3
git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "It's been usually busy for summer, with most of the efforts centered around TCMU developments and various target-core + fabric driver bug fixing activities. Not particularly large in terms of LoC, but lots of smaller patches from many different folks. The highlights include: - ibmvscsis logical partition manager support (Michael Cyr + Bryant Ly) - Convert target/iblock WRITE_SAME to blkdev_issue_zeroout (hch + nab) - Add support for TMR percpu LUN reference counting (nab) - Fix a potential deadlock between EXTENDED_COPY and iscsi shutdown (Bart) - Fix COMPARE_AND_WRITE caw_sem leak during se_cmd quiesce (Jiang Yi) - Fix TMCU module removal (Xiubo Li) - Fix iser-target OOPs during login failure (Andrea Righi + Sagi) - Breakup target-core free_device backend driver callback (mnc) - Perform TCMU add/delete/reconfig synchronously (mnc) - Fix TCMU multiple UIO open/close sequences (mnc) - Fix TCMU CHECK_CONDITION sense handling (mnc) - Fix target-core SAM_STAT_BUSY + TASK_SET_FULL handling (mnc + nab) - Introduce TYPE_ZBC support in PSCSI (Damien Le Moal) - Fix possible TCMU memory leak + OOPs when recalculating cmd base size (Xiubo Li + Bryant Ly + Damien Le Moal + mnc) - Add login_keys_workaround attribute for non RFC initiators (Robert LeBlanc + Arun Easi + nab)" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (68 commits) iscsi-target: Add login_keys_workaround attribute for non RFC initiators Revert "qla2xxx: Fix incorrect tcm_qla2xxx_free_cmd use during TMR ABORT" tcmu: clean up the code and with one small fix tcmu: Fix possbile memory leak / OOPs when recalculating cmd base size target: export lio pgr/alua support as device attr target: Fix return sense reason in target_scsi3_emulate_pr_out target: Fix cmd size for PR-OUT in passthrough_parse_cdb tcmu: Fix dev_config_store target: pscsi: Introduce TYPE_ZBC support target: Use macro for WRITE_VERIFY_32 operation codes target: fix SAM_STAT_BUSY/TASK_SET_FULL handling target: remove transport_complete pscsi: finish cmd processing from pscsi_req_done tcmu: fix sense handling during completion target: add helper to copy sense to se_cmd buffer target: do not require a transport_complete for SCF_TRANSPORT_TASK_SENSE target: make device_mutex and device_list static tcmu: Fix flushing cmd entry dcache page tcmu: fix multiple uio open/close sequences tcmu: drop configured check in destroy ...
2017-07-13mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_MAYFAIL with more useful ↵Michal Hocko1-1/+1
semantic __GFP_REPEAT was designed to allow retry-but-eventually-fail semantic to the page allocator. This has been true but only for allocations requests larger than PAGE_ALLOC_COSTLY_ORDER. It has been always ignored for smaller sizes. This is a bit unfortunate because there is no way to express the same semantic for those requests and they are considered too important to fail so they might end up looping in the page allocator for ever, similarly to GFP_NOFAIL requests. Now that the whole tree has been cleaned up and accidental or misled usage of __GFP_REPEAT flag has been removed for !costly requests we can give the original flag a better name and more importantly a more useful semantic. Let's rename it to __GFP_RETRY_MAYFAIL which tells the user that the allocator would try really hard but there is no promise of a success. This will work independent of the order and overrides the default allocator behavior. Page allocator users have several levels of guarantee vs. cost options (take GFP_KERNEL as an example) - GFP_KERNEL & ~__GFP_RECLAIM - optimistic allocation without _any_ attempt to free memory at all. The most light weight mode which even doesn't kick the background reclaim. Should be used carefully because it might deplete the memory and the next user might hit the more aggressive reclaim - GFP_KERNEL & ~__GFP_DIRECT_RECLAIM (or GFP_NOWAIT)- optimistic allocation without any attempt to free memory from the current context but can wake kswapd to reclaim memory if the zone is below the low watermark. Can be used from either atomic contexts or when the request is a performance optimization and there is another fallback for a slow path. - (GFP_KERNEL|__GFP_HIGH) & ~__GFP_DIRECT_RECLAIM (aka GFP_ATOMIC) - non sleeping allocation with an expensive fallback so it can access some portion of memory reserves. Usually used from interrupt/bh context with an expensive slow path fallback. - GFP_KERNEL - both background and direct reclaim are allowed and the _default_ page allocator behavior is used. That means that !costly allocation requests are basically nofail but there is no guarantee of that behavior so failures have to be checked properly by callers (e.g. OOM killer victim is allowed to fail currently). - GFP_KERNEL | __GFP_NORETRY - overrides the default allocator behavior and all allocation requests fail early rather than cause disruptive reclaim (one round of reclaim in this implementation). The OOM killer is not invoked. - GFP_KERNEL | __GFP_RETRY_MAYFAIL - overrides the default allocator behavior and all allocation requests try really hard. The request will fail if the reclaim cannot make any progress. The OOM killer won't be triggered. - GFP_KERNEL | __GFP_NOFAIL - overrides the default allocator behavior and all allocation requests will loop endlessly until they succeed. This might be really dangerous especially for larger orders. Existing users of __GFP_REPEAT are changed to __GFP_RETRY_MAYFAIL because they already had their semantic. No new users are added. __alloc_pages_slowpath is changed to bail out for __GFP_RETRY_MAYFAIL if there is no progress and we have already passed the OOM point. This means that all the reclaim opportunities have been exhausted except the most disruptive one (the OOM killer) and a user defined fallback behavior is more sensible than keep retrying in the page allocator. [akpm@linux-foundation.org: fix arch/sparc/kernel/mdesc.c] [mhocko@suse.com: semantic fix] Link: http://lkml.kernel.org/r/20170626123847.GM11534@dhcp22.suse.cz [mhocko@kernel.org: address other thing spotted by Vlastimil] Link: http://lkml.kernel.org/r/20170626124233.GN11534@dhcp22.suse.cz Link: http://lkml.kernel.org/r/20170623085345.11304-3-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Alex Belits <alex.belits@cavium.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: David Daney <david.daney@cavium.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@suse.de> Cc: NeilBrown <neilb@suse.com> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-09vhost/scsi: Don't reinvent the wheel but use existing llist APIByungchul Park1-8/+3
Although llist provides proper APIs, they are not used. Make them used. Signed-off-by: Byungchul Park <byungchul.park@lge.com> Acked-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>