summaryrefslogtreecommitdiff
path: root/drivers/vdpa
AgeCommit message (Collapse)AuthorFilesLines
2022-06-24vduse: Tie vduse mgmtdev and its deviceParav Pandit1-23/+37
vduse devices are not backed by any real devices such as PCI. Hence it doesn't have any parent device linked to it. Kernel driver model in [1] suggests to avoid an empty device release callback. Hence tie the mgmtdevice object's life cycle to an allocate dummy struct device instead of static one. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/core-api/kobject.rst?h=v5.18-rc7#n284 Signed-off-by: Parav Pandit <parav@nvidia.com> Message-Id: <20220613195223.473966-1-parav@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-06-24vdpa/mlx5: Initialize CVQ vringh only onceEli Cohen1-11/+20
Currently, CVQ vringh is initialized inside setup_virtqueues() which is called every time a memory update is done. This is undesirable since it resets all the context of the vring, including the available and used indices. Move the initialization to mlx5_vdpa_set_status() when VIRTIO_CONFIG_S_DRIVER_OK is set. Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220613075958.511064-2-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com>
2022-06-24vdpa/mlx5: Update Control VQ callback informationEli Cohen1-0/+2
The control VQ specific information is stored in the dedicated struct mlx5_control_vq. When the callback is updated through mlx5_vdpa_set_vq_cb(), make sure to update the control VQ struct. Fixes: 5262912ef3cf ("vdpa/mlx5: Add support for control VQ and MAC setting") Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220613075958.511064-1-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com)
2022-06-08vduse: Fix NULL pointer dereference on sysfs accessXie Yongji1-4/+3
The control device has no drvdata. So we will get a NULL pointer dereference when accessing control device's msg_timeout attribute via sysfs: [ 132.841881][ T3644] BUG: kernel NULL pointer dereference, address: 00000000000000f8 [ 132.850619][ T3644] RIP: 0010:msg_timeout_show (drivers/vdpa/vdpa_user/vduse_dev.c:1271) [ 132.869447][ T3644] dev_attr_show (drivers/base/core.c:2094) [ 132.870215][ T3644] sysfs_kf_seq_show (fs/sysfs/file.c:59) [ 132.871164][ T3644] ? device_remove_bin_file (drivers/base/core.c:2088) [ 132.872082][ T3644] kernfs_seq_show (fs/kernfs/file.c:164) [ 132.872838][ T3644] seq_read_iter (fs/seq_file.c:230) [ 132.873578][ T3644] ? __vmalloc_area_node (mm/vmalloc.c:3041) [ 132.874532][ T3644] kernfs_fop_read_iter (fs/kernfs/file.c:238) [ 132.875513][ T3644] __kernel_read (fs/read_write.c:440 (discriminator 1)) [ 132.876319][ T3644] kernel_read (fs/read_write.c:459) [ 132.877129][ T3644] kernel_read_file (fs/kernel_read_file.c:94) [ 132.877978][ T3644] kernel_read_file_from_fd (include/linux/file.h:45 fs/kernel_read_file.c:186) [ 132.879019][ T3644] __do_sys_finit_module (kernel/module.c:4207) [ 132.879930][ T3644] __ia32_sys_finit_module (kernel/module.c:4189) [ 132.880930][ T3644] do_int80_syscall_32 (arch/x86/entry/common.c:112 arch/x86/entry/common.c:132) [ 132.881847][ T3644] entry_INT80_compat (arch/x86/entry/entry_64_compat.S:419) To fix it, don't create the unneeded attribute for control device anymore. Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace") Reported-by: kernel test robot <oliver.sang@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Message-Id: <20220426073656.229-1-xieyongji@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-06-08vdpa/mlx5: clean up indenting in handle_ctrl_vlan()Dan Carpenter1-3/+3
These lines were supposed to be indented. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Message-Id: <Yp71IYMP+QfuCJ8t@kili> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eli Cohen <elic@nvidia.com> Acked-by: Si-Wei Liu <si-wei.liu@oracle.com>
2022-06-08vdpa/mlx5: fix error code for deleting vlanDan Carpenter1-0/+1
Return success if we were able to delete a vlan. The current code always returns failure. Fixes: baf2ad3f6a98 ("vdpa/mlx5: Add RX MAC VLAN filter support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Message-Id: <Yp709f1g9NcMBCHg@kili> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eli Cohen <elic@nvidia.com> Acked-by: Si-Wei Liu <si-wei.liu@oracle.com>
2022-06-08vdpa/mlx5: Fix syntax errors in commentsXiang wangx1-1/+1
Delete the redundant word 'is'. Signed-off-by: Xiang wangx <wangxiang@cdjrlc.com> Message-Id: <20220604143858.16073-1-wangxiang@cdjrlc.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-06-03Merge tag 'driver-core-5.19-rc1' of ↵Linus Torvalds1-25/+4
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the set of driver core changes for 5.19-rc1. Lots of tiny driver core changes and cleanups happened this cycle, but the two major things are: - firmware_loader reorganization and additions including the ability to have XZ compressed firmware images and the ability for userspace to initiate the firmware load when it needs to, instead of being always initiated by the kernel. FPGA devices specifically want this ability to have their firmware changed over the lifetime of the system boot, and this allows them to work without having to come up with yet-another-custom-uapi interface for loading firmware for them. - physical location support added to sysfs so that devices that know this information, can tell userspace where they are located in a common way. Some ACPI devices already support this today, and more bus types should support this in the future. Smaller changes include: - driver_override api cleanups and fixes - error path cleanups and fixes - get_abi script fixes - deferred probe timeout changes. It's that last change that I'm the most worried about. It has been reported to cause boot problems for a number of systems, and I have a tested patch series that resolves this issue. But I didn't get it merged into my tree before 5.18-final came out, so it has not gotten any linux-next testing. I'll send the fixup patches (there are 2) as a follow-on series to this pull request. All have been tested in linux-next for weeks, with no reported issues other than the above-mentioned boot time-outs" * tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits) driver core: fix deadlock in __device_attach kernfs: Separate kernfs_pr_cont_buf and rename_lock. topology: Remove unused cpu_cluster_mask() driver core: Extend deferred probe timeout on driver registration MAINTAINERS: add Russ Weight as a firmware loader maintainer driver: base: fix UAF when driver_attach failed test_firmware: fix end of loop test in upload_read_show() driver core: location: Add "back" as a possible output for panel driver core: location: Free struct acpi_pld_info *pld driver core: Add "*" wildcard support to driver_async_probe cmdline param driver core: location: Check for allocations failure arch_topology: Trace the update thermal pressure kernfs: Rename kernfs_put_open_node to kernfs_unlink_open_file. export: fix string handling of namespace in EXPORT_SYMBOL_NS rpmsg: use local 'dev' variable rpmsg: Fix calling device_lock() on non-initialized device firmware_loader: describe 'module' parameter of firmware_upload_register() firmware_loader: Move definitions from sysfs_upload.h to sysfs.h firmware_loader: Fix configs for sysfs split selftests: firmware: Add firmware upload selftests ...
2022-06-01vdpa: ifcvf: set pci driver data in probeJason Wang1-1/+2
We should set the pci driver data in probe instead of the vdpa device adding callback. Otherwise if no vDPA device is created we will lose the pointer to the management device. Fixes: 6b5df347c6482 ("vDPA/ifcvf: implement management netlink framework for ifcvf") Tested-by: Zheyu Ma <zheyuma97@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220524055557.1938-1-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-06-01vdpa/mlx5: Add RX MAC VLAN filter supportEli Cohen1-58/+216
Support HW offloaded filtering of MAC/VLAN packets. To allow that, we add a handler to handle VLAN configurations coming through the control VQ. Two operations are supported. 1. Adding VLAN - in this case, an entry will be added to the RX flow table that will allow the combination of the MAC/VLAN to be forwarded to the TIR. 2. Removing VLAN - will remove the entry from the flow table, effectively blocking such packets from going through. Currently the control VQ does not propagate changes to the MAC of the VLAN device so we always use the MAC of the parent device. Examples: 1. Create vlan device: $ ip link add link ens1 name ens1.8 type vlan id 8 Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220411122942.225717-4-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-06-01vdpa/mlx5: Remove flow counter from steeringEli Cohen1-18/+6
The flow counter has been introduced in early versions of the driver to aid in debugging. It is no longer needed and can harm performance. Remove it. Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220411122942.225717-2-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31vdpasim: Off by one in vdpasim_set_group_asid()Dan Carpenter1-1/+1
The > comparison needs to be >= to prevent an out of bounds access of the vdpasim->iommu[] array. The vdpasim->iommu[] is allocated in vdpasim_create() and it has vdpasim->dev_attr.nas elements. Fixes: 87e5afeac247 ("vdpasim: control virtqueue support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Message-Id: <YotGQU1q224RKZR8@kili> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-05-31vdpasim: allow to enable a vq repeatedlyEugenio Pérez1-1/+4
Code must be resilient to enable a queue many times. At the moment the queue is resetting so it's definitely not the expected behavior. v2: set vq->ready = 0 at disable. Fixes: 2c53d0f64c06 ("vdpasim: vDPA device simulator") Cc: stable@vger.kernel.org Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20220519145919.772896-1-eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2022-05-31vDPA/ifcvf: fix uninitialized config_vector warningZhu Lingshan1-6/+6
Static checkers are not informed that config_vector is controlled by vf->msix_vector_status, which can only be MSIX_VECTOR_SHARED_VQ_AND_CONFIG, MSIX_VECTOR_SHARED_VQ_AND_CONFIG and MSIX_VECTOR_DEV_SHARED. This commit uses an "if...elseif...else" code block to tell the checkers that it is a complete set, and config_vector can be initialized anyway Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Message-Id: <20220424072806.1083189-1-lingshan.zhu@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-05-31vdpa/vp_vdpa : add vdpa tool support in vp_vdpaCindy Lu1-32/+129
this patch is to add the support for vdpa tool in vp_vdpa here is the example steps modprobe vp_vdpa modprobe vhost_vdpa echo 0000:00:06.0>/sys/bus/pci/drivers/virtio-pci/unbind echo 1af4 1041 > /sys/bus/pci/drivers/vp-vdpa/new_id vdpa dev add name vdpa1 mgmtdev pci/0000:00:06.0 Signed-off-by: Cindy Lu <lulu@redhat.com> Message-Id: <20220429091030.547434-1-lulu@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-05-31vdpasim: control virtqueue supportGautam Dawar3-20/+161
This patch introduces the control virtqueue support for vDPA simulator. This is a requirement for supporting advanced features like multiqueue. A requirement for control virtqueue is to isolate its memory access from the rx/tx virtqueues. This is because when using vDPA device for VM, the control virqueue is not directly assigned to VM. Userspace (Qemu) will present a shadow control virtqueue to control for recording the device states. The isolation is done via the virtqueue groups and ASID support in vDPA through vhost-vdpa. The simulator is extended to have: 1) three virtqueues: RXVQ, TXVQ and CVQ (control virtqueue) 2) two virtqueue groups: group 0 contains RXVQ and TXVQ; group 1 contains CVQ 3) two address spaces and the simulator simply implements the address spaces by mapping it 1:1 to IOTLB. For the VM use cases, userspace(Qemu) may set AS 0 to group 0 and AS 1 to group 1. So we have: 1) The IOTLB for virtqueue group 0 contains the mappings of guest, so RX and TX can be assigned to guest directly. 2) The IOTLB for virtqueue group 1 contains the mappings of CVQ which is the buffers that allocated and managed by VMM only. So CVQ of vhost-vdpa is visible to VMM only. And Guest can not access the CVQ of vhost-vdpa. For the other use cases, since AS 0 is associated to all virtqueue groups by default. All virtqueues share the same mapping by default. To demonstrate the function, VIRITO_NET_F_CTRL_MACADDR is implemented in the simulator for the driver to set mac address. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-20-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31vdpa_sim: filter destination mac addressGautam Dawar1-18/+31
This patch implements a simple unicast filter for vDPA simulator. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-19-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31vdpa_sim: factor out buffer completion logicGautam Dawar1-15/+18
Wrap up common buffer completion logic in to vdpasim_net_complete Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-18-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31vdpa_sim: advertise VIRTIO_NET_F_MTUGautam Dawar1-1/+2
We've already reported maximum mtu via config space, so let's advertise the feature. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-17-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31vdpa: multiple address spaces supportGautam Dawar7-11/+17
This patches introduces the multiple address spaces support for vDPA device. This idea is to identify a specific address space via an dedicated identifier - ASID. During vDPA device allocation, vDPA device driver needs to report the number of address spaces supported by the device then the DMA mapping ops of the vDPA device needs to be extended to support ASID. This helps to isolate the environments for the virtqueue that will not be assigned directly. E.g in the case of virtio-net, the control virtqueue will not be assigned directly to guest. As a start, simply claim 1 virtqueue groups and 1 address spaces for all vDPA devices. And vhost-vDPA will simply reject the device with more than 1 virtqueue groups or address spaces. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-7-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31vdpa: introduce virtqueue groupsGautam Dawar8-6/+29
This patch introduces virtqueue groups to vDPA device. The virtqueue group is the minimal set of virtqueues that must share an address space. And the address space identifier could only be attached to a specific virtqueue group. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Gautam Dawar <gdawar@xilinx.com> Message-Id: <20220330180436.24644-6-gdawar@xilinx.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31vdpa/mlx5: Use readers/writers semaphore instead of mutexEli Cohen1-22/+19
Reading statistics could be done intensively and by several processes concurrently. Reader's lock is sufficient in this case. Change reslock from mutex to a rwsem. Suggested-by: Si-Wei Liu <si-wei.liu@oracle.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220518133804.1075129-7-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31vdpa/mlx5: Add support for reading descriptor statisticsEli Cohen2-0/+151
Implement the get_vq_stats calback of vdpa_config_ops to return the statistics for a virtqueue. The statistics are provided as vendor specific statistics where the driver provides a pair of attribute name and attribute value. Currently supported are received descriptors and completed descriptors. Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220518133804.1075129-6-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31net/vdpa: Use readers/writers semaphore instead of cf_mutexEli Cohen1-13/+12
Replace cf_mutex with rw_semaphore to reflect the fact that some calls could be called concurrently but can suffice with read lock. Suggested-by: Si-Wei Liu <si-wei.liu@oracle.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220518133804.1075129-5-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31net/vdpa: Use readers/writers semaphore instead of vdpa_dev_mutexEli Cohen1-32/+32
Use rw_semaphore instead of mutex to control access to vdpa devices. This can be especially beneficial in case processes poll on statistics information. Suggested-by: Si-Wei Liu <si-wei.liu@oracle.com> Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220518133804.1075129-4-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31vdpa: Add support for querying vendor statisticsEli Cohen1-0/+162
Allows to read vendor statistics of a vdpa device. The specific statistics data are received from the upstream driver in the form of an (attribute name, attribute value) pairs. An example of statistics for mlx5_vdpa device are: received_desc - number of descriptors received by the virtqueue completed_desc - number of descriptors completed by the virtqueue A descriptor using indirect buffers is still counted as 1. In addition, N chained descriptors are counted correctly N times as one would expect. A new callback was added to vdpa_config_ops which provides the means for the vdpa driver to return statistics results. The interface allows for reading all the supported virtqueues, including the control virtqueue if it exists. Below are some examples taken from mlx5_vdpa which are introduced in the following patch: 1. Read statistics for the virtqueue at index 1 $ vdpa dev vstats show vdpa-a qidx 1 vdpa-a: queue_type tx queue_index 1 received_desc 3844836 completed_desc 3844836 2. Read statistics for the virtqueue at index 32 $ vdpa dev vstats show vdpa-a qidx 32 vdpa-a: queue_type control_vq queue_index 32 received_desc 62 completed_desc 62 3. Read statisitics for the virtqueue at index 0 with json output $ vdpa -j dev vstats show vdpa-a qidx 0 {"vstats":{"vdpa-a":{ "queue_type":"rx","queue_index":0,"name":"received_desc","value":417776,\ "name":"completed_desc","value":417548}}} 4. Read statistics for the virtqueue at index 0 with preety json output $ vdpa -jp dev vstats show vdpa-a qidx 0 { "vstats": { "vdpa-a": { "queue_type": "rx", "queue_index": 0, "name": "received_desc", "value": 417776, "name": "completed_desc", "value": 417548 } } } Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220518133804.1075129-3-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-31vdpa: Fix error logic in vdpa_nl_cmd_dev_get_doitEli Cohen1-4/+9
In vdpa_nl_cmd_dev_get_doit(), if the call to genlmsg_reply() fails we must not call nlmsg_free() since this is done inside genlmsg_reply(). Fix it. Fixes: bc0d90ee021f ("vdpa: Enable user to query vdpa device info") Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220518133804.1075129-2-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-05-18vdpa/mlx5: Use consistent RQT sizeEli Cohen1-40/+21
The current code evaluates RQT size based on the configured number of virtqueues. This can raise an issue in the following scenario: Assume MQ was negotiated. 1. mlx5_vdpa_set_map() gets called. 2. handle_ctrl_mq() is called setting cur_num_vqs to some value, lower than the configured max VQs. 3. A second set_map gets called, but now a smaller number of VQs is used to evaluate the size of the RQT. 4. handle_ctrl_mq() is called with a value larger than what the RQT can hold. This will emit errors and the driver state is compromised. To fix this, we use a new field in struct mlx5_vdpa_net to hold the required number of entries in the RQT. This value is evaluated in mlx5_vdpa_set_driver_features() where we have the negotiated features all set up. In addition to that, we take into consideration the max capability of RQT entries early when the device is added so we don't need to take consider it when creating the RQT. Last, we remove the use of mlx5_vdpa_max_qps() which just returns the max_vas / 2 and make the code clearer. Fixes: 52893733f2c5 ("vdpa/mlx5: Add multiqueue support") Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-04-22vdpa: Use helper for safer setting of driver_overrideKrzysztof Kozlowski1-25/+4
Use a helper to set driver_override to the reduce amount of duplicated code. Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20220419113435.246203-9-krzysztof.kozlowski@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-05Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds1-21/+41
Pull virtio fixes from Michael Tsirkin: "Fixes and cleanups: - A couple of mlx5 fixes related to cvq - A couple of reverts dropping useless code (code that used it got reverted earlier)" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vdpa: mlx5: synchronize driver status with CVQ vdpa: mlx5: prevent cvq work from hogging CPU Revert "virtio_config: introduce a new .enable_cbs method" Revert "virtio: use virtio_device_ready() in virtio_device_restore()"
2022-03-31Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds5-127/+428
Pull virtio updates from Michael Tsirkin: - vdpa generic device type support - more virtio hardening for broken devices (but on the same theme, revert some virtio hotplug hardening patches - they were misusing some interrupt flags and had to be reverted) - RSS support in virtio-net - max device MTU support in mlx5 vdpa - akcipher support in virtio-crypto - shared IRQ support in ifcvf vdpa - a minor performance improvement in vhost - enable virtio mem for ARM64 - beginnings of advance dma support - cleanups, fixes all over the place * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (33 commits) vdpa/mlx5: Avoid processing works if workqueue was destroyed vhost: handle error while adding split ranges to iotlb vdpa: support exposing the count of vqs to userspace vdpa: change the type of nvqs to u32 vdpa: support exposing the config size to userspace vdpa/mlx5: re-create forwarding rules after mac modified virtio: pci: check bar values read from virtio config space Revert "virtio_pci: harden MSI-X interrupts" Revert "virtio-pci: harden INTX interrupts" drivers/net/virtio_net: Added RSS hash report control. drivers/net/virtio_net: Added RSS hash report. drivers/net/virtio_net: Added basic RSS support. drivers/net/virtio_net: Fixed padded vheader to use v1 with hash. virtio: use virtio_device_ready() in virtio_device_restore() tools/virtio: compile with -pthread tools/virtio: fix after premapped buf support virtio_ring: remove flags check for unmap packed indirect desc virtio_ring: remove flags check for unmap split indirect desc virtio_ring: rename vring_unmap_state_packed() to vring_unmap_extra_packed() net/mlx5: Add support for configuring max device MTU ...
2022-03-30vdpa: mlx5: synchronize driver status with CVQJason Wang1-14/+37
Currently, CVQ doesn't have any synchronization with the driver status. Then CVQ emulation code run in the middle of: 1) device reset 2) device status changed 3) map updating The will lead several unexpected issue like trying to execute CVQ command after the driver has been teared down. Fixing this by using reslock to synchronize CVQ emulation code with the driver status changing: - protect the whole device reset, status changing and set_map() updating with reslock - protect the CVQ handler with the reslock and check VIRTIO_CONFIG_S_DRIVER_OK in the CVQ handler This will guarantee that: 1) CVQ handler won't work if VIRTIO_CONFIG_S_DRIVER_OK is not set 2) CVQ handler will see a consistent state of the driver instead of the partial one when it is running in the middle of the teardown_driver() or setup_driver(). Cc: 5262912ef3cfc ("vdpa/mlx5: Add support for control VQ and MAC setting") Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20220329042109.4029-2-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eli Cohen <elic@nvidia.com>
2022-03-30vdpa: mlx5: prevent cvq work from hogging CPUJason Wang1-12/+9
A userspace triggerable infinite loop could happen in mlx5_cvq_kick_handler() if userspace keeps sending a huge amount of cvq requests. Fixing this by introducing a quota and re-queue the work if we're out of the budget (currently the implicit budget is one) . While at it, using a per device work struct to avoid on demand memory allocation for cvq. Fixes: 5262912ef3cfc ("vdpa/mlx5: Add support for control VQ and MAC setting") Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20220329042109.4029-1-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Eli Cohen <elic@nvidia.com>
2022-03-28vdpa/mlx5: Avoid processing works if workqueue was destroyedEli Cohen1-2/+5
If mlx5_vdpa gets unloaded while a VM is running, the workqueue will be destroyed. However, vhost might still have reference to the kick function and might attempt to push new works. This could lead to null pointer dereference. To fix this, set mvdev->wq to NULL just before destroying and verify that the workqueue is not NULL in mlx5_vdpa_kick_vq before attempting to push a new work. Fixes: 5262912ef3cf ("vdpa/mlx5: Add support for control VQ and MAC setting") Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220321141303.9586-1-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-28vdpa: change the type of nvqs to u32Longpeng1-3/+3
Change vdpa_device.nvqs and vhost_vdpa.nvqs to use u32 Signed-off-by: Longpeng <longpeng2@huawei.com> Link: https://lore.kernel.org/r/20220315032553.455-3-longpeng2@huawei.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Longpeng &lt;<a href="mailto:longpeng2@huawei.com" target="_blank">longpeng2@huawei.com</a>&gt;<br></blockquote><div><br></div><div>Acked-by: Jason Wang &lt;<a href="mailto:jasowang@redhat.com">jasowang@redhat.com</a>&gt;</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2022-03-28vdpa/mlx5: re-create forwarding rules after mac modifiedMichael Qiu1-1/+44
When MAC Address has been modified in guest, we only re-add the Mac to mpfs, it is not enough, because the guest network will not work correctly: the reply package from outside will go straight away to the host VF net interface. This patch recreate the flow rules, and make it work correctly. Signed-off-by: Michael Qiu <qiudayu@archeros.com> Link: https://lore.kernel.org/r/1648446492-17614-1-git-send-email-08005325@163.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Eli Cohen <elic@nvidia.com>
2022-03-28net/mlx5: Add support for configuring max device MTUEli Cohen1-1/+31
Allow an admin creating a vdpa device to specify the max MTU for the net device. For example, to create a device with max MTU of 1000, the following command can be used: $ vdpa dev add name vdpa-a mgmtdev auxiliary/mlx5_core.sf.1 mtu 1000 This configuration mechanism assumes that vdpa is the sole real user of the function. mlx5_core could theoretically change the mtu of the function using the ip command on the mlx5_core net device but this should not be done. Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com> Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220221121927.194728-1-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-03-28vDPA/ifcvf: cacheline alignment for ifcvf_hwZhu Lingshan2-9/+5
This commit introduces a new cacheline aligned layout for ifcvf_hw. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Link: https://lore.kernel.org/r/20220222115428.998334-6-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-28vDPA/ifcvf: implement shared IRQ featureZhu Lingshan3-57/+300
On some platforms/devices, there may not be enough MSI vectors allocated for the virtqueues and config changes. In such a case, the interrupt sources(virtqueues, config changes) must share an IRQ/vector, to avoid initialization failures, keep the device functional. This commit handles three cases: (1) number of the allocated vectors == the number of virtqueues + 1 (config changes), every virtqueue and the config interrupt has a separated vector/IRQ, the best and the most likely case. (2) number of the allocated vectors is less than the best case, but greater than 1. In this case, all virtqueues share a vector/IRQ, the config interrupt has a separated vector/IRQ (3) only one vector is allocated, in this case, the virtqueues and the config interrupt share a vector/IRQ. The worst and most unlikely case. Otherwise, it needs to fail. This commit introduces some helper functions: ifcvf_set_vq_vector() and ifcvf_set_config_vector() sets virtqueue vector and config vector in the device config space, so that the device can send interrupt DMA. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Link: https://lore.kernel.org/r/20220222115428.998334-5-lingshan.zhu@intel.com Signed-off-by: Tom Rix <trix@redhat.com> Link: https://lore.kernel.org/r/20220315124130.1710030-1-trix@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-28vDPA/ifcvf: implement device MSIX vector allocatorZhu Lingshan1-5/+26
This commit implements a MSIX vector allocation helper for vqs and config interrupts. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Link: https://lore.kernel.org/r/20220222115428.998334-4-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-28vDPA/ifcvf: make use of virtio pci modern IO helpers in ifcvfZhu Lingshan3-71/+36
This commit discards ifcvf_ioreadX()/writeX(), use virtio pci modern IO helpers instead Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Link: https://lore.kernel.org/r/20220222115428.998334-2-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-25Merge tag 'iommu-updates-v5.18' of ↵Linus Torvalds1-0/+11
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: - IOMMU Core changes: - Removal of aux domain related code as it is basically dead and will be replaced by iommu-fd framework - Split of iommu_ops to carry domain-specific call-backs separatly - Cleanup to remove useless ops->capable implementations - Improve 32-bit free space estimate in iova allocator - Intel VT-d updates: - Various cleanups of the driver - Support for ATS of SoC-integrated devices listed in ACPI/SATC table - ARM SMMU updates: - Fix SMMUv3 soft lockup during continuous stream of events - Fix error path for Qualcomm SMMU probe() - Rework SMMU IRQ setup to prepare the ground for PMU support - Minor cleanups and refactoring - AMD IOMMU driver: - Some minor cleanups and error-handling fixes - Rockchip IOMMU driver: - Use standard driver registration - MSM IOMMU driver: - Minor cleanup and change to standard driver registration - Mediatek IOMMU driver: - Fixes for IOTLB flushing logic * tag 'iommu-updates-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (47 commits) iommu/amd: Improve amd_iommu_v2_exit() iommu/amd: Remove unused struct fault.devid iommu/amd: Clean up function declarations iommu/amd: Call memunmap in error path iommu/arm-smmu: Account for PMU interrupts iommu/vt-d: Enable ATS for the devices in SATC table iommu/vt-d: Remove unused function intel_svm_capable() iommu/vt-d: Add missing "__init" for rmrr_sanity_check() iommu/vt-d: Move intel_iommu_ops to header file iommu/vt-d: Fix indentation of goto labels iommu/vt-d: Remove unnecessary prototypes iommu/vt-d: Remove unnecessary includes iommu/vt-d: Remove DEFER_DEVICE_DOMAIN_INFO iommu/vt-d: Remove domain and devinfo mempool iommu/vt-d: Remove iova_cache_get/put() iommu/vt-d: Remove finding domain in dmar_insert_one_dev_info() iommu/vt-d: Remove intel_iommu::domains iommu/mediatek: Always tlb_flush_all when each PM resume iommu/mediatek: Add tlb_lock in tlb_flush_all iommu/mediatek: Remove the power status checking in tlb flush all ...
2022-03-06vdpa: fix use-after-free on vp_vdpa_removeZhang Min1-1/+1
When vp_vdpa driver is unbind, vp_vdpa is freed in vdpa_unregister_device and then vp_vdpa->mdev.pci_dev is dereferenced in vp_modern_remove, triggering use-after-free. Call Trace of unbinding driver free vp_vdpa : do_syscall_64 vfs_write kernfs_fop_write_iter device_release_driver_internal pci_device_remove vp_vdpa_remove vdpa_unregister_device kobject_release device_release kfree Call Trace of dereference vp_vdpa->mdev.pci_dev: vp_modern_remove pci_release_selected_regions pci_release_region pci_resource_len pci_resource_end (dev)->resource[(bar)].end Signed-off-by: Zhang Min <zhang.min9@zte.com.cn> Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Link: https://lore.kernel.org/r/20220301091059.46869-1-wang.yi59@zte.com.cn Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Fixes: 64b9f64f80a6 ("vdpa: introduce virtio pci driver") Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2022-03-04vduse: Fix returning wrong type in vduse_domain_alloc_iova()Xie Yongji1-1/+1
This fixes the following smatch warnings: drivers/vdpa/vdpa_user/iova_domain.c:305 vduse_domain_alloc_iova() warn: should 'iova_pfn << shift' be a 64 bit type? Fixes: 8c773d53fb7b ("vduse: Implement an MMU-based software IOTLB") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20220121083940.102-1-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-03-04vdpa/mlx5: add validation for VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET commandSi-Wei Liu1-0/+16
When control vq receives a VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command request from the driver, presently there is no validation against the number of queue pairs to configure, or even if multiqueue had been negotiated or not is unverified. This may lead to kernel panic due to uninitialized resource for the queues were there any bogus request sent down by untrusted driver. Tie up the loose ends there. Fixes: 52893733f2c5 ("vdpa/mlx5: Add multiqueue support") Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Link: https://lore.kernel.org/r/1642206481-30721-4-git-send-email-si-wei.liu@oracle.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Eli Cohen <elic@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-03-04vdpa/mlx5: should verify CTRL_VQ feature exists for MQSi-Wei Liu1-2/+16
Per VIRTIO v1.1 specification, section 5.1.3.1 Feature bit requirements: "VIRTIO_NET_F_MQ Requires VIRTIO_NET_F_CTRL_VQ". There's assumption in the mlx5_vdpa multiqueue code that MQ must come together with CTRL_VQ. However, there's nowhere in the upper layer to guarantee this assumption would hold. Were there an untrusted driver sending down MQ without CTRL_VQ, it would compromise various spots for e.g. is_index_valid() and is_ctrl_vq_idx(). Although this doesn't end up with immediate panic or security loophole as of today's code, the chance for this to be taken advantage of due to future code change is not zero. Harden the crispy assumption by failing the set_driver_features() call when seeing (MQ && !CTRL_VQ). For that end, verify_min_features() is renamed to verify_driver_features() to reflect the fact that it now does more than just validate the minimum features. verify_driver_features() is now used to accommodate various checks against the driver features for set_driver_features(). Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Link: https://lore.kernel.org/r/1642206481-30721-3-git-send-email-si-wei.liu@oracle.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Eli Cohen <elic@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-03-04vdpa: factor out vdpa_set_features_unlocked for vdpa internal useSi-Wei Liu1-1/+1
No functional change introduced. vdpa bus driver such as virtio_vdpa or vhost_vdpa is not supposed to take care of the locking for core by its own. The locked API vdpa_set_features should suffice the bus driver's need. Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Reviewed-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/1642206481-30721-2-git-send-email-si-wei.liu@oracle.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-02-14iommu/iova: Separate out rcache initJohn Garry1-0/+11
Currently the rcache structures are allocated for all IOVA domains, even if they do not use "fast" alloc+free interface. This is wasteful of memory. In addition, fails in init_iova_rcaches() are not handled safely, which is less than ideal. Make "fast" users call a separate rcache init explicitly, which includes error checking. Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/1643882360-241739-1-git-send-email-john.garry@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-01-18Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds10-145/+350
Pull virtio updates from Michael Tsirkin: "virtio,vdpa,qemu_fw_cfg: features, cleanups, and fixes. - partial support for < MAX_ORDER - 1 granularity for virtio-mem - driver_override for vdpa - sysfs ABI documentation for vdpa - multiqueue config support for mlx5 vdpa - and misc fixes, cleanups" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (42 commits) vdpa/mlx5: Fix tracking of current number of VQs vdpa/mlx5: Fix is_index_valid() to refer to features vdpa: Protect vdpa reset with cf_mutex vdpa: Avoid taking cf_mutex lock on get status vdpa/vdpa_sim_net: Report max device capabilities vdpa: Use BIT_ULL for bit operations vdpa/vdpa_sim: Configure max supported virtqueues vdpa/mlx5: Report max device capabilities vdpa: Support reporting max device capabilities vdpa/mlx5: Restore cur_num_vqs in case of failure in change_num_qps() vdpa: Add support for returning device configuration information vdpa/mlx5: Support configuring max data virtqueue vdpa/mlx5: Fix config_attr_mask assignment vdpa: Allow to configure max data virtqueues vdpa: Read device configuration only if FEATURES_OK vdpa: Sync calls set/get config/status with cf_mutex vdpa/mlx5: Distribute RX virtqueues in RQT object vdpa: Provide interface to read driver features vdpa: clean up get_config_size ret value handling virtio_ring: mark ring unused on error ...
2022-01-15vdpa/mlx5: Fix tracking of current number of VQsEli Cohen1-4/+8
Modify the code such that ndev->cur_num_vqs better reflects the actual number of data virtqueues. The value can be accurately realized after features have been negotiated. This is to prevent possible failures when modifying the RQT object if the cur_num_vqs bears invalid value. No issue was actually encountered but this also makes the code more readable. Fixes: c5a5cd3d3217 ("vdpa/mlx5: Support configuring max data virtqueue") Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20220111183400.38418-5-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Si-Wei Liu<si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com>