summaryrefslogtreecommitdiff
path: root/drivers/vfio/pci
AgeCommit message (Collapse)AuthorFilesLines
2022-04-20vfio/pci: Fix vf_token mechanism when device-specific VF drivers are usedJason Gunthorpe1-50/+74
[ Upstream commit 1ef3342a934e235aca72b4bcc0d6854d80a65077 ] get_pf_vdev() tries to check if a PF is a VFIO PF by looking at the driver: if (pci_dev_driver(physfn) != pci_dev_driver(vdev->pdev)) { However now that we have multiple VF and PF drivers this is no longer reliable. This means that security tests realted to vf_token can be skipped by mixing and matching different VFIO PCI drivers. Instead of trying to use the driver core to find the PF devices maintain a linked list of all PF vfio_pci_core_device's that we have called pci_enable_sriov() on. When registering a VF just search the list to see if the PF is present and record the match permanently in the struct. PCI core locking prevents a PF from passing pci_disable_sriov() while VF drivers are attached so the VFIO owned PF becomes a static property of the VF. In common cases where vfio does not own the PF the global list remains empty and the VF's pointer is statically NULL. This also fixes a lockdep splat from recursive locking of the vfio_group::device_lock between vfio_device_get_from_name() and vfio_device_get_from_dev(). If the VF and PF share the same group this would deadlock. Fixes: ff53edf6d6ab ("vfio/pci: Split the pci_driver code out of vfio_pci_core.c") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/0-v3-876570980634+f2e8-vfio_vf_token_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13vfio/pci: Stub vfio_pci_vga_rw when !CONFIG_VFIO_PCI_VGAAlex Williamson1-0/+2
[ Upstream commit 6e031ec0e5a2dda53e12e0d2a7e9b15b47a3c502 ] Resolve build errors reported against UML build for undefined ioport_map() and ioport_unmap() functions. Without this config option a device cannot have vfio_pci_core_device.has_vga set, so the existing function would always return -EINVAL anyway. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/r/20220123125737.2658758-1-geert@linux-m68k.org Link: https://lore.kernel.org/r/164306582968.3758255.15192949639574660648.stgit@omen Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-08vfio/pci: wake-up devices around reset functionsAbhishek Sahu1-0/+48
[ Upstream commit 26a17b12d7f3dd8a7aa45a290e5b46e9cc775ddf ] If 'vfio_pci_core_device::needs_pm_restore' is set (PCI device does not have No_Soft_Reset bit set in its PMCSR config register), then the current PCI state will be saved locally in 'vfio_pci_core_device::pm_save' during D0->D3hot transition and same will be restored back during D3hot->D0 transition. For reset-related functionalities, vfio driver uses PCI reset API's. These API's internally change the PCI power state back to D0 first if the device power state is non-D0. This state change to D0 will happen without the involvement of vfio driver. Let's consider the following example: 1. The device is in D3hot. 2. User invokes VFIO_DEVICE_RESET ioctl. 3. pci_try_reset_function() will be called which internally invokes pci_dev_save_and_disable(). 4. pci_set_power_state(dev, PCI_D0) will be called first. 5. pci_save_state() will happen then. Now, for the devices which has NoSoftRst-, the pci_set_power_state() can trigger soft reset and the original PCI config state will be lost at step (4) and this state cannot be restored again. This original PCI state can include any setting which is performed by SBIOS or host linux kernel (for example LTR, ASPM L1 substates, etc.). When this soft reset will be triggered, then all these settings will be reset, and the device state saved at step (5) will also have this setting cleared so it cannot be restored. Since the vfio driver only exposes limited PCI capabilities to its user, so the vfio driver user also won't have the option to save and restore these capabilities state either and these original settings will be permanently lost. For pci_reset_bus() also, we can have the above situation. The other functions/devices can be in D3hot and the reset will change the power state of all devices to D0 without the involvement of vfio driver. So, before calling any reset-related API's, we need to make sure that the device state is D0. This is mainly to preserve the state around soft reset. For vfio_pci_core_disable(), we use __pci_reset_function_locked() which internally can use pci_pm_reset() for the function reset. pci_pm_reset() requires the device power state to be in D0, otherwise it returns error. This patch changes the device power state to D0 by invoking vfio_pci_set_power_state() explicitly before calling any reset related API's. Fixes: 51ef3a004b1e ("vfio/pci: Restore device state on PM transition") Signed-off-by: Abhishek Sahu <abhsahu@nvidia.com> Link: https://lore.kernel.org/r/20220217122107.22434-3-abhsahu@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-08vfio/pci: fix memory leak during D3hot to D0 transitionAbhishek Sahu1-0/+13
[ Upstream commit eadf88ecf6ac7d6a9f47a76c6055d9a1987a8991 ] If 'vfio_pci_core_device::needs_pm_restore' is set (PCI device does not have No_Soft_Reset bit set in its PMCSR config register), then the current PCI state will be saved locally in 'vfio_pci_core_device::pm_save' during D0->D3hot transition and same will be restored back during D3hot->D0 transition. For saving the PCI state locally, pci_store_saved_state() is being used and the pci_load_and_free_saved_state() will free the allocated memory. But for reset related IOCTLs, vfio driver calls PCI reset-related API's which will internally change the PCI power state back to D0. So, when the guest resumes, then it will get the current state as D0 and it will skip the call to vfio_pci_set_power_state() for changing the power state to D0 explicitly. In this case, the memory pointed by 'pm_save' will never be freed. In a malicious sequence, the state changing to D3hot followed by VFIO_DEVICE_RESET/VFIO_DEVICE_PCI_HOT_RESET can be run in a loop and it can cause an OOM situation. This patch frees the earlier allocated memory first before overwriting 'pm_save' to prevent the mentioned memory leak. Fixes: 51ef3a004b1e ("vfio/pci: Restore device state on PM transition") Signed-off-by: Abhishek Sahu <abhsahu@nvidia.com> Link: https://lore.kernel.org/r/20220217122107.22434-2-abhsahu@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-23vfio/pci: add missing identifier name in argument of function prototypeColin Ian King1-1/+1
The function prototype is missing an identifier name. Add one. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20210902212631.54260-1-colin.king@canonical.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-09-02Merge tag 'vfio-v5.15-rc1' of git://github.com/awilliam/linux-vfioLinus Torvalds10-2544/+2310
Pull VFIO updates from Alex Williamson: - Fix dma-valid return WAITED implementation (Anthony Yznaga) - SPDX license cleanups (Cai Huoqing) - Split vfio-pci-core from vfio-pci and enhance PCI driver matching to support future vendor provided vfio-pci variants (Yishai Hadas, Max Gurtovoy, Jason Gunthorpe) - Replace duplicated reflck with core support for managing first open, last close, and device sets (Jason Gunthorpe, Max Gurtovoy, Yishai Hadas) - Fix non-modular mdev support and don't nag about request callback support (Christoph Hellwig) - Add semaphore to protect instruction intercept handler and replace open-coded locks in vfio-ap driver (Tony Krowiak) - Convert vfio-ap to vfio_register_group_dev() API (Jason Gunthorpe) * tag 'vfio-v5.15-rc1' of git://github.com/awilliam/linux-vfio: (37 commits) vfio/pci: Introduce vfio_pci_core.ko vfio: Use kconfig if XX/endif blocks instead of repeating 'depends on' vfio: Use select for eventfd PCI / VFIO: Add 'override_only' support for VFIO PCI sub system PCI: Add 'override_only' field to struct pci_device_id vfio/pci: Move module parameters to vfio_pci.c vfio/pci: Move igd initialization to vfio_pci.c vfio/pci: Split the pci_driver code out of vfio_pci_core.c vfio/pci: Include vfio header in vfio_pci_core.h vfio/pci: Rename ops functions to fit core namings vfio/pci: Rename vfio_pci_device to vfio_pci_core_device vfio/pci: Rename vfio_pci_private.h to vfio_pci_core.h vfio/pci: Rename vfio_pci.c to vfio_pci_core.c vfio/ap_ops: Convert to use vfio_register_group_dev() s390/vfio-ap: replace open coded locks for VFIO_GROUP_NOTIFY_SET_KVM notification s390/vfio-ap: r/w lock for PQAP interception handler function pointer vfio/type1: Fix vfio_find_dma_valid return vfio-pci/zdev: Remove repeated verbose license text vfio: platform: reset: Convert to SPDX identifier vfio: Remove struct vfio_device_ops open/release ...
2021-08-26Merge branches 'v5.15/vfio/spdx-license-cleanups', ↵Alex Williamson10-2400/+2309
'v5.15/vfio/dma-valid-waited-v3', 'v5.15/vfio/vfio-pci-core-v5' and 'v5.15/vfio/vfio-ap' into v5.15/vfio/next
2021-08-26vfio/pci: Introduce vfio_pci_core.koMax Gurtovoy10-281/+64
Now that vfio_pci has been split into two source modules, one focusing on the "struct pci_driver" (vfio_pci.c) and a toolbox library of code (vfio_pci_core.c), complete the split and move them into two different kernel modules. As before vfio_pci.ko continues to present the same interface under sysfs and this change will have no functional impact. Splitting into another module and adding exports allows creating new HW specific VFIO PCI drivers that can implement device specific functionality, such as VFIO migration interfaces or specialized device requirements. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Link: https://lore.kernel.org/r/20210826103912.128972-14-yishaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-08-26vfio: Use kconfig if XX/endif blocks instead of repeating 'depends on'Jason Gunthorpe1-5/+6
This results in less kconfig wordage and a simpler understanding of the required "depends on" to create the menu structure. The next patch increases the nesting level a lot so this is a nice preparatory simplification. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Link: https://lore.kernel.org/r/20210826103912.128972-13-yishaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-08-26vfio: Use select for eventfdJason Gunthorpe1-1/+1
If VFIO_VIRQFD is required then turn on eventfd automatically. The majority of kconfig users of the EVENTFD use select not depends on. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Link: https://lore.kernel.org/r/20210826103912.128972-12-yishaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-08-26PCI / VFIO: Add 'override_only' support for VFIO PCI sub systemMax Gurtovoy1-1/+8
Expose an 'override_only' helper macro (i.e. PCI_DRIVER_OVERRIDE_DEVICE_VFIO) for VFIO PCI sub system and add the required code to prefix its matching entries with "vfio_" in modules.alias file. It allows VFIO device drivers to include match entries in the modules.alias file produced by kbuild that are not used for normal driver autoprobing and module autoloading. Drivers using these match entries can be connected to the PCI device manually, by userspace, using the existing driver_override sysfs. For example the resulting modules.alias may have: alias pci:v000015B3d00001021sv*sd*bc*sc*i* mlx5_core alias vfio_pci:v000015B3d00001021sv*sd*bc*sc*i* mlx5_vfio_pci alias vfio_pci:v*d*sv*sd*bc*sc*i* vfio_pci In this example mlx5_core and mlx5_vfio_pci match to the same PCI device. The kernel will autoload and autobind to mlx5_core but the kernel and udev mechanisms will ignore mlx5_vfio_pci. When userspace wants to change a device to the VFIO subsystem it can implement a generic algorithm: 1) Identify the sysfs path to the device: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 2) Get the modalias string from the kernel: $ cat /sys/bus/pci/devices/0000:01:00.0/modalias pci:v000015B3d00001021sv000015B3sd00000001bc02sc00i00 3) Prefix it with vfio_: vfio_pci:v000015B3d00001021sv000015B3sd00000001bc02sc00i00 4) Search modules.alias for the above string and select the entry that has the fewest *'s: alias vfio_pci:v000015B3d00001021sv*sd*bc*sc*i* mlx5_vfio_pci 5) modprobe the matched module name: $ modprobe mlx5_vfio_pci 6) cat the matched module name to driver_override: echo mlx5_vfio_pci > /sys/bus/pci/devices/0000:01:00.0/driver_override 7) unbind device from original module echo 0000:01:00.0 > /sys/bus/pci/devices/0000:01:00.0/driver/unbind 8) probe PCI drivers (or explicitly bind to mlx5_vfio_pci) echo 0000:01:00.0 > /sys/bus/pci/drivers_probe The algorithm is independent of bus type. In future the other buses with VFIO device drivers, like platform and ACPI, can use this algorithm as well. This patch is the infrastructure to provide the information in the modules.alias to userspace. Convert the only VFIO pci_driver which results in one new line in the modules.alias: alias vfio_pci:v*d*sv*sd*bc*sc*i* vfio_pci Later series introduce additional HW specific VFIO PCI drivers, such as mlx5_vfio_pci. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # for pci.h Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Link: https://lore.kernel.org/r/20210826103912.128972-11-yishaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-08-26vfio/pci: Move module parameters to vfio_pci.cYishai Hadas3-12/+33
This is a preparation before splitting vfio_pci.ko to 2 modules. As module parameters are a kind of uAPI they need to stay on vfio_pci.ko to avoid a user visible impact. For now continue to keep the implementation of these options in vfio_pci_core.c. Arguably they are vfio_pci functionality, but further splitting of vfio_pci_core.c will be better done in another series Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20210826103912.128972-9-yishaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-08-26vfio/pci: Move igd initialization to vfio_pci.cMax Gurtovoy3-36/+41
igd is related to the vfio_pci pci_driver implementation, move it out of vfio_pci_core.c. This is preparation for splitting vfio_pci.ko into 2 drivers. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Link: https://lore.kernel.org/r/20210826103912.128972-8-yishaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-08-26vfio/pci: Split the pci_driver code out of vfio_pci_core.cMax Gurtovoy4-215/+304
Split the vfio_pci driver into two logical parts, the 'struct pci_driver' (vfio_pci.c) which implements "Generic VFIO support for any PCI device" and a library of code (vfio_pci_core.c) that helps implementing a struct vfio_device on top of a PCI device. vfio_pci.ko continues to present the same interface under sysfs and this change should have no functional impact. Following patches will turn vfio_pci and vfio_pci_core into a separate module. This is a preparation for allowing another module to provide the pci_driver and allow that module to customize how VFIO is setup, inject its own operations, and easily extend vendor specific functionality. At this point the vfio_pci_core still contains a lot of vfio_pci functionality mixed into it. Following patches will move more of the large scale items out, but another cleanup series will be needed to get everything. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Link: https://lore.kernel.org/r/20210826103912.128972-7-yishaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-08-26vfio/pci: Include vfio header in vfio_pci_core.hMax Gurtovoy2-1/+1
The vfio_device structure is embedded into the vfio_pci_core_device structure, so there is no reason for not including the header file in the vfio_pci_core header as well. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Link: https://lore.kernel.org/r/20210826103912.128972-6-yishaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-08-26vfio/pci: Rename ops functions to fit core namingsMax Gurtovoy1-16/+16
This is another preparation patch for separating the vfio_pci driver to a subsystem driver and a generic pci driver. This patch doesn't change any logic. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Link: https://lore.kernel.org/r/20210826103912.128972-5-yishaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-08-26vfio/pci: Rename vfio_pci_device to vfio_pci_core_deviceMax Gurtovoy7-155/+158
This is a preparation patch for separating the vfio_pci driver to a subsystem driver and a generic pci driver. This patch doesn't change any logic. The new vfio_pci_core_device structure will be the main structure of the core driver and later on vfio_pci_device structure will be the main structure of the generic vfio_pci driver. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Link: https://lore.kernel.org/r/20210826103912.128972-4-yishaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-08-26vfio/pci: Rename vfio_pci_private.h to vfio_pci_core.hMax Gurtovoy7-9/+9
This is a preparation patch for separating the vfio_pci driver to a subsystem driver and a generic pci driver. This patch doesn't change any logic. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Link: https://lore.kernel.org/r/20210826103912.128972-3-yishaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-08-26vfio/pci: Rename vfio_pci.c to vfio_pci_core.cMax Gurtovoy2-1/+1
This is a preparation patch for separating the vfio_pci driver to a subsystem driver and a generic pci driver. This patch doesn't change any logic. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Link: https://lore.kernel.org/r/20210826103912.128972-2-yishaih@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-08-24vfio-pci/zdev: Remove repeated verbose license textCai Huoqing1-6/+1
The SPDX and verbose license text are redundant, however in this case the verbose license indicates a GPL v2 only while SPDX specifies v2+. Remove the verbose license and correct SPDX to the more restricted version. Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20210824003749.1039-1-caihuoqing@baidu.com [aw: commit log] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-08-11vfio/pci: Reorganize VFIO_DEVICE_PCI_HOT_RESET to use the device setJason Gunthorpe1-124/+89
Like vfio_pci_dev_set_try_reset() this code wants to reset all of the devices in the "reset group" which is the same membership as the device set. Instead of trying to reconstruct the device set from the PCI list go directly from the device set's device list to execute the reset. The same basic structure as vfio_pci_dev_set_try_reset() is used. The 'vfio_devices' struct is replaced with the device set linked list and we simply sweep it multiple times under the lock. This eliminates a memory allocation and get/put traffic and another improperly locked test of pci_dev_driver(). Reviewed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/10-v4-9ea22c5e6afb+1adf-vfio_reflck_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-08-11vfio/pci: Change vfio_pci_try_bus_reset() to use the dev_setJason Gunthorpe1-92/+82
vfio_pci_try_bus_reset() is triggering a reset of the entire_dev set if any device within it has accumulated a needs_reset. This reset can only be done once all of the drivers operating the PCI devices to be reset are in a known safe state. Make this clearer by directly operating on the dev_set instead of the vfio_pci_device. Rename the function to vfio_pci_dev_set_try_reset(). Use the device list inside the dev_set to check that all drivers are in a safe state instead of working backwards from the pci_device. The dev_set->lock directly prevents devices from joining/leaving the set, or changing their state, which further implies the pci_device cannot change drivers or that the vfio_device be freed, eliminating the need for get/put's. If a pci_device to be reset is not in the dev_set then the reset cannot be used as we can't know what the state of that driver is. Directly measure this by checking that every pci_device is in the dev_set - which effectively proves that VFIO drivers are attached to everything. Remove the odd interaction around vfio_pci_set_power_state() - have the only caller avoid its redundant vfio_pci_set_power_state() instead of avoiding it inside vfio_pci_dev_set_try_reset(). This restructuring corrects a call to pci_dev_driver() without holding the device_lock() and removes a hard wiring to &vfio_pci_driver. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/9-v4-9ea22c5e6afb+1adf-vfio_reflck_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-08-11vfio/pci: Move to the device set infrastructureYishai Hadas2-132/+37
PCI wants to have the usual open/close_device() logic with the slight twist that the open/close_device() must be done under a singelton lock shared by all of the vfio_devices that are in the PCI "reset group". The reset group, and thus the device set, is determined by what devices pci_reset_bus() touches, which is either the entire bus or only the slot. Rely on the core code to do everything reflck was doing and delete reflck entirely. Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/8-v4-9ea22c5e6afb+1adf-vfio_reflck_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-08-11vfio: Introduce a vfio_uninit_group_dev() API callMax Gurtovoy1-2/+4
This pairs with vfio_init_group_dev() and allows undoing any state that is stored in the vfio_device unrelated to registration. Add appropriately placed calls to all the drivers. The following patch will use this to add pre-registration state for the device set. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/3-v4-9ea22c5e6afb+1adf-vfio_reflck_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-08-03vfio/pci: Make vfio_pci_regops->rw() return ssize_tYishai Hadas2-6/+6
The only implementation of this in IGD returns a -ERRNO which is implicitly cast through a size_t and then casted again and returned as a ssize_t in vfio_pci_rw(). Fix the vfio_pci_regops->rw() return type to be ssize_t so all is consistent. Fixes: 28541d41c9e0 ("vfio/pci: Add infrastructure for additional device specific regions") Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Link: https://lore.kernel.org/r/0-v3-5db12d1bf576+c910-vfio_rw_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-07-23Merge tag 'drm-misc-next-2021-07-22' of ↵Dave Airlie1-6/+5
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v5.15-rc1: UAPI Changes: - Remove sysfs stats for dma-buf attachments, as it causes a performance regression. Previous merge is not in a rc kernel yet, so no userspace regression possible. Cross-subsystem Changes: - Sanitize user input in kyro's viewport ioctl. - Use refcount_t in fb_info->count - Assorted fixes to dma-buf. - Extend x86 efifb handling to all archs. - Fix neofb divide by 0. - Document corpro,gm7123 bridge dt bindings. Core Changes: - Slightly rework drm master handling. - Cleanup vgaarb handling. - Assorted fixes. Driver Changes: - Add support for ws2401 panel. - Assorted fixes to stm, ast, bochs. - Demidlayer ingenic irq. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/2d0d2fe8-01fc-e216-c3fd-38db9e69944e@linux.intel.com
2021-07-21vgaarb: don't pass a cookie to vga_client_registerChristoph Hellwig1-5/+4
The VGA arbitration is entirely based on pci_dev structures, so just pass that back to the set_vga_decode callback. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://patchwork.freedesktop.org/patch/msgid/20210716061634.2446357-8-hch@lst.de Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>
2021-07-21vgaarb: remove the unused irq_set_state argument to vga_client_registerChristoph Hellwig1-1/+1
All callers pass NULL as the irq_set_state argument, so remove it and the ->irq_set_state member in struct vga_device. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://patchwork.freedesktop.org/patch/msgid/20210716061634.2446357-7-hch@lst.de Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>
2021-07-21vgaarb: provide a vga_client_unregister wrapperChristoph Hellwig1-1/+1
Add a trivial wrapper for the unregister case that sets all fields to NULL. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://patchwork.freedesktop.org/patch/msgid/20210716061634.2446357-6-hch@lst.de Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>
2021-06-30vfio/pci: Handle concurrent vma faultsAlex Williamson1-8/+21
io_remap_pfn_range() will trigger a BUG_ON if it encounters a populated pte within the mapping range. This can occur because we map the entire vma on fault and multiple faults can be blocked behind the vma_lock. This leads to traces like the one reported below. We can use our vma_list to test whether a given vma is mapped to avoid this issue. [ 1591.733256] kernel BUG at mm/memory.c:2177! [ 1591.739515] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 1591.747381] Modules linked in: vfio_iommu_type1 vfio_pci vfio_virqfd vfio pv680_mii(O) [ 1591.760536] CPU: 2 PID: 227 Comm: lcore-worker-2 Tainted: G O 5.11.0-rc3+ #1 [ 1591.770735] Hardware name: , BIOS HixxxxFPGA 1P B600 V121-1 [ 1591.778872] pstate: 40400009 (nZcv daif +PAN -UAO -TCO BTYPE=--) [ 1591.786134] pc : remap_pfn_range+0x214/0x340 [ 1591.793564] lr : remap_pfn_range+0x1b8/0x340 [ 1591.799117] sp : ffff80001068bbd0 [ 1591.803476] x29: ffff80001068bbd0 x28: 0000042eff6f0000 [ 1591.810404] x27: 0000001100910000 x26: 0000001300910000 [ 1591.817457] x25: 0068000000000fd3 x24: ffffa92f1338e358 [ 1591.825144] x23: 0000001140000000 x22: 0000000000000041 [ 1591.832506] x21: 0000001300910000 x20: ffffa92f141a4000 [ 1591.839520] x19: 0000001100a00000 x18: 0000000000000000 [ 1591.846108] x17: 0000000000000000 x16: ffffa92f11844540 [ 1591.853570] x15: 0000000000000000 x14: 0000000000000000 [ 1591.860768] x13: fffffc0000000000 x12: 0000000000000880 [ 1591.868053] x11: ffff0821bf3d01d0 x10: ffff5ef2abd89000 [ 1591.875932] x9 : ffffa92f12ab0064 x8 : ffffa92f136471c0 [ 1591.883208] x7 : 0000001140910000 x6 : 0000000200000000 [ 1591.890177] x5 : 0000000000000001 x4 : 0000000000000001 [ 1591.896656] x3 : 0000000000000000 x2 : 0168044000000fd3 [ 1591.903215] x1 : ffff082126261880 x0 : fffffc2084989868 [ 1591.910234] Call trace: [ 1591.914837] remap_pfn_range+0x214/0x340 [ 1591.921765] vfio_pci_mmap_fault+0xac/0x130 [vfio_pci] [ 1591.931200] __do_fault+0x44/0x12c [ 1591.937031] handle_mm_fault+0xcc8/0x1230 [ 1591.942475] do_page_fault+0x16c/0x484 [ 1591.948635] do_translation_fault+0xbc/0xd8 [ 1591.954171] do_mem_abort+0x4c/0xc0 [ 1591.960316] el0_da+0x40/0x80 [ 1591.965585] el0_sync_handler+0x168/0x1b0 [ 1591.971608] el0_sync+0x174/0x180 [ 1591.978312] Code: eb1b027f 540000c0 f9400022 b4fffe02 (d4210000) Fixes: 11c4cd07ba11 ("vfio-pci: Fault mmaps to enable vma tracking") Reported-by: Zeng Tao <prime.zeng@hisilicon.com> Suggested-by: Zeng Tao <prime.zeng@hisilicon.com> Link: https://lore.kernel.org/r/162497742783.3883260.3282953006487785034.stgit@omen Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-06-24vfio: use the new pci_dev_trylock() helper to simplify try lockLuis Chamberlain1-7/+4
Use the new pci_dev_trylock() helper to simplify our locking. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20210623022824.308041-3-mcgrof@kernel.org Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-06-15vfio: centralize module refcount in subsystem layerMax Gurtovoy1-7/+0
Remove code duplication and move module refcounting to the subsystem module. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20210518192133.59195-2-mgurtovoy@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-05-24vfio/pci: zap_vma_ptes() needs MMURandy Dunlap1-0/+1
zap_vma_ptes() is only available when CONFIG_MMU is set/enabled. Without CONFIG_MMU, vfio_pci.o has build errors, so make VFIO_PCI depend on MMU. riscv64-linux-ld: drivers/vfio/pci/vfio_pci.o: in function `vfio_pci_mmap_open': vfio_pci.c:(.text+0x1ec): undefined reference to `zap_vma_ptes' riscv64-linux-ld: drivers/vfio/pci/vfio_pci.o: in function `.L0 ': vfio_pci.c:(.text+0x165c): undefined reference to `zap_vma_ptes' Fixes: 11c4cd07ba11 ("vfio-pci: Fault mmaps to enable vma tracking") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: kvm@vger.kernel.org Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Eric Auger <eric.auger@redhat.com> Message-Id: <20210515190856.2130-1-rdunlap@infradead.org> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-05-24vfio/pci: Fix error return code in vfio_ecap_init()Zhen Lei1-1/+1
The error code returned from vfio_ext_cap_len() is stored in 'len', not in 'ret'. Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Message-Id: <20210515020458.6771-1-thunder.leizhen@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-29Merge tag 'vfio-v5.13-rc1' of git://github.com/awilliam/linux-vfioLinus Torvalds7-646/+195
Pull VFIO updates from Alex Williamson: - Embed struct vfio_device into vfio driver structures (Jason Gunthorpe) - Make vfio_mdev type safe (Jason Gunthorpe) - Remove vfio-pci NVLink2 extensions for POWER9 (Christoph Hellwig) - Update vfio-pci IGD extensions for OpRegion 2.1+ (Fred Gao) - Various spelling/blank line fixes (Zhen Lei, Zhou Wang, Bhaskar Chowdhury) - Simplify unpin_pages error handling (Shenming Lu) - Fix i915 mdev Kconfig dependency (Arnd Bergmann) - Remove unused structure member (Keqian Zhu) * tag 'vfio-v5.13-rc1' of git://github.com/awilliam/linux-vfio: (43 commits) vfio/gvt: fix DRM_I915_GVT dependency on VFIO_MDEV vfio/iommu_type1: Remove unused pinned_page_dirty_scope in vfio_iommu vfio/mdev: Correct the function signatures for the mdev_type_attributes vfio/mdev: Remove kobj from mdev_parent_ops->create() vfio/gvt: Use mdev_get_type_group_id() vfio/gvt: Make DRM_I915_GVT depend on VFIO_MDEV vfio/mbochs: Use mdev_get_type_group_id() vfio/mdpy: Use mdev_get_type_group_id() vfio/mtty: Use mdev_get_type_group_id() vfio/mdev: Add mdev/mtype_get_type_group_id() vfio/mdev: Remove duplicate storage of parent in mdev_device vfio/mdev: Add missing error handling to dev_set_name() vfio/mdev: Reorganize mdev_device_create() vfio/mdev: Add missing reference counting to mdev_type vfio/mdev: Expose mdev_get/put_parent to mdev_private.h vfio/mdev: Use struct mdev_type in struct mdev_device vfio/mdev: Simplify driver registration vfio/mdev: Add missing typesafety around mdev_device vfio/mdev: Do not allow a mdev_type to have a NULL parent pointer vfio/mdev: Fix missing static's on MDEV_TYPE_ATTR's ...
2021-04-13vfio/pci: Add missing range check in vfio_pci_mmapChristian A. Ehrhardt1-1/+3
When mmaping an extra device region verify that the region index derived from the mmap offset is valid. Fixes: a15b1883fee1 ("vfio_pci: Allow mapping extra regions") Cc: stable@vger.kernel.org Signed-off-by: Christian A. Ehrhardt <lk@c--e.de> Message-Id: <20210412214124.GA241759@lisa.in-ulm.de> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-06Merge branches 'v5.13/vfio/embed-vfio_device', 'v5.13/vfio/misc' and ↵Alex Williamson7-532/+55
'v5.13/vfio/nvlink' into v5.13/vfio/next Spelling fixes merged with file deletion. Conflicts: drivers/vfio/pci/vfio_pci_nvlink2.c Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-06vfio: Remove device_data from the vfio bus driver APIJason Gunthorpe1-1/+1
There are no longer any users, so it can go away. Everything is using container_of now. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <14-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-06vfio/pci: Replace uses of vfio_device_data() with container_ofJason Gunthorpe1-43/+24
This tidies a few confused places that think they can have a refcount on the vfio_device but the device_data could be NULL, that isn't possible by design. Most of the change falls out when struct vfio_devices is updated to just store the struct vfio_pci_device itself. This wasn't possible before because there was no easy way to get from the 'struct vfio_pci_device' to the 'struct vfio_device' to put back the refcount. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <13-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-06vfio: Make vfio_device_ops pass a 'struct vfio_device *' instead of 'void *'Jason Gunthorpe1-18/+29
This is the standard kernel pattern, the ops associated with a struct get the struct pointer in for typesafety. The expected design is to use container_of to cleanly go from the subsystem level type to the driver level type without having any type erasure in a void *. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <12-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-06vfio/pci: Use vfio_init/register/unregister_group_devJason Gunthorpe2-5/+6
pci already allocates a struct vfio_pci_device with exactly the same lifetime as vfio_device, switch to the new API and embed vfio_device in vfio_pci_device. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Liu Yi L <yi.l.liu@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <9-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-06vfio/pci: Re-order vfio_pci_probe()Jason Gunthorpe1-8/+9
vfio_add_group_dev() must be called only after all of the private data in vdev is fully setup and ready, otherwise there could be races with user space instantiating a device file descriptor and starting to call ops. For instance vfio_pci_reflck_attach() sets vdev->reflck and vfio_pci_open(), called by fops open, unconditionally derefs it, which will crash if things get out of order. Fixes: cc20d7999000 ("vfio/pci: Introduce VF token") Fixes: e309df5b0c9e ("vfio/pci: Parallelize device open and release") Fixes: 6eb7018705de ("vfio-pci: Move idle devices to D3hot power state") Fixes: ecaa1f6a0154 ("vfio-pci: Add VGA arbiter client") Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <8-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-06vfio/pci: Move VGA and VF initialization to functionsJason Gunthorpe1-42/+74
vfio_pci_probe() is quite complicated, with optional VF and VGA sub components. Move these into clear init/uninit functions and have a linear flow in probe/remove. This fixes a few little buglets: - vfio_pci_remove() is in the wrong order, vga_client_register() removes a notifier and is after kfree(vdev), but the notifier refers to vdev, so it can use after free in a race. - vga_client_register() can fail but was ignored Organize things so destruction order is the reverse of creation order. Fixes: ecaa1f6a0154 ("vfio-pci: Add VGA arbiter client") Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <7-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-06vfio/pci: remove vfio_pci_nvlink2Christoph Hellwig5-529/+0
This driver never had any open userspace (which for VFIO would include VM kernel drivers) that use it, and thus should never have been added by our normal userspace ABI rules. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Message-Id: <20210326061311.1497642-2-hch@lst.de> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-06vfio/pci: fix a couple of spelling mistakesZhen Lei2-3/+3
There are several spelling mistakes, as follows: thru ==> through presense ==> presence Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20210326083528.1329-4-thunder.leizhen@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-06vfio/pci: Add support for opregion v2.1+Fred Gao1-0/+53
Before opregion version 2.0 VBT data is stored in opregion mailbox #4, but when VBT data exceeds 6KB size and cannot be within mailbox #4 then from opregion v2.0+, Extended VBT region, next to opregion is used to hold the VBT data, so the total size will be opregion size plus extended VBT region size. Since opregion v2.0 with physical host VBT address would not be practically available for end user and guest can not directly access host physical address, so it is not supported. Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Swee Yee Fonn <swee.yee.fonn@intel.com> Signed-off-by: Fred Gao <fred.gao@intel.com> Message-Id: <20210325170953.24549-1-fred.gao@intel.com> Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-06vfio/pci: Remove an unnecessary blank line in vfio_pci_enableZhou Wang1-1/+0
This blank line is unnecessary, so remove it. Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com> Message-Id: <1615808073-178604-1-git-send-email-wangzhou1@hisilicon.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-04-06vfio: pci: Spello fix in the file vfio_pci.cBhaskar Chowdhury1-1/+1
s/permision/permission/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Message-Id: <20210314052925.3560-1-unixbhaskar@gmail.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-03-29vfio/nvlink: Add missing SPAPR_TCE_IOMMU dependsJason Gunthorpe1-1/+1
Compiling the nvlink stuff relies on the SPAPR_TCE_IOMMU otherwise there are compile errors: drivers/vfio/pci/vfio_pci_nvlink2.c:101:10: error: implicit declaration of function 'mm_iommu_put' [-Werror,-Wimplicit-function-declaration] ret = mm_iommu_put(data->mm, data->mem); As PPC only defines these functions when the config is set. Previously this wasn't a problem by chance as SPAPR_TCE_IOMMU was the only IOMMU that could have satisfied IOMMU_API on POWERNV. Fixes: 179209fa1270 ("vfio: IOMMU_API should be selected") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <0-v1-83dba9768fc3+419-vfio_nvlink2_kconfig_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-02-19vfio/pci: remove CONFIG_VFIO_PCI_ZDEV from KconfigMax Gurtovoy4-21/+7
In case we're running on s390 system always expose the capabilities for configuration of zPCI devices. In case we're running on different platform, continue as usual. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>