summaryrefslogtreecommitdiff
path: root/drivers/vfio
AgeCommit message (Collapse)AuthorFilesLines
2024-04-13vfio/fsl-mc: Block calling interrupt handler without triggerAlex Williamson1-3/+4
[ Upstream commit 7447d911af699a15f8d050dfcb7c680a86f87012 ] The eventfd_ctx trigger pointer of the vfio_fsl_mc_irq object is initially NULL and may become NULL if the user sets the trigger eventfd to -1. The interrupt handler itself is guaranteed that trigger is always valid between request_irq() and free_irq(), but the loopback testing mechanisms to invoke the handler function need to test the trigger. The triggering and setting ioctl paths both make use of igate and are therefore mutually exclusive. The vfio-fsl-mc driver does not make use of irqfds, nor does it support any sort of masking operations, therefore unlike vfio-pci and vfio-platform, the flow can remain essentially unchanged. Cc: Diana Craciun <diana.craciun@oss.nxp.com> Cc: <stable@vger.kernel.org> Fixes: cc0ee20bd969 ("vfio/fsl-mc: trigger an interrupt via eventfd") Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20240308230557.805580-8-alex.williamson@redhat.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-13vfio/platform: Create persistent IRQ handlersAlex Williamson1-33/+68
[ Upstream commit 675daf435e9f8e5a5eab140a9864dfad6668b375 ] The vfio-platform SET_IRQS ioctl currently allows loopback triggering of an interrupt before a signaling eventfd has been configured by the user, which thereby allows a NULL pointer dereference. Rather than register the IRQ relative to a valid trigger, register all IRQs in a disabled state in the device open path. This allows mask operations on the IRQ to nest within the overall enable state governed by a valid eventfd signal. This decouples @masked, protected by the @locked spinlock from @trigger, protected via the @igate mutex. In doing so, it's guaranteed that changes to @trigger cannot race the IRQ handlers because the IRQ handler is synchronously disabled before modifying the trigger, and loopback triggering of the IRQ via ioctl is safe due to serialization with trigger changes via igate. For compatibility, request_irq() failures are maintained to be local to the SET_IRQS ioctl rather than a fatal error in the open device path. This allows, for example, a userspace driver with polling mode support to continue to work regardless of moving the request_irq() call site. This necessarily blocks all SET_IRQS access to the failed index. Cc: Eric Auger <eric.auger@redhat.com> Cc: <stable@vger.kernel.org> Fixes: 57f972e2b341 ("vfio/platform: trigger an interrupt via eventfd") Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20240308230557.805580-7-alex.williamson@redhat.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-13vfio/pci: Create persistent INTx handlerAlex Williamson1-67/+82
[ Upstream commit 18c198c96a815c962adc2b9b77909eec0be7df4d ] A vulnerability exists where the eventfd for INTx signaling can be deconfigured, which unregisters the IRQ handler but still allows eventfds to be signaled with a NULL context through the SET_IRQS ioctl or through unmask irqfd if the device interrupt is pending. Ideally this could be solved with some additional locking; the igate mutex serializes the ioctl and config space accesses, and the interrupt handler is unregistered relative to the trigger, but the irqfd path runs asynchronous to those. The igate mutex cannot be acquired from the atomic context of the eventfd wake function. Disabling the irqfd relative to the eventfd registration is potentially incompatible with existing userspace. As a result, the solution implemented here moves configuration of the INTx interrupt handler to track the lifetime of the INTx context object and irq_type configuration, rather than registration of a particular trigger eventfd. Synchronization is added between the ioctl path and eventfd_signal() wrapper such that the eventfd trigger can be dynamically updated relative to in-flight interrupts or irqfd callbacks. Cc: <stable@vger.kernel.org> Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver") Reported-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20240308230557.805580-5-alex.williamson@redhat.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-13vfio: Introduce interface to flush virqfd inject workqueueAlex Williamson1-0/+21
[ Upstream commit b620ecbd17a03cacd06f014a5d3f3a11285ce053 ] In order to synchronize changes that can affect the thread callback, introduce an interface to force a flush of the inject workqueue. The irqfd pointer is only valid under spinlock, but the workqueue cannot be flushed under spinlock. Therefore the flush work for the irqfd is queued under spinlock. The vfio_irqfd_cleanup_wq workqueue is re-used for queuing this work such that flushing the workqueue is also ordered relative to shutdown. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20240308230557.805580-4-alex.williamson@redhat.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-13vfio/pci: Lock external INTx masking opsAlex Williamson1-6/+24
[ Upstream commit 810cd4bb53456d0503cc4e7934e063835152c1b7 ] Mask operations through config space changes to DisINTx may race INTx configuration changes via ioctl. Create wrappers that add locking for paths outside of the core interrupt code. In particular, irq_type is updated holding igate, therefore testing is_intx() requires holding igate. For example clearing DisINTx from config space can otherwise race changes of the interrupt configuration. This aligns interfaces which may trigger the INTx eventfd into two camps, one side serialized by igate and the other only enabled while INTx is configured. A subsequent patch introduces synchronization for the latter flows. Cc: <stable@vger.kernel.org> Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver") Reported-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20240308230557.805580-3-alex.williamson@redhat.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-13vfio/pci: Disable auto-enable of exclusive INTx IRQAlex Williamson1-7/+10
[ Upstream commit fe9a7082684eb059b925c535682e68c34d487d43 ] Currently for devices requiring masking at the irqchip for INTx, ie. devices without DisINTx support, the IRQ is enabled in request_irq() and subsequently disabled as necessary to align with the masked status flag. This presents a window where the interrupt could fire between these events, resulting in the IRQ incrementing the disable depth twice. This would be unrecoverable for a user since the masked flag prevents nested enables through vfio. Instead, invert the logic using IRQF_NO_AUTOEN such that exclusive INTx is never auto-enabled, then unmask as required. Cc: <stable@vger.kernel.org> Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver") Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20240308230557.805580-2-alex.williamson@redhat.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-13vfio/platform: Disable virqfds on cleanupAlex Williamson1-1/+4
[ Upstream commit fcdc0d3d40bc26c105acf8467f7d9018970944ae ] irqfds for mask and unmask that are not specifically disabled by the user are leaked. Remove any irqfds during cleanup Cc: Eric Auger <eric.auger@redhat.com> Cc: <stable@vger.kernel.org> Fixes: a7fa7c77cf15 ("vfio/platform: implement IRQ masking/unmasking via an eventfd") Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20240308230557.805580-6-alex.williamson@redhat.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-19vfio/type1: fix cap_migration information leakStefan Hajnoczi1-1/+1
[ Upstream commit cd24e2a60af633f157d7e59c0a6dba64f131c0b1 ] Fix an information leak where an uninitialized hole in struct vfio_iommu_type1_info_cap_migration on the stack is exposed to userspace. The definition of struct vfio_iommu_type1_info_cap_migration contains a hole as shown in this pahole(1) output: struct vfio_iommu_type1_info_cap_migration { struct vfio_info_cap_header header; /* 0 8 */ __u32 flags; /* 8 4 */ /* XXX 4 bytes hole, try to pack */ __u64 pgsize_bitmap; /* 16 8 */ __u64 max_dirty_bitmap_size; /* 24 8 */ /* size: 32, cachelines: 1, members: 4 */ /* sum members: 28, holes: 1, sum holes: 4 */ /* last cacheline: 32 bytes */ }; The cap_mig variable is filled in without initializing the hole: static int vfio_iommu_migration_build_caps(struct vfio_iommu *iommu, struct vfio_info_cap *caps) { struct vfio_iommu_type1_info_cap_migration cap_mig; cap_mig.header.id = VFIO_IOMMU_TYPE1_INFO_CAP_MIGRATION; cap_mig.header.version = 1; cap_mig.flags = 0; /* support minimum pgsize */ cap_mig.pgsize_bitmap = (size_t)1 << __ffs(iommu->pgsize_bitmap); cap_mig.max_dirty_bitmap_size = DIRTY_BITMAP_SIZE_MAX; return vfio_info_add_capability(caps, &cap_mig.header, sizeof(cap_mig)); } The structure is then copied to a temporary location on the heap. At this point it's already too late and ioctl(VFIO_IOMMU_GET_INFO) copies it to userspace later: int vfio_info_add_capability(struct vfio_info_cap *caps, struct vfio_info_cap_header *cap, size_t size) { struct vfio_info_cap_header *header; header = vfio_info_cap_add(caps, size, cap->id, cap->version); if (IS_ERR(header)) return PTR_ERR(header); memcpy(header + 1, cap + 1, size - sizeof(*header)); return 0; } This issue was found by code inspection. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Fixes: ad721705d09c ("vfio iommu: Add migration capability to report supported features") Link: https://lore.kernel.org/r/20230801155352.1391945-1-stefanha@redhat.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-11vfio/type1: prevent underflow of locked_vm via exec()Steve Sistare1-27/+14
commit 046eca5018f8a5dd1dc2cedf87fb5843b9ea3026 upstream. When a vfio container is preserved across exec, the task does not change, but it gets a new mm with locked_vm=0, and loses the count from existing dma mappings. If the user later unmaps a dma mapping, locked_vm underflows to a large unsigned value, and a subsequent dma map request fails with ENOMEM in __account_locked_vm. To avoid underflow, grab and save the mm at the time a dma is mapped. Use that mm when adjusting locked_vm, rather than re-acquiring the saved task's mm, which may have changed. If the saved mm is dead, do nothing. locked_vm is incremented for existing mappings in a subsequent patch. Fixes: 73fa0d10d077 ("vfio: Type1 IOMMU implementation") Cc: stable@vger.kernel.org Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/1675184289-267876-3-git-send-email-steven.sistare@oracle.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-14vfio: platform: Do not pass return buffer to ACPI _RST methodRafael Mendonca1-2/+1
[ Upstream commit e67e070632a665c932d534b8b800477bb3111449 ] The ACPI _RST method has no return value, there's no need to pass a return buffer to acpi_evaluate_object(). Fixes: d30daa33ec1d ("vfio: platform: call _RST method when using ACPI") Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20221018152825.891032-1-rafaelmendsr@gmail.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-28vfio/type1: fix vaddr_get_pfns() return in vfio_pin_page_external()Daniel Jordan1-1/+7
commit 4ab4fcfce5b540227d80eb32f1db45ab615f7c92 upstream. vaddr_get_pfns() now returns the positive number of pfns successfully gotten instead of zero. vfio_pin_page_external() might return 1 to vfio_iommu_type1_pin_pages(), which will treat it as an error, if vaddr_get_pfns() is successful but vfio_pin_page_external() doesn't reach vfio_lock_acct(). Fix it up in vfio_pin_page_external(). Found by inspection. Fixes: be16c1fd99f4 ("vfio/type1: Change success value of vaddr_get_pfn()") Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Message-Id: <20210308172452.38864-1-daniel.m.jordan@oracle.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-28vfio/type1: Unpin zero pagesAlex Williamson1-0/+12
[ Upstream commit 873aefb376bbc0ed1dd2381ea1d6ec88106fdbd4 ] There's currently a reference count leak on the zero page. We increment the reference via pin_user_pages_remote(), but the page is later handled as an invalid/reserved page, therefore it's not accounted against the user and not unpinned by our put_pfn(). Introducing special zero page handling in put_pfn() would resolve the leak, but without accounting of the zero page, a single user could still create enough mappings to generate a reference count overflow. The zero page is always resident, so for our purposes there's no reason to keep it pinned. Therefore, add a loop to walk pages returned from pin_user_pages_remote() and unpin any zero pages. Cc: stable@vger.kernel.org Reported-by: Luboslav Pivarc <lpivarc@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/166182871735.3518559.8884121293045337358.stgit@omen Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-28vfio/type1: Prepare for batched pinning with struct vfio_batchDaniel Jordan1-13/+58
[ Upstream commit 4b6c33b3229678e38a6b0bbd4367d4b91366b523 ] Get ready to pin more pages at once with struct vfio_batch, which represents a batch of pinned pages. The struct has a fallback page pointer to avoid two unlikely scenarios: pointlessly allocating a page if disable_hugepages is enabled or failing the whole pinning operation if the kernel can't allocate memory. vaddr_get_pfn() becomes vaddr_get_pfns() to prepare for handling multiple pages, though for now only one page is stored in the pages array. Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Stable-dep-of: 873aefb376bb ("vfio/type1: Unpin zero pages") Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-28vfio/type1: Change success value of vaddr_get_pfn()Daniel Jordan1-7/+14
[ Upstream commit be16c1fd99f41abebc0bf965d5d29cd18c9d271e ] vaddr_get_pfn() simply returns 0 on success. Have it report the number of pfns successfully gotten instead, whether from page pinning or follow_fault_pfn(), which will be used later when batching pinning. Change the last check in vfio_pin_pages_remote() for consistency with the other two. Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Stable-dep-of: 873aefb376bb ("vfio/type1: Unpin zero pages") Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-25vfio: Clear the caps->buf to NULL after freeSchspa Shi1-0/+1
[ Upstream commit 6641085e8d7b3f061911517f79a2a15a0a21b97b ] On buffer resize failure, vfio_info_cap_add() will free the buffer, report zero for the size, and return -ENOMEM. As additional hardening, also clear the buffer pointer to prevent any chance of a double free. Signed-off-by: Schspa Shi <schspa@gmail.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/20220629022948.55608-1-schspa@gmail.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21vfio/mdev: Make to_mdev_device() into a static inlineJason Gunthorpe1-1/+4
[ Upstream commit 66873b5fa738ca02b5c075ca4a410b13d88e6e9a ] The macro wrongly uses 'dev' as both the macro argument and the member name, which means it fails compilation if any caller uses a word other than 'dev' as the single argument. Fix this defect by making it into proper static inline, which is more clear and typesafe anyhow. Fixes: 99e3123e3d72 ("vfio-mdev: Make mdev_device private and abstract interfaces") Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <11-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21vfio: Split creation of a vfio_device into init and register opsJason Gunthorpe1-60/+65
[ Upstream commit 0bfc6a4ea63c2adac71a824397ef48f28dbc5e47 ] This makes the struct vfio_device part of the public interface so it can be used with container_of and so forth, as is typical for a Linux subystem. This is the first step to bring some type-safety to the vfio interface by allowing the replacement of 'void *' and 'struct device *' inputs with a simple and clear 'struct vfio_device *' For now the self-allocating vfio_add_group_dev() interface is kept so each user can be updated as a separate patch. The expected usage pattern is driver core probe() function: my_device = kzalloc(sizeof(*mydevice)); vfio_init_group_dev(&my_device->vdev, dev, ops, mydevice); /* other driver specific prep */ vfio_register_group_dev(&my_device->vdev); dev_set_drvdata(dev, my_device); driver core remove() function: my_device = dev_get_drvdata(dev); vfio_unregister_group_dev(&my_device->vdev); /* other driver specific tear down */ kfree(my_device); Allowing the driver to be able to use the drvdata and vfio_device to go to/from its own data. The pattern also makes it clear that vfio_register_group_dev() must be last in the sequence, as once it is called the core code can immediately start calling ops. The init/register gap is provided to allow for the driver to do setup before ops can be called and thus avoid races. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Liu Yi L <yi.l.liu@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <3-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21vfio: Simplify the lifetime logic for vfio_deviceJason Gunthorpe1-54/+25
[ Upstream commit 5e42c999445bd0ae86e35affeb3e7c473d74a893 ] The vfio_device is using a 'sleep until all refs go to zero' pattern for its lifetime, but it is indirectly coded by repeatedly scanning the group list waiting for the device to be removed on its own. Switch this around to be a direct representation, use a refcount to count the number of places that are blocking destruction and sleep directly on a completion until that counter goes to zero. kfree the device after other accesses have been excluded in vfio_del_group_dev(). This is a fairly common Linux idiom. Due to this we can now remove kref_put_mutex(), which is very rarely used in the kernel. Here it is being used to prevent a zero ref device from being seen in the group list. Instead allow the zero ref device to continue to exist in the device_list and use refcount_inc_not_zero() to exclude it once refs go to zero. This patch is organized so the next patch will be able to alter the API to allow drivers to provide the kfree. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <2-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21vfio: Remove extra put/gets around vfio_device->groupJason Gunthorpe1-19/+2
[ Upstream commit e572bfb2b6a83b05acd30c03010e661b1967960f ] The vfio_device->group value has a get obtained during vfio_add_group_dev() which gets moved from the stack to vfio_device->group in vfio_group_create_device(). The reference remains until we reach the end of vfio_del_group_dev() when it is put back. Thus anything that already has a kref on the vfio_device is guaranteed a valid group pointer. Remove all the extra reference traffic. It is tricky to see, but the get at the start of vfio_del_group_dev() is actually pairing with the put hidden inside vfio_device_put() a few lines below. A later patch merges vfio_group_create_device() into vfio_add_group_dev() which makes the ownership and error flow on the create side easier to follow. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <1-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-08amba: Make the remove callback return voidUwe Kleine-König1-2/+1
[ Upstream commit 3fd269e74f2feec973f45ee11d822faeda4fe284 ] All amba drivers return 0 in their remove callback. Together with the driver core ignoring the return value anyhow, it doesn't make sense to return a value here. Change the remove prototype to return void, which makes it explicit that returning an error value doesn't work as expected. This simplifies changing the core remove callback to return void, too. Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Krzysztof Kozlowski <krzk@kernel.org> # for drivers/memory Acked-by: Mark Brown <broonie@kernel.org> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com> # for hwtracing/coresight Acked-By: Vinod Koul <vkoul@kernel.org> # for dmaengine Acked-by: Guenter Roeck <linux@roeck-us.net> # for watchdog Acked-by: Wolfram Sang <wsa@kernel.org> # for I2C Acked-by: Takashi Iwai <tiwai@suse.de> # for sound Acked-by: Vladimir Zapolskiy <vz@mleia.com> # for memory/pl172 Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20210126165835.687514-5-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-08vfio: platform: simplify device removalUwe Kleine-König1-9/+5
[ Upstream commit 5b495ac8fe03b9e0d2e775f9064c3e2a340ff440 ] vfio_platform_remove_common() cannot return non-NULL in vfio_amba_remove() as the latter is only called if vfio_amba_probe() returned success. Diagnosed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20210126165835.687514-4-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18vfio: Use config not menuconfig for VFIO_NOIOMMUJason Gunthorpe1-1/+1
[ Upstream commit 26c22cfde5dd6e63f25c48458b0185dcb0fbb2fd ] VFIO_NOIOMMU is supposed to be an element in the VFIO menu, not start a new menu. Correct this copy-paste mistake. Fixes: 03a76b60f8ba ("vfio: Include No-IOMMU mode") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/0-v1-3f0b685c3679+478-vfio_menuconfig_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14vfio/pci: Handle concurrent vma faultsAlex Williamson1-8/+21
[ Upstream commit 6a45ece4c9af473555f01f0f8b97eba56e3c7d0d ] io_remap_pfn_range() will trigger a BUG_ON if it encounters a populated pte within the mapping range. This can occur because we map the entire vma on fault and multiple faults can be blocked behind the vma_lock. This leads to traces like the one reported below. We can use our vma_list to test whether a given vma is mapped to avoid this issue. [ 1591.733256] kernel BUG at mm/memory.c:2177! [ 1591.739515] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 1591.747381] Modules linked in: vfio_iommu_type1 vfio_pci vfio_virqfd vfio pv680_mii(O) [ 1591.760536] CPU: 2 PID: 227 Comm: lcore-worker-2 Tainted: G O 5.11.0-rc3+ #1 [ 1591.770735] Hardware name: , BIOS HixxxxFPGA 1P B600 V121-1 [ 1591.778872] pstate: 40400009 (nZcv daif +PAN -UAO -TCO BTYPE=--) [ 1591.786134] pc : remap_pfn_range+0x214/0x340 [ 1591.793564] lr : remap_pfn_range+0x1b8/0x340 [ 1591.799117] sp : ffff80001068bbd0 [ 1591.803476] x29: ffff80001068bbd0 x28: 0000042eff6f0000 [ 1591.810404] x27: 0000001100910000 x26: 0000001300910000 [ 1591.817457] x25: 0068000000000fd3 x24: ffffa92f1338e358 [ 1591.825144] x23: 0000001140000000 x22: 0000000000000041 [ 1591.832506] x21: 0000001300910000 x20: ffffa92f141a4000 [ 1591.839520] x19: 0000001100a00000 x18: 0000000000000000 [ 1591.846108] x17: 0000000000000000 x16: ffffa92f11844540 [ 1591.853570] x15: 0000000000000000 x14: 0000000000000000 [ 1591.860768] x13: fffffc0000000000 x12: 0000000000000880 [ 1591.868053] x11: ffff0821bf3d01d0 x10: ffff5ef2abd89000 [ 1591.875932] x9 : ffffa92f12ab0064 x8 : ffffa92f136471c0 [ 1591.883208] x7 : 0000001140910000 x6 : 0000000200000000 [ 1591.890177] x5 : 0000000000000001 x4 : 0000000000000001 [ 1591.896656] x3 : 0000000000000000 x2 : 0168044000000fd3 [ 1591.903215] x1 : ffff082126261880 x0 : fffffc2084989868 [ 1591.910234] Call trace: [ 1591.914837] remap_pfn_range+0x214/0x340 [ 1591.921765] vfio_pci_mmap_fault+0xac/0x130 [vfio_pci] [ 1591.931200] __do_fault+0x44/0x12c [ 1591.937031] handle_mm_fault+0xcc8/0x1230 [ 1591.942475] do_page_fault+0x16c/0x484 [ 1591.948635] do_translation_fault+0xbc/0xd8 [ 1591.954171] do_mem_abort+0x4c/0xc0 [ 1591.960316] el0_da+0x40/0x80 [ 1591.965585] el0_sync_handler+0x168/0x1b0 [ 1591.971608] el0_sync+0x174/0x180 [ 1591.978312] Code: eb1b027f 540000c0 f9400022 b4fffe02 (d4210000) Fixes: 11c4cd07ba11 ("vfio-pci: Fault mmaps to enable vma tracking") Reported-by: Zeng Tao <prime.zeng@hisilicon.com> Suggested-by: Zeng Tao <prime.zeng@hisilicon.com> Link: https://lore.kernel.org/r/162497742783.3883260.3282953006487785034.stgit@omen Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-10vfio/platform: fix module_put call in error flowMax Gurtovoy1-1/+1
[ Upstream commit dc51ff91cf2d1e9a2d941da483602f71d4a51472 ] The ->parent_module is the one that use in try_module_get. It should also be the one the we use in module_put during vfio_platform_open(). Fixes: 32a2d71c4e80 ("vfio: platform: introduce vfio-platform-base module") Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Message-Id: <20210518192133.59195-1-mgurtovoy@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-10vfio/pci: zap_vma_ptes() needs MMURandy Dunlap1-0/+1
[ Upstream commit 2a55ca37350171d9b43d561528f23d4130097255 ] zap_vma_ptes() is only available when CONFIG_MMU is set/enabled. Without CONFIG_MMU, vfio_pci.o has build errors, so make VFIO_PCI depend on MMU. riscv64-linux-ld: drivers/vfio/pci/vfio_pci.o: in function `vfio_pci_mmap_open': vfio_pci.c:(.text+0x1ec): undefined reference to `zap_vma_ptes' riscv64-linux-ld: drivers/vfio/pci/vfio_pci.o: in function `.L0 ': vfio_pci.c:(.text+0x165c): undefined reference to `zap_vma_ptes' Fixes: 11c4cd07ba11 ("vfio-pci: Fault mmaps to enable vma tracking") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: kvm@vger.kernel.org Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Eric Auger <eric.auger@redhat.com> Message-Id: <20210515190856.2130-1-rdunlap@infradead.org> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-10vfio/pci: Fix error return code in vfio_ecap_init()Zhen Lei1-1/+1
[ Upstream commit d1ce2c79156d3baf0830990ab06d296477b93c26 ] The error code returned from vfio_ext_cap_len() is stored in 'len', not in 'ret'. Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Message-Id: <20210515020458.6771-1-thunder.leizhen@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14vfio/mdev: Do not allow a mdev_type to have a NULL parent pointerJason Gunthorpe1-1/+1
[ Upstream commit b5a1f8921d5040bb788492bf33a66758021e4be5 ] There is a small race where the parent is NULL even though the kobj has already been made visible in sysfs. For instance the attribute_group is made visible in sysfs_create_files() and the mdev_type_attr_show() does: ret = attr->show(kobj, type->parent->dev, buf); Which will crash on NULL parent. Move the parent setup to before the type pointer leaves the stack frame. Fixes: 7b96953bc640 ("vfio: Mediated device Core driver") Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <2-v2-d36939638fc6+d54-vfio2_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14vfio/pci: Re-order vfio_pci_probe()Jason Gunthorpe1-8/+9
[ Upstream commit 4aeec3984ddc853f7c65903bde472ffdef738bae ] vfio_add_group_dev() must be called only after all of the private data in vdev is fully setup and ready, otherwise there could be races with user space instantiating a device file descriptor and starting to call ops. For instance vfio_pci_reflck_attach() sets vdev->reflck and vfio_pci_open(), called by fops open, unconditionally derefs it, which will crash if things get out of order. Fixes: cc20d7999000 ("vfio/pci: Introduce VF token") Fixes: e309df5b0c9e ("vfio/pci: Parallelize device open and release") Fixes: 6eb7018705de ("vfio-pci: Move idle devices to D3hot power state") Fixes: ecaa1f6a0154 ("vfio-pci: Add VGA arbiter client") Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <8-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14vfio/pci: Move VGA and VF initialization to functionsJason Gunthorpe1-42/+74
[ Upstream commit 61e90817482871b614133c0f20feb1aba2faec86 ] vfio_pci_probe() is quite complicated, with optional VF and VGA sub components. Move these into clear init/uninit functions and have a linear flow in probe/remove. This fixes a few little buglets: - vfio_pci_remove() is in the wrong order, vga_client_register() removes a notifier and is after kfree(vdev), but the notifier refers to vdev, so it can use after free in a race. - vga_client_register() can fail but was ignored Organize things so destruction order is the reverse of creation order. Fixes: ecaa1f6a0154 ("vfio-pci: Add VGA arbiter client") Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <7-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14vfio/fsl-mc: Re-order vfio_fsl_mc_probe()Jason Gunthorpe1-27/+47
[ Upstream commit 2b1fe162e584a88ec7f12a651a2a50f94dd8cfac ] vfio_add_group_dev() must be called only after all of the private data in vdev is fully setup and ready, otherwise there could be races with user space instantiating a device file descriptor and starting to call ops. For instance vfio_fsl_mc_reflck_attach() sets vdev->reflck and vfio_fsl_mc_open(), called by fops open, unconditionally derefs it, which will crash if things get out of order. This driver started life with the right sequence, but two commits added stuff after vfio_add_group_dev(). Fixes: 2e0d29561f59 ("vfio/fsl-mc: Add irq infrastructure for fsl-mc devices") Fixes: f2ba7e8c947b ("vfio/fsl-mc: Added lock support in preparation for interrupt handling") Co-developed-by: Diana Craciun OSS <diana.craciun@oss.nxp.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <5-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-07vfio: Depend on MMUJason Gunthorpe1-1/+1
commit b2b12db53507bc97d96f6b7cb279e831e5eafb00 upstream. VFIO_IOMMU_TYPE1 does not compile with !MMU: ../drivers/vfio/vfio_iommu_type1.c: In function 'follow_fault_pfn': ../drivers/vfio/vfio_iommu_type1.c:536:22: error: implicit declaration of function 'pte_write'; did you mean 'vfs_write'? [-Werror=implicit-function-declaration] So require it. Suggested-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <0-v1-02cb5500df6e+78-vfio_no_mmu_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-21vfio/pci: Add missing range check in vfio_pci_mmapChristian A. Ehrhardt1-1/+3
commit 909290786ea335366e21d7f1ed5812b90f2f0a92 upstream. When mmaping an extra device region verify that the region index derived from the mmap offset is valid. Fixes: a15b1883fee1 ("vfio_pci: Allow mapping extra regions") Cc: stable@vger.kernel.org Signed-off-by: Christian A. Ehrhardt <lk@c--e.de> Message-Id: <20210412214124.GA241759@lisa.in-ulm.de> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-07vfio/nvlink: Add missing SPAPR_TCE_IOMMU dependsJason Gunthorpe1-1/+1
commit e0146a108ce4d2c22b9510fd12268e3ee72a0161 upstream. Compiling the nvlink stuff relies on the SPAPR_TCE_IOMMU otherwise there are compile errors: drivers/vfio/pci/vfio_pci_nvlink2.c:101:10: error: implicit declaration of function 'mm_iommu_put' [-Werror,-Wimplicit-function-declaration] ret = mm_iommu_put(data->mm, data->mem); As PPC only defines these functions when the config is set. Previously this wasn't a problem by chance as SPAPR_TCE_IOMMU was the only IOMMU that could have satisfied IOMMU_API on POWERNV. Fixes: 179209fa1270 ("vfio: IOMMU_API should be selected") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <0-v1-83dba9768fc3+419-vfio_nvlink2_kconfig_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-25vfio: IOMMU_API should be selectedJason Gunthorpe1-1/+1
commit 179209fa12709a3df8888c323b37315da2683c24 upstream. As IOMMU_API is a kconfig without a description (eg does not show in the menu) the correct operator is select not 'depends on'. Using 'depends on' for this kind of symbol means VFIO is not selectable unless some other random kconfig has already enabled IOMMU_API for it. Fixes: cba3345cc494 ("vfio: VFIO core") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <1-v1-df057e0f92c3+91-vfio_arm_compile_test_jgg@nvidia.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-04vfio/type1: Use follow_pte()Alex Williamson1-2/+13
[ Upstream commit 07956b6269d3ed05d854233d5bb776dca91751dd ] follow_pfn() doesn't make sure that we're using the correct page protections, get the pte with follow_pte() so that we can test protections and get the pfn from the pte. Fixes: 5cbf3264bc71 ("vfio/type1: Fix VA->PA translation for PFNMAP VMAs in vaddr_get_pfn()") Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04vfio-pci/zdev: fix possible segmentation fault issueMax Gurtovoy1-0/+4
[ Upstream commit 7e31d6dc2c78b2a0ba0039ca97ca98a581e8db82 ] In case allocation fails, we must behave correctly and exit with error. Fixes: e6b817d4b821 ("vfio-pci/zdev: Add zPCI capabilities to VFIO_DEVICE_GET_INFO") Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04vfio/iommu_type1: Fix some sanity checks in detach groupKeqian Zhu1-23/+8
[ Upstream commit 4a19f37a3dd3f29997735e61b25ddad24b8abe73 ] vfio_sanity_check_pfn_list() is used to check whether pfn_list and notifier are empty when remove the external domain, so it makes a wrong assumption that only external domain will use the pinning interface. Now we apply the pfn_list check when a vfio_dma is removed and apply the notifier check when all domains are removed. Fixes: a54eb55045ae ("vfio iommu type1: Add support for mediated devices") Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04vfio/iommu_type1: Populate full dirty when detach non-pinned groupKeqian Zhu1-1/+16
[ Upstream commit d0a78f91761fcd837da1e7a4b0f8368873adc646 ] If a group with non-pinned-page dirty scope is detached with dirty logging enabled, we should fully populate the dirty bitmaps at the time it's removed since we don't know the extent of its previous DMA, nor will the group be present to trigger the full bitmap when the user retrieves the dirty bitmap. Fixes: d6a4c185660c ("vfio iommu: Implementation of ioctl for dirty pages tracking") Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30vfio/pci/nvlink2: Do not attempt NPU2 setup on POWER8NVL NPUAlexey Kardashevskiy1-2/+5
commit d22f9a6c92de96304c81792942ae7c306f08ac77 upstream. We execute certain NPU2 setup code (such as mapping an LPID to a device in NPU2) unconditionally if an Nvlink bridge is detected. However this cannot succeed on POWER8NVL machines as the init helpers return an error other than ENODEV which means the device is there is and setup failed so vfio_pci_enable() fails and pass through is not possible. This changes the two NPU2 related init helpers to return -ENODEV if there is no "memory-region" device tree property as this is the distinction between NPU and NPU2. Tested on - POWER9 pvr=004e1201, Ubuntu 19.04 host, Ubuntu 18.04 vm, NVIDIA GV100 10de:1db1 driver 418.39 - POWER8 pvr=004c0100, RHEL 7.6 host, Ubuntu 16.10 vm, NVIDIA P100 10de:15f9 driver 396.47 Fixes: 7f92891778df ("vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] subdriver") Cc: stable@vger.kernel.org # 5.0 Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-30vfio/pci: Move dummy_resources_list init in vfio_pci_probe()Eric Auger1-2/+1
commit 16b8fe4caf499ae8e12d2ab1b1324497e36a7b83 upstream. In case an error occurs in vfio_pci_enable() before the call to vfio_pci_probe_mmaps(), vfio_pci_disable() will try to iterate on an uninitialized list and cause a kernel panic. Lets move to the initialization to vfio_pci_probe() to fix the issue. Signed-off-by: Eric Auger <eric.auger@redhat.com> Fixes: 05f0c03fbac1 ("vfio-pci: Allow to mmap sub-page MMIO BARs if the mmio page is exclusive") CC: Stable <stable@vger.kernel.org> # v4.7+ Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-30vfio-pci: Use io_remap_pfn_range() for PCI IO memoryJason Gunthorpe1-2/+2
[ Upstream commit 7b06a56d468b756ad6bb43ac21b11e474ebc54a0 ] commit f8f6ae5d077a ("mm: always have io_remap_pfn_range() set pgprot_decrypted()") allows drivers using mmap to put PCI memory mapped BAR space into userspace to work correctly on AMD SME systems that default to all memory encrypted. Since vfio_pci_mmap_fault() is working with PCI memory mapped BAR space it should be calling io_remap_pfn_range() otherwise it will not work on SME systems. Fixes: 11c4cd07ba11 ("vfio-pci: Fault mmaps to enable vma tracking") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Peter Xu <peterx@redhat.com> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-11-03vfio/pci: Bypass IGD init in case of -ENODEVFred Gao1-1/+1
Bypass the IGD initialization when -ENODEV returns, that should be the case if opregion is not available for IGD or within discrete graphics device's option ROM, or host/lpc bridge is not found. Then use of -ENODEV here means no special device resources found which needs special care for VFIO, but we still allow other normal device resource access. Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Xiong Zhang <xiong.y.zhang@intel.com> Cc: Hang Yuan <hang.yuan@linux.intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Fred Gao <fred.gao@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-03vfio: platform: fix reference leak in vfio_platform_openZhang Qilong1-2/+1
pm_runtime_get_sync() will increment pm usage counter even it failed. Forgetting to call pm_runtime_put will result in reference leak in vfio_platform_open, so we should fix it. Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Acked-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-03vfio/pci: Implement ioeventfd thread handler for contended memory lockAlex Williamson1-8/+35
The ioeventfd is called under spinlock with interrupts disabled, therefore if the memory lock is contended defer code that might sleep to a thread context. Fixes: bc93b9ae0151 ("vfio-pci: Avoid recursive read-lock usage") Link: https://bugzilla.kernel.org/show_bug.cgi?id=209253#c1 Reported-by: Ian Pilcher <arequipeno@gmail.com> Tested-by: Ian Pilcher <arequipeno@gmail.com> Tested-by: Justin Gatzen <justin.gatzen@gmail.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-03vfio/fsl-mc: Make vfio_fsl_mc_irqs_allocate staticDiana Craciun1-1/+1
Fixed compiler warning: drivers/vfio/fsl-mc/vfio_fsl_mc_intr.c:16:5: warning: no previous prototype for function 'vfio_fsl_mc_irqs_allocate' [-Wmissing-prototypes] ^ drivers/vfio/fsl-mc/vfio_fsl_mc_intr.c:16:1: note: declare 'static' if the function is not intended to be used outside of this translation unit int vfio_fsl_mc_irqs_allocate(struct vfio_fsl_mc_device *vdev) Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Diana Craciun <diana.craciun@oss.nxp.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-03vfio/fsl-mc: prevent underflow in vfio_fsl_mc_mmap()Dan Carpenter1-1/+1
My static analsysis tool complains that the "index" can be negative. There are some checks in do_mmap() which try to prevent underflows but I don't know if they are sufficient for this situation. Either way, making "index" unsigned is harmless so let's do it just to be safe. Fixes: 67247289688d ("vfio/fsl-mc: Allow userspace to MMAP fsl-mc device MMIO regions") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Diana Craciun <diana.craciun@oss.nxp.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-03vfio/fsl-mc: return -EFAULT if copy_to_user() failsDan Carpenter1-2/+6
The copy_to_user() function returns the number of bytes remaining to be copied, but this code should return -EFAULT. Fixes: df747bcd5b21 ("vfio/fsl-mc: Implement VFIO_DEVICE_GET_REGION_INFO ioctl call") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Diana Craciun <diana.craciun@oss.nxp.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-03vfio/type1: Use the new helper to find vfio_groupZenghui Yu1-12/+5
When attaching a new group to the container, let's use the new helper vfio_iommu_find_iommu_group() to check if it's already attached. There is no functional change. Also take this chance to add a missing blank line. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-10-22Merge tag 'vfio-v5.10-rc1' of git://github.com/awilliam/linux-vfioLinus Torvalds16-16/+1200
Pull VFIO updates from Alex Williamson: - New fsl-mc vfio bus driver supporting userspace drivers of objects within NXP's DPAA2 architecture (Diana Craciun) - Support for exposing zPCI information on s390 (Matthew Rosato) - Fixes for "detached" VFs on s390 (Matthew Rosato) - Fixes for pin-pages and dma-rw accesses (Yan Zhao) - Cleanups and optimize vconfig regen (Zenghui Yu) - Fix duplicate irq-bypass token registration (Alex Williamson) * tag 'vfio-v5.10-rc1' of git://github.com/awilliam/linux-vfio: (30 commits) vfio iommu type1: Fix memory leak in vfio_iommu_type1_pin_pages vfio/pci: Clear token on bypass registration failure vfio/fsl-mc: fix the return of the uninitialized variable ret vfio/fsl-mc: Fix the dead code in vfio_fsl_mc_set_irq_trigger vfio/fsl-mc: Fixed vfio-fsl-mc driver compilation on 32 bit MAINTAINERS: Add entry for s390 vfio-pci vfio-pci/zdev: Add zPCI capabilities to VFIO_DEVICE_GET_INFO vfio/fsl-mc: Add support for device reset vfio/fsl-mc: Add read/write support for fsl-mc devices vfio/fsl-mc: trigger an interrupt via eventfd vfio/fsl-mc: Add irq infrastructure for fsl-mc devices vfio/fsl-mc: Added lock support in preparation for interrupt handling vfio/fsl-mc: Allow userspace to MMAP fsl-mc device MMIO regions vfio/fsl-mc: Implement VFIO_DEVICE_GET_REGION_INFO ioctl call vfio/fsl-mc: Implement VFIO_DEVICE_GET_INFO ioctl vfio/fsl-mc: Scan DPRC objects on vfio-fsl-mc driver bind vfio: Introduce capability definitions for VFIO_DEVICE_GET_INFO s390/pci: track whether util_str is valid in the zpci_dev s390/pci: stash version in the zpci_dev vfio/fsl-mc: Add VFIO framework skeleton for fsl-mc devices ...
2020-10-20vfio iommu type1: Fix memory leak in vfio_iommu_type1_pin_pagesXiaoyang Xu1-1/+2
pfn is not added to pfn_list when vfio_add_to_pfn_list fails. vfio_unpin_page_external will exit directly without calling vfio_iova_put_vfio_pfn. This will lead to a memory leak. Fixes: a54eb55045ae ("vfio iommu type1: Add support for mediated devices") Signed-off-by: Xiaoyang Xu <xuxiaoyang2@huawei.com> [aw: simplified logic, add Fixes] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>