kernel/linux.git/drivers/xen, branch v4.14.263

xen: detect uninitialized xenbus in xenbus_init

2021-12-08T07:46:48+00:00

commit 36e8f60f0867d3b70d398d653c17108459a04efe upstream. If the xenstore page hasn't been allocated properly, reading the value of the related hvm_param (HVM_PARAM_STORE_PFN) won't actually return error. Instead, it will succeed and return zero. Instead of attempting to xen_remap a bad guest physical address, detect this condition and return early. Note that although a guest physical address of zero for HVM_PARAM_STORE_PFN is theoretically possible, it is not a good choice and zero has never been validly used in that capacity. Also recognize all bits set as an invalid value. For 32-bit Linux, any pfn above ULONG_MAX would get truncated. Pfns above ULONG_MAX should never be passed by the Xen tools to HVM guests anyway, so check for this condition and return early. Cc: stable@vger.kernel.org Signed-off-by: Stefano Stabellini Reviewed-by: Juergen Gross Reviewed-by: Jan Beulich Link: https://lore.kernel.org/r/20211123210748.1910236-1-sstabellini@kernel.org Signed-off-by: Boris Ostrovsky Signed-off-by: Greg Kroah-Hartman

xen: don't continue xenstore initialization in case of errors

2021-12-08T07:46:48+00:00

commit 08f6c2b09ebd4b326dbe96d13f94fee8f9814c78 upstream. In case of errors in xenbus_init (e.g. missing xen_store_gfn parameter), we goto out_error but we forget to reset xen_store_domain_type to XS_UNKNOWN. As a consequence xenbus_probe_initcall and other initcalls will still try to initialize xenstore resulting into a crash at boot. [ 2.479830] Call trace: [ 2.482314] xb_init_comms+0x18/0x150 [ 2.486354] xs_init+0x34/0x138 [ 2.489786] xenbus_probe+0x4c/0x70 [ 2.498432] xenbus_probe_initcall+0x2c/0x7c [ 2.503944] do_one_initcall+0x54/0x1b8 [ 2.507358] kernel_init_freeable+0x1ac/0x210 [ 2.511617] kernel_init+0x28/0x130 [ 2.516112] ret_from_fork+0x10/0x20 Cc: Cc: jbeulich@suse.com Signed-off-by: Stefano Stabellini Link: https://lore.kernel.org/r/20211115222719.2558207-1-sstabellini@kernel.org Reviewed-by: Jan Beulich Signed-off-by: Boris Ostrovsky Signed-off-by: Greg Kroah-Hartman

xen-pciback: Fix return in pm_ctrl_init()

2021-11-26T10:40:35+00:00

[ Upstream commit 4745ea2628bb43a7ec34b71763b5a56407b33990 ] Return NULL instead of passing to ERR_PTR while err is zero, this fix smatch warnings: drivers/xen/xen-pciback/conf_space_capability.c:163 pm_ctrl_init() warn: passing zero to 'ERR_PTR' Fixes: a92336a1176b ("xen/pciback: Drop two backends, squash and cleanup some code.") Signed-off-by: YueHaibing Reviewed-by: Juergen Gross Link: https://lore.kernel.org/r/20211008074417.8260-1-yuehaibing@huawei.com Signed-off-by: Boris Ostrovsky Signed-off-by: Sasha Levin

xen/balloon: add late_initcall_sync() for initial ballooning done

2021-11-26T10:40:25+00:00

commit 40fdea0284bb20814399da0484a658a96c735d90 upstream. When running as PVH or HVM guest with actual memory < max memory the hypervisor is using "populate on demand" in order to allow the guest to balloon down from its maximum memory size. For this to work correctly the guest must not touch more memory pages than its target memory size as otherwise the PoD cache will be exhausted and the guest is crashed as a result of that. In extreme cases ballooning down might not be finished today before the init process is started, which can consume lots of memory. In order to avoid random boot crashes in such cases, add a late init call to wait for ballooning down having finished for PVH/HVM guests. Warn on console if initial ballooning fails, panic() after stalling for more than 3 minutes per default. Add a module parameter for changing this timeout. [boris: replaced pr_info() with pr_notice()] Cc: Reported-by: Marek Marczykowski-Górecki Signed-off-by: Juergen Gross Link: https://lore.kernel.org/r/20211102091944.17487-1-jgross@suse.com Reviewed-by: Boris Ostrovsky Signed-off-by: Boris Ostrovsky Signed-off-by: Greg Kroah-Hartman

xen/balloon: fix cancelled balloon action

2021-10-17T08:08:32+00:00

commit 319933a80fd4f07122466a77f93e5019d71be74c upstream. In case a ballooning action is cancelled the new kernel thread handling the ballooning might end up in a busy loop. Fix that by handling the cancelled action gracefully. While at it introduce a short wait for the BP_WAIT case. Cc: stable@vger.kernel.org Fixes: 8480ed9c2bbd56 ("xen/balloon: use a kernel thread instead a workqueue") Reported-by: Marek Marczykowski-Górecki Signed-off-by: Juergen Gross Tested-by: Jason Andryuk Reviewed-by: Boris Ostrovsky Link: https://lore.kernel.org/r/20211005133433.32008-1-jgross@suse.com Signed-off-by: Juergen Gross Signed-off-by: Greg Kroah-Hartman

xen/balloon: fix balloon kthread freezing

2021-10-06T13:05:07+00:00

commit 96f5bd03e1be606987644b71899ea56a8d05f825 upstream. Commit 8480ed9c2bbd56 ("xen/balloon: use a kernel thread instead a workqueue") switched the Xen balloon driver to use a kernel thread. Unfortunately the patch omitted to call try_to_freeze() or to use wait_event_freezable_timeout(), causing a system suspend to fail. Fixes: 8480ed9c2bbd56 ("xen/balloon: use a kernel thread instead a workqueue") Signed-off-by: Juergen Gross Reviewed-by: Boris Ostrovsky Link: https://lore.kernel.org/r/20210920100345.21939-1-jgross@suse.com Signed-off-by: Juergen Gross Signed-off-by: Greg Kroah-Hartman

xen/balloon: use a kernel thread instead a workqueue

2021-10-06T13:05:06+00:00

[ Upstream commit 8480ed9c2bbd56fc86524998e5f2e3e22f5038f6 ] Today the Xen ballooning is done via delayed work in a workqueue. This might result in workqueue hangups being reported in case of large amounts of memory are being ballooned in one go (here 16GB): BUG: workqueue lockup - pool cpus=6 node=0 flags=0x0 nice=0 stuck for 64s! Showing busy workqueues and worker pools: workqueue events: flags=0x0 pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=2/256 refcnt=3 in-flight: 229:balloon_process pending: cache_reap workqueue events_freezable_power_: flags=0x84 pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=1/256 refcnt=2 pending: disk_events_workfn workqueue mm_percpu_wq: flags=0x8 pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=1/256 refcnt=2 pending: vmstat_update pool 12: cpus=6 node=0 flags=0x0 nice=0 hung=64s workers=3 idle: 2222 43 This can easily be avoided by using a dedicated kernel thread for doing the ballooning work. Reported-by: Jan Beulich Signed-off-by: Juergen Gross Reviewed-by: Boris Ostrovsky Link: https://lore.kernel.org/r/20210827123206.15429-1-jgross@suse.com Signed-off-by: Juergen Gross Signed-off-by: Sasha Levin

xen/events: Fix race in set_evtchn_to_irq

2021-08-26T12:37:02+00:00

[ Upstream commit 88ca2521bd5b4e8b83743c01a2d4cb09325b51e9 ] There is a TOCTOU issue in set_evtchn_to_irq. Rows in the evtchn_to_irq mapping are lazily allocated in this function. The check whether the row is already present and the row initialization is not synchronized. Two threads can at the same time allocate a new row for evtchn_to_irq and add the irq mapping to the their newly allocated row. One thread will overwrite what the other has set for evtchn_to_irq[row] and therefore the irq mapping is lost. This will trigger a BUG_ON later in bind_evtchn_to_cpu: INFO: pci 0000:1a:15.4: [1d0f:8061] type 00 class 0x010802 INFO: nvme 0000:1a:12.1: enabling device (0000 -> 0002) INFO: nvme nvme77: 1/0/0 default/read/poll queues CRIT: kernel BUG at drivers/xen/events/events_base.c:427! WARN: invalid opcode: 0000 [#1] SMP NOPTI WARN: Workqueue: nvme-reset-wq nvme_reset_work [nvme] WARN: RIP: e030:bind_evtchn_to_cpu+0xc2/0xd0 WARN: Call Trace: WARN: set_affinity_irq+0x121/0x150 WARN: irq_do_set_affinity+0x37/0xe0 WARN: irq_setup_affinity+0xf6/0x170 WARN: irq_startup+0x64/0xe0 WARN: __setup_irq+0x69e/0x740 WARN: ? request_threaded_irq+0xad/0x160 WARN: request_threaded_irq+0xf5/0x160 WARN: ? nvme_timeout+0x2f0/0x2f0 [nvme] WARN: pci_request_irq+0xa9/0xf0 WARN: ? pci_alloc_irq_vectors_affinity+0xbb/0x130 WARN: queue_request_irq+0x4c/0x70 [nvme] WARN: nvme_reset_work+0x82d/0x1550 [nvme] WARN: ? check_preempt_wakeup+0x14f/0x230 WARN: ? check_preempt_curr+0x29/0x80 WARN: ? nvme_irq_check+0x30/0x30 [nvme] WARN: process_one_work+0x18e/0x3c0 WARN: worker_thread+0x30/0x3a0 WARN: ? process_one_work+0x3c0/0x3c0 WARN: kthread+0x113/0x130 WARN: ? kthread_park+0x90/0x90 WARN: ret_from_fork+0x3a/0x50 This patch sets evtchn_to_irq rows via a cmpxchg operation so that they will be set only once. The row is now cleared before writing it to evtchn_to_irq in order to not create a race once the row is visible for other threads. While at it, do not require the page to be zeroed, because it will be overwritten with -1's in clear_evtchn_to_irq_row anyway. Signed-off-by: Maximilian Heyne Fixes: d0b075ffeede ("xen/events: Refactor evtchn_to_irq array to be dynamically allocated") Link: https://lore.kernel.org/r/20210812130930.127134-1-mheyne@amazon.de Reviewed-by: Boris Ostrovsky Signed-off-by: Boris Ostrovsky Signed-off-by: Sasha Levin

xen/events: reset active flag for lateeoi events later

2021-07-11T10:48:13+00:00

commit 3de218ff39b9e3f0d453fe3154f12a174de44b25 upstream. In order to avoid a race condition for user events when changing cpu affinity reset the active flag only when EOI-ing the event. This is working fine as all user events are lateeoi events. Note that lateeoi_ack_mask_dynirq() is not modified as there is no explicit call to xen_irq_lateeoi() expected later. Cc: stable@vger.kernel.org Reported-by: Julien Grall Fixes: b6622798bc50b62 ("xen/events: avoid handling the same event on two cpus at the same time") Tested-by: Julien Grall Signed-off-by: Juergen Gross Reviewed-by: Boris Ostrovsky Link: https://lore.kernel.org/r/20210623130913.9405-1-jgross@suse.com Signed-off-by: Juergen Gross Signed-off-by: Greg Kroah-Hartman

xen-pciback: redo VF placement in the virtual topology

2021-06-10T10:43:53+00:00

The commit referenced below was incomplete: It merely affected what would get written to the vdev- xenstore node. The guest would still find the function at the original function number as long as __xen_pcibk_get_pci_dev() wouldn't be in sync. The same goes for AER wrt __xen_pcibk_get_pcifront_dev(). Undo overriding the function to zero and instead make sure that VFs at function zero remain alone in their slot. This has the added benefit of improving overall capacity, considering that there's only a total of 32 slots available right now (PCI segment and bus can both only ever be zero at present). This is upstream commit 4ba50e7c423c29639878c00573288869aa627068. Fixes: 8a5248fe10b1 ("xen PV passthru: assign SR-IOV virtual functions to separate virtual slots") Signed-off-by: Jan Beulich Reviewed-by: Boris Ostrovsky Link: https://lore.kernel.org/r/8def783b-404c-3452-196d-3f3fd4d72c9e@suse.com Signed-off-by: Juergen Gross Signed-off-by: Greg Kroah-Hartman