summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)AuthorFilesLines
30 hoursusb: core: Fix SuperSpeed root hub wMaxPacketSizeMichal Pecio1-3/+1
commit d1e280334b7f0a1df441e08bd1f6a1bcc36b3bbb upstream. There is no good reason to have wBytesPerInterval < wMaxPacketSize - either one is too low or the other too high, and we may want to warn about such descriptors. Start with cleaning up our own root hubs. USB 3.2 section 10.15.1 sets wMaxPacketSize and wBytesPerInterval of SuperSpeed hub status endpoints at 2 bytes, so reduce wMaxPacketSize from its former value of 4, which was derived from USB 2.0 spec and the kernel's USB_MAXCHILDREN limit. They don't apply because USB 3.2 10.15.2.1 specifies SuperSpeed hubs to have up to 15 ports. Suggested-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Link: https://patch.msgid.link/20260518073121.7bc1da0f.michal.pecio@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursmailbox: Fix NULL message support in mbox_send_message()Jassi Brar2-8/+9
commit c58e9456e30c7098cbcd9f04571992be8a2e4e63 upstream. The active_req field serves double duty as both the "is a TX in flight" flag (NULL means idle) and the storage for the in-flight message pointer. When a client sends NULL via mbox_send_message(), active_req is set to NULL, which the framework misinterprets as "no active request". This breaks the TX state machine by: - tx_tick() short-circuits on (!mssg), skipping the tx_done callback and the tx_complete completion - txdone_hrtimer() skips the channel entirely since active_req is NULL, so poll-based TX-done detection never fires. Fix this by introducing a MBOX_NO_MSG sentinel value that means "no active request," freeing NULL to be valid message data. The sentinel is defined in the subsystem-internal mailbox.h so that controller drivers within drivers/mailbox/ can reference it, but it is not exposed to clients outside the subsystem. Fifteen in-tree callers send NULL (doorbell-style IPCs on Qualcomm, Tegra, TI, Xilinx, i.MX, SCMI, and PCC platforms). All were audited for regression: - Most already work around the bug via knows_txdone=true with a manual mbox_client_txdone() call, making the framework's tracking irrelevant. These are unaffected. - Poll-based callers (Xilinx zynqmp/r5) are strictly better off: the poll timer now correctly detects NULL-active channels instead of silently skipping them. - irq-qcom-mpm.c was a pre-existing bug -- the only Qualcomm caller that omitted the knows_txdone + mbox_client_txdone() pattern. Fixed in a companion commit ("irqchip/qcom-mpm: Fix missing mailbox TX done acknowledgment"). - No caller sets both a tx_done callback and sends NULL, nor combines tx_block=true with NULL sends, so the newly reachable callback/completion paths are never exercised. Also update tegra-hsp's flush callback, which directly inspects active_req to wait for the channel to drain: the old "!= NULL" check becomes "!= MBOX_NO_MSG", otherwise flush spins until timeout since the sentinel is non-NULL. The only tradeoff is that 'MBOX_NO_MSG' can not be used as a message by clients. Reported-by: Joonwon Kang <joonwonkang@google.com> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com> Signed-off-by: Joonwon Kang <joonwonkang@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursxhci: tegra: Fix ghost USB device on dual-role port unplugWei-Cheng Chen1-33/+44
[ Upstream commit 5a4c828b8b29b47534814ade26d9aee09d5101fc ] When a USB device is unplugged from the dual-role port, the device-mode path in tegra_xhci_id_work() explicitly clears both SS and HS port power via direct hub_control ClearPortFeature(POWER) calls. This preempts the xHCI controller's normal disconnect processing -- PORT_CSC is never generated, the USB core never sees the disconnect, and the device remains in its internal tree as a ghost visible in lsusb. Add an otg_set_port_power flag to control whether the dual-role switch path performs explicit port power management. SoCs that need it (Tegra124 / Tegra210 / Tegra186) set the flag; later SoCs (Tegra194 and beyond) rely on the PHY mode change to handle disconnect naturally and skip all port power calls. Within the port power path, otg_reset_sspi additionally gates the SSPI reset sequence on host-mode entry for SoCs that require it. Flags set per SoC: Tegra124, Tegra186 -> otg_set_port_power Tegra210 -> otg_set_port_power, otg_reset_sspi Tegra194 and later -> (none) [ Backport to 7.0.y: keep the host-mode snapshot in the existing tegra->lock section, preserve str_on_off(), and resolve context around the SoC ops/Tegra234 entries. ] Fixes: f836e7843036 ("usb: xhci-tegra: Add OTG support") Cc: stable@vger.kernel.org Signed-off-by: Wei-Cheng Chen <weichengc@nvidia.com> Link: https://patch.msgid.link/20260505112630.217704-1-weichengc@nvidia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
30 hoursnet: phy: micrel: fix LAN8814 QSGMII soft resetRobert Marko1-7/+8
[ Upstream commit e027c218c482c6a0ae1948129ccda3b0a2033368 ] LAN8814 QSGMII soft reset was moved into the probe function to avoid triggering it for each of 4 PHY-s in the package. However, that broke QSGMII link between the MAC and PHY on most LAN8814 PHY-s, specificaly for us on the Microchip LAN969x switch. Reading the QSGMII status registers it was visible that lanes were only partially synced. It looks like the reset timing is crucial, so lets move the reset back into the .config_init function but guard it with phy_package_init_once() to avoid it being triggered on each of 4 PHY-s in the package. Change the probe function to use phy_package_probe_once() for coma and PtP setup. Fixes: 96a9178a29a6 ("net: phy: micrel: lan8814 fix reset of the QSGMII interface") Signed-off-by: Robert Marko <robert.marko@sartura.hr> Link: https://patch.msgid.link/20260428134138.1741253-1-robert.marko@sartura.hr Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
30 hourshwmon: (pmbus/adm1266) serialize GPIO PMBus accesses with pmbus_lockAbdurrahman Hussain1-0/+6
[ Upstream commit bab8c6fb5af8df7e753d196c1262cb78e92ca872 ] adm1266_gpio_get(), adm1266_gpio_get_multiple(), and adm1266_gpio_dbg_show() all issue PMBus reads against the device but none of them take pmbus_lock. The pmbus_core framework holds pmbus_lock around its own multi-transaction sequences (notably the "set PAGE, then read paged register" pattern used by hwmon attributes), so an unlocked GPIO accessor can land between a PAGE write and the subsequent paged read in another thread and corrupt either side's view of the device state machine. Take pmbus_lock at the top of each of the three accessors via the scope-based guard(). The lock is uncontended in the common case and adds only a single mutex round-trip per call. Fixes: d98dfad35c38 ("hwmon: (pmbus/adm1266) Add support for GPIOs") Cc: stable@vger.kernel.org Signed-off-by: Abdurrahman Hussain <abdurrahman@nexthop.ai> Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260518-adm1266-gpio-fixes-v3-6-e425e4f88139@nexthop.ai Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hourshwmon: (pmbus/adm1266) serialize sequencer_state debugfs read with pmbus_lockAbdurrahman Hussain1-0/+1
[ Upstream commit 4e4af55aaca7f6d7673d5f9889ad0529db86a048 ] adm1266_state_read() backs the sequencer_state debugfs entry and issues an i2c_smbus_read_word_data(client, ADM1266_READ_STATE) against the device without taking pmbus_lock. pmbus_core holds pmbus_lock around its own multi-transaction sequences (notably the "set PAGE, then read paged register" pattern used by hwmon attributes), so an unlocked debugfs reader can land between a PAGE write and the subsequent paged read in another thread. READ_STATE itself is not paged, so it cannot corrupt PAGE in flight, but the same defensive serialisation that applies to the GPIO accessors applies here: any direct device access from outside pmbus_core should be ordered with respect to pmbus_core's own. Take pmbus_lock at the top of adm1266_state_read() via the scope-based guard(). Fixes: ed1ff457e187 ("hwmon: (pmbus/adm1266) add debugfs for states") Cc: stable@vger.kernel.org Signed-off-by: Abdurrahman Hussain <abdurrahman@nexthop.ai> Link: https://lore.kernel.org/r/20260518-adm1266-gpio-fixes-v3-8-e425e4f88139@nexthop.ai Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hourshwmon: (pmbus) Add support for guarded PMBus lockGuenter Roeck2-0/+13
[ Upstream commit 1814f4d3ff358277a5b6957e7f133c2812dc80ec ] Add support for guard(pmbus_lock)() and scoped_guard(pmbus_lock)() to be able to simplify the PMBus code. Also introduce pmbus_lock() as pre-requisite for supporting guard(). Reviewed-by: Sanman Pradhan <psanman@juniper.net> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Stable-dep-of: 4e4af55aaca7 ("hwmon: (pmbus/adm1266) serialize sequencer_state debugfs read with pmbus_lock") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursi2c: tegra: make tegra_i2c_mutex_unlock() return voidSaurav Sachidanand1-9/+6
[ Upstream commit 30792d12842901f5276f466a960962d5bfa15cc8 ] tegra_i2c_mutex_unlock() returning an error that overwrites the transfer result causes silent loss of I2C transfer errors. If the transfer failed but the unlock succeeded, the error was lost and the function incorrectly reported success. Rather than propagating the unlock error (which is not actionable by the caller - the I2C message may have been sent regardless), convert the function to return void and WARN on the unexpected condition. If the unlock fails, subsequent lock attempts will fail anyway, making the error visible on the next transfer. Fixes: 6077cfd716fb ("i2c: tegra: Add support for SW mutex register") Signed-off-by: Saurav Sachidanand <sauravsc@amazon.com> Cc: <stable@vger.kernel.org> # v7.0+ Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/r/20260507221145.62183-3-sauravsc@amazon.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursplatform/x86/intel/vsec: Fix enable_cnt imbalance on PCIe error recoveryLukas Wunner1-24/+30
[ Upstream commit 348ccc754d8939e21ca5956ff45720b81d6e407f ] After a PCIe Uncorrectable Error has been reported by a device with Intel Vendor Specific Extended Capabilities and has been recovered through a Secondary Bus Reset, its driver calls intel_vsec_pci_probe() to rescan and reinitialize VSECs. intel_vsec_pci_probe() invokes pcim_enable_device() and thereby adds another devm action which calls pcim_disable_device() on driver unbind. So once the driver unbinds, pcim_disable_device() will be called as many times as an Uncorrectable Error occurred, plus one. This will lead to an enable_cnt imbalance on driver unbind. Additionally, since commit dc957ab6aa05 ("platform/x86/intel/vsec: Add private data for per-device data"), a devm_kzalloc() allocation is leaked on every Uncorrectable Error. Avoid by splitting the VSEC rescan out of intel_vsec_pci_probe() into a separate helper and calling that on PCIe error recovery. Fixes: 936874b77dd0 ("platform/x86/intel/vsec: Add PCI error recovery support to Intel PMT") Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org # v6.0+ Link: https://patch.msgid.link/bd594d09fa866dc51dddc9a447c3b23f9b1402cc.1778736835.git.lukas@wunner.de Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursplatform/x86/intel/vsec: Make driver_data info constDavid E. Box1-10/+10
[ Upstream commit 9577c74c96f88d807d1ba005adbf5952e7127e55 ] Treat PCI id->driver_data (intel_vsec_platform_info) as read-only by making vsec_priv->info a const pointer and updating all function signatures to accept const intel_vsec_platform_info *. This improves const-correctness and clarifies that the platform info data from the driver_data table is not meant to be modified at runtime. No functional changes intended. Signed-off-by: David E. Box <david.e.box@linux.intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Link: https://patch.msgid.link/20260313015202.3660072-3-david.e.box@linux.intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Stable-dep-of: 348ccc754d89 ("platform/x86/intel/vsec: Fix enable_cnt imbalance on PCIe error recovery") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursplatform/x86/intel/vsec: Refactor base_addr handlingDavid E. Box1-13/+10
[ Upstream commit 904b333fc51cc045941df9656302449a0fc9978e ] The base_addr field in intel_vsec_platform_info was originally added to support devices that emulate PCI VSEC capabilities in MMIO. Previously, the code would check at registration time whether base_addr was set, falling back to the PCI BAR if not. Refactor this by making base_addr an explicit function parameter. This clarifies ownership of the value and removes conditional logic from intel_vsec_add_dev(). It also enables making intel_vsec_platform_info const in a later patch, since the function no longer needs to write to info->base_addr. No functional change intended. Signed-off-by: David E. Box <david.e.box@linux.intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Link: https://patch.msgid.link/20260313015202.3660072-2-david.e.box@linux.intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Stable-dep-of: 348ccc754d89 ("platform/x86/intel/vsec: Fix enable_cnt imbalance on PCIe error recovery") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursserial: 8250_dw: dispatch SysRq character in dw8250_handle_irq()Jacques Nilo1-1/+1
commit 2e211723953f7740e54b53f3d3a0d5e351a5e223 upstream. dw8250_handle_irq() calls serial8250_handle_irq_locked() with the port lock held via guard(uart_port_lock_irqsave). The guard destructor is plain uart_port_unlock_irqrestore(), so a SysRq character captured into port->sysrq_ch by uart_prepare_sysrq_char() is dropped without ever being dispatched to handle_sysrq(). This is the same regression pattern as in serial8250_handle_irq(), introduced when 883c5a2bc934 ("serial: 8250_dw: Rework dw8250_handle_irq() locking and IIR handling") moved the function to the guard()-based locking scheme without using the sysrq-aware unlock helper. Switch to guard(uart_port_lock_check_sysrq_irqsave) so that captured sysrq_ch is dispatched on scope exit, matching the fix in serial8250_handle_irq(). Fixes: 883c5a2bc934 ("serial: 8250_dw: Rework dw8250_handle_irq() locking and IIR handling") Cc: stable@vger.kernel.org Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Jacques Nilo <jnilo@free.fr> Link: https://patch.msgid.link/ed56fcaf4af24e4ed011a7bce206e0182acb761c.1778675349.git.jnilo@free.fr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursserial: 8250: dispatch SysRq character in serial8250_handle_irq()Jacques Nilo1-2/+5
commit 71f42b2149a1307a97165b409493665579462ea0 upstream. serial8250_handle_irq() captures a SysRq character into port->sysrq_ch inside serial8250_handle_irq_locked() via uart_prepare_sysrq_char() (reached from serial8250_read_char()). Dispatch of that captured character to handle_sysrq() is expected to happen at port-unlock time, through uart_unlock_and_check_sysrq[_irqrestore](). After commit 8324a54f604d ("serial: 8250: Add serial8250_handle_irq_locked()") the function was reduced to a wrapper that takes the port lock via guard(uart_port_lock_irqsave) whose destructor is plain uart_port_unlock_irqrestore(). The sysrq-aware unlock helper is no longer called, so port->sysrq_ch is captured but never dispatched: BREAK + SysRq key is consumed silently. This was the very condition Johan Hovold's 853a9ae29e978 ("serial: 8250: fix handle_irq locking", 2021) introduced uart_unlock_and_check_sysrq_irqrestore() to address. Switch to the new guard(uart_port_lock_check_sysrq_irqsave), whose destructor is the sysrq-aware unlock helper, restoring the pre-split behaviour. Update the Context: comment on serial8250_handle_irq_locked() so future HW-specific 8250 wrappers know to use the same guard or the explicit sysrq-aware unlock. Verified on RTL8196E with CONFIG_MAGIC_SYSRQ_SERIAL=y: BREAK + 'h' on the console UART produces the SysRq help dump in dmesg and the brk counter in /proc/tty/driver/serial increments correctly. Fixes: 8324a54f604d ("serial: 8250: Add serial8250_handle_irq_locked()") Cc: stable@vger.kernel.org Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Jacques Nilo <jnilo@free.fr> Link: https://patch.msgid.link/52692ae6c3501f7940347cef364ad7fcacaab7e5.1778675349.git.jnilo@free.fr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursserial: zs: Convert to use a platform deviceMaciej W. Rozycki2-121/+72
commit 7cac59d08a73cb866ec51a483a6f3fe0f531947c upstream. Prevent a crash from happening as the first serial port is initialised: Console: switching to mono frame buffer device 160x64 fb0: PMAG-AA frame buffer device at tc0 DECstation Z85C30 serial driver version 0.10 CPU 0 Unable to handle kernel paging request at virtual address 0000002c, epc == 803ab00c, ra == 803aafe0 Oops[#1]: CPU: 0 PID: 1 Comm: swapper Not tainted 6.4.0-rc3-00031-g84a9582fd203-dirty #57 $ 0 : 00000000 10012c00 803aaeb0 00000000 $ 4 : 80e12f60 80e12f50 80e12f58 81000030 $ 8 : 00000000 805ff37c 00000000 33433538 $12 : 65732030 00000006 80c2915d 6c616972 $16 : 80e12f00 807b7630 00000000 00000000 $20 : 00000004 00000348 000001a0 807623b8 $24 : 00000018 00000000 $28 : 80c24000 80c25d60 8078b148 803aafe0 Hi : 00000000 Lo : 00000000 epc : 803ab00c serial_base_ctrl_add+0x78/0xf4 ra : 803aafe0 serial_base_ctrl_add+0x4c/0xf4 Status: 10012c03 KERNEL EXL IE Cause : 00000008 (ExcCode 02) BadVA : 0000002c PrId : 00000440 (R4400SC) Modules linked in: Process swapper (pid: 1, threadinfo=(ptrval), task=(ptrval), tls=00000000) Stack : 80760000 00000cc0 00400044 00400040 803aa02c 80d61ab8 00000000 807b7630 80760000 807623b8 807b7628 803aa644 80386998 00000000 80e17780 80220f68 80e17780 80d61ab8 80c17d80 80e17780 80e17780 8063c798 80e17780 80383fa0 00000010 80e17780 00000000 80386998 807a0000 00000000 00400040 8038f848 807623b8 80d61ab8 00000004 80e17780 00000000 803a68e4 80c25e2c 803bb884 ... Call Trace: [<803ab00c>] serial_base_ctrl_add+0x78/0xf4 [<803aa644>] serial_core_register_port+0x174/0x69c [<8077e9ac>] zs_init+0xc8/0xfc [<800404d4>] do_one_initcall+0x40/0x2ac [<8076cecc>] kernel_init_freeable+0x1e4/0x270 [<80605bec>] kernel_init+0x20/0x108 [<800431e8>] ret_from_kernel_thread+0x14/0x1c Code: 2442aeb0 ae120024 ae0200d0 <8c67002c> 50e00001 8c670000 3c06806e 3c05806e afb30010 ---[ end trace 0000000000000000 ]--- (report at the offending commit) -- where a pointer is dereferenced that has been derived from a null pointer to the port's parent device. Since no device is available with legacy probing and it's not anymore a preferable way to discover devices anyway, switch the driver to using a platform device and use it as the port's parent device. Update resource handling accordingly and only request the actual span of addresses used within the slot, which will have had its resource already requested by generic platform device code. Use platform_driver_probe() not just because SCC devices are fixed with solder on board and not straightforward to remove, but foremost because the associated TTY's major device number is the same as used by the dz driver and the first driver to claim it will prevent the other one from using it. Either one DZ device or some SCC devices will be present in a given system but never both at a time, and therefore we want the major device number to be claimed by the first driver to actually successfully bind to its device and platform_driver_probe() is a way to fulfil that. An unfortunate consequence of the switch to a platform device is we now hand the console over from the bootconsole much later in the bootstrap. The firmware console handler appears good enough though to work so late and in particular with interrupts enabled. Since there is one way only remaining to reach zs_reset() now, remove the port initialisation marker as no longer needed and go through the channel reset unconditionally. Fixes: 84a9582fd203 ("serial: core: Start managing serial controllers to enable runtime PM") Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Cc: stable@vger.kernel.org # needs to use .remove_new for <= 6.10 Link: https://patch.msgid.link/alpine.DEB.2.21.2605062328480.46195@angie.orcam.me.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursserial: zs: Switch to using channel resetMaciej W. Rozycki2-4/+5
commit 8572955630f30948837088aa98bcbe0532d1ceac upstream. Switch the driver to using the channel reset rather than hardware reset, simplifying handling by removing an interference between channels that causes the other channel to become uninitialised afterwards. There is little difference between the two kinds of reset in terms of register settings that result, and we initialise the whole register set right away anyway. However this prevents a hang from happening should the console output handler in the firmware try to access the other port whose transmitter has been disabled and line parameters messed up. For example this will happen if the keyboard port (port A) is chosen for the system console, unusually but not insanely for a headless system, as the port is wired to a standard DA-15 connector and an adapter can be easily made. Or with the next change in place this would happen for the regular console port (port B), since the keyboard port (port A) will be initialised first. Just remove the unnecessary complication then, a channel reset is good enough. We still need the initialisation marker, now per channel rather than per SCC, as for the console port zs_reset() will be called twice: once early on via zs_serial_console_init() for the console setup only, and then again via zs_config_port() as the port is associated with a TTY device. Fixes: 8b4a40809e53 ("zs: move to the serial subsystem") Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Cc: stable@vger.kernel.org # v2.6.23+ Link: https://patch.msgid.link/alpine.DEB.2.21.2605062323430.46195@angie.orcam.me.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursserial: zs: Fix bootconsole handover lockupMaciej W. Rozycki1-21/+8
commit 6c05cf72e13314ce9b770b5951695dc5a2152920 upstream. Calling zs_reset() in the course of setting up the serial device causes line parameters to be reset and the transmitter disabled. We've been lucky in that no message is usually produced to the kernel log between this call and the later call to uart_set_options() in the course of console setup done by zs_serial_console_init(), or the system would hang as the console output handler in the firmware tried to access a port the transmitter of which has been disabled and line parameters messed up. This will change with the next change to the driver, so fix zs_reset() such that line parameters are set for 9600n8 console operation as with the system firmware and the transmitter re-enabled after reset. This also means zs_pm() serves no purpose anymore, so drop it. Fixes: 8b4a40809e53 ("zs: move to the serial subsystem") Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Cc: stable@vger.kernel.org # v2.6.23+ Link: https://patch.msgid.link/alpine.DEB.2.21.2605062308040.46195@angie.orcam.me.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursserial: dz: Convert to use a platform deviceMaciej W. Rozycki1-60/+56
commit 5d7a49d60b8fda66da60e240fd7315232fa1754f upstream. Prevent a crash from happening as the first serial port is initialised: Console: switching to colour frame buffer device 160x64 tgafb: SFB+ detected, rev=0x02 fb0: Digital ZLX-E1 frame buffer device at 0x1e000000 DECstation DZ serial driver version 1.04 CPU 0 Unable to handle kernel paging request at virtual address 000000bc, epc == 8048b3a4, ra == 80470a78 Oops[#1]: CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.19.0-dirty #35 NONE $ 0 : 00000000 1000ac00 00000004 804707ac $ 4 : 00000000 80e20850 80e20858 81000030 $ 8 : 00000000 8072c81c 00000008 fefefeff $12 : 6c616972 00000006 80c5917f 69726420 $16 : 80e20800 00000000 808f8968 80e20800 $20 : 00000000 807f5a90 808b0094 808d3bc8 $24 : 00000018 80479030 $28 : 80c2e000 80c2fd70 00000069 80470a78 Hi : 00000004 Lo : 00000000 epc : 8048b3a4 __dev_fwnode+0x0/0xc ra : 80470a78 serial_base_ctrl_add+0xa0/0x168 Status: 1000ac04 IEp Cause : 30000008 (ExcCode 02) BadVA : 000000bc PrId : 00000220 (R3000) Modules linked in: Process swapper/0 (pid: 1, threadinfo=(ptrval), task=(ptrval), tls=00000000) Stack : 00400044 00400040 8046f4cc 00000000 808a6148 808a0000 808f8968 8086983c 808e0000 8046fc84 1000ac01 00000028 80e20700 802ba3f8 80e20700 80d34a94 80c1b900 80e20700 80e20700 80e20700 80e20700 80444650 00000000 00000000 00000000 807f5a90 808b0094 80447080 00400040 808e0000 80d34a94 808a6148 80d34a94 00000004 80e20700 00000000 8076974c 80469810 80c2fe3c 1000ac01 ... Call Trace: [<8048b3a4>] __dev_fwnode+0x0/0xc [<80470a78>] serial_base_ctrl_add+0xa0/0x168 [<8046fc84>] serial_core_register_port+0x1c8/0x974 [<808c6af0>] dz_init+0x74/0xc8 [<800470e0>] do_one_initcall+0x44/0x2d4 [<808b111c>] kernel_init_freeable+0x258/0x308 [<8072e434>] kernel_init+0x20/0x114 [<80049cd0>] ret_from_kernel_thread+0x14/0x1c Code: 27bd0018 03e00008 2402ffea <8c8200bc> 03e00008 00000000 27bdffc0 afbe0038 afb30024 ---[ end trace 0000000000000000 ]--- -- where a pointer is dereferenced that has been derived from a null pointer to the port's parent device. Since no device is available with legacy probing and it's not anymore a preferable way to discover devices anyway, switch the driver to using a platform device and use it as the port's parent device. Update resource handling accordingly and only request the actual span of addresses used within the slot, which will have had its resource already requested by generic platform device code. Use platform_driver_probe() not just because the DZ device is fixed with solder on board and not straightforward to remove, but foremost because the associated TTY's major device number is the same as used by the zs driver and the first driver to claim it will prevent the other one from using it. Either one DZ device or some SCC devices will be present in a given system but never both at a time, and therefore we want the major device number to be claimed by the first driver to actually successfully bind to its device and platform_driver_probe() is a way to fulfil that. An unfortunate consequence of the switch to a platform device is we now hand the console over from the bootconsole much later in the bootstrap. The firmware console handler appears good enough though to work so late and in particular with interrupts enabled. Conversely only starting the console port so late lets the reset code fully utilise our delay handlers, so switch from udelay() to fsleep() for transmitter draining so as to avoid busy-waiting for an excessive amount of time. Fixes: 84a9582fd203 ("serial: core: Start managing serial controllers to enable runtime PM") Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Cc: stable@vger.kernel.org # needs to use .remove_new for <= 6.10 Link: https://patch.msgid.link/alpine.DEB.2.21.2605062326540.46195@angie.orcam.me.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursserial: dz: Fix bootconsole handover lockupMaciej W. Rozycki1-24/+12
commit 7f127b2208e5e2b817243cad41fe4211a6d5a7a3 upstream. Calling dz_reset() in the course of setting up the serial device causes line parameters to be reset and the transmitter disabled. We've been lucky in that no message is usually produced to the kernel log between this call and the later call to uart_set_options() in the course of console setup done by dz_serial_console_init(), or the system would hang as the console output handler in the firmware tried to access a port the transmitter of which has been disabled and line parameters messed up. This will change with the next change to the driver, so fix dz_reset() such that line parameters are set for 9600n8 console operation as with the system firmware and the transmitter re-enabled after reset. This also means dz_pm() serves no purpose anymore, so drop it. Fixes: e6ee512f5a77 ("dz.c: Resource management") Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Cc: stable@vger.kernel.org # v2.6.25+ Link: https://patch.msgid.link/alpine.DEB.2.21.2605062302010.46195@angie.orcam.me.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursserial: dz: Fix bootconsole message clobbering at chip resetMaciej W. Rozycki1-0/+21
commit ca904f4b42355287bc5ce8b7550ebe909cda4c2c upstream. In the DZ interface as implemented by the DC7085 gate array the serial transmitters are double buffered, meaning that at the time a transmitter is ready to accept the next character there is one in the transmit shift register still being sent to the line. Issuing a master clear at this time causes this character to be lost, so wait an extra amount of time sufficient for the transmit shift register to drain at 9600bps, which is the baud rate setting used by the firmware console. Mind the specified 1.4us TRDY recovery time in the course and continue using iob() as the completion barrier, since the platforms involved use a write buffer that can delay and combine writes, and reorder them with respect to reads regardless of the MMIO locations accessed and we still lack a platform-independent handler for that. When called from dz_serial_console_init() this is too early for fsleep() to work and even before lpj has been calculated and therefore the delay is actually not sufficient for the transmitter to drain and is merely a placeholder now. This will be addressed in a follow-up change. Fixes: e6ee512f5a77 ("dz.c: Resource management") Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Cc: stable@vger.kernel.org # v2.6.25+ Link: https://patch.msgid.link/alpine.DEB.2.21.2605062259080.46195@angie.orcam.me.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursdrm/amdgpu: check num_entries in GEM_OP GET_MAPPING_INFOZiyi Guo1-0/+5
commit a1ba4594232c87c3b8defd6f89a2e40f8b08395d upstream. kvcalloc(args->num_entries, sizeof(*vm_entries), GFP_KERNEL) at amdgpu_gem.c:1050 uses the user-supplied num_entries directly without any upper bounds check. Since num_entries is a __u32 and sizeof(drm_amdgpu_gem_vm_entry) is 32 bytes, a large num_entries produces an allocation exceeding INT_MAX, triggering WARNING in __kvmalloc_node_noprof(), causing a kernel WARNING, TAINT_WARN, and panic on CONFIG_PANIC_ON_WARN=y systems. Add a size bounds check before we invoke the kvzalloc() to reject oversized num_entries early with -EINVAL. Fixes: 4d82724f7f2b ("drm/amdgpu: Add mapping info option for GEM_OP ioctl") Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1fe7bf5457f6efd7be60b17e23163ba54341d73d) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursdrm/amdgpu: fix amdgpu_hmm_range_get_pagesChristian König1-8/+8
commit 962d684b5dc0741dcd93485d41b450de402d5592 upstream. The notifier sequence must only be read once or otherwise we could work with invalid pages. While at it also fix the coding style, e.g. drop the pre-initialized return value and use the common define for 2G range. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Tested-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit c08972f555945cda57b0adb72272a37910153390) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursdrm/amdgpu: fix calling VM invalidation in amdgpu_hmm_invalidate_gfxChristian König2-2/+6
commit 1c824497d8acd3187d585d6187cedc1897dcc871 upstream. Otherwise we don't invalidate page tables on next CS. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Tested-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit b6444d1bcbc34f6f2a31a3aab3059be082f3683e) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursdrm/amdgpu: fix lock leak on ENOMEM in AMDGPU_GEM_OP_GET_MAPPING_INFOMichael Bommarito1-2/+4
commit 2e7f55eb408c3f72ee1957a0d0ad11d8648a6379 upstream. The AMDGPU_GEM_OP_GET_MAPPING_INFO branch of amdgpu_gem_op_ioctl() holds three cleanup-tracked resources before calling kvcalloc(): the drm_gem_object reference from drm_gem_object_lookup(), the drm_exec lock on the looked-up GEM via drm_exec_lock_obj(), and the drm_exec lock on the per-process VM root page directory via amdgpu_vm_lock_pd(). All three are released by the out_exec label that every other error path in this function jumps to. The kvcalloc() failure path returns -ENOMEM directly, skipping out_exec and leaking all three. The leaked per-process VM root PD dma_resv lock is the load-bearing leak: any subsequent operation on the same VM (further GEM ops, command-submission, eviction, TTM shrinker callbacks) blocks on the held lock. DRM_IOCTL_AMDGPU_GEM_OP is DRM_AUTH | DRM_RENDER_ALLOW, so this is an unprivileged-local denial of service against the caller's GPU context, reachable by any process with /dev/dri/renderD* access. Route the failure through out_exec so drm_exec_fini() and drm_gem_object_put() run. Reproduced on stock 7.0.0-10, Ryzen 7 5700U / Radeon Vega (Lucienne): the failing ioctl returns -ENOMEM and a second GET_MAPPING_INFO on the same fd then blocks in drm_exec_lock_obj() on the leaked dma_resv. SIGKILL on the caller does not reap the task; the fd-release path during process exit goes through amdgpu_gem_object_close() -> drm_exec_prepare_obj() on the same lock, leaving the task in D state until the box is rebooted. The patched kernel was not rebuilt and re-tested on this hardware; the fix is mechanical. Tested on a single Lucienne / Vega box only. Ziyi Guo posted an independent INT_MAX-bound check for args->num_entries in the same branch [1]; the two patches are complementary and can land in either order. Fixes: 4d82724f7f2b ("drm/amdgpu: Add mapping info option for GEM_OP ioctl") Link: https://lore.kernel.org/all/20260208000255.4073363-1-n7l8m4@u.northwestern.edu/ # [1] Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit b69d3256d79de15f54c322986ff4da68f1d65b0a) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursdrm/amdkfd: Check for pdd drm file first in CRIU restore pathDavid Francis1-5/+5
commit 6842b6a4b72da9b2906ffc5ca9d846ace2c54c14 upstream. CRIU restore ioctls are meant to be called by CRIU with no existing drm file. There's an error path for if the drm file unexpectedly exists. It was positioned so it was missing a fput(drm_file). Do that check earlier, as soon as we have the pdd. Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 2bab781dac78916c5cc8de76345a4102449267d7) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursdrm/amdkfd: fix a vulnerability of integer overflow in kfd debuggerEric Huang1-3/+5
commit 93f5534b35a05ef8a0109c1eefa800062fee810a upstream. get_queue_ids() computes array_size = num_queues * sizeof(uint32_t), which could overflow on 32-bit size_t build. using array_size() instead, it saturates to SIZE_MAX on overflow. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 2d57a0475f085c08b49312dfd8edcb461845f285) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursdrm/amdkfd: fix NULL pointer bug in svm_range_set_attrEric Huang1-0/+3
commit e984d61d92e702096058f0f828f4b2b8563b88ce upstream. The process_info could be NULL if user doesn't call kfd_ioctl_acquire_vm before calling kfd_ioctl_svm. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 83a26c812e0529eb040d31a76f73e33e637243d4) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursserial: fsl_lpuart: fix rx buffer and DMA map leaks in start_rx_dmaShitalkumar Gandhi1-3/+12
commit 9a9254c4a2a3ca2b3da16d173f3b0dd01f397ff6 upstream. lpuart_start_rx_dma() allocates sport->rx_ring.buf with kzalloc() and then maps a scatterlist via dma_map_sg(). On three subsequent error paths the function returns directly without releasing those resources: - when dma_map_sg() returns 0 (-EINVAL): ring->buf is leaked. - when dmaengine_slave_config() fails: ring->buf and the DMA mapping are leaked. - when dmaengine_prep_dma_cyclic() returns NULL: ring->buf and the DMA mapping are leaked. The sole cleanup path, lpuart_dma_rx_free(), is only reached when lpuart_dma_rx_use is set, and the caller lpuart_rx_dma_startup() clears that flag on failure of lpuart_start_rx_dma(). So these resources are permanently leaked on every failure in this function. Repeated port open/close or termios changes under error conditions will slowly consume memory and leave stale streaming DMA mappings behind. Fix it by introducing two error labels that unmap the scatterlist and free the ring buffer as appropriate. While here, replace the misleading -EFAULT (bad userspace pointer) returned when dmaengine_prep_dma_cyclic() fails with the more accurate -ENOMEM, matching how other dmaengine users in the tree treat this failure. No functional change on the success path. Fixes: 5887ad43ee02 ("tty: serial: fsl_lpuart: Use cyclic DMA for Rx") Cc: stable <stable@kernel.org> Signed-off-by: Shitalkumar Gandhi <shitalkumar.gandhi@cambiumnetworks.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20260420135903.2062024-1-shitalkumar.gandhi@cambiumnetworks.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursserial: zs: Fix swapped RI/DSR modem line transition countingMaciej W. Rozycki1-2/+2
commit d15cd40cb1858f75846eaafa9a6bca841b790a92 upstream. Fix a thinko in the status interrupt handler that has caused counters for the RI and DSR modem line transitions to be used for the other line each. Fixes: 8b4a40809e53 ("zs: move to the serial subsystem") Cc: stable <stable@kernel.org> Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Link: https://patch.msgid.link/alpine.DEB.2.21.2604101747110.29980@angie.orcam.me.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursserial: sh-sci: fix memory region release in error pathHongling Zeng1-1/+1
commit 92b1ea22454b08a39baef3a7290fb3ec50366616 upstream. The sci_request_port() function uses request_mem_region() to reserve I/O memory, but in the error path when sci_remap_port() fails, it incorrectly calls release_resource() instead of release_mem_region(). This mismatch can cause resource accounting issues. Fix it by using the correct release function, consistent with sci_release_port(). Fixes: e2651647080930a1 ("serial: sh-sci: Handle port memory region reservations.") Cc: stable <stable@kernel.org> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Closes: https://lore.kernel.org/r/202604032356.SzEjYkBC-lkp@intel.com/ Signed-off-by: Hongling Zeng <zenghongling@kylinos.cn> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://patch.msgid.link/20260421065737.724187-1-zenghongling@kylinos.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursserial: qcom_geni: fix kfifo underflow when flush precedes DMA completion IRQViken Dadhaniya1-1/+13
commit 452d6fa37ae9b021f4f6d397dbae077f7296f6f4 upstream. When uart_flush_buffer() runs before the DMA completion IRQ is delivered, the following race can occur (all steps serialized by uart_port_lock): 1. DMA starts: tx_remaining = N, kfifo contains N bytes 2. DMA completes in hardware; IRQ is pending but not yet delivered 3. uart_flush_buffer() acquires the port lock and calls kfifo_reset(), making kfifo_len() = 0 while tx_remaining remains N 4. uart_flush_buffer() releases the port lock 5. DMA IRQ fires; handle_tx_dma() acquires the port lock and calls uart_xmit_advance(uport, tx_remaining) on an empty kfifo uart_xmit_advance() increments kfifo->out by tx_remaining. Since kfifo_reset() already set both in and out to 0, out wraps past in, causing kfifo_len() to return UART_XMIT_SIZE - tx_remaining. The next start_tx_dma() call then submits a DMA transfer of stale buffer data. Fix this by snapshotting kfifo_len() at the start of handle_tx_dma() and skipping uart_xmit_advance() when fifo_len < tx_remaining, which indicates the kfifo was reset by a preceding flush. Fixes: 2aaa43c70778 ("tty: serial: qcom-geni-serial: add support for serial engine DMA") Cc: stable <stable@kernel.org> Signed-off-by: Viken Dadhaniya <viken.dadhaniya@oss.qualcomm.com> Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Link: https://patch.msgid.link/20260506-serial-dma-stale-tx-buf-v1-1-e3ccb360d719@oss.qualcomm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursserial: qcom-geni: fix UART_RX_PAR_EN bit positionPrasanna S1-1/+1
commit ca2584d841b69391ffc4144840563d2e1a0018df upstream. UART_RX_PAR_EN is incorrectly defined as bit 3, which triggers false framing errors (S_GP_IRQ_1_EN) and causes received data to be dropped when parity is enabled and the parity bit is 0. Define UART_RX_PAR_EN as bit 4 of the SE_UART_RX_TRANS_CFG register, as specified in the reference manual. Fixes: c4f528795d1a ("tty: serial: msm_geni_serial: Add serial driver support for GENI based QUP") Cc: stable <stable@kernel.org> Signed-off-by: Prasanna S <prasanna.s@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://patch.msgid.link/20260428-serial-bit-correct-v1-1-9131ad5b97d8@oss.qualcomm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursserial: altera_jtaguart: handle uart_add_one_port() failuresMyeonghun Pak1-1/+6
commit ea66be25f0e934f49d24cd0c5845d13cdba3520b upstream. altera_jtaguart_probe() maps the register window before registering the UART port, but it ignores failures from uart_add_one_port(). If port registration fails, probe still returns success and the mapping remains live until a later remove path that is not part of probe failure cleanup. Return the uart_add_one_port() error and unmap the register window on that failure path. This issue was identified during our ongoing static-analysis research while reviewing kernel code. Fixes: 5bcd601049c6 ("serial: Add driver for the Altera JTAG UART") Cc: stable <stable@kernel.org> Co-developed-by: Ijae Kim <ae878000@gmail.com> Signed-off-by: Ijae Kim <ae878000@gmail.com> Signed-off-by: Myeonghun Pak <mhun512@gmail.com> Link: https://patch.msgid.link/20260512065837.79528-1-mhun512@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursdrm/amd/pm/si: Disregard vblank time when no displays are connectedTimur Kristóf1-0/+4
commit dd4f3ee535b3b0ac027f75dbf9dc5fc88733c765 upstream. When no displays are connected, there is no vblank happening so the power management code shouldn't worry about it. This fixes a regression that caused the memory clock to be stuck at maximum when there were no displays connected to a SI GPU. Fixes: 9003a0746864 ("drm/amd/pm: Treat zero vblank time as too short in si_dpm (v3)") Fixes: 9d73b107a61b ("drm/amd/pm: Use pm_display_cfg in legacy DPM (v2)") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Jeremy Klarenbeek <jeremy.klarenbeek99@gmail.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 6d87e0199f7b83735b56e422d59f170a201897a8) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursdrm/i915: Fix potential UAF in TTM object purgeJanusz Krzysztofik1-12/+16
commit 5c4063c87a619e4df954c179d24628636f5db15f upstream. TLDR: The bo->ttm object might be changed by calling ttm_bo_validate(), move casting it to an i915_tt object later to actually get the right pointer. A user reported hitting the following bug under heavy use on DG2: [26620.095550] Oops: general protection fault, probably for non-canonical address 0xa56b6b6b6b6b6b8b: 0000 1 SMP NOPTI [26620.095556] CPU: 2 UID: 0 PID: 631 Comm: Xorg Not tainted 6.18.8 #1 PREEMPT(lazy) [26620.095558] Hardware name: ASRock B850M Steel Legend WiFi/B850M Steel Legend WiFi, BIOS 3.50 09/18/2025 [26620.095559] RIP: 0010:i915_ttm_purge+0x84/0x100 [i915] [26620.095604] Code: 00 00 00 48 8d 54 24 10 48 89 e6 48 89 fb e8 83 aa ae ff 85 c0 75 6f 48 83 bb a8 01 00 00 00 74 2c 48 8b 45 78 48 85 c0 74 23 <48> 8b 78 20 48 c7 c2 ff ff ff ff 31 f6 e8 7a 73 e3 e0 48 8b 7d 78 [26620.095605] RSP: 0018:ffffc90005fd7430 EFLAGS: 00010282 [26620.095607] RAX: a56b6b6b6b6b6b6b RBX: ffff8881f46c3dc0 RCX: 0000000000000000 [26620.095608] RDX: 0000000000000000 RSI: 0000000000000246 RDI: 00000000ffffffff [26620.095609] RBP: ffff888289610f00 R08: 0000000000000001 R09: ffff88823b022000 [26620.095609] R10: ffff888103029b28 R11: ffff8881fc7f3800 R12: ffff88810b6150d0 [26620.095609] R13: ffff888289610f00 R14: 0000000000000000 R15: ffff8881f46c3dc0 [26620.095610] FS: 00007f1004d86900(0000) GS:ffff88901c858000(0000) knlGS:0000000000000000 [26620.095611] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [26620.095611] CR2: 00007f0fdf489000 CR3: 000000035b0c1000 CR4: 0000000000750ef0 [26620.095612] PKRU: 55555554 [26620.095612] Call Trace: [26620.095615] <TASK> [26620.095615] i915_ttm_move+0x2b9/0x420 [i915] [26620.095642] ? ttm_tt_init+0x65/0x80 [ttm] [26620.095644] ? i915_ttm_tt_create+0xc6/0x150 [i915] [26620.095667] ttm_bo_handle_move_mem+0xb6/0x160 [ttm] [26620.095669] ttm_bo_evict+0x100/0x150 [ttm] [26620.095671] ? preempt_count_add+0x64/0xa0 [26620.095673] ? _raw_spin_lock+0xe/0x30 [26620.095675] ? _raw_spin_unlock+0xd/0x30 [26620.095675] ? i915_gem_object_evictable+0xb7/0xd0 [i915] [26620.095704] ttm_bo_evict_cb+0x6e/0xd0 [ttm] [26620.095705] ttm_lru_walk_for_evict+0xa6/0x200 [ttm] [26620.095708] ttm_bo_alloc_resource+0x185/0x4f0 [ttm] [26620.095709] ? init_object+0x62/0xd0 [26620.095712] ttm_bo_validate+0x7a/0x180 [ttm] [26620.095713] ? _raw_spin_unlock_irqrestore+0x16/0x30 [26620.095714] __i915_ttm_get_pages+0xb0/0x170 [i915] [26620.095737] i915_ttm_get_pages+0x9f/0x150 [i915] [26620.095759] ? i915_gem_do_execbuffer+0xedc/0x2b40 [i915] [26620.095786] ? alloc_debug_processing+0xd0/0x100 [26620.095787] ? _raw_spin_unlock_irqrestore+0x16/0x30 [26620.095788] ? i915_vma_instance+0xa0/0x4e0 [i915] [26620.095822] __i915_gem_object_get_pages+0x2f/0x40 [i915] [26620.095848] i915_vma_pin_ww+0x706/0x980 [i915] [26620.095875] ? i915_gem_do_execbuffer+0xedc/0x2b40 [i915] [26620.095904] eb_validate_vmas+0x170/0xa00 [i915] [26620.095930] i915_gem_do_execbuffer+0x1201/0x2b40 [i915] [26620.095953] ? alloc_debug_processing+0xd0/0x100 [26620.095954] ? _raw_spin_unlock_irqrestore+0x16/0x30 [26620.095955] ? i915_gem_execbuffer2_ioctl+0xc9/0x240 [i915] [26620.095977] ? __wake_up_sync_key+0x32/0x50 [26620.095979] ? i915_gem_execbuffer2_ioctl+0xc9/0x240 [i915] [26620.096001] ? __slab_alloc.isra.0+0x67/0xc0 [26620.096003] i915_gem_execbuffer2_ioctl+0x11a/0x240 [i915] Results from decode_stacktrace.sh pointed to dereference of a file pointer field of a i915 TTM page vector container associated with an object being purged on eviction. That path is taken when the object is marked as no longer needed. Code analysis revealed a possibility of the i915 TTM page vector container being replaced with a new instance inside a function that purges content of the object, should it be still busy. That function is called, indirectly via a more general function that changes the object's placement and caching policy, before the problematic dereference, but still after a pointer to the container is captured, rendering the pointer no longer valid. Fix the issue by capturing the pointer to the container only after its potential replacement. v2: Move the container_of() inside the if block (Sebastian), - a simplified version of the commit description that explains briefly why the change is necessary (Christian). Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/work_items/14882 Fixes: 7ae034590ceae ("drm/i915/ttm: add tt shmem backend") Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Cc: stable@vger.kernel.org # v5.17+ Cc: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Sebastian Brzezinka <sebastian.brzezinka@intel.com> Cc: Christian König <christian.koenig@amd.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://lore.kernel.org/r/20260508122612.469227-2-janusz.krzysztofik@linux.intel.com (cherry picked from commit 4462966a93eb185849b7f174f0d0de53476d00a4) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursdrm/i915/psr: Use DC_OFF wake reference to block DC6 on vblank enableJouni Högander4-17/+18
commit 3549a9649dc7c5fc586ab12f675279283cdcb2a7 upstream. We are observing following warnings: *ERROR* power well DC_off state mismatch (refcount 0/enabled 1) gen9_dc_off_power_well_enabled is considering target state DC_STATE_DISABLE as DC_OFF power well being enabled. Fix this by using wakeref for the purpose. To achieve this we need to modify notification code as well. Currently it is possible that PSR gets notified vblank enable/disable twice on same status. This is currently not a problem as it is just triggering call to intel_display_power_set_target_dc_state with same target state as a parameter. When using wakeref this becomes a problem due to reference counting. Fix this storing vbank status on last notification and use that to ensure there are no more than one notification with same vblank status. v2: ensure there is no subsequent notifications with same status Fixes: aa451abcffb5 ("drm/i915/display: Prevent DC6 while vblank is enabled for Panel Replay") Cc: <stable@vger.kernel.org> # v6.13+ Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Michał Grzelak <michal.grzelak@intel.com> Link: https://patch.msgid.link/20260520104944.239797-2-jouni.hogander@intel.com (cherry picked from commit 35485ac56d878192a3829a58cb26503125ec7104) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursdrm/i915/psr: Block DC states on vblank enable when Panel Replay supportedJouni Högander1-9/+10
commit 8bb9093df555f9e89fdbe1405118b11384c03e04 upstream. Currently we are blocking DC states only when Panel Replay is enabled on vblank enable. It may happen that Panel Replay is getting enabled when vblank is already enabled. Fix this by blocking DC states always if Panel Replay is supported. While at it take care of possible dual eDP case by looping all encoders supporting PSR. Fixes: 0c427ac78a1d ("drm/i915/psr: Add interface to notify PSR of vblank enable/disable") Cc: <stable@vger.kernel.org> # v6.16+ Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Michał Grzelak <michal.grzelak@intel.com> Link: https://patch.msgid.link/20260520104944.239797-1-jouni.hogander@intel.com (cherry picked from commit eb5911f990554f7ce947dd53df00c114362e4465) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursdrm/i915/color: Fix HDR pre-CSC LUT programming loopPranay Samala1-1/+1
commit d196136a988051173f68f91de0b5a1bd32122dd7 upstream. The integer lut programming loop never executes completely due to incorrect condition (i++ > 130). Fix to properly program 129th+ entries for values > 1.0. Cc: <stable@vger.kernel.org> #v6.19 Fixes: 82caa1c8813f ("drm/i915/color: Program Pre-CSC registers") Signed-off-by: Pranay Samala <pranay.samala@intel.com> Signed-off-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> Reviewed-by: Uma Shankar <uma.shankar@intel.com> Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://patch.msgid.link/20260519075308.383877-1-pranay.samala@intel.com (cherry picked from commit f33862ec3e8849ad7c0a3dd46719083b13ade248) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursdrm/gem: fix race between change_handle and handle_deleteZhenghang Xiao1-0/+2
commit 7164d78559b0ff29931a366a840a9e5dd53d4b7c upstream. drm_gem_change_handle_ioctl leaves the old handle live in the IDR during the window between spin_unlock(table_lock) and the final spin_lock(table_lock). A concurrent drm_gem_handle_delete on the old handle succeeds in this window, decrements handle_count to 0, and frees the GEM object while the new handle's IDR entry still references it. NULL the old handle's IDR entry before dropping table_lock so that any concurrent GEM_CLOSE on the old handle sees NULL and returns -EINVAL. Restore the old entry on the prime-bookkeeping error path. Fixes: 5e28b7b94408 ("drm: Set old handle to NULL before prime swap in change_handle") Signed-off-by: Zhenghang Xiao <kipreyyy@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patch.msgid.link/20260526085313.26791-1-kipreyyy@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursdrm/hyperv: validate VMBus packet size in receive callbackBerkant Koc1-13/+87
commit 7f87763f47a3c22fb50265a00619ef10f2394b18 upstream. hyperv_receive_sub() reads msg->vid_hdr.type and dispatches into one of four message-type branches without knowing how many bytes the host wrote into hv->recv_buf. The completion path then runs memcpy(hv->init_buf, msg, VMBUS_MAX_PACKET_SIZE), so the consumer that wakes on wait_for_completion_timeout() can read up to 16 KiB of residue from a prior message as if it were the response payload. Pass bytes_recvd into hyperv_receive_sub() and reject any packet that does not cover the pipe + synthvid header. A single switch on msg->vid_hdr.type then computes the type-specific payload size: the three completion-driving types (SYNTHVID_VERSION_RESPONSE, SYNTHVID_RESOLUTION_RESPONSE, SYNTHVID_VRAM_LOCATION_ACK) fall through to a shared exit that requires that size before memcpy/complete, while SYNTHVID_FEATURE_CHANGE validates its own payload and returns before reading is_dirt_needed. Unknown types are dropped. SYNTHVID_RESOLUTION_RESPONSE is variable length: the host fills resolution_count entries, not the full SYNTHVID_MAX_RESOLUTION_COUNT array. Validate the fixed prefix first so resolution_count can be read, bound it against the array, then require only the count-sized array, so the shorter responses the host actually sends are accepted. Only run the sub-handler when vmbus_recvpacket() returned success. The memcpy length is bytes_recvd, which is bounded by VMBUS_MAX_PACKET_SIZE only on a successful receive; on -ENOBUFS vmbus_recvpacket() instead reports the required length, which can exceed hv->recv_buf, so copying bytes_recvd would read and write past the 16 KiB buffers. Gating on the success return keeps the copy bounded. The nonzero-return path is itself a malformed-message case and is now logged rather than silently skipped; channel recovery is not attempted. Rejected packets are reported via drm_err_ratelimited() rather than silently dropped, matching the CoCo-hardened pattern in hv_kvp_onchannelcallback(). Fixes: 76c56a5affeb ("drm/hyperv: Add DRM driver for hyperv synthetic video device") Cc: stable@vger.kernel.org # 5.14+ Signed-off-by: Berkant Koc <me@berkoc.com> Assisted-by: Claude:claude-opus-4-7 berkoc-pipeline Reviewed-by: Michael Kelley <mhklinux@outlook.com> Tested-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Link: https://patch.msgid.link/8200dbc199c7a9b75ac7e8af6c748d2189b5ebd5.1779542874.git.me@berkoc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursdrm/hyperv: validate resolution_count and fix WIN8 fallbackBerkant Koc1-3/+10
commit 13d33b9ef67066c77c84273fac5a1d3fde3533d1 upstream. A SYNTHVID_RESOLUTION_RESPONSE with resolution_count > 64 walks past the supported_resolution[SYNTHVID_MAX_RESOLUTION_COUNT] array in the parse loop. Bound resolution_count against the array size, folded into the existing zero-check. When the WIN10 resolution probe fails, the caller in hyperv_connect_vsp() left hv->screen_*_max / preferred_* unpopulated, which sets mode_config.max_width / max_height to 0 and makes drm_internal_framebuffer_create() reject every userspace framebuffer with -EINVAL. The pre-WIN10 branch had the same gap for preferred_width / preferred_height. Use a single post-probe fallback guarded by screen_width_max == 0 so both paths converge on the WIN8 defaults. Signed-off-by: Berkant Koc <me@berkoc.com> Assisted-by: Claude:claude-opus-4-7 berkoc-pipeline Fixes: 76c56a5affeb ("drm/hyperv: Add DRM driver for hyperv synthetic video device") Cc: stable@vger.kernel.org # 5.14+ Reviewed-by: Michael Kelley <mhklinux@outlook.com> Tested-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Link: https://patch.msgid.link/6945b22419c7d404b4954a113de2ac9c900dba93.1779542874.git.me@berkoc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursscsi: target: iscsi: Validate CHAP_R length before base64 decodeAlexandru Hossu1-1/+18
commit 85db7391310b1304d2dc8ae3b0b12105a9567147 upstream. chap_server_compute_hash() allocates client_digest as kzalloc(chap->digest_size) and then, for BASE64-encoded responses, passes chap_r directly to chap_base64_decode() without checking whether the input length could produce more than digest_size bytes of output. chap_base64_decode() writes to the destination unconditionally as long as there is input to consume. With MAX_RESPONSE_LENGTH set to 128 and the "0b" prefix stripped by extract_param(), up to 127 base64 characters can reach the decoder. 127 characters decode to 95 bytes. For SHA-256 (digest_size=32) this overflows client_digest by 63 bytes; for MD5 (digest_size=16) the overflow is 79 bytes. The length check at line 344 fires after the write has already happened. The HEX branch in the same switch statement already validates the length up front. Apply the same approach to the BASE64 branch: strip trailing base64 padding characters, then reject any input whose data length exceeds DIV_ROUND_UP(digest_size * 4, 3) before calling the decoder. Stripping trailing '=' before the comparison handles both padded and unpadded encodings. chap_base64_decode() already returns early on '=', so the full original string is still passed to the decoder unchanged. The mutual CHAP path decodes CHAP_C into initiatorchg_binhex, which is kzalloc(CHAP_CHALLENGE_STR_LEN). extract_param() caps initiatorchg at CHAP_CHALLENGE_STR_LEN characters, so at most CHAP_CHALLENGE_STR_LEN-1 base64 characters reach the decoder. The maximum decoded size, DIV_ROUND_UP((CHAP_CHALLENGE_STR_LEN-1) * 3, 4), is less than CHAP_CHALLENGE_STR_LEN, so no overflow is possible there. A comment is added at the call site to document this. Fixes: 1e5733883421 ("scsi: target: iscsi: Support base64 in CHAP") Cc: stable@vger.kernel.org Signed-off-by: Alexandru Hossu <hossu.alexandru@gmail.com> Reviewed-by: David Disseldorp <ddiss@suse.de> Link: https://patch.msgid.link/20260521151121.808477-1-hossu.alexandru@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursscsi: target: iscsi: Bound iscsi_encode_text_output() appends to rsp_bufMichael Bommarito3-16/+55
commit bf33e01f88388c43e285492a63e539df6ffed64c upstream. iscsi_encode_text_output() concatenates "key=value\0" records into login->rsp_buf, an 8192-byte kzalloc(MAX_KEY_VALUE_PAIRS) buffer allocated in iscsit_alloc_login_setup_buffer(). The three sprintf() call sites in this function (lines 1398, 1411, 1424 in v7.1-rc2) never check the remaining buffer capacity: *length += sprintf(output_buf, "%s=%s", er->key, er->value); *length += 1; output_buf = textbuf + *length; The 8192-byte ceiling at iscsi_target_check_login_request() bounds the *input* Login PDU payload, but a single PDU can carry up to 2048 minimal four-byte "a=b\0" pairs, each unknown key expanding to a 16-byte "a=NotUnderstood\0" output record via iscsi_add_notunderstood_response(). 2048 * 16 = 32 KiB of output into an 8 KiB buffer, producing a ~24 KiB heap overrun in the kmalloc-8k slab. The fix introduces a static iscsi_encode_text_record() helper that uses snprintf() with a per-call bounds check against the remaining buffer, and threads a u32 textbuf_size parameter through iscsi_encode_text_output(). Both call sites in iscsi_target_handle_csg_zero() (PHASE_SECURITY) and iscsi_target_handle_csg_one() (PHASE_OPERATIONAL) pass MAX_KEY_VALUE_PAIRS. On overflow the encoder logs the condition, calls iscsi_release_extra_responses() to drop queued records, and returns -1; both caller sites now emit ISCSI_STATUS_CLS_INITIATOR_ERR / ISCSI_LOGIN_STATUS_INIT_ERR via iscsit_tx_login_rsp() before returning, so the initiator sees an explicit failed-login response rather than a silent connection drop. (Prior to this patch only the PHASE_OPERATIONAL caller did that; the PHASE_SECURITY caller is converted to the same shape.) Fixes: e48354ce078c ("iscsi-target: Add iSCSI fabric support for target v4.1") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Tested-by: John Garry <john.g.garry@oracle.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursscsi: target: iscsi: Fix CRC overread and double-free in ↵Michael Bommarito1-1/+4
iscsit_handle_text_cmd() commit 778c2ab142c625a8a8afa570e0f9b7873f445d99 upstream. Two latent bugs in the Text-phase handler, both present since the original LIO integration in commit e48354ce078c ("iscsi-target: Add iSCSI fabric support for target v4.1"): 1) DataDigest CRC buffer overread (4 bytes past text_in). text_in is kzalloc()'d at ALIGN(payload_length, 4). rx_size is then incremented by ISCSI_CRC_LEN to make room for the received DataDigest in the iovec, but the same (now-bumped) rx_size is passed as the buffer length to iscsit_crc_buf(): if (conn->conn_ops->DataDigest) { ... rx_size += ISCSI_CRC_LEN; } ... if (conn->conn_ops->DataDigest) { data_crc = iscsit_crc_buf(text_in, rx_size, 0, NULL); iscsit_crc_buf() walks rx_size bytes of text_in with crc32c(), so when DataDigest is negotiated it reads 4 bytes past the end of the text_in allocation. KASAN reproduces this directly on the unpatched mainline tree as slab-out-of-bounds in crc32c() called from the Text PDU path. The OOB bytes feed crc32c() and are then compared against the initiator-supplied checksum, so the value does not flow back to the attacker, but the kernel does read past the buffer on every Text PDU with DataDigest=CRC32C. Fix by passing the actual padded payload length (ALIGN(payload_length, 4)) that was used for the kzalloc(). 2) Stale cmd->text_in_ptr re-free (double-free) on ERL>0 bad DataDigest drop. On DataDigest mismatch with ErrorRecoveryLevel > 0 the handler silently drops the PDU and lets the initiator plug the CmdSN gap: kfree(text_in); return 0; cmd->text_in_ptr still points at the freed buffer. The next Text Request on the same ITT re-enters iscsit_setup_text_cmd(), which unconditionally does kfree(cmd->text_in_ptr); cmd->text_in_ptr = NULL; freeing the same pointer a second time. Session teardown via iscsit_release_cmd() has the same shape and hits the same double-free if the connection is dropped before a second Text Request arrives. On an unmodified mainline tree the bug-1 CRC overread fires first on the initial valid Text Request and perturbs the subsequent state, so #4 was isolated by building a kernel with only the bug-1 hunk of this patch applied plus temporary printk() observability around the three relevant kfree() sites. The observability prints are not part of this patch. On that build, a three-PDU Text Request sequence after login produces two back-to-back splats: BUG: KASAN: double-free in iscsit_setup_text_cmd+0x?? BUG: KASAN: double-free in iscsit_release_cmd+0x?? showing the same pointer freed in the ERL>0 drop path and again in iscsit_setup_text_cmd() (next Text Request on the same ITT) and once more in iscsit_release_cmd() (session teardown). On distro kernels with CONFIG_SLAB_FREELIST_HARDENED=y (default) the double-free becomes a remote kernel BUG(); on non-hardened kernels it corrupts the slab freelist. Fix by clearing cmd->text_in_ptr after the kfree() in the ERL>0 drop path. With both hunks applied #4 is directly observable on the stock tree without observability printks; fixing bug-1 alone would mask #4 less, not more, so the hunks are submitted together. Both fixes are one-liners. The Text PDU state machine is unchanged and the wire protocol is unaffected. Fixes: e48354ce078c ("iscsi-target: Add iSCSI fabric support for target v4.1") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Tested-by: John Garry <john.g.garry@oracle.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursscsi: scsi_transport_fc: Widen FPIN pname walker counter to u32Michael Bommarito1-36/+41
commit a9a39233ec1fc9f97ea1340a4d09bb7ec2be5153 upstream. An adjacent Fibre Channel fabric actor that can deliver an FPIN ELS frame to an lpfc or qla2xxx Linux initiator can trigger a non-return in the generic FC transport. This is not a local userspace or IP network path; the attacker must be able to inject fabric traffic, for example as a compromised switch or fabric controller, or as a same-zone N_Port on a fabric that permits source spoofing. The Link-Integrity and Peer-Congestion FPIN walkers used a u8 loop counter against the 32-bit on-wire pname_count field, and did not bound pname_count by the descriptor body already validated by the TLV walker. A pname_count of 256 therefore wraps the counter and keeps the loop condition true indefinitely. Factor the shared pname_list[] walk into one helper, widen the counter to u32, and clamp pname_count against the entries that fit in the descriptor body before iterating. Fixes: 3dcfe0de5a97 ("scsi: fc: Parse FPIN packets and update statistics") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://patch.msgid.link/20260520133015.1018937-1-michael.bommarito@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursscsi: fcoe: Reject FIP descriptors with zero fip_dlen in CVL walkerMichael Bommarito1-1/+1
commit 9eed1bd59937e6828b00d2f2dfef631d964f3636 upstream. drivers/scsi/fcoe/fcoe_ctlr.c::fcoe_ctlr_recv_clr_vlink() advanced the descriptor cursor by an attacker-supplied fip_dlen without ever requiring dlen >= sizeof(struct fip_desc) in the default branch. The named descriptor cases (FIP_DT_MAC, FIP_DT_NAME, FIP_DT_VN_ID) checked their per-type minimum lengths, but a FIP_DT_NON_CRITICAL descriptor (fip_dtype >= 128, which the standard requires receivers to silently ignore) skipped that check entirely. An unauthenticated L2 peer on the FCoE control VLAN could hang fcoe_ctlr_recv_work on an fcoe, qedf, or bnx2fc initiator indefinitely by emitting one FIP CVL frame whose single descriptor had fip_dtype == FIP_DT_NON_CRITICAL and fip_dlen == 0: the cursor advanced zero bytes per iteration and the loop condition rlen >= sizeof(*desc) stayed true forever, blocking every subsequent FIP frame on that controller. Tighten the outer dlen guard to also reject dlen < sizeof(struct fip_desc), so a malformed descriptor whose length cannot even cover the descriptor header is rejected before the switch. This is the same lower-bound the named cases already apply and is the minimum scope that closes the loop. Fixes: 97c8389d54b9 ("[SCSI] fcoe, libfcoe: Add support for FIP. FCoE discovery and keep-alive.") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Reviewed-by: Hannes Reinecke <hare@kernel.org> Link: https://patch.msgid.link/20260518144307.2820961-1-michael.bommarito@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursthunderbolt: property: Cap recursion depth in __tb_property_parse_dir()Michael Bommarito1-6/+12
commit 928abe19fbf0127003abcb1ea69cabc1c897d0ab upstream. A DIRECTORY entry's value field is used as the dir_offset for a recursive call into __tb_property_parse_dir() with no depth counter. A crafted peer that chains DIRECTORY entries into a back-reference loop drives the parser until the kernel stack is exhausted and the guard page fires. Any untrusted XDomain peer (cable, dock, in-line inspector, adjacent host) that reaches the PROPERTIES_REQUEST control-plane exchange can trigger this without authentication. Thread a depth counter through tb_property_parse() and __tb_property_parse_dir(), and reject blocks that exceed TB_PROPERTY_MAX_DEPTH = 8. That is comfortably larger than any observed legitimate XDomain layout. Operators who do not need XDomain host-to-host discovery can disable the path entirely with thunderbolt.xdomain=0 on the kernel command line. Fixes: cdae7c07e3e3 ("thunderbolt: Add support for XDomain properties") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Assisted-by: Codex:gpt-5-4 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursthunderbolt: property: Reject dir_len < 4 to prevent size_t underflowMichael Bommarito1-2/+6
commit de21b59c29e31c5108ddc04210631bbfab81b997 upstream. On the non-root path, __tb_property_parse_dir() takes dir_len from entry->length (u16 widened to size_t). Two distinct OOB conditions follow when entry->length < 4: 1. The non-root path begins with kmemdup(&block[dir_offset], sizeof(*dir->uuid), ...) which always reads 4 dwords from dir_offset. tb_property_entry_valid() only enforces dir_offset + entry->length <= block_len, so a crafted entry with dir_offset close to the end of the property block and entry->length in 0..3 passes that gate but lets the UUID copy run off the block (e.g. dir_offset = 497, dir_len = 3 in a 500-dword block reads block[497..501]). 2. After the kmemdup, content_len = dir_len - 4 underflows size_t to ~SIZE_MAX, nentries becomes SIZE_MAX / 4, and the entry walk runs OOB on each iteration until an entry fails validation or the kernel oopses on an unmapped page. Reject dir_len < 4 on the non-root path *before* the UUID kmemdup, which closes both holes. Also move INIT_LIST_HEAD(&dir->properties) up to immediately after the dir allocation so the new error-return path (and the existing uuid-alloc failure path) calling tb_property_free_dir() sees a walkable list rather than the zero-initialized NULL next/prev that list_for_each_entry_safe() would oops on. Fixes: cdae7c07e3e3 ("thunderbolt: Add support for XDomain properties") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Assisted-by: Codex:gpt-5-4 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursthunderbolt: property: Reject u32 wrap in tb_property_entry_valid()Michael Bommarito1-1/+5
commit 01deda0152066c6c955f0619114ea6afa070aaec upstream. entry->value is u32 and entry->length is u16; the sum is performed in u32 and wraps. A malicious XDomain peer can pick value = 0xffffff00, length = 0x100 so the sum 0x100000000 wraps to 0 and passes the > block_len check. tb_property_parse() then passes entry->value to parse_dwdata() as a dword offset into the property block, reading attacker-directed memory far past the allocation. For TEXT-typed entries with the "deviceid" or "vendorid" keys this lands in xd->device_name / xd->vendor_name and is readable back via the per-XDomain device_name / vendor_name sysfs attributes; the leak is NUL-bounded (kstrdup() stops at the first zero byte) and untargeted (the attacker picks a delta, not an absolute address). DATA-typed entries are parsed into property->value.data but not generically surfaced to userspace. Use check_add_overflow() so a wrapped sum is rejected. Fixes: cdae7c07e3e3 ("thunderbolt: Add support for XDomain properties") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Assisted-by: Codex:gpt-5-4 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursusb: gadget: f_fs: serialize DMABUF cancel against request completionMichael Bommarito1-2/+22
commit 2796646f6d892c1eb6818c7ca41fdfa12568e8d1 upstream. ffs_epfile_dmabuf_io_complete() calls usb_ep_free_request() on the completed request but leaves priv->req, the back-pointer that ffs_dmabuf_transfer() set on submission, pointing at the freed memory. A later FUNCTIONFS_DMABUF_DETACH ioctl or ffs_epfile_release() on the close path still sees priv->req non-NULL under ffs->eps_lock: if (priv->ep && priv->req) usb_ep_dequeue(priv->ep, priv->req); so usb_ep_dequeue() is called on a freed usb_request. On dummy_hcd the dequeue path only walks a live queue and pointer-compares, so the freed pointer reads without faulting and KASAN requires an explicit check at the FunctionFS call site to surface the use-after-free. On SG-capable in-tree UDCs the dequeue path dereferences the supplied request immediately: * chipidea's ep_dequeue() does container_of(req, struct ci_hw_req, req) and reads hwreq->req.status before acquiring its own lock. * cdnsp's cdnsp_gadget_ep_dequeue() reads request->status first. The narrower option of clearing priv->req via cmpxchg() in the completion does not close the race: the completion runs without eps_lock, so a cancel path holding eps_lock can still observe priv->req non-NULL, race a concurrent completion that clears and frees, and pass the freed pointer to usb_ep_dequeue(). A slightly longer fix that moves the free into the cleanup work is needed. Same class of lifetime race as the recent usbip-vudc timer fix [1]. Take eps_lock in the sole place that mutates priv->req from the callback direction by moving usb_ep_free_request() out of the completion into ffs_dmabuf_cleanup(), the existing work handler scheduled by ffs_dmabuf_signal_done() on ffs->io_completion_wq. Clear priv->req there under eps_lock before freeing, and only clear if priv->req still names our request (a subsequent ffs_dmabuf_transfer() on the same attachment may have queued a new one). This keeps the existing dummy_hcd sync-dequeue invariant: the completion callback is still invoked by the UDC without eps_lock held (dummy_hcd drops its own lock before calling the callback), and the callback now takes no f_fs lock at all. Serialization against the cancel path happens in cleanup, which runs from the workqueue with no f_fs lock held on entry. The priv ref count protects the containing ffs_dmabuf_priv: ffs_dmabuf_transfer() takes a ref via ffs_dmabuf_get(), cleanup drops it via ffs_dmabuf_put(), so priv stays live for the cleanup even after the cancel path's list_del + ffs_dmabuf_put. The ffs_dmabuf_transfer() error path no longer frees usb_req inline: fence->req and fence->ep are set before usb_ep_queue(), so ffs_dmabuf_cleanup() (scheduled by the error-path ffs_dmabuf_signal_done()) owns the free regardless of whether the queue succeeded. Reproduced under KASAN on both detach and close paths against dummy_hcd with an observability hook (kasan_check_byte(priv->req) immediately before usb_ep_dequeue) at the two FunctionFS cancel sites to surface the stale-pointer access; the hook is not part of this patch. The KASAN allocator / free stacks in the captured splats identify the same request: alloc in dummy_alloc_request, free in dummy_timer, fault reached from ffs_epfile_release (close) and from the FUNCTIONFS_DMABUF_DETACH ioctl (detach). With the patch applied, both paths are silent under the same hook. The bug is reached from the FunctionFS device node, which in real deployments is owned by the privileged gadget daemon (adbd, UMS, composite gadget services, etc.); it is not reachable from unprivileged userspace or from a USB host on the cable. FunctionFS mounts default to GLOBAL_ROOT_UID, but the filesystem supports uid=, gid=, and fmode= delegation to a non-root gadget daemon, so on real deployments the attacker may be a less-privileged service rather than root. Fixes: 7b07a2a7ca02 ("usb: gadget: functionfs: Add DMABUF import interface") Link: https://lore.kernel.org/all/20260417163552.807548-1-michael.bommarito@gmail.com/ [1] Cc: stable <stable@kernel.org> Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Link: https://patch.msgid.link/20260419161227.1587668-1-michael.bommarito@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
30 hoursusb: gadget: f_fs: copy only received bytes on short ep0 readMichael Bommarito1-1/+1
commit 4e036c10e7f4df5d951c69cc3697bc8e209c6d02 upstream. ffs_ep0_read() allocates its control-OUT data buffer with kmalloc() (not kzalloc) at the Length value from the Setup packet, then copies that full len to userspace regardless of how many bytes were actually received: data = kmalloc(len, GFP_KERNEL); ... ret = __ffs_ep0_queue_wait(ffs, data, len); if ((ret > 0) && (copy_to_user(buf, data, len))) ret = -EFAULT; __ffs_ep0_queue_wait() returns req->actual, which on a short control OUT transfer is strictly less than len. The copy_to_user() call still copies len bytes, so on a short OUT the last (len - ret) bytes of the kmalloc() buffer -- uninitialised slab residue -- are delivered to the FunctionFS daemon. Short ep0 OUT completions are specified USB control-transfer behavior and are produced by in-tree UDCs: * dwc2 continues on req->actual < req->length for ep0 DATA OUT (short-not-ok is the only ep0-OUT stall path). * aspeed_udc ends ep0 OUT on rx_len < ep->ep.maxpacket. * renesas_usbf logs "ep0 short packet" and completes the request. * dwc3 stalls on short IN but not on short OUT. A short ep0 OUT is therefore not evidence of a broken UDC; it is a normal condition f_fs has to cope with. The sibling gadgetfs implementation in drivers/usb/gadget/legacy/inode.c already does this correctly via min(len, dev->req->actual) before copy_to_user(). This patch brings f_fs.c to the same safe pattern rather than trimming at a defensive layer. The bug is reached from the FunctionFS device node, which in real deployments is owned by the privileged gadget daemon (adbd, UMS, composite gadget services, etc.); it is not reachable from unprivileged userspace. Linux host stacks normally reject short-wLength control OUTs before they reach the gadget, so reproducing this required a build that bypasses that host-side check. With the bypass in place, a 1-byte payload on a 64-byte Setup produces 63 bytes of non-canary slab residue in the daemon's read buffer. Fix by copying only ret (actually received) bytes to userspace. Fixes: ddf8abd25994 ("USB: f_fs: the FunctionFS driver") Cc: stable <stable@kernel.org> Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Link: https://patch.msgid.link/20260419160359.1577270-1-michael.bommarito@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>