summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2025-06-19net: pse-pd: Add support for budget evaluation strategiesKory Maincent (Dent Project)1-0/+76
This patch introduces the ability to configure the PSE PI budget evaluation strategies. Budget evaluation strategies is utilized by PSE controllers to determine which ports to turn off first in scenarios such as power budget exceedance. The pis_prio_max value is used to define the maximum priority level supported by the controller. Both the current priority and the maximum priority are exposed to the user through the pse_ethtool_get_status call. This patch add support for two mode of budget evaluation strategies. 1. Static Method: This method involves distributing power based on PD classification. It’s straightforward and stable, the PSE core keeping track of the budget and subtracting the power requested by each PD’s class. Advantages: Every PD gets its promised power at any time, which guarantees reliability. Disadvantages: PD classification steps are large, meaning devices request much more power than they actually need. As a result, the power supply may only operate at, say, 50% capacity, which is inefficient and wastes money. Priority max value is matching the number of PSE PIs within the PSE. 2. Dynamic Method: To address the inefficiencies of the static method, vendors like Microchip have introduced dynamic power budgeting, as seen in the PD692x0 firmware. This method monitors the current consumption per port and subtracts it from the available power budget. When the budget is exceeded, lower-priority ports are shut down. Advantages: This method optimizes resource utilization, saving costs. Disadvantages: Low-priority devices may experience instability. Priority max value is set by the PSE controller driver. For now, budget evaluation methods are not configurable and cannot be mixed. They are hardcoded in the PSE driver itself, as no current PSE controller supports both methods. Signed-off-by: Kory Maincent (Dent Project) <kory.maincent@bootlin.com> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/20250617-feature_poe_port_prio-v14-7-78a1a645e2ee@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-19net: ethtool: Add support for new power domains index descriptionKory Maincent (Dent Project)1-0/+2
Report the index of the newly introduced PSE power domain to the user, enabling improved management of the power budget for PSE devices. Signed-off-by: Kory Maincent (Dent Project) <kory.maincent@bootlin.com> Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/20250617-feature_poe_port_prio-v14-5-78a1a645e2ee@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-19net: pse-pd: Add support for PSE power domainsKory Maincent (Dent Project)1-0/+2
Introduce PSE power domain support as groundwork for upcoming port priority features. Multiple PSE PIs can now be grouped under a single PSE power domain, enabling future enhancements like defining available power budgets, port priority modes, and disconnection policies. This setup will allow the system to assess whether activating a port would exceed the available power budget, preventing over-budget states proactively. Signed-off-by: Kory Maincent (Dent Project) <kory.maincent@bootlin.com> Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/20250617-feature_poe_port_prio-v14-4-78a1a645e2ee@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-19net: pse-pd: Add support for reporting eventsKory Maincent (Dent Project)2-0/+27
Add support for devm_pse_irq_helper() to register PSE interrupts and report events such as over-current or over-temperature conditions. This follows a similar approach to the regulator API but also sends notifications using a dedicated PSE ethtool netlink socket. Signed-off-by: Kory Maincent (Dent Project) <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20250617-feature_poe_port_prio-v14-2-78a1a645e2ee@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-19net: pse-pd: Introduce attached_phydev to pse controlKory Maincent (Dent Project)1-2/+4
In preparation for reporting PSE events via ethtool notifications, introduce an attached_phydev field in the pse_control structure. This field stores the phy_device associated with the PSE PI, ensuring that notifications are sent to the correct network interface. The attached_phydev pointer is directly tied to the PHY lifecycle. It is set when the PHY is registered and cleared when the PHY is removed. There is no need to use a refcount, as doing so could interfere with the PHY removal process. Signed-off-by: Kory Maincent (Dent Project) <kory.maincent@bootlin.com> Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/20250617-feature_poe_port_prio-v14-1-78a1a645e2ee@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-19stddef: Introduce TRAILING_OVERLAP() helper macroGustavo A. R. Silva1-0/+20
Add new TRAILING_OVERLAP() helper macro to create a union between a flexible-array member (FAM) and a set of members that would otherwise follow it. This overlays the trailing members onto the FAM while preserving the original memory layout. Co-developed-by: Kees Cook <kees@kernel.org> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/aFG8gEwKXAWWIvX0@kspp Signed-off-by: Kees Cook <kees@kernel.org>
2025-06-19mux: Convert mux_control_ops to a flex array member in mux_chipThorsten Blum1-2/+2
Convert mux_control_ops to a flexible array member at the end of the mux_chip struct and add the __counted_by() compiler attribute to improve access bounds-checking via CONFIG_UBSAN_BOUNDS and CONFIG_FORTIFY_SOURCE. Use struct_size() to calculate the number of bytes to allocate for a new mux chip and to remove the following Coccinelle/coccicheck warning: WARNING: Use struct_size Use size_add() to safely add any extra bytes. No functional changes intended. Link: https://github.com/KSPP/linux/issues/83 Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Link: https://lore.kernel.org/r/20250610104106.1948-2-thorsten.blum@linux.dev Signed-off-by: Kees Cook <kees@kernel.org>
2025-06-18PM: runtime: Mark last busy stamp in pm_request_autosuspend()Sakari Ailus1-3/+5
Set device's last busy timestamp to current time in pm_request_autosuspend(). Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Link: https://patch.msgid.link/20250616061212.2286741-6-sakari.ailus@linux.intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-06-18PM: runtime: Mark last busy stamp in pm_runtime_autosuspend()Sakari Ailus1-3/+6
Set device's last busy timestamp to current time in pm_runtime_autosuspend(). Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Link: https://patch.msgid.link/20250616061212.2286741-5-sakari.ailus@linux.intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-06-18PM: runtime: Mark last busy stamp in pm_runtime_put_sync_autosuspend()Sakari Ailus1-4/+7
Set device's last busy timestamp to current time in pm_runtime_put_sync_autosuspend(). Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Link: https://patch.msgid.link/20250616061212.2286741-4-sakari.ailus@linux.intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-06-18PM: runtime: Mark last busy stamp in pm_runtime_put_autosuspend()Sakari Ailus1-5/+7
Set device's last busy timestamp to current time in pm_runtime_put_autosuspend(). Callers wishing not to do that will need to use __pm_runtime_put_autosuspend(). Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Link: https://patch.msgid.link/20250616061212.2286741-3-sakari.ailus@linux.intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-06-18PM: runtime: Document return values of suspend-related API functionsSakari Ailus1-9/+138
Document return values for device suspend and idle related API functions. Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Link: https://patch.msgid.link/20250616061212.2286741-2-sakari.ailus@linux.intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-06-18Merge tag 'ata-6.16-rc3' of ↵Linus Torvalds1-4/+3
git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux Pull ata fixes from Niklas Cassel: - Force PIO for ATAPI devices on VT6415/VT6330 as the controller locks up on ATAPI DMA (Tasos) - Fix ACPI PATA cable type detection such that the controller is not forced down to a slow transfer mode (Tasos) - Fix build error on 32-bit UML (Johannes) - Fix a PCI region leak in the pata_macio driver so that the driver no longer fails to load after rmmod (Philipp) - Use correct DMI BIOS build date for ThinkPad W541 quirk (me) - Disallow LPM for ASUSPRO-D840SA motherboard as this board interestingly enough gets graphical corruptions on the iGPU when LPM is enabled (me) - Disallow LPM for Asus B550-F motherboard as this board will get command timeouts on ports 5 and 6, yet LPM with the same drive works fine on all other ports (Mikko) * tag 'ata-6.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux: ata: ahci: Disallow LPM for Asus B550-F motherboard ata: ahci: Disallow LPM for ASUSPRO-D840SA motherboard ata: ahci: Use correct BIOS build date for ThinkPad W541 quirk ata: pata_macio: Fix PCI region leak ata: pata_cs5536: fix build on 32-bit UML ata: libata-acpi: Do not assume 40 wire cable if no devices are enabled ata: pata_via: Force PIO for ATAPI devices on VT6415/VT6330
2025-06-18pinctrl: Constify pointers to 'pinctrl_desc'Krzysztof Kozlowski1-4/+4
Pin controller core code only stores the pointer to 'struct pinctrl_desc' and does not modify it anywhere. The pointer can be changed to pointer to const which makes the code safer, explicit and later allows constifying 'pinctrl_desc' allocations in individual drivers. Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/20250611-pinctrl-const-desc-v2-4-b11c1d650384@linaro.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2025-06-18padata: Remove comment for reorder_workHerbert Xu1-1/+0
Remove comment for reorder_work which no longer exists. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Fixes: 71203f68c774 ("padata: Fix pd UAF once and for all") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-06-18fs_context: fix parameter name in infofc() macroRubenKelevra1-1/+1
The macro takes a parameter called "p" but references "fc" internally. This happens to compile as long as callers pass a variable named fc, but breaks otherwise. Rename the first parameter to “fc” to match the usage and to be consistent with warnfc() / errorfc(). Fixes: a3ff937b33d9 ("prefix-handling analogues of errorf() and friends") Signed-off-by: RubenKelevra <rubenkelevra@gmail.com> Link: https://lore.kernel.org/20250617230927.1790401-1-rubenkelevra@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-06-18mtd: spinand: winbond: Prevent unsupported frequencies on dual/quad I/O variantsMiquel Raynal1-4/+6
Dual and quad capable chips natively support dual and quad I/O variants at up to 104MHz (1-2-2 and 1-4-4 operations). Reaching the maximum speed of 166MHz is theoretically possible (while still unsupported in the field) by adding a few more dummy cycles. Let's be accurate and clearly state this limit. Setting a maximum frequency implies adding the frequency parameter to the macro, which is done using a variadic argument to avoid impacting all the other drivers which already make use of this macro. Fixes: 1ea808b4d15b ("mtd: spinand: winbond: Update the *JW chip definitions") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2025-06-18Revert "mtd: core: always create master device"Miquel Raynal1-1/+1
The idea behind this patch was to always let a "master" mtd device available to anchor runtime PM. Historically, there was no mtd device representing the whole storage as soon as partitions were coming into play. The introduction of CONFIG_MTD_PARTITIONED_MASTER allowed to keep this "master" device, but was not enabled by default to avoid breaking existing users (otherwise the mtd device numbering would be totally messed up with an off by 1, at least). The approach of adding an mtd_master class on top of partitioned mtd devices is breaking the mtd core in many creative ways, so better think again this approach and revert the faulty changes for now. This reverts commit 0aa7b390fc40a871267a2328bbbefca8b37ad307. Fixes: 0aa7b390fc40 ("mtd: core: always create master device") Tested-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2025-06-18atm: Revert atm_account_tx() if copy_from_iter_full() fails.Kuniyuki Iwashima1-0/+6
In vcc_sendmsg(), we account skb->truesize to sk->sk_wmem_alloc by atm_account_tx(). It is expected to be reverted by atm_pop_raw() later called by vcc->dev->ops->send(vcc, skb). However, vcc_sendmsg() misses the same revert when copy_from_iter_full() fails, and then we will leak a socket. Let's factorise the revert part as atm_return_tx() and call it in the failure path. Note that the corresponding sk_wmem_alloc operation can be found in alloc_tx() as of the blamed commit. $ git blame -L:alloc_tx net/atm/common.c c55fa3cccbc2c~ Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Simon Horman <horms@kernel.org> Closes: https://lore.kernel.org/netdev/20250614161959.GR414686@horms.kernel.org/ Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250616182147.963333-3-kuni1840@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-18net: Add skb_can_coalesce for netmemDragos Tatulea1-3/+9
Allow drivers that have moved over to netmem to do fragment coalescing. Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250616141441.1243044-3-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-18net: ipv6: Add ip6_mr_output()Petr Machata2-0/+8
Multicast routing is today handled in the input path. Locally generated MC packets don't hit the IPMR code today. Thus if a VXLAN remote address is multicast, the driver needs to set an OIF during route lookup. Thus MC routing configuration needs to be kept in sync with the VXLAN FDB and MDB. Ideally, the VXLAN packets would be routed by the MC routing code instead. To that end, this patch adds support to route locally generated multicast packets. The newly-added routines do largely what ip6_mr_input() and ip6_mr_forward() do: make an MR cache lookup to find where to send the packets, and use ip6_output() to send each of them. When no cache entry is found, the packet is punted to the daemon for resolution. Similarly to the IPv4 case in a previous patch, the new logic is contingent on a newly-added IP6CB flag being set. Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/3bcc034a3ab4d3c291072fff38f78d7fbbeef4e6.1750113335.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-18net: vlan: Use IS_ENABLED() helper for CONFIG_VLAN_8021Q guardGal Pressman1-1/+1
The header currently tests the VLAN core with an explicit pair of 'if defined' checks: #if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE) Instead, use IS_ENABLED() which is the kernel way to test whether an option is configured as builtin/module. This is purely cosmetic – no functional changes. Reviewed-by: Alex Lazar <alazar@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250616132626.1749331-4-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-18net: vlan: Replace BUG() with WARN_ON_ONCE() in vlan_dev_* stubsGal Pressman1-3/+3
When CONFIG_VLAN_8021Q=n, a set of stub helpers are used, three of these helpers use BUG() unconditionally. This code should not be reached, as callers of these functions should always check for is_vlan_dev() first, but the usage of BUG() is not recommended, replace it with WARN_ON() instead. Reviewed-by: Alex Lazar <alazar@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250616132626.1749331-3-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-18net: vlan: Make is_vlan_dev() a stub when VLAN is not configuredGal Pressman1-5/+10
Add a stub implementation of is_vlan_dev() that returns false when VLAN support is not compiled in (CONFIG_VLAN_8021Q=n). This allows us to compile-out VLAN-dependent dead code when it is not needed. This also resolves the following compilation error when: * CONFIG_VLAN_8021Q=n * CONFIG_OBJTOOL=y * CONFIG_OBJTOOL_WERROR=y drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.o: error: objtool: parse_mirred.isra.0+0x370: mlx5e_tc_act_vlan_add_push_action() missing __noreturn in .c/.h or NORETURN() in noreturns.h The error occurs because objtool cannot determine that unreachable BUG() (which doesn't return) calls in VLAN code paths are actually dead code when VLAN support is disabled. Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250616132626.1749331-2-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-18tcp: remove RFC3517/RFC6675 hint state: lost_skb_hint, lost_cnt_hintNeal Cardwell1-3/+0
Now that obsolete RFC3517/RFC6675 TCP loss detection has been removed, we can remove the somewhat complex and intrusive code to maintain its hint state: lost_skb_hint and lost_cnt_hint. This commit makes tcp_clear_retrans_hints_partial() empty. We will remove tcp_clear_retrans_hints_partial() and its call sites in the next commit. Suggested-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Yuchung Cheng <ycheng@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250615001435.2390793-3-ncardwell.sw@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-18tpm: don't bother with removal of files in directory we'll be removingAl Viro1-1/+1
FWIW, there is a reliable indication of removal - ->i_nlink going to 0 ;-) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-06-18Merge tag 'drm-misc-next-2025-06-12' of ↵Dave Airlie1-13/+19
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for 6.17: UAPI Changes: Cross-subsystem Changes: Core Changes: - atomic-helpers: Tune the enable / disable sequence - bridge: Add destroy hook - color management: Add helpers for hardware gamma LUT handling - HDMI: Add CEC handling, YUV420 output support - sched: tracing improvements Driver Changes: - hyperv: Move out of simple-kms, drm_panic support - i915: drm_panel_follower support - imx: Add IMX8qxq Display Controller Support - lima: Add Rockchip RK3528 GPU Support - nouveau: fence handling cleanup - panfrost: Add BO labeling, 64-bit registers access - qaic: Add RAS Support - rz-du: Add RZ/V2H(P) Support, MIPI-DSI DCS Support - sun4i: Add H616 Support - tidss: Add TI AM62L Support - vkms: YUV and R* formats support - bridges: - Switched to reference counted drm_bridge allocations - panels: - Switched to reference counted drm_panel allocations - Add support for fwnode-based panel lookup - himax-hx8394: Support for Huiling hl055fhv028c - ilitek-ili9881c: Support for 7" Raspberry Pi 720x1280 - panel-edp: Support for KDC KD116N3730A05, N160JCE-ELL CMN, - panel-simple: Support for AUO P238HAN01 - st7701: Support for Winstar wf40eswaa6mnn0 - visionox-rm69299: Support for rm69299-shift - New panels: Renesas R61307, Renesas R69328 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://lore.kernel.org/r/20250612-coucal-of-impossible-cleaning-a5eecf@houat
2025-06-18Merge branch 'shradha_v6.16-rc1' of https://github.com/shradhagupta6/linuxJakub Kicinski1-0/+2
Shradha Gupta says: ==================== Allow dyn MSI-X vector allocation of MANA In this patchset we want to enable the MANA driver to be able to allocate MSI-X vectors in PCI dynamically. The first patch exports pci_msix_prepare_desc() in PCI to be able to correctly prepare descriptors for dynamically added MSI-X vectors. The second patch adds the support of dynamic vector allocation in pci-hyperv PCI controller by enabling the MSI_FLAG_PCI_MSIX_ALLOC_DYN flag and using the pci_msix_prepare_desc() exported in first patch. The third patch adds a detailed description of the irq_setup(), to help understand the function design better. The fourth patch is a preparation patch for mana changes to support dynamic IRQ allocation. It contains changes in irq_setup() to allow skipping first sibling CPU sets, in case certain IRQs are already affinitized to them. The fifth patch has the changes in MANA driver to be able to allocate MSI-X vectors dynamically. If the support does not exist it defaults to older behavior. * 'shradha_v6.16-rc1' of https://github.com/shradhagupta6/linux: net: mana: Allocate MSI-X vectors dynamically net: mana: Allow irq_setup() to skip cpus for affinity net: mana: explain irq_setup() algorithm PCI: hv: Allow dynamic MSI-X vector allocation PCI/MSI: Export pci_msix_prepare_desc() for dynamic MSI-X allocations ==================== Link: https://patch.msgid.link/1749650984-9193-1-git-send-email-shradhagupta@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-17soc: qcom: fix endianness for QMI headerAlexander Wilhelm1-3/+3
The members of QMI header have to be swapped on big endian platforms. Use __le16 types instead of u16 ones. Signed-off-by: Alexander Wilhelm <alexander.wilhelm@westermo.com> Fixes: 9b8a11e82615 ("soc: qcom: Introduce QMI encoder/decoder") Fixes: 3830d0771ef6 ("soc: qcom: Introduce QMI helpers") Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250522143530.3623809-3-alexander.wilhelm@westermo.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2025-06-17cgroup: remove per-cpu per-subsystem locksShakeel Butt1-7/+0
The rstat update side used to insert the cgroup whose stats are updated in the update tree and the read side flush the update tree to get the latest uptodate stats. The per-cpu per-subsystem locks were used to synchronize the update and flush side. However now the update side does not access update tree but uses per-cpu lockless lists. So there is no need for locks to synchronize update and flush side. Let's remove them. Suggested-by: JP Kobryn <inwardvessel@gmail.com> Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev> Tested-by: JP Kobryn <inwardvessel@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-06-17cgroup: support to enable nmi-safe css_rstat_updatedShakeel Butt1-0/+4
Add necessary infrastructure to enable the nmi-safe execution of css_rstat_updated(). Currently css_rstat_updated() takes a per-cpu per-css raw spinlock to add the given css in the per-cpu per-css update tree. However the kernel can not spin in nmi context, so we need to remove the spinning on the raw spinlock in css_rstat_updated(). To support lockless css_rstat_updated(), let's add necessary data structures in the css and ss structures. Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev> Tested-by: JP Kobryn <inwardvessel@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-06-17Merge branch 'WQ_PERCPU' into for-6.17Tejun Heo1-3/+6
2025-06-17workqueue: Add new WQ_PERCPU flagMarco Crivellari1-0/+1
Currently if a user enqueue a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistentcy cannot be addressed without refactoring the API. This patch adds a new WQ_PERCPU flag to explicitly request the use of the per-CPU behavior. Both flags coexist for one release cycle to allow callers to transition their calls. Once migration is complete, WQ_UNBOUND can be removed and unbound will become the implicit default. tj: Merged doc patch. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-06-17workqueue: Add system_percpu_wq and system_dfl_wqMarco Crivellari1-3/+5
Currently, if a user enqueue a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistentcy cannot be addressed without refactoring the API. system_wq is a per-CPU worqueue, yet nothing in its name tells about that CPU affinity constraint, which is very often not required by users. Make it clear by adding a system_percpu_wq. system_unbound_wq should be the default workqueue so as not to enforce locality constraints for random work whenever it's not required. Adding system_dfl_wq to encourage its use when unbound work should be used. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-06-17x86/bugs: Add a Transient Scheduler Attacks mitigationBorislav Petkov (AMD)1-0/+1
Add the required features detection glue to bugs.c et all in order to support the TSA mitigation. Co-developed-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
2025-06-17wifi: remove zero-length arraysJohannes Berg1-9/+9
All of these are really meant to be variable-length, and in the case of s1g_beacon it's actually accessed. Make that one in particular, and a couple of others (that aren't used as arrays now), actually variable. Reported-by: syzbot+fd222bb38e916df26fa4@syzkaller.appspotmail.com Fixes: 1e1f706fc2ce ("wifi: cfg80211/mac80211: correctly parse S1G beacon optional elements") Link: https://patch.msgid.link/20250614003037.a3e82e882251.I2e8b58e56ff2a9f8b06c66f036578b7c1d4e4685@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-06-17mm/filemap: introduce generic_file_*_mmap_prepare() helpersLorenzo Stoakes1-2/+4
Since commit c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback"), the f_op->mmap() hook has been deprecated in favour of f_op->mmap_prepare(). The generic mmap handlers are very simple, so we can very easily convert these in advance of converting file systems which use them. This patch does so. Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Link: https://lore.kernel.org/30622c1f0b98c66840bc8c02668bda276a810b70.1750099179.git.lorenzo.stoakes@oracle.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-06-17fs/dax: make it possible to check dev dax support without a VMALorenzo Stoakes1-7/+9
This is a prerequisite for adapting those filesystems to use the .mmap_prepare() hook for mmap()'ing which invoke this check as this hook does not have access to a VMA pointer. To effect this, change the signature of daxdev_mapping_supported() and update its callers (ext4 and xfs mmap()'ing hook code). Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Link: https://lore.kernel.org/b09de1e8544384074165d92d048e80058d971286.1750099179.git.lorenzo.stoakes@oracle.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-06-17fs: consistently use can_mmap_file() helperLorenzo Stoakes1-1/+1
Since commit c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback"), the f_op->mmap() hook has been deprecated in favour of f_op->mmap_prepare(). Additionally, commit bb666b7c2707 ("mm: add mmap_prepare() compatibility layer for nested file systems") permits the use of the .mmap_prepare() hook even in nested filesystems like overlayfs. There are a number of places where we check only for f_op->mmap - this is incorrect now mmap_prepare exists, so update all of these to use the general helper can_mmap_file(). Most notably, this updates the elf logic to allow for the ability to execute binaries on filesystems which have the .mmap_prepare hook, but additionally we update nested filesystems. Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Link: https://lore.kernel.org/b68145b609532e62bab603dd9686faa6562046ec.1750099179.git.lorenzo.stoakes@oracle.com Acked-by: Kees Cook <kees@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-06-17serial: 8250: sanitize uart_port::serial_{in,out}() typesJiri Slaby (SUSE)2-4/+4
uart_port::{serial_in,serial_out} (and plat_serial8250_port::* likewise) historically use: * 'unsigned int' for 32-bit register values in reads and writes, and * 'int' for offsets. Make them sane such that: * 'u32' is used for register values, and * 'unsigned int' is used for offsets. While at it, name hooks' parameters, so it is clear what is what. Signed-off-by: "Jiri Slaby (SUSE)" <jirislaby@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Paul Cercueil <paul@crapouillou.net> Cc: Vladimir Zapolskiy <vz@mleia.com> Cc: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20250611100319.186924-9-jirislaby@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-17tty: introduce and use tty_port_tty_vhangup() helperJiri Slaby (SUSE)1-1/+11
This code (tty_get -> vhangup -> tty_put) is repeated on few places. Introduce a helper similar to tty_port_tty_hangup() (asynchronous) to handle even vhangup (synchronous). And use it on those places. In fact, reuse the tty_port_tty_hangup()'s code and call tty_vhangup() depending on a new bool parameter. Signed-off-by: "Jiri Slaby (SUSE)" <jirislaby@kernel.org> Cc: Karsten Keil <isdn@linux-pingi.de> Cc: David Lin <dtwlin@gmail.com> Cc: Johan Hovold <johan@kernel.org> Cc: Alex Elder <elder@kernel.org> Cc: Oliver Neukum <oneukum@suse.com> Cc: Marcel Holtmann <marcel@holtmann.org> Cc: Johan Hedberg <johan.hedberg@gmail.com> Cc: Luiz Augusto von Dentz <luiz.dentz@gmail.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20250611100319.186924-2-jirislaby@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-17mm: rename call_mmap/mmap_prepare to vfs_mmap/mmap_prepareLorenzo Stoakes1-3/+2
The call_mmap() function violates the existing convention in include/linux/fs.h whereby invocations of virtual file system hooks is performed by functions prefixed with vfs_xxx(). Correct this by renaming call_mmap() to vfs_mmap(). This also avoids confusion as to the fact that f_op->mmap_prepare may be invoked here. Also rename __call_mmap_prepare() function to vfs_mmap_prepare() and adjust to accept a file parameter, this is useful later for nested file systems. Finally, fix up the VMA userland tests and ensure the mmap_prepare -> mmap shim is implemented there. Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Link: https://lore.kernel.org/8d389f4994fa736aa8f9172bef8533c10a9e9011.1750099179.git.lorenzo.stoakes@oracle.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-06-17mm, slab: use frozen pages for large kmallocVlastimil Babka1-1/+3
Since slab pages are now frozen, it makes sense to have large kmalloc() objects behave same as small kmalloc(), as the choice between the two is an implementation detail depending on allocation size. Notably, increasing refcount on a slab page containing kmalloc() object is not possible anymore, so it should be consistent for large kmalloc pages. Therefore, change large kmalloc to use the frozen pages API. Because of some unexpected fallout in the slab pages case (see commit b9c0e49abfca ("mm: decline to manipulate the refcount on a slab page"), implement the same kind of checks and warnings as part of this change. Notably, networking code using sendpage_ok() to determine whether the page refcount can be manipulated in the network stack should continue behaving correctly. Before this change, the function returns true for large kmalloc pages and page refcount can be manipulated. After this change, the function will return false. Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Harry Yoo <harry.yoo@oracle.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-06-17driver core: Add device_link_test() for testing device link flagsRafael J. Wysocki1-0/+5
To avoid coding mistakes like the one fixed by commit 3860cbe23963 ("PM: sleep: Fix bit masking operation"), introduce device_link_test() for testing device link flags and use it where applicable. No intentional functional impact. Signed-off-by: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/2793309.mvXUDI8C0e@rjwysocki.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-17PCI/MSI: Export pci_msix_prepare_desc() for dynamic MSI-X allocationsShradha Gupta1-0/+2
For supporting dynamic MSI-X vector allocation by PCI controllers, enabling the flag MSI_FLAG_PCI_MSIX_ALLOC_DYN is not enough, msix_prepare_msi_desc() to prepare the MSI descriptor is also needed. Export pci_msix_prepare_desc() to allow PCI controllers to support dynamic MSI-X vector allocation. Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2025-06-17net: phy: remove phy_driver_is_genphy_10gHeiner Kallweit1-2/+0
Remove now unused function phy_driver_is_genphy_10g(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/49b0589a-9604-4ee9-add5-28fbbbe2c2f3@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-17net: phy: improve phy_driver_is_genphyHeiner Kallweit1-1/+11
Use new flag phydev->is_genphy_driven to simplify this function. Note that this includes a minor functional change: Now this function returns true if ANY of the genphy drivers is bound to the PHY device. We have only one user in DSA driver mt7530, and there the functional change doesn't matter. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/c9ac3a7d-262a-425d-9153-97fe3ca6280a@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-17net: phy: add flag is_genphy_driven to struct phy_deviceHeiner Kallweit1-0/+2
In order to get rid of phy_driver_is_genphy() and phy_driver_is_genphy_10g(), as first step add and use a flag phydev->is_genphy_driven. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/3f3ad6dc-402e-4915-8d5a-2306b6d5562b@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-17octeontx2: Set appropriate PF, VF masks and shifts based on siliconSubbaraya Sundeep1-0/+25
Number of RVU PFs on CN20K silicon have increased to 96 from maximum of 32 that were supported on earlier silicons. Every RVU PF and VF is identified by HW using a 16bit PF_FUNC value. Due to the change in Max number of PFs in CN20K, the bit encoding of this PF_FUNC has changed. This patch handles the change by using helper functions(using silicon check) to use PF,VF masks and shifts to support both new silicon CN20K, OcteonTx series. These helper functions are used in different modules. Also moved the NIX AF register offset macros to other files which will be posted in coming patches. Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sai Krishna <saikrishnag@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Link: https://patch.msgid.link/1749639716-13868-2-git-send-email-sbhatta@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-17netpoll: move netpoll_print_options to netconsoleBreno Leitao1-1/+0
Move netpoll_print_options() from net/core/netpoll.c to drivers/net/netconsole.c and make it static. This function is only used by netconsole, so there's no need to export it or keep it in the public netpoll API. This reduces the netpoll API surface and improves code locality by keeping netconsole-specific functionality within the netconsole driver. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250613-rework-v3-4-0752bf2e6912@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>