summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2026-04-10netfilter: require Ethernet MAC header before using eth_hdr()Zhengchuan Liang6-14/+24
`ip6t_eui64`, `xt_mac`, the `bitmap:ip,mac`, `hash:ip,mac`, and `hash:mac` ipset types, and `nf_log_syslog` access `eth_hdr(skb)` after either assuming that the skb is associated with an Ethernet device or checking only that the `ETH_HLEN` bytes at `skb_mac_header(skb)` lie between `skb->head` and `skb->data`. Make these paths first verify that the skb is associated with an Ethernet device, that the MAC header was set, and that it spans at least a full Ethernet header before accessing `eth_hdr(skb)`. Suggested-by: Florian Westphal <fw@strlen.de> Tested-by: Ren Wei <enjou1224z@gmail.com> Signed-off-by: Zhengchuan Liang <zcliangcn@gmail.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-10netfilter: nft_fwd_netdev: check ttl/hl before forwardingFlorian Westphal1-0/+10
Drop packets if their ttl/hl is too small for forwarding. Fixes: d32de98ea70f ("netfilter: nft_fwd_netdev: allow to forward packets via neighbour layer") Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-10netfilter: x_tables: Avoid a couple -Wflex-array-member-not-at-end warningsGustavo A. R. Silva1-4/+8
-Wflex-array-member-not-at-end was introduced in GCC-14, and we are getting ready to enable it, globally. Use the TRAILING_OVERLAP() helper to fix the following warnings: 1 net/netfilter/x_tables.c:816:39: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] 1 net/netfilter/x_tables.c:811:39: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] This helper creates a union between a flexible-array member (FAM) and a set of members that would otherwise follow it. This overlays the trailing members onto the FAM while preserving the original memory layout. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-10netfilter: conntrack: remove UDP-Lite conntrack supportFernando Fernandez Mancera11-170/+0
UDP-Lite (RFC 3828) socket support was recently retired from the core networking stack. As a follow-up of that, drop the connection tracker and NAT support for UDP-Lite in Netfilter. This patch removes CONFIG_NF_CT_PROTO_UDPLITE and scrubs UDP-Lite awareness from the conntrack core, NAT core, nft_ct, and ctnetlink. Please note that stateless packet inspection, matching, ipsets or logging support for IPPROTO_UDPLITE is preserved. As conntrack no longer extracts UDP-Lite ports or tracks its L4 state, when performing NAT the UDP-Lite checksum cannot be updated anymore. That is an expected and acceptable consequence of removing UDP-Lite conntrack module. Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-10netfilter: xt_socket: enable defrag after all other checksFlorian Westphal1-17/+6
Originally this did not matter because defrag was enabled once per netns and only disabled again on netns dismantle. When this got changed I should have adjusted checkentry to not leave defrag enabled on error. Fixes: de8c12110a13 ("netfilter: disable defrag once its no longer needed") Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-10netfilter: xt_HL: add pr_fmt and checkentry validationMarino Dzalto1-0/+27
Add pr_fmt to prefix log messages with the module name for easier debugging in dmesg. Add checkentry functions for IPv4 (ttl_mt_check) and IPv6 (hl_mt6_check) to validate the match mode at rule registration time, rejecting invalid modes with -EINVAL. The evaluation function returns false in case the mode is unknown, so this is a cleanup, not a bug fix. Signed-off-by: Marino Dzalto <marino.dzalto@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-10netfilter: nfnetlink: prefer skb_mac_header helpersFlorian Westphal2-22/+22
This adds implicit DEBUG_WARN_ON_ONCE for debug configurations. No other changes intended. Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-10netfilter: x_physdev: reject empty or not-nul terminated device namesFlorian Westphal1-0/+22
Reject names that lack a \0 character and reject the empty string as well. iptables allows this but it fails to re-parse iptables-save output that contain such rules. Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-10ipvs: add conn_lfactor and svc_lfactor sysctl varsJulian Anastasov2-0/+113
Allow the default load factor for the connection and service tables to be configured. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-10ipvs: add ip_vs_status infoJulian Anastasov1-0/+145
Add /proc/net/ip_vs_status to show current state of IPVS. The motivation for this new /proc interface is to provide the output for the users to help them decide when to tune the load factor for hash tables, which is possible with the new sysctl knobs coming in followup patch. The output also includes information for the kthreads used for stats. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-10ipvs: show the current conn_tab size to usersJulian Anastasov1-4/+22
As conn_tab is per-net, better to show the current hash table size to users instead of the ip_vs_conn_tab_size (max). Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-10arm64: Kconfig: fix duplicate word in CMDLINE help textMichael Ugrin1-1/+1
Remove duplicate 'the' in the CMDLINE config help text. Signed-off-by: Michael Ugrin <mugrinphoto@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2026-04-10Merge branch 'pm-cpufreq'Rafael J. Wysocki28-200/+1105
Merge cpufreq updates for 7.1-rc1: - Update qcom-hw DT bindings to include Eliza hardware (Abel Vesa) - Update cpufreq-dt-platdev blocklist (Faruque Ansari) - Minor updates to driver and dt-bindings for Tegra (Thierry Reding, Rosen Penev) - Add MAINTAINERS entry for CPPC driver (Viresh Kumar) - Add support for new features: CPPC performance priority, Dynamic EPP, Raw EPP, and new unit tests for them to amd-pstate (Gautham Shenoy, Mario Limonciello) - Fix sysfs files being present when HW missing and broken/outdated documentation in the amd-pstate driver (Ninad Naik, Gautham Shenoy) - Pass the policy to cpufreq_driver->adjust_perf() to avoid using cpufreq_cpu_get() in the .adjust_perf() callback in amd-pstate which leads to a scheduling-while-atomic bug (K Prateek Nayak) - Clean up dead code in Kconfig for cpufreq (Julian Braha) - Remove max_freq_req update for pre-existing cpufreq policy and add a boost_freq_req QoS request to save the boost constraint instead of overwriting the last scaling_max_freq constraint (Pierre Gondois) - Embed cpufreq QoS freq_req objects in cpufreq policy so they all are allocated in one go along with the policy to simplify lifetime rules and avoid error handling issues (Viresh Kumar) - Use DMI max speed when CPPC is unavailable in the acpi-cpufreq scaling driver (Henry Tseng) - Switch policy_is_shared() in cpufreq to using cpumask_nth() instead of cpumask_weight() because the former is more efficient (Yury Norov) - Use sysfs_emit() in sysfs show functions for cpufreq governor attributes (Thorsten Blum) - Update intel_pstate to stop returning an error when "off" is written to its status sysfs attribute while the driver is already off (Fabio De Francesco) - Include current frequency in the debug message printed by __cpufreq_driver_target() (Pengjie Zhang) * pm-cpufreq: (38 commits) cpufreq/amd-pstate: Add POWER_SUPPLY select for dynamic EPP MAINTAINERS: amd-pstate: Step down as maintainer, add Prateek as reviewer cpufreq: Pass the policy to cpufreq_driver->adjust_perf() cpufreq/amd-pstate: Pass the policy to amd_pstate_update() cpufreq/amd-pstate-ut: Add a unit test for raw EPP cpufreq/amd-pstate: Add support for raw EPP writes cpufreq/amd-pstate: Add support for platform profile class cpufreq/amd-pstate: add kernel command line to override dynamic epp cpufreq/amd-pstate: Add dynamic energy performance preference Documentation: amd-pstate: fix dead links in the reference section cpufreq/amd-pstate: Cache the max frequency in cpudata Documentation/amd-pstate: Add documentation for amd_pstate_floor_{freq,count} Documentation/amd-pstate: List amd_pstate_prefcore_ranking sysfs file Documentation/amd-pstate: List amd_pstate_hw_prefcore sysfs file amd-pstate-ut: Add a testcase to validate the visibility of driver attributes amd-pstate-ut: Add module parameter to select testcases amd-pstate: Introduce a tracepoint trace_amd_pstate_cppc_req2() amd-pstate: Add sysfs support for floor_freq and floor_count amd-pstate: Add support for CPPC_REQ2 and FLOOR_PERF x86/cpufeatures: Add AMD CPPC Performance Priority feature. ...
2026-04-10xen/grant-table: guard gnttab_suspend/resume with CONFIG_HIBERNATE_CALLBACKSPengpeng Hou2-1/+14
In current linux.git, gnttab_suspend() and gnttab_resume() are defined and declared unconditionally. However, their only in-tree callers reside in drivers/xen/manage.c, which are guarded by CONFIG_HIBERNATE_CALLBACKS. Match the helper scope to their callers by wrapping the definitions in CONFIG_HIBERNATE_CALLBACKS and providing no-op stubs in the header. This fixes the config-scope mismatch and reduces the code footprint when hibernation callbacks are disabled. Signed-off-by: Pengpeng Hou <pengpeng.hou@isrc.iscas.ac.cn> Signed-off-by: Juergen Gross <jgross@suse.com> Message-ID: <20260310080800.742223-1-pengpeng.hou@isrc.iscas.ac.cn>
2026-04-10hvc/xen: Check console connection flagJason Andryuk2-0/+16
When the console out buffer is filled, __write_console() will return 0 as it cannot send any data. domU_write_console() will then spin in `while (len)` as len doesn't decrement until xenconsoled attaches. This would block a domU and nullify the parallelism of Hyperlaunch until dom0 userspace starts xenconsoled, which empties the buffer. Xen 4.21 added a connection field to the xen console page. This is set to XENCONSOLE_DISCONNECTED (1) when a domain is built, and xenconsoled will set it to XENCONSOLE_CONNECTED (0) when it connects. Update the hvc_xen driver to check the field. When the field is disconnected, drop the write with -ENOTCONN. We only drop the write when the field is XENCONSOLE_DISCONNECTED (1) to try for maximum compatibility. The Xen toolstack has historically zero initialized the console, so it should see XENCONSOLE_CONNECTED (0) by default. If an implemenation used uninitialized memory, only checking for XENCONSOLE_DISCONNECTED could have the lowest chance of not connecting. This lets the hyperlaunched domU boot without stalling. Once dom0 starts xenconsoled, xl console can be used to access the domU's hvc0. Paritally sync console.h from xen.git to bring in the new field. Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Jason Andryuk <jason.andryuk@amd.com> Signed-off-by: Juergen Gross <jgross@suse.com> Message-ID: <20260318235326.14568-1-jason.andryuk@amd.com>
2026-04-10xen/swiotlb: fix stale reference to swiotlb_unmap_page()Kexin Sun1-1/+1
Commit af85de5a9f00 ("xen: swiotlb: Switch to physical address mapping callbacks") renamed xen_swiotlb_unmap_page() to xen_swiotlb_unmap_phys(). The comment in xen_swiotlb_unmap_sg() had already been missing the xen_ prefix (reading swiotlb_unmap_page()), and the rename only changed _page to _phys without correcting this, leaving it as swiotlb_unmap_phys(). Fix the reference to use the correct function name xen_swiotlb_unmap_phys(). Assisted-by: unnamed:deepseek-v3.2 coccinelle Signed-off-by: Kexin Sun <kexinsun@smail.nju.edu.cn> Signed-off-by: Juergen Gross <jgross@suse.com> Message-ID: <20260321110039.8905-1-kexinsun@smail.nju.edu.cn>
2026-04-10xen/manage: unwind partial shutdown watcher setup on errorGuoHan Zhao1-3/+17
setup_shutdown_watcher() registers shutdown_watch first, then the sysrq watch, and finally publishes the supported feature-* nodes in xenstore. If sysrq watch registration fails, or xenbus_printf() fails after one or more feature nodes were created, the function returns immediately without undoing the earlier setup. This leaves the system in a partially initialized state, with registered watches and/or stale xenstore entries despite the function reporting failure. Unwind the partial setup before returning an error by unregistering any watches that were already registered and removing feature nodes that were already published. Signed-off-by: GuoHan Zhao <zhaoguohan@kylinos.cn> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Juergen Gross <jgross@suse.com> Message-ID: <20260407022443.12971-1-zhaoguohan@kylinos.cn>
2026-04-10selftests/sched_ext: Fix wrong DSQ ID in peek_dsq error messagefangqiurong1-1/+1
The error path after scx_bpf_create_dsq(real_dsq_id, ...) was reporting test_dsq_id instead of real_dsq_id in the error message, which would mislead debugging. Signed-off-by: fangqiurong <fangqiurong@kylinos.cn> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-04-10erofs: error out obviously illegal extents in advanceGao Xiang2-10/+15
Detect some corrupted extent cases during metadata parsing rather than letting them result in harmless decompression failures later: - For full-reference compressed extents, the compressed size must not exceed the decompressed size, which is a strict on-disk layout constraint; - For plain (shifted/interlaced) extents, the decoded size must not exceed the encoded size, even accounting for partial decoding. Both ways work but it should be better to report illegal extents as metadata layout violations rather than deferring as decompression failure. Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2026-04-10erofs: clean up encoded map flagsGao Xiang4-35/+33
- Remove EROFS_MAP_ENCODED since it was always set together with EROFS_MAP_MAPPED for compressed extents and checked redundantly; - Replace the EROFS_MAP_FULL_MAPPED flag with the opposite EROFS_MAP_PARTIAL_MAPPED flag so that extents are implicitly fully mapped initially to simplify the logic; - Make fragment extents independent of EROFS_MAP_MAPPED since they are not directly allocated on disk; thus fragment extents are no longer twisted with mapped extents. Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2026-04-10ARM: xen: validate hypervisor compatible before parsing its versionPengpeng Hou1-4/+6
fdt_find_hyper_node() reads the raw compatible property and then derives hyper_node.version from a prefix match before later printing it with %s. Flat DT properties are external boot input, and this path does not prove that the first compatible entry is NUL-terminated within the returned property length. Keep the existing flat-DT lookup path, but verify that the first compatible entry terminates within the returned property length before deriving the version suffix from it. Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Juergen Gross <jgross@suse.com> Message-ID: <20260405094005.5-arm-xen-v2-pengpeng@iscas.ac.cn>
2026-04-10sched_ext: Documentation: improve accuracy of task lifecycle pseudo-codeKuba Piecuch1-7/+36
* Add ops.quiescent() and ops.runnable() to the sched_change path. When a queued task has one of its scheduling properties changed (e.g. nice, affinity), it goes through dequeue() -> quiescent() -> (property change callback, e.g. ops.set_weight()) -> runnable() -> enqueue(). * Change && to || in ops.enqueue() condition. We want to enqueue tasks that have a non-zero slice and are not in any DSQ. * Call ops.dispatch() and ops.dequeue() only for tasks that have had ops.enqueue() called. This is to account for tasks direct-dispatched from ops.select_cpu(). * Add a note explaining that the pseudo-code provides a simplified view of the task lifecycle and list some examples of cases that the pseudo-code does not account for. Fixes: a4f61f0a1afd ("sched_ext: Documentation: Add ops.dequeue() to task lifecycle") Signed-off-by: Kuba Piecuch <jpiecuch@google.com> Reviewed-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-04-10cgroup/rdma: fix swapped arguments in pr_warn() format stringcuitao1-1/+1
The format string says "device %p ... rdma cgroup %p" but the arguments were passed as (cg, device), printing them in the wrong order. Signed-off-by: cuitao <cuitao@kylinos.cn> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-04-10mmc: sdhci-msm: Fix the wrapped key handlingNeeraj Soni1-5/+0
Inline Crypto Engine (ICE) supports wrapped key generation. While registering crypto profile the supported key types are queried from ICE driver. So the explicit check for RAW key is not needed. Fixes: fd78e2b582a0 ("mmc: sdhci-msm: Add support for wrapped keys") Signed-off-by: Neeraj Soni <neeraj.soni@oss.qualcomm.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2026-04-10um: Disable GCOV_PROFILE_ALL on 32-bit UML with Clang 20/21Kees Cook1-1/+3
Clang 20 and 21 miscompute __builtin_object_size() when -fprofile-arcs is active on 32-bit UML targets, which passes incorrect object size calculations for local variables through always_inline copy_to_user() and check_copy_size(), causing spurious compile-time errors: include/linux/ucopysize.h:52:4: error: call to '__bad_copy_from' declared with 'error' attribute: copy source size is too small The regression was introduced in LLVM commit 02b8ee281947 ("[llvm] Improve llvm.objectsize computation by computing GEP, alloca and malloc parameters bound"), which shipped in Clang 20. It was fixed in LLVM by commit 45b697e610fd ("[MemoryBuiltins] Consider index type size when aggregating gep offsets"), which was backported to the LLVM 22.x release branch. The bug requires 32-bit UML + GCOV_PROFILE_ALL (which uses -fprofile-arcs), though the exact trigger depends on optimizer decisions influenced by other enabled configs. Prevent the bad combination by disabling UML's ARCH_HAS_GCOV_PROFILE_ALL on 32-bit when using Clang 20.x or 21.x. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202604030531.O6FveVgn-lkp@intel.com/ Suggested-by: Nathan Chancellor <nathan@kernel.org> Assisted-by: Claude:claude-opus-4-6[1m] Signed-off-by: Kees Cook <kees@kernel.org> Link: https://patch.msgid.link/20260409052038.make.995-kees@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-04-10gpio: tegra: return -ENOMEM on allocation failure in probeSamasth Norway Ananda1-1/+1
devm_kzalloc() failure in tegra_gpio_probe() returns -ENODEV, which indicates "no such device". The correct error code for a memory allocation failure is -ENOMEM. Signed-off-by: Samasth Norway Ananda <samasth.norway.ananda@oracle.com> Link: https://patch.msgid.link/20260409185853.2163034-1-samasth.norway.ananda@oracle.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
2026-04-10Merge tag 'drm-misc-fixes-2026-04-09' of ↵Dave Airlie4-8/+16
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Several fixes for v3d about memory leak, runtime PM, and locking, and a Kconfig improvement for ethosu. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patch.msgid.link/20260409-omniscient-tomato-coucal-edbadc@penduick
2026-04-10tools: ynl: tests: fix leading space on Makefile targetHangbin Liu1-1/+1
The ../generated/protos.a rule had a spurious leading space before the target name. In make, target rules must start at column 0; only recipe lines are indented with a tab. The extra space caused make to misparse the rule. Remove the leading space to match the style of the adjacent ../lib/ynl.a rule. Fixes: e0aa0c61758f ("tools: ynl: move samples to tests") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20260408-ynl_makefile-v1-1-f9624acc2ad9@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10selftests: net: py: explicitly forbid multiple ksft_run() callsJakub Kicinski1-1/+4
People (do people still write code or is it all AI?) seem to not get that ksft_run() can only be called once. If we call it multiple times KTAP parsers will likely cut off after the first batch has finished. Link: https://patch.msgid.link/20260408221952.819822-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10ipv6: sit: remove redundant ret = 0 assignmentYue Haibing1-1/+1
The variable ret is assigned a value at all places where it is used; There is no need to assign a value when it is initially defined. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Link: https://patch.msgid.link/20260408032051.3096449-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10eth: fbnic: Use wake instead of startMohsin Bashir1-1/+1
fbnic_up() calls netif_tx_start_all_queues(), which only clears __QUEUE_STATE_DRV_XOFF. If qdisc backlog has accumulated on any TX queue before the reconfiguration (e.g. ring resize via ethtool -G), start does not call __netif_schedule() to kick the qdisc, so the pending backlog is never drained and the queue stalls. Switch to netif_tx_wake_all_queues(), which clears DRV_XOFF and also calls __netif_schedule() on every queue, ensuring any backlog that built up before the down/up cycle is promptly dequeued. Fixes: bc6107771bb4 ("eth: fbnic: Allocate a netdevice and napi vectors with queues") Signed-off-by: Mohsin Bashir <hmohsin@meta.com> Link: https://patch.msgid.link/20260408002415.2963915-1-mohsin.bashr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10net: airoha: Add dma_rmb() and READ_ONCE() in airoha_qdma_rx_process()Lorenzo Bianconi1-6/+10
Add missing dma_rmb() in airoha_qdma_rx_process routine to make sure the DMA read operations are completed when the NIC reports the processing on the current descriptor is done. Moreover, add missing READ_ONCE() in airoha_qdma_rx_process() for DMA descriptor control fields in order to avoid any compiler reordering. Fixes: 23020f0493270 ("net: airoha: Introduce ethernet support for EN7581 SoC") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20260407-airoha_qdma_rx_process-fix-reordering-v3-1-91c36e9da31f@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10net: txgbe: fix RTNL assertion warning when remove moduleJiawen Wu1-0/+2
For the copper NIC with external PHY, the driver called phylink_connect_phy() during probe and phylink_disconnect_phy() during remove. It caused an RTNL assertion warning in phylink_disconnect_phy() upon module remove. To fix this, add rtnl_lock() and rtnl_unlock() around the phylink_disconnect_phy() in remove function. ------------[ cut here ]------------ RTNL: assertion failed at drivers/net/phy/phylink.c (2351) WARNING: drivers/net/phy/phylink.c:2351 at phylink_disconnect_phy+0xd8/0xf0 [phylink], CPU#0: rmmod/4464 Modules linked in: ... CPU: 0 UID: 0 PID: 4464 Comm: rmmod Kdump: loaded Not tainted 7.0.0-rc4+ Hardware name: Micro-Star International Co., Ltd. MS-7E16/X670E GAMING PLUS WIFI (MS-7E16), BIOS 1.90 12/31/2024 RIP: 0010:phylink_disconnect_phy+0xe4/0xf0 [phylink] Code: 5b 41 5c 41 5d 41 5e 41 5f 5d 31 c0 31 d2 31 f6 31 ff e9 3a 38 8f e7 48 8d 3d 48 87 e2 ff ba 2f 09 00 00 48 c7 c6 c1 22 24 c0 <67> 48 0f b9 3a e9 34 ff ff ff 66 90 90 90 90 90 90 90 90 90 90 90 RSP: 0018:ffffce7288363ac0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff89654b2a1a00 RCX: 0000000000000000 RDX: 000000000000092f RSI: ffffffffc02422c1 RDI: ffffffffc0239020 RBP: ffffce7288363ae8 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8964c4022000 R13: ffff89654fce3028 R14: ffff89654ebb4000 R15: ffffffffc0226348 FS: 0000795e80d93780(0000) GS:ffff896c52857000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005b528b592000 CR3: 0000000170d0f000 CR4: 0000000000f50ef0 PKRU: 55555554 Call Trace: <TASK> txgbe_remove_phy+0xbb/0xd0 [txgbe] txgbe_remove+0x4c/0xb0 [txgbe] pci_device_remove+0x41/0xb0 device_remove+0x43/0x80 device_release_driver_internal+0x206/0x270 driver_detach+0x4a/0xa0 bus_remove_driver+0x83/0x120 driver_unregister+0x2f/0x60 pci_unregister_driver+0x40/0x90 txgbe_driver_exit+0x10/0x850 [txgbe] __do_sys_delete_module.isra.0+0x1c3/0x2f0 __x64_sys_delete_module+0x12/0x20 x64_sys_call+0x20c3/0x2390 do_syscall_64+0x11c/0x1500 ? srso_alias_return_thunk+0x5/0xfbef5 ? do_syscall_64+0x15a/0x1500 ? srso_alias_return_thunk+0x5/0xfbef5 ? do_fault+0x312/0x580 ? srso_alias_return_thunk+0x5/0xfbef5 ? __handle_mm_fault+0x9d5/0x1040 ? srso_alias_return_thunk+0x5/0xfbef5 ? count_memcg_events+0x101/0x1d0 ? srso_alias_return_thunk+0x5/0xfbef5 ? handle_mm_fault+0x1e8/0x2f0 ? srso_alias_return_thunk+0x5/0xfbef5 ? do_user_addr_fault+0x2f8/0x820 ? srso_alias_return_thunk+0x5/0xfbef5 ? irqentry_exit+0xb2/0x600 ? srso_alias_return_thunk+0x5/0xfbef5 ? exc_page_fault+0x92/0x1c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e Fixes: 02b2a6f91b90 ("net: txgbe: support copper NIC with external PHY") Cc: stable@vger.kernel.org Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/8B47A5872884147D+20260407094041.4646-1-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10Merge branch 'net-bcmgenet-fix-queue-lock-up'Jakub Kicinski1-16/+14
Justin Chen says: ==================== net: bcmgenet: fix queue lock up We have been seeing reports of logs like this. [ 41.761198] bcmgenet 1001300000.ethernet eth0: NETDEV WATCHDOG: CPU: 0: transmit queue 2 timed out 10039 ms [ 43.745198] bcmgenet 1001300000.ethernet eth0: NETDEV WATCHDOG: CPU: 0: transmit queue 2 timed out 12023 ms [ 45.729198] bcmgenet 1001300000.ethernet eth0: NETDEV WATCHDOG: CPU: 0: transmit queue 2 timed out 14007 ms We have two issues. The persistent queue timeouts and the eventual lock up of the entire transmit. We address the lock up issue first. The queue timeouts are due to a fundamental design issue not a bug perse. Timeouts still persist, but we should no longer lock up. ==================== Link: https://patch.msgid.link/20260406175756.134567-1-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10net: bcmgenet: fix racing timeout handlerJustin Chen1-13/+9
The bcmgenet_timeout handler tries to take down all tx queues when a single queue times out. This is over zealous and causes many race conditions with queues that are still chugging along. Instead lets only restart the timed out queue. Fixes: 13ea657806cf ("net: bcmgenet: improve TX timeout") Signed-off-by: Justin Chen <justin.chen@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de> Tested-by: Nicolai Buchwitz <nb@tipi-net.de> Link: https://patch.msgid.link/20260406175756.134567-4-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10net: bcmgenet: fix leaking free_bdsJustin Chen1-0/+2
While reclaiming the tx queue we fast forward the write pointer to drop any data in flight. These dropped frames are not added back to the pool of free bds. We also need to tell the netdev that we are dropping said data. Fixes: f1bacae8b655 ("net: bcmgenet: support reclaiming unsent Tx packets") Signed-off-by: Justin Chen <justin.chen@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de> Tested-by: Nicolai Buchwitz <nb@tipi-net.de> Link: https://patch.msgid.link/20260406175756.134567-3-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10net: bcmgenet: fix off-by-one in bcmgenet_put_txcbJustin Chen1-3/+3
The write_ptr points to the next open tx_cb. We want to return the tx_cb that gets rewinded, so we must rewind the pointer first then return the tx_cb that it points to. That way the txcb can be correctly cleaned up. Fixes: 876dbadd53a7 ("net: bcmgenet: Fix unmapping of fragments in bcmgenet_xmit()") Signed-off-by: Justin Chen <justin.chen@broadcom.com> Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de> Link: https://patch.msgid.link/20260406175756.134567-2-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10net: macb: Use napi_schedule_irqoff() in IRQ handlerKevin Hao1-2/+2
For non-PREEMPT_RT kernels, the IRQ handler runs with interrupts disabled, allowing the use of napi_schedule_irqoff() to save a pair of local_irq_{save,restore} operations. For PREEMPT_RT kernels, napi_schedule_irqoff() behaves identically to napi_schedule(). Signed-off-by: Kevin Hao <haokexin@gmail.com> Link: https://patch.msgid.link/20260407-macb-napi-irqoff-v1-1-61bec60047d7@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10ppp: consolidate refcount decrementsQingfang Deng1-33/+28
ppp_destroy_{channel,interface} are always called after refcount_dec_and_test(). To reduce boilerplate code, consolidate the decrements by moving them into the two functions. To reflect this change in semantics, rename the functions to ppp_release_*. Signed-off-by: Qingfang Deng <qingfang.deng@linux.dev> Link: https://patch.msgid.link/20260407094058.257246-1-qingfang.deng@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10net: phy: realtek: Add property to enable SSCMarek Vasut1-0/+127
Add support for spread spectrum clocking (SSC) on RTL8211F(D)(I)-CG, RTL8211FS(I)(-VS)-CG, RTL8211FG(I)(-VS)-CG PHYs. The implementation follows EMI improvement application note Rev. 1.2 for these PHYs. The current implementation enables SSC for both RXC and SYSCLK clock signals. Introduce DT properties 'realtek,clkout-ssc-enable', 'realtek,rxc-ssc-enable' and 'realtek,sysclk-ssc-enable' which control CLKOUT, RXC and SYSCLK SSC spread spectrum clocking enablement on these signals. Signed-off-by: Marek Vasut <marek.vasut@mailbox.org> Link: https://patch.msgid.link/20260405233008.148974-3-marek.vasut@mailbox.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10dt-bindings: net: realtek,rtl82xx: Document realtek,*-ssc-enable propertyMarek Vasut1-0/+15
Document support for spread spectrum clocking (SSC) on RTL8211F(D)(I)-CG, RTL8211FS(I)(-VS)-CG, RTL8211FG(I)(-VS)-CG PHYs. Introduce DT properties 'realtek,clkout-ssc-enable', 'realtek,rxc-ssc-enable' and 'realtek,sysclk-ssc-enable' which control CLKOUT, RXC and SYSCLK SSC spread spectrum clocking enablement on these signals. These clock are not exposed via the clock API, therefore assigned-clock-sscs property does not apply. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Marek Vasut <marek.vasut@mailbox.org> Link: https://patch.msgid.link/20260405233008.148974-2-marek.vasut@mailbox.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10dt-bindings: net: realtek,rtl82xx: Keep property list sortedMarek Vasut1-4/+4
Sort the documented properties alphabetically, no functional change. Acked-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Marek Vasut <marek.vasut@mailbox.org> Link: https://patch.msgid.link/20260405233008.148974-1-marek.vasut@mailbox.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10Merge branch 'macsec-add-support-for-vlan-filtering-in-offload-mode'Jakub Kicinski9-128/+489
Cosmin Ratiu says: ==================== macsec: Add support for VLAN filtering in offload mode This short series adds support for VLANs in MACsec devices when offload mode is enabled. This allows VLAN netdevs on top of MACsec netdevs to function, which accidentally used to be the case in the past, but was broken. This series adds back proper support. As part of this, the existing nsim-only MACsec offload tests were translated to Python so they can run against real HW and new traffic-based tests were added for VLAN filter propagation, since there's currently no uAPI to check VLAN filters. ==================== Link: https://patch.msgid.link/20260408115240.1636047-1-cratiu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10macsec: Support VLAN-filtering lower devicesCosmin Ratiu1-8/+63
VLAN-filtering is done through two netdev features (NETIF_F_HW_VLAN_CTAG_FILTER and NETIF_F_HW_VLAN_STAG_FILTER) and two netdev ops (ndo_vlan_rx_add_vid and ndo_vlan_rx_kill_vid). Implement these and advertise the features if the lower device supports them. This allows proper VLAN filtering to work on top of MACsec devices, when the lower device is capable of VLAN filtering. As a concrete example, having this chain of interfaces now works: vlan_filtering_capable_dev(1) -> macsec_dev(2) -> macsec_vlan_dev(3) Before the mentioned commit this used to accidentally work because the MACsec device (and thus the lower device) was put in promiscuous mode and the VLAN filter was not used. But after commit [1] correctly made the macsec driver expose the IFF_UNICAST_FLT flag, promiscuous mode was no longer used and VLAN filters on dev 1 kicked in. Without support in dev 2 for propagating VLAN filters down, the register_vlan_dev -> vlan_vid_add -> __vlan_vid_add -> vlan_add_rx_filter_info call from dev 3 is silently eaten (because vlan_hw_filter_capable returns false and vlan_add_rx_filter_info silently succeeds). For MACsec, VLAN filters are only relevant for offload, otherwise the VLANs are encrypted and the lower devices don't care about them. So VLAN filters are only passed on to lower devices in offload mode. Flipping between offload modes now needs to offload/unoffload the filters with vlan_{get,drop}_rx_*_filter_info(). To avoid the back-and-forth filter updating during rollback, the setting of macsec->offload is moved after the add/del secy ops. This is safe since none of the code called from those requires macsec->offload. In case adding the filters fails, the added ones are rolled back and an error is returned to the operation toggling the offload state. Fixes: 0349659fd72f ("macsec: set IFF_UNICAST_FLT priv flag") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/20260408115240.1636047-5-cratiu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10selftests: Add MACsec VLAN propagation traffic testCosmin Ratiu3-0/+151
Add VLAN filter propagation tests through offloaded MACsec devices via actual traffic. The tests create MACsec tunnels with matching SAs on both endpoints, stack VLANs on top, and verify connectivity with ping. Covered: - Offloaded MACsec with VLAN (filters propagate to HW) - Software MACsec with VLAN (no HW filter propagation) - Offload on/off toggle and verifying traffic still works On netdevsim this makes use of the VLAN filter debugfs file to actually validate that filters are applied/removed correctly. On real hardware the traffic should validate actual VLAN filter propagation. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/20260408115240.1636047-4-cratiu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10nsim: Add support for VLAN filtersCosmin Ratiu2-2/+71
Add support for storing the list of VLANs in nsim devices, together with ops for adding/removing them and a debug file to show them. This will be used in upcoming tests. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/20260408115240.1636047-3-cratiu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10selftests: Migrate nsim-only MACsec tests to PythonCosmin Ratiu5-118/+204
Move MACsec offload API and ethtool feature tests from tools/testing/selftests/drivers/net/netdevsim/macsec-offload.sh to tools/testing/selftests/drivers/net/macsec.py using the NetDrvEnv framework so tests can run against both netdevsim (default) and real hardware (NETIF=ethX). As some real hardware requires MACsec to use encryption, add that to the tests. Netdevsim-specific limit checks (max SecY, max RX SC) were moved into separate test cases to avoid failures on real hardware. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/20260408115240.1636047-2-cratiu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10ipv6: move IFA_F_PERMANENT percpu allocation in process scopePaolo Abeni1-12/+19
Observed at boot time: CPU: 43 UID: 0 PID: 3595 Comm: (t-daemon) Not tainted 6.12.0 #1 Call Trace: <TASK> dump_stack_lvl+0x4e/0x70 pcpu_alloc_noprof.cold+0x1f/0x4b fib_nh_common_init+0x4c/0x110 fib6_nh_init+0x387/0x740 ip6_route_info_create+0x46d/0x640 addrconf_f6i_alloc+0x13b/0x180 addrconf_permanent_addr+0xd0/0x220 addrconf_notify+0x93/0x540 notifier_call_chain+0x5a/0xd0 __dev_notify_flags+0x5c/0xf0 dev_change_flags+0x54/0x70 do_setlink+0x36c/0xce0 rtnl_setlink+0x11f/0x1d0 rtnetlink_rcv_msg+0x142/0x3f0 netlink_rcv_skb+0x50/0x100 netlink_unicast+0x242/0x390 netlink_sendmsg+0x21b/0x470 __sys_sendto+0x1dc/0x1f0 __x64_sys_sendto+0x24/0x30 do_syscall_64+0x7d/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f5c3852f127 Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 80 3d 85 ef 0c 00 00 41 89 ca 74 10 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 71 c3 55 48 83 ec 30 44 89 4c 24 2c 4c 89 44 RSP: 002b:00007ffe86caf4c8 EFLAGS: 00000202 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000556c5cd93210 RCX: 00007f5c3852f127 RDX: 0000000000000020 RSI: 0000556c5cd938b0 RDI: 0000000000000003 RBP: 00007ffe86caf5a0 R08: 00007ffe86caf4e0 R09: 0000000000000080 R10: 0000000000000000 R11: 0000000000000202 R12: 0000556c5cd932d0 R13: 00000000021d05d1 R14: 00000000021d05d1 R15: 0000000000000001 IFA_F_PERMANENT addresses require the allocation of a bunch of percpu pointers, currently in atomic scope. Similar to commit 51454ea42c1a ("ipv6: fix locking issues with loops over idev->addr_list"), move fixup_permanent_addr() outside the &idev->lock scope, and do the allocations with GFP_KERNEL. With such change fixup_permanent_addr() is invoked with the BH enabled, and the ifp lock acquired there needs the BH variant. Note that we don't need to acquire a reference to the permanent addresses before releasing the mentioned write lock, because addrconf_permanent_addr() runs under RTNL and ifa removal always happens under RTNL, too. Also the PERMANENT flag is constant in the relevant scope, as it can be cleared only by inet6_addr_modify() under the RTNL lock. Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/46a7a030727e236af2dc7752994cd4f04f4a91d2.1775658924.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10net: use get_random_u{16,32,64}() where appropriateDavid Carlier7-10/+10
Use the typed random integer helpers instead of get_random_bytes() when filling a single integer variable. The helpers return the value directly, require no pointer or size argument, and better express intent. Skipped sites writing into __be16 (netdevsim) and __le64 (ceph) fields where a direct assignment would trigger sparse endianness warnings. Signed-off-by: David Carlier <devnexen@gmail.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260407150758.5889-1-devnexen@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10jbd2: fix deadlock in jbd2_journal_cancel_revoke()Zhang Yi1-3/+5
Commit f76d4c28a46a ("fs/jbd2: use sleeping version of __find_get_block()") changed jbd2_journal_cancel_revoke() to use __find_get_block_nonatomic() which holds the folio lock instead of i_private_lock. This breaks the lock ordering (folio -> buffer) and causes an ABBA deadlock when the filesystem blocksize < pagesize: T1 T2 ext4_mkdir() ext4_init_new_dir() ext4_append() ext4_getblk() lock_buffer() <- A sync_blockdev() blkdev_writepages() writeback_iter() writeback_get_folio() folio_lock() <- B ext4_journal_get_create_access() jbd2_journal_cancel_revoke() __find_get_block_nonatomic() folio_lock() <- B block_write_full_folio() lock_buffer() <- A This can occasionally cause generic/013 to hang. Fix by only calling __find_get_block_nonatomic() when the passed buffer_head doesn't belong to the bdev, which is the only case that we need to look up its bdev alias. Otherwise, the lookup is redundant since the found buffer_head is equal to the one we passed in. Fixes: f76d4c28a46a ("fs/jbd2: use sleeping version of __find_get_block()") Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Link: https://patch.msgid.link/20260409114204.917154-1-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org