kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2026-04-07	selftests/nolibc: explicitly handle ENOSYS from ptrace()	Thomas Weißschuh	1	-1/+1
	The automatic ENOSYS handling in EXPECT_SYSER() is about to be removed. ptrace() will return legitimately return ENOSYS on qemu-user, so handle it explicitly. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Willy Tarreau <w@1wt.eu> Link: https://patch.msgid.link/20260406-nolibc-no-skip-enosys-v1-1-c046b1ac7d73@weissschuh.net/
2026-04-07	tools/nolibc: add byteorder conversions	Thomas Weißschuh	5	-0/+70
	Add some standard functions to convert between different byte orders. Conveniently the UAPI headers provide all the necessary functionality. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Willy Tarreau <w@1wt.eu> Link: https://patch.msgid.link/20260405-nolibc-bswap-v1-1-f7699ca9cee0@weissschuh.net
2026-04-07	tools/nolibc: add the _syscall() macro	Thomas Weißschuh	2	-1/+6
	The standard syscall() function or macro uses the libc return value convention. Errors returned from the kernel as negative values are stored in errno and -1 is returned. Users who want to avoid using errno don't have a way to call raw syscalls and check the returned error. Add a new macro _syscall() which works like the standard syscall() but passes through the return value from the kernel unchanged. The naming scheme and return values match the named _sys_foo() system call wrappers already part of nolibc. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Willy Tarreau <w@1wt.eu> Link: https://patch.msgid.link/20260405-nolibc-syscall-v1-3-e5b12bc63211@weissschuh.net
2026-04-07	tools/nolibc: move the call to __sysret() into syscall()	Thomas Weißschuh	1	-2/+2
	__sysret() transforms the return value from the kernel into the libc return value convention. There is no reason for it to be called in the middle of the internals of the syscall() implementation macros. Move the call up, directly into syscall(), to make the code simpler. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Willy Tarreau <w@1wt.eu> Link: https://patch.msgid.link/20260405-nolibc-syscall-v1-2-e5b12bc63211@weissschuh.net
2026-04-07	tools/nolibc: rename the internal macros used in syscall()	Thomas Weißschuh	1	-5/+5
	These macros are the internal implementation of syscall(). They can not be used by users. Align them with the standard naming scheme for internal symbols. The current name also prevents the addition of an application-usable _syscall() symbol. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Willy Tarreau <w@1wt.eu> Link: https://patch.msgid.link/20260405-nolibc-syscall-v1-1-e5b12bc63211@weissschuh.net
2026-04-07	sched: Use u64 for bandwidth ratio calculations	Joseph Salisbury	3	-3/+3
	to_ratio() computes BW_SHIFT-scaled bandwidth ratios from u64 period and runtime values, but it returns unsigned long. tg_rt_schedulable() also stores the current group limit and the accumulated child sum in unsigned long. On 32-bit builds, large bandwidth ratios can be truncated and the RT group sum can wrap when enough siblings are present. That can let an overcommitted RT hierarchy pass the schedulability check, and it also narrows the helper result for other callers. Return u64 from to_ratio() and use u64 for the RT group totals so bandwidth ratios are preserved and compared at full width on both 32-bit and 64-bit builds. Fixes: b40b2e8eb521 ("sched: rt: multi level group constraints") Assisted-by: Codex:GPT-5 Signed-off-by: Joseph Salisbury <joseph.salisbury@oracle.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260403210014.2713404-1-joseph.salisbury@oracle.com
2026-04-07	Merge tag 'fpga-for-7.1-rc1' of ↵	Greg Kroah-Hartman	2	-2/+2
	ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/fpga/linux-fpga into char-misc-next Xu writes: FPGA Manager changes for 7.1-rc1 - Dinh & Yury's changes to use sysfs_emit() for sysfs read All patches have been reviewed on the mailing list, and have been in the last linux-next releases (as part of our for-next branch). Sorry for the late post due to personal affairs. Signed-off-by: Xu Yilun <yilun.xu@intel.com> * tag 'fpga-for-7.1-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/fpga/linux-fpga: fpga: m10bmc-sec: switch show_canceled_csk() to using sysfs_emit() fpga: bridge: Use sysfs_emit() instead of sprintf()
2026-04-07	md/raid5: fix soft lockup in retry_aligned_read()	Chia-Ming Chang	1	-1/+7
	When retry_aligned_read() encounters an overlapped stripe, it releases the stripe via raid5_release_stripe() which puts it on the lockless released_stripes llist. In the next raid5d loop iteration, release_stripe_list() drains the stripe onto handle_list (since STRIPE_HANDLE is set by the original IO), but retry_aligned_read() runs before handle_active_stripes() and removes the stripe from handle_list via find_get_stripe() -> list_del_init(). This prevents handle_stripe() from ever processing the stripe to resolve the overlap, causing an infinite loop and soft lockup. Fix this by using __release_stripe() with temp_inactive_list instead of raid5_release_stripe() in the failure path, so the stripe does not go through the released_stripes llist. This allows raid5d to break out of its loop, and the overlap will be resolved when the stripe is eventually processed by handle_stripe(). Fixes: 773ca82fa1ee ("raid5: make release_stripe lockless") Cc: stable@vger.kernel.org Signed-off-by: FengWei Shih <dannyshih@synology.com> Signed-off-by: Chia-Ming Chang <chiamingc@synology.com> Link: https://lore.kernel.org/linux-raid/20260402061406.455755-1-chiamingc@synology.com/ Signed-off-by: Yu Kuai <yukuai@fnnas.com>
2026-04-07	USB: serial: option: add Telit Cinterion FN990A MBIM composition	Fabio Porcedda	1	-0/+2
	Add the following Telit Cinterion FN990A MBIM composition: 0x1074: MBIM + tty (AT/NMEA) + tty (AT) + tty (AT) + tty (diag) + DPL (Data Packet Logging) + adb T: Bus=01 Lev=01 Prnt=04 Port=06 Cnt=01 Dev#= 7 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=1074 Rev=05.04 S: Manufacturer=Telit Wireless Solutions S: Product=FN990 S: SerialNumber=70628d0c C: #Ifs= 8 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim E: Ad=81(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim E: Ad=0f(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8e(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 6 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none) E: Ad=8f(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 7 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms Cc: stable@vger.kernel.org Signed-off-by: Fabio Porcedda <fabio.porcedda@gmail.com> Signed-off-by: Johan Hovold <johan@kernel.org>
2026-04-07	perf/x86/intel/uncore: Remove extra double quote mark	Zide Chen	1	-24/+24
	The third argument in INTEL_UNCORE_FR_EVENT_DESC() is subject to __stringify(), and the extra double quote marks can result in the expansion "3.814697266e-6" in the sysfs knobs, instead of 3.814697266e-6. This is incorrect, though it may still work for perf, e.g. perf stat -e uncore_iio_free_running_0/bw_in_port0/ Fixes: d8987048f665 ("perf/x86/intel/uncore: Support IIO free-running counters on DMR") Closes: https://lore.kernel.org/all/20251231224233.113839-1-zide.chen@intel.com/ Reported-by: Chun-Tse Shao <ctshao@google.com> Signed-off-by: Zide Chen <zide.chen@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Chun-Tse Shao <ctshao@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://patch.msgid.link/20260313174050.171704-5-zide.chen@intel.com
2026-04-07	perf/x86/intel/uncore: Fix die ID init and look up bugs	Zide Chen	2	-7/+7
	In snbep_pci2phy_map_init(), in the nr_node_ids > 8 path, uncore_device_to_die() may return -1 when all CPUs associated with the UBOX device are offline. Remove the WARN_ON_ONCE(die_id == -1) check for two reasons: - The current code breaks out of the loop. This is incorrect because pci_get_device() does not guarantee iteration in domain or bus order, so additional UBOX devices may be skipped during the scan. - Returning -EINVAL is incorrect, since marking offline buses with die_id == -1 is expected and should not be treated as an error. Separately, when NUMA is disabled on a NUMA-capable platform, pcibus_to_node() returns NUMA_NO_NODE, causing uncore_device_to_die() to return -1 for all PCI devices. As a result, spr_update_device_location(), used on Intel SPR and EMR, ignores the corresponding PMON units and does not add them to the RB tree. Fix this by using uncore_pcibus_to_dieid(), which retrieves topology from the UBOX GIDNIDMAP register and works regardless of whether NUMA is enabled in Linux. This requires snbep_pci2phy_map_init() to be added in spr_uncore_pci_init(). Keep uncore_device_to_die() only for the nr_node_ids > 8 case, where NUMA is expected to be enabled. Fixes: 9a7832ce3d92 ("perf/x86/intel/uncore: With > 8 nodes, get pci bus die id from NUMA info") Fixes: 65248a9a9ee1 ("perf/x86/uncore: Add a quirk for UPI on SPR") Signed-off-by: Zide Chen <zide.chen@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Tested-by: Steve Wahl <steve.wahl@hpe.com> Link: https://patch.msgid.link/20260313174050.171704-4-zide.chen@intel.com
2026-04-07	perf/x86/intel/uncore: Skip discovery table for offline dies	Zide Chen	1	-1/+1
	This warning can be triggered if NUMA is disabled and the system boots with fewer CPUs than the number of CPUs in die 0. WARNING: CPU: 9 PID: 7257 at uncore.c:1157 uncore_pci_pmu_register+0x136/0x160 [intel_uncore] Currently, the discovery table continues to be parsed even if all CPUs in the associated die are offline. This can lead to an array overflow at "pmu->boxes[die] = box" in uncore_pci_pmu_register(), which may trigger the warning above or cause other issues. Fixes: edae1f06c2cd ("perf/x86/intel/uncore: Parse uncore discovery tables") Reported-by: Steve Wahl <steve.wahl@hpe.com> Signed-off-by: Zide Chen <zide.chen@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Tested-by: Steve Wahl <steve.wahl@hpe.com> Link: https://patch.msgid.link/20260313174050.171704-3-zide.chen@intel.com
2026-04-07	perf/x86/intel/uncore: Fix iounmap() leak on global_init failure	Zide Chen	1	-5/+10
	Kernel test robot reported: Unverified Error/Warning (likely false positive, kindly check if interested): arch/x86/events/intel/uncore_discovery.c:293:2-8: ERROR: missing iounmap; ioremap on line 288 and execution via conditional on line 292 If domain->global_init() fails in __parse_discovery_table(), the ioremap'ed MMIO region is not released before returning, resulting in an MMIO mapping leak. Fixes: b575fc0e3357 ("perf/x86/intel/uncore: Add domain global init callback") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Zide Chen <zide.chen@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://patch.msgid.link/20260313174050.171704-2-zide.chen@intel.com
2026-04-07	pinctrl: qcom: add sdm670 lpi tlmm	Richard Acayan	3	-0/+177
	The Snapdragon 670 has an Low-Power Island (LPI) TLMM for configuring pins related to audio. Add the driver for this. Signed-off-by: Richard Acayan <mailingradian@gmail.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Linus Walleij <linusw@kernel.org>
2026-04-07	dt-bindings: pinctrl: qcom: Add SDM670 LPASS LPI pinctrl	Richard Acayan	1	-0/+81
	Add the pin controller for the audio Low-Power Island (LPI) on SDM670. Signed-off-by: Richard Acayan <mailingradian@gmail.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Linus Walleij <linusw@kernel.org>
2026-04-07	dt-bindings: qcom: lpass-lpi-common: add reserved GPIOs property	Richard Acayan	1	-0/+8
	There can be reserved GPIOs on the LPASS LPI pin controller to possibly control sensors. Add the property for reserved GPIOs so they can be avoided appropriately. Adapted from the same entry in qcom,tlmm-common.yaml. Signed-off-by: Richard Acayan <mailingradian@gmail.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Linus Walleij <linusw@kernel.org>
2026-04-07	thunderbolt: tunnel: Simplify allocation	Rosen Penev	2	-10/+5
	Use a flexible array member and kzalloc_flex to combine allocations. Add __counted_by for extra runtime analysis. Move counting variable assignment after allocation. kzalloc_flex with GCC >= 15 does this automatically. Signed-off-by: Rosen Penev <rosenp@gmail.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2026-04-07	Merge tag 'intel-pinctrl-v7.0-2' of ↵	Linus Walleij	2	-10/+27
	git://git.kernel.org/pub/scm/linux/kernel/git/pinctrl/intel into fixes intel-pinctrl for v7.0-2 * Fix 1kOhm, debounce, and PWM capability support * Add support for new PAD_OWN layout Signed-off-by: Linus Walleij <linusw@kernel.org>
2026-04-07	Input: aiptek - validate raw macro indices before updating state	Pengpeng Hou	1	-4/+9
	aiptek_irq() derives macro key indices directly from tablet reports and then uses them to index macroKeyEvents[]. Report types 4 and 5 also save the derived value in aiptek->lastMacro and later use that state to release the previous key. Validate the raw macro index once before it enters that state machine, so lastMacro only ever stores an in-range macro key. Keep direct bounds checks for report type 6, which reads the macro number from the packet body and uses it immediately. Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn> Link: https://patch.msgid.link/20260329001711.88076-1-pengpeng@iscas.ac.cn [dtor: fix macro fallback in report 5s to use -1] Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2026-04-07	Input: gf2k - skip invalid hat lookup values	Pengpeng Hou	1	-2/+4
	gf2k_read() decodes the hat position from a 4-bit field and uses it directly to index gf2k_hat_to_axis[]. The lookup table only has nine entries, so malformed packets can read past the end of the fixed table. Skip hat reporting when the decoded value falls outside the lookup table instead of forcing it to the neutral position. This keeps the fix local and avoids reporting a made-up axis state for malformed packets. Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn> Link: https://patch.msgid.link/20260407120001.1-gf2k-v2-pengpeng@iscas.ac.cn Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2026-04-07	md: wake raid456 reshape waiters before suspend	Yu Kuai	1	-0/+11
	During raid456 reshape, direct IO across the reshape position can sleep in raid5_make_request() waiting for reshape progress while still holding an active_io reference. If userspace then freezes reshape and writes md/suspend_lo or md/suspend_hi, mddev_suspend() kills active_io and waits for all in-flight IO to drain. This can deadlock: the IO needs reshape progress to continue, but the reshape thread is already frozen, so the active_io reference is never dropped and suspend never completes. raid5_prepare_suspend() already wakes wait_for_reshape for dm-raid. Do the same for normal md suspend when reshape is already interrupted, so waiting raid456 IO can abort, drop its reference, and let suspend finish. The mdadm test tests/25raid456-reshape-deadlock reproduces the hang. Fixes: 714d20150ed8 ("md: add new helpers to suspend/resume array") Link: https://lore.kernel.org/linux-raid/20260327140729.2030564-1-yukuai@fnnas.com/ Signed-off-by: Yu Kuai <yukuai@fnnas.com>
2026-04-07	md/raid1: serialize overlap io for writemostly disk	Xiao Ni	3	-14/+39
	Previously, using wait_event() would wake up all waiters simultaneously, and they would compete for the tree lock. The bio which gets the lock first will be handled, so the write sequence cannot be guaranteed. For example: bio1(100,200) bio2(150,200) bio3(150,300) The write sequence of fast device is bio1,bio2,bio3. But the write sequence of slow device could be bio1,bio3,bio2 due to lock competition. This causes data corruption. Replace waitqueue with a fifo list to guarantee the write sequence. And it also needs to iterate the list when removing one entry. If not, it may miss the opportunity to wake up the waiting io. For example: bio1(1,3), bio2(2,4) bio3(5,7), bio4(6,8) These four bios are in the same bucket. bio1 and bio3 are inserted into the rbtree. bio2 and bio4 are added to the waiting list and bio2 is the first one. bio3 returns from slow disk and tries to wake up the waiting bios. bio2 is removed from the list and will be handled. But bio1 hasn't finished. So bio2 will be added into waiting list again. Then bio1 returns from slow disk and wakes up waiting bios. bio4 is removed from the list and will be handled. Now bio1, bio3 and bio4 all finish and bio2 is left on the waiting list. So it needs to iterate the waiting list to wake up the right bio. Signed-off-by: Xiao Ni <xni@redhat.com> Link: https://lore.kernel.org/linux-raid/20260324072501.59865-1-xni@redhat.com/ Signed-off-by: Yu Kuai <yukuai@fnnas.com>
2026-04-07	md/md-llbitmap: optimize initial sync with write_zeroes_unmap support	Yu Kuai	1	-1/+61
	For RAID-456 arrays with llbitmap, if all underlying disks support write_zeroes with unmap, issue write_zeroes to zero all disk data regions and initialize the bitmap to BitCleanUnwritten instead of BitUnwritten. This optimization skips the initial XOR parity building because: 1. write_zeroes with unmap guarantees zeroed reads after the operation 2. For RAID-456, when all data is zero, parity is automatically consistent (0 XOR 0 XOR ... = 0) 3. BitCleanUnwritten indicates parity is valid but no user data has been written The implementation adds two helper functions: - llbitmap_all_disks_support_wzeroes_unmap(): Checks if all active disks support write_zeroes with unmap - llbitmap_zero_all_disks(): Issues blkdev_issue_zeroout() to each rdev's data region to zero all disks The zeroing and bitmap state setting happens in llbitmap_init_state() during bitmap initialization. If any disk fails to zero, we fall back to BitUnwritten and normal lazy recovery. This significantly reduces array initialization time for RAID-456 arrays built on modern NVMe SSDs or other devices that support write_zeroes with unmap. Reviewed-by: Xiao Ni <xni@redhat.com> Link: https://lore.kernel.org/linux-raid/20260323054644.3351791-4-yukuai@fnnas.com/ Signed-off-by: Yu Kuai <yukuai@fnnas.com>
2026-04-07	md/md-llbitmap: add CleanUnwritten state for RAID-5 proactive parity building	Yu Kuai	1	-12/+128
	Add new states to the llbitmap state machine to support proactive XOR parity building for RAID-5 arrays. This allows users to pre-build parity data for unwritten regions before any user data is written. New states added: - BitNeedSyncUnwritten: Transitional state when proactive sync is triggered via sysfs on Unwritten regions. - BitSyncingUnwritten: Proactive sync in progress for unwritten region. - BitCleanUnwritten: XOR parity has been pre-built, but no user data written yet. When user writes to this region, it transitions to BitDirty. New actions added: - BitmapActionProactiveSync: Trigger for proactive XOR parity building. - BitmapActionClearUnwritten: Convert CleanUnwritten/NeedSyncUnwritten/ SyncingUnwritten states back to Unwritten before recovery starts. State flows: - Current (lazy): Unwritten -> (write) -> NeedSync -> (sync) -> Dirty -> Clean - New (proactive): Unwritten -> (sysfs) -> NeedSyncUnwritten -> (sync) -> CleanUnwritten - On write to CleanUnwritten: CleanUnwritten -> (write) -> Dirty -> Clean - On disk replacement: CleanUnwritten regions are converted to Unwritten before recovery starts, so recovery only rebuilds regions with user data A new sysfs interface is added at /sys/block/mdX/md/llbitmap/proactive_sync (write-only) to trigger proactive sync. This only works for RAID-456 arrays. Link: https://lore.kernel.org/linux-raid/20260323054644.3351791-3-yukuai@fnnas.com/ Signed-off-by: Yu Kuai <yukuai@fnnas.com>
2026-04-07	md: add fallback to correct bitmap_ops on version mismatch	Yu Kuai	1	-1/+110
	If default bitmap version and on-disk version doesn't match, and mdadm is not the latest version to set bitmap_type, set bitmap_ops based on the disk version. Link: https://lore.kernel.org/linux-raid/20260323054644.3351791-2-yukuai@fnnas.com/ Signed-off-by: Yu Kuai <yukuai@fnnas.com>
2026-04-07	md/raid5: validate payload size before accessing journal metadata	Junrui Luo	1	-15/+33
	r5c_recovery_analyze_meta_block() and r5l_recovery_verify_data_checksum_for_mb() iterate over payloads in a journal metadata block using on-disk payload size fields without validating them against the remaining space in the metadata block. A corrupted journal contains payload sizes extending beyond the PAGE_SIZE boundary can cause out-of-bounds reads when accessing payload fields or computing offsets. Add bounds validation for each payload type to ensure the full payload fits within meta_size before processing. Fixes: b4c625c67362 ("md/r5cache: r5cache recovery: part 1") Cc: stable@vger.kernel.org Signed-off-by: Junrui Luo <moonafterrain@outlook.com> Link: https://lore.kernel.org/linux-raid/SYBPR01MB78815E78D829BB86CD7C8015AF5FA@SYBPR01MB7881.ausprd01.prod.outlook.com/ Signed-off-by: Yu Kuai <yukuai@fnnas.com>
2026-04-07	md: remove unused static md_wq workqueue	Abd-Alrhman Masalkhi	1	-8/+0
	The md_wq workqueue is defined as static and initialized in md_init(), but it is not used anywhere within md.c. All asynchronous and deferred work in this file is handled via md_misc_wq or dedicated md threads. Fixes: b75197e86e6d3 ("md: Remove flush handling") Signed-off-by: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com> Link: https://lore.kernel.org/linux-raid/20260328193522.3624-1-abd.masalkhi@gmail.com/ Signed-off-by: Yu Kuai <yukuai@fnnas.com>
2026-04-07	md/raid0: use kvzalloc/kvfree for strip_zone and devlist allocations	Gregory Price	1	-9/+9
	syzbot reported a WARNING at mm/page_alloc.c:__alloc_frozen_pages_noprof() triggered by create_strip_zones() in the RAID0 driver. When raid_disks is large, the allocation size exceeds MAX_PAGE_ORDER (4MB on x86), causing WARN_ON_ONCE_GFP(order > MAX_PAGE_ORDER). Convert the strip_zone and devlist allocations from kzalloc/kzalloc_objs to kvzalloc/kvzalloc_objs, which first attempts a contiguous allocation with __GFP_NOWARN and then falls back to vmalloc for large sizes. Convert the corresponding kfree calls to kvfree. Both arrays are pure metadata lookup tables (arrays of pointers and zone descriptors) accessed only via indexing, so they do not require physically contiguous memory. Reported-by: syzbot+924649752adf0d3ac9dd@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/69adaba8.a00a0220.b130.0005.GAE@google.com/ Signed-off-by: Gregory Price <gourry@gourry.net> Reviewed-by: Yu Kuai <yukuai@fnnas.com> Reviewed-by: Li Nan <linan122@huawei.com> Link: https://lore.kernel.org/linux-raid/20260308234202.3118119-1-gourry@gourry.net/ Signed-off-by: Yu Kuai <yukuai@fnnas.com>
2026-04-07	erofs: handle 48-bit blocks/uniaddr for extra devices	Zhan Xusheng	2	-4/+8
	erofs_init_device() only reads blocks_lo and uniaddr_lo from the on-disk device slot, ignoring blocks_hi and uniaddr_hi that were introduced alongside the 48-bit block addressing feature. For the primary device (dif0), erofs_read_superblock() already handles this correctly by combining blocks_lo with blocks_hi when 48-bit layout is enabled. But the same logic was not applied to extra devices. With a 48-bit EROFS image using extra devices whose uniaddr or blocks exceed 32-bit range, the truncated values cause erofs_map_dev() to compute wrong physical addresses, leading to silent data corruption. Fix this by reading blocks_hi and uniaddr_hi in erofs_init_device() when 48-bit layout is enabled, consistent with the primary device handling. Also fix the erofs_deviceslot on-disk definition where blocks_hi was incorrectly declared as __le32 instead of __le16. Fixes: 61ba89b57905 ("erofs: add 48-bit block addressing on-disk support") Suggested-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Zhan Xusheng <zhanxusheng@xiaomi.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2026-04-07	drbd: remove DRBD_GENLA_F_MANDATORY flag handling	Christoph Böhmwalder	7	-197/+114
	DRBD used a custom mechanism to mark netlink attributes as "mandatory": bit 14 of nla_type was repurposed as DRBD_GENLA_F_MANDATORY. Attributes sent from userspace that had this bit present and that were unknown to the kernel would lead to an error. Since commit ef6243acb478 ("genetlink: optionally validate strictly/dumps"), the generic netlink layer rejects unknown top-level attributes when strict validation is enabled. DRBD never opted out of strict validation, so unknown top-level attributes are already rejected by the netlink core. The mandatory flag mechanism was required for nested attributes, because these are parsed liberally, silently dropping attributes unknown to the kernel. This prepares for the move to a new YNL-based family, which will use the now-default strict parsing. The current family is not expected to gain any new attributes, which makes this change safe. Old userspace that still sets bit 14 is unaffected: nla_type() strips it before __nla_validate_parse() performs attribute validation, so the bit never reaches DRBD. Remove all references to the mandatory flag in DRBD. Cc: Johannes Berg <johannes.berg@intel.com> Cc: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Link: https://patch.msgid.link/20260403132953.2248751-1-christoph.boehmwalder@linbit.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-04-07	net/mlx5: Update the list of the PCI supported devices	Michael Guralnik	1	-0/+1
	Add the upcoming ConnectX-10 NVLink-C2C device ID to the table of supported PCI device IDs. Cc: stable@vger.kernel.org Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Reviewed-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260403091756.139583-1-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-07	Merge branch 'mptcp-support-msg_eor-and-small-cleanups'	Jakub Kicinski	5	-21/+44
	Matthieu Baerts says: ==================== mptcp: support MSG_EOR and small cleanups This series contains various unrelated patches: - Patches 1 & 2: support MSG_EOR instead of ignoring it. - Patch 3: avoid duplicated code in TCP and MPTCP by using a new helper. - Patch 4: adapt test to reproduce bug and increase code coverage. ==================== Link: https://patch.msgid.link/20260403-net-next-mptcp-msg_eor-misc-v1-0-b0b33bea3fed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-07	selftests: mptcp: join: recreate signal endp with same ID	Matthieu Baerts (NGI0)	1	-2/+2
	In this "delete re-add signal" MPTCP Join subtest, the endpoint linked to the initial subflow is removed, but readded once with different ID. It appears that there was an issue when reusing the same ID, recently fixed by commit d191101dee25 ("mptcp: pm: in-kernel: always set ID as avail when rm endp"). The test then now reuses the same ID the first time, but continue to use another one (88) the second time. This should then cover more cases. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/615 Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260403-net-next-mptcp-msg_eor-misc-v1-5-b0b33bea3fed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-07	tcp: add recv_should_stop helper	Geliang Tang	3	-15/+13
	Factor out a new helper tcp_recv_should_stop() from tcp_recvmsg_locked() and tcp_splice_read() to check whether to stop receiving. And use this helper in mptcp_recvmsg() and mptcp_splice_read() to reduce redundant code. Suggested-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260403-net-next-mptcp-msg_eor-misc-v1-3-b0b33bea3fed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-07	mptcp: preserve MSG_EOR semantics in sendmsg path	Gang Yan	2	-4/+22
	Extend MPTCP's sendmsg handling to recognize and honor the MSG_EOR flag, which marks the end of a record for application-level message boundaries. Data fragments tagged with MSG_EOR are explicitly marked in the mptcp_data_frag structure and skb context to prevent unintended coalescing with subsequent data chunks. This ensures the intent of applications using MSG_EOR is preserved across MPTCP subflows, maintaining consistent message segmentation behavior. Signed-off-by: Gang Yan <yangang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260403-net-next-mptcp-msg_eor-misc-v1-2-b0b33bea3fed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-07	mptcp: reduce 'overhead' from u16 to u8	Gang Yan	2	-1/+8
	The 'overhead' in struct mptcp_data_frag can safely use u8, as it represents 'alignment + sizeof(mptcp_data_frag)'. With a maximum alignment of 7('ALIGN(1, sizeof(long)) - 1'), the overhead is at most 47, well below U8_MAX and validated with BUILD_BUG_ON(). This patch also adds a field named 'unused' for further extensions. Signed-off-by: Gang Yan <yangang@kylinos.cn> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260403-net-next-mptcp-msg_eor-misc-v1-1-b0b33bea3fed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-07	dpaa2: avoid linking objects into multiple modules	Arnd Bergmann	2	-4/+21
	Each object file contains information about which module it gets linked into, so linking the same file into multiple modules now causes a warning: scripts/Makefile.build:254: drivers/net/ethernet/freescale/dpaa2/Makefile: dpaa2-mac.o is added to multiple modules: fsl-dpaa2-eth fsl-dpaa2-switch scripts/Makefile.build:254: drivers/net/ethernet/freescale/dpaa2/Makefile: dpmac.o is added to multiple modules: fsl-dpaa2-eth fsl-dpaa2-switch Change the way that dpaa2 is built by moving the two common files into a separate module with exported symbols instead. Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20260402184726.3746487-3-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-07	net: ethernet: ti-cpsw: fix linking built-in code to modules	Arnd Bergmann	6	-15/+139
	There are six variants of the cpsw driver, sharing various parts of the code: davinci-emac, cpsw, cpsw-switchdev, netcp, netcp_ethss and am65-cpsw-nuss. I noticed that this means some files can be linked into more than one loadable module, or even part of vmlinux but also linked into a loadable module, both of which mess up assumptions of the build system, and causes warnings: scripts/Makefile.build:279: cpsw_ale.o is added to multiple modules: ti-am65-cpsw-nuss ti_cpsw ti_cpsw_new scripts/Makefile.build:279: cpsw_priv.o is added to multiple modules: ti_cpsw ti_cpsw_new scripts/Makefile.build:279: cpsw_sl.o is added to multiple modules: ti-am65-cpsw-nuss ti_cpsw ti_cpsw_new scripts/Makefile.build:279: cpsw_ethtool.o is added to multiple modules: ti_cpsw ti_cpsw_new scripts/Makefile.build:279: davinci_cpdma.o is added to multiple modules: ti_cpsw ti_cpsw_new ti_davinci_emac Change this back to having separate modules for each portion that can be linked standalone, exporting symbols as needed: - ti-cpsw-common.ko now contains both cpsw-common.o and davinci_cpdma.o as they are always used together - ti-cpsw-priv.ko contains cpsw_priv.o, cpsw_sl.o and cpsw_ethtool.o, which are the core of the cpsw and cpsw-new drivers. - ti-cpsw-sl.ko contains the cpsw-sl.o object and is used on ti-am65-cpsw-nuss.ko in addition to the two other cpsw variants. - ti-cpsw-ale.o is the one standalone module that is used by all except davinci_emac. Each of these will be built-in if any of its users are built-in, otherwise it's a loadable module if there is at least one module using it. I did not bring back the separate Kconfig symbols for this, but just handle it using Makefile logic. Note: ideally this is something that Kbuild complains about, but usually we just notice when something using THIS_MODULE misbehaves in a way that a user notices. Fixes: 99f6297182729 ("net: ethernet: ti: cpsw: drop TI_DAVINCI_CPDMA config option") Link: https://lore.kernel.org/lkml/20240417084400.3034104-1-arnd@kernel.org/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20260402184726.3746487-2-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-07	net: ethernet: ti-cpsw:: rename soft_reset() function	Arnd Bergmann	4	-4/+4
	While looking at the glob symbols shared between the cpsw drivers, I noticed that soft_reset() is the only one that is missing a proper namespace prefix, and will pollute the kernel namespace, so rename it to be consistent with the other symbols. Reviewed-by: Alexander Sverdlin <alexander.sverdlin@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20260402184726.3746487-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-07	eth: remove the driver for acenic / tigon1&2	Jakub Kicinski	29	-4057/+0
	The entire git history for this driver looks like tree-wide and automated cleanups. There's even more coming now with AI, so let's try to delete it instead. Acked-by: Jes Sorensen <jes@trained-monkey.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/20260403220501.2263835-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-07	net: macb: Use netif_napi_add_tx() instead of netif_napi_add() for TX NAPI	Kevin Hao	1	-1/+1
	The TX NAPI should be registered via netif_napi_add_tx() to avoid unnecessarily polluting the napi_hash table. Signed-off-by: Kevin Hao <haokexin@gmail.com> Link: https://patch.msgid.link/20260403-macb-napi-tx-v1-1-08126a60c65e@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-07	Merge branch 'nfc-support-for-five-qualcomm-sdm845-phones'	Jakub Kicinski	1	-0/+1
	David Heidelberg says: ==================== NFC support for five Qualcomm SDM845 phones - OnePlus 6 / 6T - Pixel 3 / 3 XL - SHIFT 6MQ Verified with NFC card using neard: systemctl enable --now neard nfctool --device nfc0 -1 nfctool -d nfc0 -p gdbus introspect --system --dest org.neard --object-path /org/neard/nfc0/tag0/record0 or use gNFC: https://gitlab.gnome.org/dh/gnfc/ successfully detecting and reading a tag. ==================== Link: https://patch.msgid.link/20260403-oneplus-nfc-v3-0-fbdce57d63c1@ixit.cz Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-07	dt-bindings: nfc: nxp,nci: Document PN557 compatible	David Heidelberg	1	-0/+1
	The PN557 uses the same hardware as the PN553 but ships with firmware compliant with NCI 2.0. Document PN557 as a compatible device. Signed-off-by: David Heidelberg <david@ixit.cz> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Link: https://patch.msgid.link/20260403-oneplus-nfc-v3-1-fbdce57d63c1@ixit.cz Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-07	net: skb: fix cross-cache free of KFENCE-allocated skb head	Jiayuan Chen	1	-4/+1
	SKB_SMALL_HEAD_CACHE_SIZE is intentionally set to a non-power-of-2 value (e.g. 704 on x86_64) to avoid collisions with generic kmalloc bucket sizes. This ensures that skb_kfree_head() can reliably use skb_end_offset to distinguish skb heads allocated from skb_small_head_cache vs. generic kmalloc caches. However, when KFENCE is enabled, kfence_ksize() returns the exact requested allocation size instead of the slab bucket size. If a caller (e.g. bpf_test_init) allocates skb head data via kzalloc() and the requested size happens to equal SKB_SMALL_HEAD_CACHE_SIZE, then slab_build_skb() -> ksize() returns that exact value. After subtracting skb_shared_info overhead, skb_end_offset ends up matching SKB_SMALL_HEAD_HEADROOM, causing skb_kfree_head() to incorrectly free the object to skb_small_head_cache instead of back to the original kmalloc cache, resulting in a slab cross-cache free: kmem_cache_free(skbuff_small_head): Wrong slab cache. Expected skbuff_small_head but got kmalloc-1k Fix this by always calling kfree(head) in skb_kfree_head(). This keeps the free path generic and avoids allocator-specific misclassification for KFENCE objects. Fixes: bf9f1baa279f ("net: add dedicated kmem_cache for typical/small skb->head") Reported-by: Antonius <antonius@bluedragonsec.com> Closes: https://lore.kernel.org/netdev/CAK8a0jxC5L5N7hq-DT2_NhUyjBxrPocoiDazzsBk4TGgT1r4-A@mail.gmail.com/ Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260403014517.142550-1-jiayuan.chen@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-07	vsock/test: fix send_buf()/recv_buf() EINTR handling	Stefano Garzarella	1	-2/+6
	When send() or recv() returns -1 with errno == EINTR, the code skips the break but still adds the return value to nwritten/nread, making it decrease by 1. This leads to wrong buffer offsets and wrong bytes count. Fix it by explicitly continuing the loop on EINTR, so the return value is only added when it is positive. Fixes: a8ed71a27ef5 ("vsock/test: add recv_buf() utility function") Fixes: 12329bd51fdc ("vsock/test: add send_buf() utility function") Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Luigi Leonardi <leonardi@redhat.com> Link: https://patch.msgid.link/20260403093251.30662-1-sgarzare@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-07	Merge branch 'xsk-tailroom-reservation-and-mtu-validation'	Jakub Kicinski	10	-38/+150
	Maciej Fijalkowski says: ==================== xsk: tailroom reservation and MTU validation here we fix a long-standing issue regarding multi-buffer scenario in ZC mode - we have not been providing space at the end of the buffer where multi-buffer XDP works on skb_shared_info. This has been brought to our attention via [0]. Unaligned mode does not get any specific treatment, it is user's responsibility to properly handle XSK addresses in queues. With adjustments included here in this set against xskxceiver I have been able to pass the full test suite on ice. [0]: https://community.intel.com/t5/Ethernet-Products/X710-XDP-Packet-Corruption-Issue-DRV-MODE-Zero-Copy-Multi-Buffer/m-p/1724208 ==================== Link: https://patch.msgid.link/20260402154958.562179-1-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-07	selftests: bpf: adjust rx_dropped xskxceiver's test to respect tailroom	Maciej Fijalkowski	1	-2/+4
	Since we have changed how big user defined headroom in umem can be, change the logic in testapp_stats_rx_dropped() so we pass updated headroom validation in xdp_umem_reg() and still drop half of frames. Test works on non-mbuf setup so __xsk_pool_get_rx_frame_size() that is called on xsk_rcv_check() will not account skb_shared_info size. Taking the tailroom size into account in test being fixed is needed as xdp_umem_reg() defaults to respect it. Reviewed-by: Björn Töpel <bjorn@kernel.org> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20260402154958.562179-9-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-07	selftests: bpf: have a separate variable for drop test	Maciej Fijalkowski	1	-1/+3
	Currently two different XDP programs share a static variable for different purposes (picking where to redirect on shared umem test & whether to drop a packet). This can be a problem when running full test suite - idx can be written by shared umem test and this value can cause a false behavior within XDP drop half test. Introduce a dedicated variable for drop half test so that these two don't step on each other toes. There is no real need for using __sync_fetch_and_add here as XSK tests are executed on single CPU. Reviewed-by: Björn Töpel <bjorn@kernel.org> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20260402154958.562179-8-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-07	selftests: bpf: fix pkt grow tests	Maciej Fijalkowski	1	-3/+21
	Skip tail adjust tests in xskxceiver for SKB mode as it is not very friendly for it. multi-buffer case does not work as xdp_rxq_info that is registered for generic XDP does not report ::frag_size. The non-mbuf path copies packet via skb_pp_cow_data() which only accounts for headroom, leaving us with no tailroom and causing underlying XDP prog to drop packets therefore. For multi-buffer test on other modes, change the amount of bytes we use for growth, assume worst-case scenario and take care of headroom and tailroom. Reviewed-by: Björn Töpel <bjorn@kernel.org> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20260402154958.562179-7-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-07	selftests: bpf: introduce a common routine for reading procfs	Maciej Fijalkowski	4	-24/+66
	Parametrize current way of getting MAX_SKB_FRAGS value from {sys,proc}fs so that it can be re-used to get cache line size of system's CPU. All that just to mimic and compute size of kernel's struct skb_shared_info which for xsk and test suite interpret as tailroom. Introduce two variables to ifobject struct that will carry count of skb frags and tailroom size. Do the reading and computing once, at the beginning of test suite execution in xskxceiver, but for test_progs such way is not possible as in this environment each test setups and torns down ifobject structs. Reviewed-by: Björn Töpel <bjorn@kernel.org> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://patch.msgid.link/20260402154958.562179-6-maciej.fijalkowski@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>