starfive-tech/linux.git - StarFive Tech Linux Kernel for VisionFive (JH7110) boards (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2025-09-26	Merge tag 'v6.17rc7-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6	Linus Torvalds	1	-12/+89
	Pull smb client fixes from Steve French: - Fix unlink bug - Fix potential out of bounds access in processing compound requests * tag 'v6.17rc7-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: client: fix wrong index reference in smb2_compound_op() smb: client: handle unlink(2) of files open by different clients
2025-09-26	Merge tag 'vfs-6.17-rc8.fixes' of ↵	Linus Torvalds	10	-16/+50
	git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Prevent double unlock in netfs - Fix a NULL pointer dereference in afs_put_server() - Fix a reference leak in netfs * tag 'vfs-6.17-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: netfs: fix reference leak afs: Fix potential null pointer dereference in afs_put_server netfs: Prevent duplicate unlocking
2025-09-26	Merge tag 'pmdomain-v6.17-rc2-2' of ↵	Linus Torvalds	1	-0/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm Pull pmdomain fix from Ulf Hansson: - mediatek: Make sure MT8195 AUDIO power domain isn't left powered-on * tag 'pmdomain-v6.17-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm: pmdomain: mediatek: set default off flag for MT8195 AUDIO power domain
2025-09-26	Merge tag 'platform-drivers-x86-v6.17-5' of ↵	Linus Torvalds	5	-22/+33
	git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Ilpo Järvinen: "Fixes and New HW Supoort - amd/pmc: Use 8042 quirk for Stellaris Slim Gen6 AMD - dell: Set USTT mode according to BIOS after reboot - dell-lis3lv02d: Add Latitude E6530 - lg-laptop: Fix setting the fan mode" * tag 'platform-drivers-x86-v6.17-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: lg-laptop: Fix WMAB call in fan_mode_store() platform/x86: dell-lis3lv02d: Add Latitude E6530 platform/x86/dell: Set USTT mode according to BIOS after reboot platform/x86/amd/pmc: Add Stellaris Slim Gen6 AMD to spurious 8042 quirks list
2025-09-26	Merge tag 'gpio-fixes-for-v6.17' of ↵	Linus Torvalds	2	-3/+20
	git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fixes from Bartosz Golaszewski: - allow looking up GPIOs by the secondary firmware node too - fix memory leak in gpio-regmap * tag 'gpio-fixes-for-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpio: regmap: fix memory leak of gpio_regmap structure gpiolib: Extend software-node support to support secondary software-nodes
2025-09-26	Merge tag 'block-6.17-20250925' of ↵	Linus Torvalds	2	-4/+6
	git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block fixes from Jens Axboe: "A regression fix for this series where an attempt to silence an EOD error got messed up a bit, and then a change of git trees for the block and io_uring trees. Switching the git trees to kernel.org now, as I've just about had it trying to battle AI bots that bring the box to its knees, continually. At least I don't have to maintain the kernel.org side" * tag 'block-6.17-20250925' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: MAINTAINERS: update io_uring and block tree git trees block: fix EOD return for device with nr_sectors == 0
2025-09-26	Merge tag 'drm-fixes-2025-09-26' of https://gitlab.freedesktop.org/drm/kernel	Linus Torvalds	17	-35/+167
	Pull drm fixes from Dave Airlie: "Weekly fixes, some fbcon font handling fixes, then amdgpu/xe/i915 with a few, and a few misc fixes for other drivers. Seems about right for this stage, and I don't know of anything outstanding. fbcon: - fix OOB access in font allocation - fix integer overflow in font handling amdgpu: - Backlight fix - DC preblend fix - DCN 3.5 fix - Cleanup output_tf_change xe: - Don't expose sysfs attributes not applicable for VFs - Fix build with CONFIG_MODULES=n - Don't copy pinned kernel bos twice on suspend i915: - Set O_LARGEFILE in __create_shmem() - Guard reg_val against a INVALID_TRANSCODER [ddi] ast: - sleeps causing cpu stall fix panthor: - scheduler race condition fix gma500: - NULL ptr deref in hdmi teardown fix" * tag 'drm-fixes-2025-09-26' of https://gitlab.freedesktop.org/drm/kernel: drm/panthor: Defer scheduler entitiy destruction to queue release drm/amd/display: remove output_tf_change flag drm/amd/display: Init DCN35 clocks from pre-os HW values drm/amd/display: Use mpc.preblend flag to indicate preblend drm/amd/display: Only restore backlight after amdgpu_dm_init or dm_resume fbcon: Fix OOB access in font allocation drm/i915/ddi: Guard reg_val against a INVALID_TRANSCODER drm/i915: set O_LARGEFILE in __create_shmem() drm/xe: Don't copy pinned kernel bos twice on suspend drm/xe: Fix build with CONFIG_MODULES=n drm/xe/vf: Don't expose sysfs attributes not applicable for VFs fbcon: fix integer overflow in fbcon_do_set_font drm/gma500: Fix null dereference in hdmi teardown drm/ast: Use msleep instead of mdelay for edid read
2025-09-26	smb: client: fix wrong index reference in smb2_compound_op()	Sang-Heon Jeon	1	-1/+1
	In smb2_compound_op(), the loop that processes each command's response uses wrong indices when accessing response bufferes. This incorrect indexing leads to improper handling of command results. Also, if incorrectly computed index is greather than or equal to MAX_COMPOUND, it can cause out-of-bounds accesses. Fixes: 3681c74d342d ("smb: client: handle lack of EA support in smb2_query_path_info()") # 6.14 Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Sang-Heon Jeon <ekffu200098@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-09-26	netfs: fix reference leak	Max Kellermann	8	-14/+47
	Commit 20d72b00ca81 ("netfs: Fix the request's work item to not require a ref") modified netfs_alloc_request() to initialize the reference counter to 2 instead of 1. The rationale was that the requet's "work" would release the second reference after completion (via netfs_{read,write}_collection_worker()). That works most of the time if all goes well. However, it leaks this additional reference if the request is released before the I/O operation has been submitted: the error code path only decrements the reference counter once and the work item will never be queued because there will never be a completion. This has caused outages of our whole server cluster today because tasks were blocked in netfs_wait_for_outstanding_io(), leading to deadlocks in Ceph (another bug that I will address soon in another patch). This was caused by a netfs_pgpriv2_begin_copy_to_cache() call which failed in fscache_begin_write_operation(). The leaked netfs_io_request was never completed, leaving `netfs_inode.io_count` with a positive value forever. All of this is super-fragile code. Finding out which code paths will lead to an eventual completion and which do not is hard to see: - Some functions like netfs_create_write_req() allocate a request, but will never submit any I/O. - netfs_unbuffered_read_iter_locked() calls netfs_unbuffered_read() and then netfs_put_request(); however, netfs_unbuffered_read() can also fail early before submitting the I/O request, therefore another netfs_put_request() call must be added there. A rule of thumb is that functions that return a `netfs_io_request` do not submit I/O, and all of their callers must be checked. For my taste, the whole netfs code needs an overhaul to make reference counting easier to understand and less fragile & obscure. But to fix this bug here and now and produce a patch that is adequate for a stable backport, I tried a minimal approach that quickly frees the request object upon early failure. I decided against adding a second netfs_put_request() each time because that would cause code duplication which obscures the code further. Instead, I added the function netfs_put_failed_request() which frees such a failed request synchronously under the assumption that the reference count is exactly 2 (as initially set by netfs_alloc_request() and never touched), verified by a WARN_ON_ONCE(). It then deinitializes the request object (without going through the "cleanup_work" indirection) and frees the allocation (with RCU protection to protect against concurrent access by netfs_requests_seq_start()). All code paths that fail early have been changed to call netfs_put_failed_request() instead of netfs_put_request(). Additionally, I have added a netfs_put_request() call to netfs_unbuffered_read() as explained above because the netfs_put_failed_request() approach does not work there. Fixes: 20d72b00ca81 ("netfs: Fix the request's work item to not require a ref") Signed-off-by: Max Kellermann <max.kellermann@ionos.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Paulo Alcantara <pc@manguebit.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: stable@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-26	Merge tag 'drm-xe-fixes-2025-09-25' of ↵	Dave Airlie	3	-4/+4
	https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes - Don't expose sysfs attributes not applicable for VFs (Michal) - Fix build with CONFIG_MODULES=n (Lucas) - Don't copy pinned kernel bos twice on suspend (Thomas) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/aNU-FkJEcA3T4aDB@intel.com
2025-09-26	Merge tag 'drm-misc-fixes-2025-09-25' of ↵	Dave Airlie	4	-12/+13
	https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes A CPU stall fix for ast, a NULL pointer dereference fix for gma500, an OOB and overflow fixes for fbcon, and a race condition fix for panthor. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://lore.kernel.org/r/20250925-smilodon-of-luxurious-genius-4ebee7@penduick
2025-09-26	Merge tag 'drm-intel-fixes-2025-09-25' of ↵	Dave Airlie	2	-2/+10
	https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - Set O_LARGEFILE in __create_shmem() (Taotao Chen) - Guard reg_val against a INVALID_TRANSCODER [ddi] (Suraj Kandpal) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tursulin@igalia.com> Link: https://lore.kernel.org/r/aNTxWfhsMkFZ3Q-a@linux
2025-09-26	Merge tag 'amd-drm-fixes-6.17-2025-09-24' of ↵	Dave Airlie	8	-17/+140
	https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.17-2025-09-24: amdgpu: - Backlight fix - DC preblend fix - DCN 3.5 fix - Cleanup output_tf_change Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250924200632.531102-1-alexander.deucher@amd.com
2025-09-26	include/linux/pgtable.h: convert arch_enter_lazy_mmu_mode() and friends to ↵	Andrew Morton	1	-3/+3
	static inlines commit c519c3c0a113 ("mm/kasan: avoid lazy MMU mode hazards") introduced the use of arch_enter_lazy_mmu_mode(), which results in the compiler complaining about "statement has no effect", when __HAVE_ARCH_LAZY_MMU_MODE is not defined in include/linux/pgtable.h The exact warning/error is: In file included from ./include/linux/kasan.h:37, from mm/kasan/shadow.c:14: mm/kasan/shadow.c: In function kasan_populate_vmalloc_pte: ./include/linux/pgtable.h:247:41: error: statement with no effect [-Werror=unused-value] 247 \| #define arch_enter_lazy_mmu_mode() (LAZY_MMU_DEFAULT) \| ^ mm/kasan/shadow.c:322:9: note: in expansion of macro arch_enter_lazy_mmu_mode> 322 \| arch_enter_lazy_mmu_mode(); \| ^~~~~~~~~~~~~~~~~~~~~~~~ switching these "functions" to static inlines fixes this up. Fixes: c519c3c0a113 ("mm/kasan: avoid lazy MMU mode hazards") Reported-by: Balbir Singh <balbirs@nvidia.com> Closes: https://lkml.kernel.org/r/20250912235515.367061-1-balbirs@nvidia.com Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-26	mm/damon/sysfs: do not ignore callback's return value in ↵	Akinobu Mita	1	-1/+3
	damon_sysfs_damon_call() The callback return value is ignored in damon_sysfs_damon_call(), which means that it is not possible to detect invalid user input when writing commands such as 'commit' to /sys/kernel/mm/damon/admin/kdamonds/<K>/state. Fix it. Link: https://lkml.kernel.org/r/20250920132546.5822-1-akinobu.mita@gmail.com Fixes: f64539dcdb87 ("mm/damon/sysfs: use damon_call() for update_schemes_stats") Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: SeongJae Park <sj@kernel.org> Cc: <stable@vger.kernel.org> [6.14+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-26	mailmap: add entry for Bence Csókás	Bence Csókás	1	-0/+1
	I will be leaving Prolan this week. You can reach me by my personal email for now. Link: https://lkml.kernel.org/r/20250915-mailmap-v1-1-9ebdea93c6a7@prolan.hu Signed-off-by: Bence Csókás <bence98@sch.bme.hu> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-26	fs/proc/task_mmu: check p->vec_buf for NULL	Jakub Acs	1	-0/+3
	When the PAGEMAP_SCAN ioctl is invoked with vec_len = 0 reaches pagemap_scan_backout_range(), kernel panics with null-ptr-deref: [ 44.936808] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI [ 44.937797] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] [ 44.938391] CPU: 1 UID: 0 PID: 2480 Comm: reproducer Not tainted 6.17.0-rc6 #22 PREEMPT(none) [ 44.939062] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [ 44.939935] RIP: 0010:pagemap_scan_thp_entry.isra.0+0x741/0xa80 <snip registers, unreliable trace> [ 44.946828] Call Trace: [ 44.947030] <TASK> [ 44.949219] pagemap_scan_pmd_entry+0xec/0xfa0 [ 44.952593] walk_pmd_range.isra.0+0x302/0x910 [ 44.954069] walk_pud_range.isra.0+0x419/0x790 [ 44.954427] walk_p4d_range+0x41e/0x620 [ 44.954743] walk_pgd_range+0x31e/0x630 [ 44.955057] __walk_page_range+0x160/0x670 [ 44.956883] walk_page_range_mm+0x408/0x980 [ 44.958677] walk_page_range+0x66/0x90 [ 44.958984] do_pagemap_scan+0x28d/0x9c0 [ 44.961833] do_pagemap_cmd+0x59/0x80 [ 44.962484] __x64_sys_ioctl+0x18d/0x210 [ 44.962804] do_syscall_64+0x5b/0x290 [ 44.963111] entry_SYSCALL_64_after_hwframe+0x76/0x7e vec_len = 0 in pagemap_scan_init_bounce_buffer() means no buffers are allocated and p->vec_buf remains set to NULL. This breaks an assumption made later in pagemap_scan_backout_range(), that page_region is always allocated for p->vec_buf_index. Fix it by explicitly checking p->vec_buf for NULL before dereferencing. Other sites that might run into same deref-issue are already (directly or transitively) protected by checking p->vec_buf. Note: From PAGEMAP_SCAN man page, it seems vec_len = 0 is valid when no output is requested and it's only the side effects caller is interested in, hence it passes check in pagemap_scan_get_args(). This issue was found by syzkaller. Link: https://lkml.kernel.org/r/20250922082206.6889-1-acsjakub@amazon.de Fixes: 52526ca7fdb9 ("fs/proc/task_mmu: implement IOCTL to get and optionally clear info about PTEs") Signed-off-by: Jakub Acs <acsjakub@amazon.de> Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Jinjiang Tu <tujinjiang@huawei.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Penglei Jiang <superman.xpt@gmail.com> Cc: Mark Brown <broonie@kernel.org> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Andrei Vagin <avagin@gmail.com> Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-26	kmsan: fix out-of-bounds access to shadow memory	Eric Biggers	2	-3/+23
	Running sha224_kunit on a KMSAN-enabled kernel results in a crash in kmsan_internal_set_shadow_origin(): BUG: unable to handle page fault for address: ffffbc3840291000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 1810067 P4D 1810067 PUD 192d067 PMD 3c17067 PTE 0 Oops: 0000 [#1] SMP NOPTI CPU: 0 UID: 0 PID: 81 Comm: kunit_try_catch Tainted: G N 6.17.0-rc3 #10 PREEMPT(voluntary) Tainted: [N]=TEST Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014 RIP: 0010:kmsan_internal_set_shadow_origin+0x91/0x100 [...] Call Trace: <TASK> __msan_memset+0xee/0x1a0 sha224_final+0x9e/0x350 test_hash_buffer_overruns+0x46f/0x5f0 ? kmsan_get_shadow_origin_ptr+0x46/0xa0 ? __pfx_test_hash_buffer_overruns+0x10/0x10 kunit_try_run_case+0x198/0xa00 This occurs when memset() is called on a buffer that is not 4-byte aligned and extends to the end of a guard page, i.e. the next page is unmapped. The bug is that the loop at the end of kmsan_internal_set_shadow_origin() accesses the wrong shadow memory bytes when the address is not 4-byte aligned. Since each 4 bytes are associated with an origin, it rounds the address and size so that it can access all the origins that contain the buffer. However, when it checks the corresponding shadow bytes for a particular origin, it incorrectly uses the original unrounded shadow address. This results in reads from shadow memory beyond the end of the buffer's shadow memory, which crashes when that memory is not mapped. To fix this, correctly align the shadow address before accessing the 4 shadow bytes corresponding to each origin. Link: https://lkml.kernel.org/r/20250911195858.394235-1-ebiggers@kernel.org Fixes: 2ef3cec44c60 ("kmsan: do not wipe out origin when doing partial unpoisoning") Signed-off-by: Eric Biggers <ebiggers@kernel.org> Tested-by: Alexander Potapenko <glider@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-26	mm/hugetlb: fix copy_hugetlb_page_range() to use ->pt_share_count	Jane Chu	2	-10/+10
	commit 59d9094df3d79 ("mm: hugetlb: independent PMD page table shared count") introduced ->pt_share_count dedicated to hugetlb PMD share count tracking, but omitted fixing copy_hugetlb_page_range(), leaving the function relying on page_count() for tracking that no longer works. When lazy page table copy for hugetlb is disabled, that is, revert commit bcd51a3c679d ("hugetlb: lazy page table copies in fork()") fork()'ing with hugetlb PMD sharing quickly lockup - [ 239.446559] watchdog: BUG: soft lockup - CPU#75 stuck for 27s! [ 239.446611] RIP: 0010:native_queued_spin_lock_slowpath+0x7e/0x2e0 [ 239.446631] Call Trace: [ 239.446633] <TASK> [ 239.446636] _raw_spin_lock+0x3f/0x60 [ 239.446639] copy_hugetlb_page_range+0x258/0xb50 [ 239.446645] copy_page_range+0x22b/0x2c0 [ 239.446651] dup_mmap+0x3e2/0x770 [ 239.446654] dup_mm.constprop.0+0x5e/0x230 [ 239.446657] copy_process+0xd17/0x1760 [ 239.446660] kernel_clone+0xc0/0x3e0 [ 239.446661] __do_sys_clone+0x65/0xa0 [ 239.446664] do_syscall_64+0x82/0x930 [ 239.446668] ? count_memcg_events+0xd2/0x190 [ 239.446671] ? syscall_trace_enter+0x14e/0x1f0 [ 239.446676] ? syscall_exit_work+0x118/0x150 [ 239.446677] ? arch_exit_to_user_mode_prepare.constprop.0+0x9/0xb0 [ 239.446681] ? clear_bhb_loop+0x30/0x80 [ 239.446684] ? clear_bhb_loop+0x30/0x80 [ 239.446686] entry_SYSCALL_64_after_hwframe+0x76/0x7e There are two options to resolve the potential latent issue: 1. warn against PMD sharing in copy_hugetlb_page_range(), 2. fix it. This patch opts for the second option. While at it, simplify the comment, the details are not actually relevant anymore. Link: https://lkml.kernel.org/r/20250916004520.1604530-1-jane.chu@oracle.com Fixes: 59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count") Signed-off-by: Jane Chu <jane.chu@oracle.com> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Acked-by: Oscar Salvador <osalvador@suse.de> Acked-by: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Liu Shixin <liushixin2@huawei.com> Cc: Muchun Song <muchun.song@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-26	mm/hugetlb: fix folio is still mapped when deleted	Jinjiang Tu	1	-4/+6
	Migration may be raced with fallocating hole. remove_inode_single_folio will unmap the folio if the folio is still mapped. However, it's called without folio lock. If the folio is migrated and the mapped pte has been converted to migration entry, folio_mapped() returns false, and won't unmap it. Due to extra refcount held by remove_inode_single_folio, migration fails, restores migration entry to normal pte, and the folio is mapped again. As a result, we triggered BUG in filemap_unaccount_folio. The log is as follows: BUG: Bad page cache in process hugetlb pfn:156c00 page: refcount:515 mapcount:0 mapping:0000000099fef6e1 index:0x0 pfn:0x156c00 head: order:9 mapcount:1 entire_mapcount:1 nr_pages_mapped:0 pincount:0 aops:hugetlbfs_aops ino:dcc dentry name(?):"my_hugepage_file" flags: 0x17ffffc00000c1(locked\|waiters\|head\|node=0\|zone=2\|lastcpupid=0x1fffff) page_type: f4(hugetlb) page dumped because: still mapped when deleted CPU: 1 UID: 0 PID: 395 Comm: hugetlb Not tainted 6.17.0-rc5-00044-g7aac71907bde-dirty #484 NONE Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 Call Trace: <TASK> dump_stack_lvl+0x4f/0x70 filemap_unaccount_folio+0xc4/0x1c0 __filemap_remove_folio+0x38/0x1c0 filemap_remove_folio+0x41/0xd0 remove_inode_hugepages+0x142/0x250 hugetlbfs_fallocate+0x471/0x5a0 vfs_fallocate+0x149/0x380 Hold folio lock before checking if the folio is mapped to avold race with migration. Link: https://lkml.kernel.org/r/20250912074139.3575005-1-tujinjiang@huawei.com Fixes: 4aae8d1c051e ("mm/hugetlbfs: unmap pages if page fault raced with hole punch") Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com> Cc: David Hildenbrand <david@redhat.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oscar Salvador <osalvador@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-25	Merge tag 'net-6.17-rc8' of ↵	Linus Torvalds	41	-201/+548
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from Bluetooth, IPsec and CAN. No known regressions at this point. Current release - regressions: - xfrm: xfrm_alloc_spi shouldn't use 0 as SPI Previous releases - regressions: - xfrm: fix offloading of cross-family tunnels - bluetooth: fix several races leading to UaFs - dsa: lantiq_gswip: fix FDB entries creation for the CPU port - eth: - tun: update napi->skb after XDP process - mlx: fix UAF in flow counter release Previous releases - always broken: - core: forbid FDB status change while nexthop is in a group - smc: fix warning in smc_rx_splice() when calling get_page() - can: provide missing ndo_change_mtu(), to prevent buffer overflow. - eth: - i40e: fix VF config validation - broadcom: fix support for PTP_EXTTS_REQUEST2 ioctl" * tag 'net-6.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (40 commits) octeontx2-pf: Fix potential use after free in otx2_tc_add_flow() net: dsa: lantiq_gswip: suppress -EINVAL errors for bridge FDB entries added to the CPU port net: dsa: lantiq_gswip: move gswip_add_single_port_br() call to port_setup() libie: fix string names for AQ error codes net/mlx5e: Fix missing FEC RS stats for RS_544_514_INTERLEAVED_QUAD net/mlx5: HWS, ignore flow level for multi-dest table net/mlx5: fs, fix UAF in flow counter release selftests: fib_nexthops: Add test cases for FDB status change selftests: fib_nexthops: Fix creation of non-FDB nexthops nexthop: Forbid FDB status change while nexthop is in a group net: allow alloc_skb_with_frags() to use MAX_SKB_FRAGS bnxt_en: correct offset handling for IPv6 destination address ptp: document behavior of PTP_STRICT_FLAGS broadcom: fix support for PTP_EXTTS_REQUEST2 ioctl broadcom: fix support for PTP_PEROUT_DUTY_CYCLE Bluetooth: MGMT: Fix possible UAFs Bluetooth: hci_event: Fix UAF in hci_acl_create_conn_sync Bluetooth: hci_event: Fix UAF in hci_conn_tx_dequeue Bluetooth: hci_sync: Fix hci_resume_advertising_sync Bluetooth: Fix build after header cleanup ...
2025-09-25	Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost	Linus Torvalds	7	-32/+29
	Pull virtio fixes from Michael Tsirkin: "virtio,vhost: last minute fixes More small fixes. Most notably this fixes crashes and hangs in vhost-net" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: MAINTAINERS, mailmap: Update address for Peter Hilber virtio_config: clarify output parameters uapi: vduse: fix typo in comment vhost: Take a reference on the task in struct vhost_task. vhost-net: flush batched before enabling notifications Revert "vhost/net: Defer TX queue re-enable until after sendmsg" vhost-net: unbreak busy polling vhost-scsi: fix argument order in tport allocation error message
2025-09-25	platform/x86: lg-laptop: Fix WMAB call in fan_mode_store()	Daniel Lee	2	-22/+16
	When WMAB is called to set the fan mode, the new mode is read from either bits 0-1 or bits 4-5 (depending on the value of some other EC register). Thus when WMAB is called with bits 4-5 zeroed and called again with bits 0-1 zeroed, the second call undoes the effect of the first call. This causes writes to /sys/devices/platform/lg-laptop/fan_mode to have no effect (and causes reads to always report a status of zero). Fix this by calling WMAB once, with the mode set in bits 0,1 and 4,5. When the fan mode is returned from WMAB it always has this form, so there is no need to preserve the other bits. As a bonus, the driver now supports the "Performance" fan mode seen in the LG-provided Windows control app, which provides less aggressive CPU throttling but louder fan noise and shorter battery life. Also, correct the documentation to reflect that 0 corresponds to the default mode (what the Windows app calls "Optimal") and 1 corresponds to the silent mode. Fixes: dbf0c5a6b1f8 ("platform/x86: Add LG Gram laptop special features driver") Link: https://bugzilla.kernel.org/show_bug.cgi?id=204913#c4 Signed-off-by: Daniel Lee <dany97@live.ca> Link: https://patch.msgid.link/MN2PR06MB55989CB10E91C8DA00EE868DDC1CA@MN2PR06MB5598.namprd06.prod.outlook.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-09-25	octeontx2-pf: Fix potential use after free in otx2_tc_add_flow()	Dan Carpenter	1	-1/+1
	This code calls kfree_rcu(new_node, rcu) and then dereferences "new_node" and then dereferences it on the next line. Two lines later, we take a mutex so I don't think this is an RCU safe region. Re-order it to do the dereferences before queuing up the free. Fixes: 68fbff68dbea ("octeontx2-pf: Add police action for TC flower") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/aNKCL1jKwK8GRJHh@stanley.mountain Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-25	drm/panthor: Defer scheduler entitiy destruction to queue release	Adrián Larumbe	1	-7/+1
	Commit de8548813824 ("drm/panthor: Add the scheduler logical block") handled destruction of a group's queues' drm scheduler entities early into the group destruction procedure. However, that races with the group submit ioctl, because by the time entities are destroyed (through the group destroy ioctl), the submission procedure might've already obtained a group handle, and therefore the ability to push jobs into entities. This is met with a DRM error message within the drm scheduler core as a situation that should never occur. Fix by deferring drm scheduler entity destruction to queue release time. Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block") Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20250919164436.531930-1-adrian.larumbe@collabora.com
2025-09-25	Merge branch 'lantiq_gswip-fixes'	Paolo Abeni	1	-5/+16
	Vladimir Oltean says: ==================== lantiq_gswip fixes This is a small set of fixes which I believe should be backported for the lantiq_gswip driver. Daniel Golle asked me to submit them here: https://lore.kernel.org/netdev/aLiDfrXUbw1O5Vdi@pidgin.makrotopia.org/ As mentioned there, a merge conflict with net-next is expected, due to the movement of the driver to the 'drivers/net/dsa/lantiq' folder there. Good luck :-/ Patch 2/2 fixes an old regression and is the minimal fix for that, as discussed here: https://lore.kernel.org/netdev/aJfNMLNoi1VOsPrN@pidgin.makrotopia.org/ Patch 1/2 was identified by me through static analysis, and I consider it to be a serious deficiency. It needs a test tag. ==================== Link: https://patch.msgid.link/20250918072142.894692-1-vladimir.oltean@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-25	net: dsa: lantiq_gswip: suppress -EINVAL errors for bridge FDB entries added ↵	Vladimir Oltean	1	-1/+2
	to the CPU port The blamed commit and others in that patch set started the trend of reusing existing DSA driver API for a new purpose: calling ds->ops->port_fdb_add() on the CPU port. The lantiq_gswip driver was not prepared to handle that, as can be seen from the many errors that Daniel presents in the logs: [ 174.050000] gswip 1e108000.switch: port 2 failed to add fa:aa:72:f4:8b:1e vid 1 to fdb: -22 [ 174.060000] gswip 1e108000.switch lan2: entered promiscuous mode [ 174.070000] gswip 1e108000.switch: port 2 failed to add 00:01:02:03:04:02 vid 0 to fdb: -22 [ 174.090000] gswip 1e108000.switch: port 2 failed to add 00:01:02:03:04:02 vid 1 to fdb: -22 [ 174.090000] gswip 1e108000.switch: port 2 failed to delete fa:aa:72:f4:8b:1e vid 1 from fdb: -2 The errors are because gswip_port_fdb() wants to get a handle to the bridge that originated these FDB events, to associate it with a FID. Absolutely honourable purpose, however this only works for user ports. To get the bridge that generated an FDB entry for the CPU port, one would need to look at the db.bridge.dev argument. But this was introduced in commit c26933639b54 ("net: dsa: request drivers to perform FDB isolation"), first appeared in v5.18, and when the blamed commit was introduced in v5.14, no such API existed. So the core DSA feature was introduced way too soon for lantiq_gswip. Not acting on these host FDB entries and suppressing any errors has no other negative effect, and practically returns us to not supporting the host filtering feature at all - peacefully, this time. Fixes: 10fae4ac89ce ("net: dsa: include bridge addresses which are local in the host fdb list") Reported-by: Daniel Golle <daniel@makrotopia.org> Closes: https://lore.kernel.org/netdev/aJfNMLNoi1VOsPrN@pidgin.makrotopia.org/ Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250918072142.894692-3-vladimir.oltean@nxp.com Tested-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-25	net: dsa: lantiq_gswip: move gswip_add_single_port_br() call to port_setup()	Vladimir Oltean	1	-4/+14
	A port added to a "single port bridge" operates as standalone, and this is mutually exclusive to being part of a Linux bridge. In fact, gswip_port_bridge_join() calls gswip_add_single_port_br() with add=false, i.e. removes the port from the "single port bridge" to enable autonomous forwarding. The blamed commit seems to have incorrectly thought that ds->ops->port_enable() is called one time per port, during the setup phase of the switch. However, it is actually called during the ndo_open() implementation of DSA user ports, which is to say that this sequence of events: 1. ip link set swp0 down 2. ip link add br0 type bridge 3. ip link set swp0 master br0 4. ip link set swp0 up would cause swp0 to join back the "single port bridge" which step 3 had just removed it from. The correct DSA hook for one-time actions per port at switch init time is ds->ops->port_setup(). This is what seems to match the coder's intention; also see the comment at the beginning of the file: * At the initialization the driver allocates one bridge table entry for ~~~~~~~~~~~~~~~~~~~~~ * each switch port which is used when the port is used without an * explicit bridge. Fixes: 8206e0ce96b3 ("net: dsa: lantiq: Add VLAN unaware bridge offloading") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250918072142.894692-2-vladimir.oltean@nxp.com Tested-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-25	sched/deadline: Fix dl_server behaviour	Peter Zijlstra	3	-24/+33
	John reported undesirable behaviour with the dl_server since commit: cccb45d7c4295 ("sched/deadline: Less agressive dl_server handling"). When starving fair tasks on purpose (starting spinning FIFO tasks), his fair workload, which often goes (briefly) idle, would delay fair invocations for a second, running one invocation per second was both unexpected and terribly slow. The reason this happens is that when dl_se->server_pick_task() returns NULL, indicating no runnable tasks, it would yield, pushing any later jobs out a whole period (1 second). Instead simply stop the server. This should restore behaviour in that a later wakeup (which restarts the server) will be able to continue running (subject to the CBS wakeup rules). Notably, this does not re-introduce the behaviour cccb45d7c4295 set out to solve, any start/stop cycle is naturally throttled by the timer period (no active cancel). Fixes: cccb45d7c4295 ("sched/deadline: Less agressive dl_server handling") Reported-by: John Stultz <jstultz@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: John Stultz <jstultz@google.com>
2025-09-25	sched/deadline: Fix dl_server getting stuck	Peter Zijlstra	4	-22/+2
	John found it was easy to hit lockup warnings when running locktorture on a 2 CPU VM, which he bisected down to: commit cccb45d7c429 ("sched/deadline: Less agressive dl_server handling"). While debugging it seems there is a chance where we end up with the dl_server dequeued, with dl_se->dl_server_active. This causes dl_server_start() to return without enqueueing the dl_server, thus it fails to run when RT tasks starve the cpu. When this happens, dl_server_timer() catches the '!dl_se->server_has_tasks(dl_se)' case, which then calls replenish_dl_entity() and dl_server_stopped() and finally return HRTIMER_NO_RESTART. This ends in no new timer and also no enqueue, leaving the dl_server 'dead', allowing starvation. What should have happened is for the bandwidth timer to start the zero-laxity timer, which in turn would enqueue the dl_server and cause dl_se->server_pick_task() to be called -- which will stop the dl_server if no fair tasks are observed for a whole period. IOW, it is totally irrelevant if there are fair tasks at the moment of bandwidth refresh. This removes all dl_se->server_has_tasks() users, so remove the whole thing. Fixes: cccb45d7c4295 ("sched/deadline: Less agressive dl_server handling") Reported-by: John Stultz <jstultz@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: John Stultz <jstultz@google.com>
2025-09-25	afs: Fix potential null pointer dereference in afs_put_server	Zhen Ni	1	-1/+2
	afs_put_server() accessed server->debug_id before the NULL check, which could lead to a null pointer dereference. Move the debug_id assignment, ensuring we never dereference a NULL server pointer. Fixes: 2757a4dc1849 ("afs: Fix access after dec in put functions") Cc: stable@vger.kernel.org Signed-off-by: Zhen Ni <zhen.ni@easystack.cn> Acked-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-25	Merge tag 'probes-fixes-v6.17-rc7' of ↵	Linus Torvalds	2	-3/+8
	git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probes fixes from Masami Hiramatsu: - fprobe: Even if there is a memory allocation failure, try to remove the addresses recorded until then from the filter. Previously we just skipped it. - tracing: dynevent: Add a missing lockdown check on dynevent. This dynevent is the interface for all probe events. Thus if there is no check, any probe events can be added after lock down the tracefs. * tag 'probes-fixes-v6.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: dynevent: Add a missing lockdown check on dynevent tracing: fprobe: Fix to remove recorded module addresses from filter
2025-09-25	libie: fix string names for AQ error codes	Jacob Keller	1	-1/+1
	The LIBIE_AQ_STR macro() introduced by commit 5feaa7a07b85 ("libie: add adminq helper for converting err to str") is used in order to generate strings for printing human readable error codes. Its definition is missing the separating underscore ('_') character which makes the resulting strings difficult to read. Additionally, the string won't match the source code, preventing search tools from working properly. Add the missing underscore character, fixing the error string names. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Fixes: 5feaa7a07b85 ("libie: add adminq helper for converting err to str") Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250923205657.846759-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-25	crypto: af_alg - Fix incorrect boolean values in af_alg_ctx	Eric Biggers	1	-1/+1
	Commit 1b34cbbf4f01 ("crypto: af_alg - Disallow concurrent writes in af_alg_sendmsg") changed some fields from bool to 1-bit bitfields of type u32. However, some assignments to these fields, specifically 'more' and 'merge', assign values greater than 1. These relied on C's implicit conversion to bool, such that zero becomes false and nonzero becomes true. With a 1-bit bitfields of type u32 instead, mod 2 of the value is taken instead, resulting in 0 being assigned in some cases when 1 was intended. Fix this by restoring the bool type. Fixes: 1b34cbbf4f01 ("crypto: af_alg - Disallow concurrent writes in af_alg_sendmsg") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-09-25	Merge tag 'soc-fixes-6.17-3' of ↵	Linus Torvalds	22	-28/+113
	git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC fixes from Arnd Bergmann: "There are a few minor code fixes for tegra firmware, i.MX firmware and the eyeq reset controller, and a MAINTAINERS update as Alyssa Rosenzweig moves on to non-kernel projects. The other changes are all for devicetree files: - Multiple Marvell Armada SoCs need changes to fix PCIe, audio and SATA - A socfpga board fails to probe the ethernet phy - The two temperature sensors on i.MX8MP are swapped - Allwinner devicetree files cause build-time warnings - Two Rockchip based boards need corrections for headphone detection and SPI flash" * tag 'soc-fixes-6.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: MAINTAINERS: remove Alyssa Rosenzweig firmware: tegra: Do not warn on missing memory-region property arm64: dts: marvell: cn9132-clearfog: fix multi-lane pci x2 and x4 ports arm64: dts: marvell: cn9132-clearfog: disable eMMC high-speed modes arm64: dts: marvell: cn913x-solidrun: fix sata ports status ARM: dts: kirkwood: Fix sound DAI cells for OpenRD clients arm64: dts: imx8mp: Correct thermal sensor index ARM: imx: Kconfig: Adjust select after renamed config option firmware: imx: Add stub functions for SCMI CPU API firmware: imx: Add stub functions for SCMI LMM API firmware: imx: Add stub functions for SCMI MISC API riscv: dts: allwinner: rename devterm i2c-gpio node to comply with binding arm64: dts: rockchip: Fix the headphone detection on the orangepi 5 arm64: dts: rockchip: Add vcc supply for SPI Flash on NanoPC-T6 ARM: dts: socfpga: sodia: Fix mdio bus probe and PHY address reset: eyeq: fix OF node leak ARM64: dts: mcbin: fix SATA ports on Macchiatobin ARM: dts: armada-370-db: Fix stereo audio input routing on Armada 370 ARM: dts: allwinner: Minor whitespace cleanup
2025-09-24	Merge tag 'pm-6.17-rc8' of ↵	Linus Torvalds	1	-9/+11
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Rafael: "Fix a locking issue in the cpufreq core introduced recently and caught by lockdep (Christian Loehle)" * tag 'pm-6.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: Initialize cpufreq-based invariance before subsys
2025-09-24	Merge tag 'for-6.17-rc7-tag' of ↵	Linus Torvalds	1	-0/+6
	git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "One more regression fix for a problem in zoned mode: mounting would fail if the number of open and active zones reached a common limit that didn't use to be checked" * tag 'for-6.17-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: zoned: don't fail mount needlessly due to too many active zones
2025-09-24	Merge tag '6.17-rc7-ksmbd-server-fixes' of git://git.samba.org/ksmbd	Linus Torvalds	1	-12/+10
	Pull smb server fixes from Steve French: - free_transport fix for disconnect races - minor delayed work fix * tag '6.17-rc7-ksmbd-server-fixes' of git://git.samba.org/ksmbd: smb: server: use disable_work_sync in transport_rdma.c smb: server: don't use delayed_work for post_recv_credits_work
2025-09-24	tracing: dynevent: Add a missing lockdown check on dynevent	Masami Hiramatsu (Google)	1	-0/+4
	Since dynamic_events interface on tracefs is compatible with kprobe_events and uprobe_events, it should also check the lockdown status and reject if it is set. Link: https://lore.kernel.org/all/175824455687.45175.3734166065458520748.stgit@devnote2/ Fixes: 17911ff38aa5 ("tracing: Add locked_down checks to the open calls of files created for tracefs") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: stable@vger.kernel.org
2025-09-24	tracing: fprobe: Fix to remove recorded module addresses from filter	Masami Hiramatsu (Google)	1	-3/+4
	Even if there is a memory allocation failure in fprobe_addr_list_add(), there is a partial list of module addresses. So remove the recorded addresses from filter if exists. This also removes the redundant ret local variable. Fixes: a3dc2983ca7b ("tracing: fprobe: Cleanup fprobe hash when module unloading") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: stable@vger.kernel.org Reviewed-by: Menglong Dong <menglong8.dong@gmail.com>
2025-09-24	kbuild: Disable CC_HAS_ASM_GOTO_OUTPUT on clang < 17	Thomas Gleixner	1	-0/+3
	clang < 17 fails to use scope local labels with CONFIG_CC_HAS_ASM_GOTO_OUTPUT=y: { __label__ local_lbl; ... unsafe_get_user(uval, uaddr, local_lbl); ... return 0; local_lbl: return -EFAULT; } when two such scopes exist in the same function: error: cannot jump from this asm goto statement to one of its possible targets There are other failure scenarios. Shuffling code around slightly makes it worse and fail even with one instance. That issue prevents using local labels for a cleanup based user access mechanism. After failed attempts to provide a simple enough test case for the 'depends on' test in Kconfig, the initial cure was to mark ASM goto broken on clang versions < 17 to get this road block out of the way. But Nathan pointed out that this is a known clang issue and indeed affects clang < version 17 in combination with cleanup(). It's not even required to use local labels for that. The clang issue tracker has a small enough test case, which can be used as a test in the 'depends on' section of CC_HAS_ASM_GOTO_OUTPUT: void bar(void *); void baz(void); int foo (void) { { asm goto("jmp %l0"::::l0); return 0; l0: return 1; } void *x __attribute__((cleanup(bar))) = baz(); { asm goto("jmp %l0"::::l1); return 42; l1: return 0xff; } } Add another dependency to config CC_HAS_ASM_GOTO_OUTPUT for it and use the clang issue tracker test case for detection by condensing it to obfuscated C-code contest format. This reliably catches the problem on clang < 17 and did not show any issues on the non broken GCC versions. That test might be sufficient to catch all issues and therefore could replace the existing test, but keeping that around does no harm either. Thanks to Nathan for pointing to the relevant clang issue! Suggested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://github.com/ClangBuiltLinux/linux/issues/1886 Link: https://github.com/llvm/llvm-project/commit/f023f5cdb2e6c19026f04a15b5a935c041835d14
2025-09-24	futex: Use correct exit on failure from futex_hash_allocate_default()	Sebastian Andrzej Siewior	1	-1/+1
	copy_process() uses the wrong error exit path from futex_hash_allocate_default(). After exiting from futex_hash_allocate_default(), neither tasklist_lock nor siglock has been acquired. The exit label bad_fork_core_free unlocks both of these locks which is wrong. The next exit label, bad_fork_cancel_cgroup, is the correct exit. sched_cgroup_fork() did not allocate any resources that need to freed. Use bad_fork_cancel_cgroup on error exit from futex_hash_allocate_default(). Fixes: 7c4f75a21f636 ("futex: Allow automatic allocation of process wide futex hash") Reported-by: syzbot+80cb3cc5c14fad191a10@syzkaller.appspotmail.com Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Closes: https://lore.kernel.org/all/68cb1cbd.050a0220.2ff435.0599.GAE@google.com
2025-09-24	MAINTAINERS: Update Paul Walmsley's E-mail address	Paul Walmsley	1	-4/+4
	My experiment with using corporate Gmail for Linux kernel list interaction has come to an end. For my MAINTAINERS entries that use that E-mail address, let's switch those to use the k.org E-mail forwarding. Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com> Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-24	riscv: Use an atomic xchg in pudp_huge_get_and_clear()	Alexandre Ghiti	1	-0/+17
	Make sure we return the right pud value and not a value that could have been overwritten in between by a different core. Fixes: c3cc2a4a3a23 ("riscv: Add support for PUD THP") Cc: stable@vger.kernel.org Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250814-dev-alex-thp_pud_xchg-v1-1-b4704dfae206@rivosinc.com [pjw@kernel.org: use xchg rather than atomic_long_xchg; avoid atomic op for !CONFIG_SMP like x86] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-09-24	Merge branch 'mlx5-misc-fixes-2025-09-22'	Jakub Kicinski	9	-17/+40
	Tariq Toukan says: ==================== mlx5 misc fixes 2025-09-22 This patchset provides misc bug fixes from the team to the mlx5 Eth and core drivers. ==================== Link: https://patch.msgid.link/1758525094-816583-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-24	net/mlx5e: Fix missing FEC RS stats for RS_544_514_INTERLEAVED_QUAD	Carolina Jubran	1	-0/+1
	Include MLX5E_FEC_RS_544_514_INTERLEAVED_QUAD in the FEC RS stats handling. This addresses a gap introduced when adding support for 200G/lane link modes. Fixes: 4e343c11efbb ("net/mlx5e: Support FEC settings for 200G per lane link modes") Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Yael Chemla <ychemla@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1758525094-816583-4-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-24	net/mlx5: HWS, ignore flow level for multi-dest table	Yevgeny Kliteynik	3	-12/+6
	When HWS creates multi-dest FW table and adds rules to forward to other tables, ignore the flow level enforcement in FW, because HWS is responsible for table levels. This fixes the following error: mlx5_core 0000:08:00.0: mlx5_cmd_out_err:818:(pid 192306): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x6ae84c), err(-22) Fixes: 504e536d9010 ("net/mlx5: HWS, added actions handling") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1758525094-816583-3-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-24	net/mlx5: fs, fix UAF in flow counter release	Moshe Shemesh	5	-5/+33
	Fix a kernel trace [1] caused by releasing an HWS action of a local flow counter in mlx5_cmd_hws_delete_fte(), where the HWS action refcount and mutex were not initialized and the counter struct could already be freed when deleting the rule. Fix it by adding the missing initializations and adding refcount for the local flow counter struct. [1] Kernel log: Call Trace: <TASK> dump_stack_lvl+0x34/0x48 mlx5_fs_put_hws_action.part.0.cold+0x21/0x94 [mlx5_core] mlx5_fc_put_hws_action+0x96/0xad [mlx5_core] mlx5_fs_destroy_fs_actions+0x8b/0x152 [mlx5_core] mlx5_cmd_hws_delete_fte+0x5a/0xa0 [mlx5_core] del_hw_fte+0x1ce/0x260 [mlx5_core] mlx5_del_flow_rules+0x12d/0x240 [mlx5_core] ? ttwu_queue_wakelist+0xf4/0x110 mlx5_ib_destroy_flow+0x103/0x1b0 [mlx5_ib] uverbs_free_flow+0x20/0x50 [ib_uverbs] destroy_hw_idr_uobject+0x1b/0x50 [ib_uverbs] uverbs_destroy_uobject+0x34/0x1a0 [ib_uverbs] uobj_destroy+0x3c/0x80 [ib_uverbs] ib_uverbs_run_method+0x23e/0x360 [ib_uverbs] ? uverbs_finalize_object+0x60/0x60 [ib_uverbs] ib_uverbs_cmd_verbs+0x14f/0x2c0 [ib_uverbs] ? do_tty_write+0x1a9/0x270 ? file_tty_write.constprop.0+0x98/0xc0 ? new_sync_write+0xfc/0x190 ib_uverbs_ioctl+0xd7/0x160 [ib_uverbs] __x64_sys_ioctl+0x87/0xc0 do_syscall_64+0x59/0x90 Fixes: b581f4266928 ("net/mlx5: fs, manage flow counters HWS action sharing by refcount") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1758525094-816583-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-24	Merge branch 'nexthop-various-fixes'	Jakub Kicinski	2	-6/+53
	Ido Schimmel says: ==================== nexthop: Various fixes Patch #1 fixes a NPD that was recently reported by syzbot. Patch #2 fixes an issue in the existing FIB nexthop selftest. Patch #3 extends the selftest with test cases for the bug that was fixed in the first patch. ==================== Link: https://patch.msgid.link/20250921150824.149157-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-24	selftests: fib_nexthops: Add test cases for FDB status change	Ido Schimmel	1	-0/+40
	Add the following test cases for both IPv4 and IPv6: * Can change from FDB nexthop to non-FDB nexthop and vice versa. * Can change FDB nexthop address while in a group. * Cannot change from FDB nexthop to non-FDB nexthop and vice versa while in a group. Output without "nexthop: Forbid FDB status change while nexthop is in a group": # ./fib_nexthops.sh -t "ipv6_fdb_grp_fcnal ipv4_fdb_grp_fcnal" IPv6 fdb groups functional -------------------------- [...] TEST: Replace FDB nexthop to non-FDB nexthop [ OK ] TEST: Replace non-FDB nexthop to FDB nexthop [ OK ] TEST: Replace FDB nexthop address while in a group [ OK ] TEST: Replace FDB nexthop to non-FDB nexthop while in a group [FAIL] TEST: Replace non-FDB nexthop to FDB nexthop while in a group [FAIL] [...] IPv4 fdb groups functional -------------------------- [...] TEST: Replace FDB nexthop to non-FDB nexthop [ OK ] TEST: Replace non-FDB nexthop to FDB nexthop [ OK ] TEST: Replace FDB nexthop address while in a group [ OK ] TEST: Replace FDB nexthop to non-FDB nexthop while in a group [FAIL] TEST: Replace non-FDB nexthop to FDB nexthop while in a group [FAIL] [...] Tests passed: 36 Tests failed: 4 Tests skipped: 0 Output with "nexthop: Forbid FDB status change while nexthop is in a group": # ./fib_nexthops.sh -t "ipv6_fdb_grp_fcnal ipv4_fdb_grp_fcnal" IPv6 fdb groups functional -------------------------- [...] TEST: Replace FDB nexthop to non-FDB nexthop [ OK ] TEST: Replace non-FDB nexthop to FDB nexthop [ OK ] TEST: Replace FDB nexthop address while in a group [ OK ] TEST: Replace FDB nexthop to non-FDB nexthop while in a group [ OK ] TEST: Replace non-FDB nexthop to FDB nexthop while in a group [ OK ] [...] IPv4 fdb groups functional -------------------------- [...] TEST: Replace FDB nexthop to non-FDB nexthop [ OK ] TEST: Replace non-FDB nexthop to FDB nexthop [ OK ] TEST: Replace FDB nexthop address while in a group [ OK ] TEST: Replace FDB nexthop to non-FDB nexthop while in a group [ OK ] TEST: Replace non-FDB nexthop to FDB nexthop while in a group [ OK ] [...] Tests passed: 40 Tests failed: 0 Tests skipped: 0 Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20250921150824.149157-4-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>