summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
43 hoursLinux 6.1.161v6.1.161linux-6.1.yGreg Kroah-Hartman1-1/+1
Link: https://lore.kernel.org/r/20260115164143.482647486@linuxfoundation.org Tested-by: Brett A C Sheffield <bacs@librecast.net> Tested-by: Slade Watkins <sr@sladewatkins.com> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Tested-by: Salvatore Bonaccorso <carnil@debian.org> Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com> Tested-by: Peter Schneider <pschneider1968@googlemail.com> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Mark Brown <broonie@kernel.org> Tested-by: Hardik Garg <hargar@linux.microsoft.com> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
43 hoursbpf: test_run: Fix ctx leak in bpf_prog_test_run_xdp error pathShardul Bankar1-1/+1
commit 7f9ee5fc97e14682e36fe22ae2654c07e4998b82 upstream. Fix a memory leak in bpf_prog_test_run_xdp() where the context buffer allocated by bpf_ctx_init() is not freed when the function returns early due to a data size check. On the failing path: ctx = bpf_ctx_init(...); if (kattr->test.data_size_in - meta_sz < ETH_HLEN) return -EINVAL; The early return bypasses the cleanup label that kfree()s ctx, leading to a leak detectable by kmemleak under fuzzing. Change the return to jump to the existing free_ctx label. Fixes: fe9544ed1a2e ("bpf: Support specifying linear xdp packet data size for BPF_PROG_TEST_RUN") Reported-by: BPF Runtime Fuzzer (BRF) Signed-off-by: Shardul Bankar <shardulsb08@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://patch.msgid.link/20251014120037.1981316-1-shardulsb08@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
43 hoursscsi: sg: Fix occasional bogus elapsed time that exceeds timeoutMichal Rábek1-7/+13
[ Upstream commit 0e1677654259a2f3ccf728de1edde922a3c4ba57 ] A race condition was found in sg_proc_debug_helper(). It was observed on a system using an IBM LTO-9 SAS Tape Drive (ULTRIUM-TD9) and monitoring /proc/scsi/sg/debug every second. A very large elapsed time would sometimes appear. This is caused by two race conditions. We reproduced the issue with an IBM ULTRIUM-HH9 tape drive on an x86_64 architecture. A patched kernel was built, and the race condition could not be observed anymore after the application of this patch. A reproducer C program utilising the scsi_debug module was also built by Changhui Zhong and can be viewed here: https://github.com/MichaelRabek/linux-tests/blob/master/drivers/scsi/sg/sg_race_trigger.c The first race happens between the reading of hp->duration in sg_proc_debug_helper() and request completion in sg_rq_end_io(). The hp->duration member variable may hold either of two types of information: #1 - The start time of the request. This value is present while the request is not yet finished. #2 - The total execution time of the request (end_time - start_time). If sg_proc_debug_helper() executes *after* the value of hp->duration was changed from #1 to #2, but *before* srp->done is set to 1 in sg_rq_end_io(), a fresh timestamp is taken in the else branch, and the elapsed time (value type #2) is subtracted from a timestamp, which cannot yield a valid elapsed time (which is a type #2 value as well). To fix this issue, the value of hp->duration must change under the protection of the sfp->rq_list_lock in sg_rq_end_io(). Since sg_proc_debug_helper() takes this read lock, the change to srp->done and srp->header.duration will happen atomically from the perspective of sg_proc_debug_helper() and the race condition is thus eliminated. The second race condition happens between sg_proc_debug_helper() and sg_new_write(). Even though hp->duration is set to the current time stamp in sg_add_request() under the write lock's protection, it gets overwritten by a call to get_sg_io_hdr(), which calls copy_from_user() to copy struct sg_io_hdr from userspace into kernel space. hp->duration is set to the start time again in sg_common_write(). If sg_proc_debug_helper() is called between these two calls, an arbitrary value set by userspace (usually zero) is used to compute the elapsed time. To fix this issue, hp->duration must be set to the current timestamp again after get_sg_io_hdr() returns successfully. A small race window still exists between get_sg_io_hdr() and setting hp->duration, but this window is only a few instructions wide and does not result in observable issues in practice, as confirmed by testing. Additionally, we fix the format specifier from %d to %u for printing unsigned int values in sg_proc_debug_helper(). Signed-off-by: Michal Rábek <mrabek@redhat.com> Suggested-by: Tomas Henzl <thenzl@redhat.com> Tested-by: Changhui Zhong <czhong@redhat.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: John Meneghini <jmeneghi@redhat.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Link: https://patch.msgid.link/20251212160900.64924-1-mrabek@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursASoC: fsl_sai: Add missing registers to cache defaultAlexander Stein1-0/+3
[ Upstream commit 90ed688792a6b7012b3e8a2f858bc3fe7454d0eb ] Drivers does cache sync during runtime resume, setting all writable registers. Not all writable registers are set in cache default, resulting in the erorr message: fsl-sai 30c30000.sai: using zero-initialized flat cache, this may cause unexpected behavior Fix this by adding missing writable register defaults. Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com> Link: https://patch.msgid.link/20251216102246.676181-1-alexander.stein@ew.tq-group.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursASoC: amd: yc: Add quirk for Honor MagicBook X16 2025Andrew Elantsev1-0/+7
[ Upstream commit e2cb8ef0372665854fca6fa7b30b20dd35acffeb ] Add a DMI quirk for the Honor MagicBook X16 2025 laptop fixing the issue where the internal microphone was not detected. Signed-off-by: Andrew Elantsev <elantsew.andrew@gmail.com> Link: https://patch.msgid.link/20251210203800.142822-1-elantsew.andrew@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hourscan: j1939: make j1939_session_activate() fail if device is no longer registeredTetsuo Handa1-0/+2
[ Upstream commit 5d5602236f5db19e8b337a2cd87a90ace5ea776d ] syzbot is still reporting unregister_netdevice: waiting for vcan0 to become free. Usage count = 2 even after commit 93a27b5891b8 ("can: j1939: add missing calls in NETDEV_UNREGISTER notification handler") was added. A debug printk() patch found that j1939_session_activate() can succeed even after j1939_cancel_active_session() from j1939_netdev_notify(NETDEV_UNREGISTER) has completed. Since j1939_cancel_active_session() is processed with the session list lock held, checking ndev->reg_state in j1939_session_activate() with the session list lock held can reliably close the race window. Reported-by: syzbot <syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com> Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/b9653191-d479-4c8b-8536-1326d028db5c@I-love.SAKURA.ne.jp Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hourspowercap: fix sscanf() error return value handlingSumeet Pawnikar1-3/+3
[ Upstream commit efc4c35b741af973de90f6826bf35d3b3ac36bf1 ] Fix inconsistent error handling for sscanf() return value check. Implicit boolean conversion is used instead of explicit return value checks. The code checks if (!sscanf(...)) which is incorrect because: 1. sscanf returns the number of successfully parsed items 2. On success, it returns 1 (one item passed) 3. On failure, it returns 0 or EOF 4. The check 'if (!sscanf(...))' is wrong because it treats success (1) as failure All occurrences of sscanf() now uses explicit return value check. With this behavior it returns '-EINVAL' when parsing fails (returns 0 or EOF), and continues when parsing succeeds (returns 1). Signed-off-by: Sumeet Pawnikar <sumeet4linux@gmail.com> [ rjw: Subject and changelog edits ] Link: https://patch.msgid.link/20251207151549.202452-1-sumeet4linux@gmail.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hourspowercap: fix race condition in register_control_type()Sumeet Pawnikar1-5/+11
[ Upstream commit 7bda1910c4bccd4b8d4726620bb3d6bbfb62286e ] The device becomes visible to userspace via device_register() even before it fully initialized by idr_init(). If userspace or another thread tries to register a zone immediately after device_register(), the control_type_valid() will fail because the control_type is not yet in the list. The IDR is not yet initialized, so this race condition causes zone registration failure. Move idr_init() and list addition before device_register() fix the race condition. Signed-off-by: Sumeet Pawnikar <sumeet4linux@gmail.com> [ rjw: Subject adjustment, empty line added ] Link: https://patch.msgid.link/20251205190216.5032-1-sumeet4linux@gmail.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursbpf: Fix reference count leak in bpf_prog_test_run_xdp()Tetsuo Handa1-3/+4
[ Upstream commit ec69daabe45256f98ac86c651b8ad1b2574489a7 ] syzbot is reporting unregister_netdevice: waiting for sit0 to become free. Usage count = 2 problem. A debug printk() patch found that a refcount is obtained at xdp_convert_md_to_buff() from bpf_prog_test_run_xdp(). According to commit ec94670fcb3b ("bpf: Support specifying ingress via xdp_md context in BPF_PROG_TEST_RUN"), the refcount obtained by xdp_convert_md_to_buff() will be released by xdp_convert_buff_to_md(). Therefore, we can consider that the error handling path introduced by commit 1c1949982524 ("bpf: introduce frags support to bpf_prog_test_run_xdp()") forgot to call xdp_convert_buff_to_md(). Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84 Fixes: 1c1949982524 ("bpf: introduce frags support to bpf_prog_test_run_xdp()") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/af090e53-9d9b-4412-8acb-957733b3975c@I-love.SAKURA.ne.jp Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursbpf, test_run: Subtract size of xdp_frame from allowed metadata sizeToke Høiland-Jørgensen1-5/+13
[ Upstream commit e558cca217790286e799a8baacd1610bda31b261 ] The xdp_frame structure takes up part of the XDP frame headroom, limiting the size of the metadata. However, in bpf_test_run, we don't take this into account, which makes it possible for userspace to supply a metadata size that is too large (taking up the entire headroom). If userspace supplies such a large metadata size in live packet mode, the xdp_update_frame_from_buff() call in xdp_test_run_init_page() call will fail, after which packet transmission proceeds with an uninitialised frame structure, leading to the usual Bad Stuff. The commit in the Fixes tag fixed a related bug where the second check in xdp_update_frame_from_buff() could fail, but did not add any additional constraints on the metadata size. Complete the fix by adding an additional check on the metadata size. Reorder the checks slightly to make the logic clearer and add a comment. Link: https://lore.kernel.org/r/fa2be179-bad7-4ee3-8668-4903d1853461@hust.edu.cn Fixes: b6f1f780b393 ("bpf, test_run: Fix packet size check for live packet mode") Reported-by: Yinhao Hu <dddddd@hust.edu.cn> Reported-by: Kaiyan Mei <M202472210@hust.edu.cn> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Reviewed-by: Amery Hung <ameryhung@gmail.com> Link: https://lore.kernel.org/r/20260105114747.1358750-1-toke@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursbpf: Support specifying linear xdp packet data size for BPF_PROG_TEST_RUNAmery Hung2-6/+13
[ Upstream commit fe9544ed1a2e9217b2c5285c3a4ac0dc5a38bd7b ] To test bpf_xdp_pull_data(), an xdp packet containing fragments as well as free linear data area after xdp->data_end needs to be created. However, bpf_prog_test_run_xdp() always fills the linear area with data_in before creating fragments, leaving no space to pull data. This patch will allow users to specify the linear data size through ctx->data_end. Currently, ctx_in->data_end must match data_size_in and will not be the final ctx->data_end seen by xdp programs. This is because ctx->data_end is populated according to the xdp_buff passed to test_run. The linear data area available in an xdp_buff, max_linear_sz, is alawys filled up before copying data_in into fragments. This patch will allow users to specify the size of data that goes into the linear area. When ctx_in->data_end is different from data_size_in, only ctx_in->data_end bytes of data will be put into the linear area when creating the xdp_buff. While ctx_in->data_end will be allowed to be different from data_size_in, it cannot be larger than the data_size_in as there will be no data to copy from user space. If it is larger than the maximum linear data area size, the layout suggested by the user will not be honored. Data beyond max_linear_sz bytes will still be copied into fragments. Finally, since it is possible for a NIC to produce a xdp_buff with empty linear data area, allow it when calling bpf_test_init() from bpf_prog_test_run_xdp() so that we can test XDP kfuncs with such xdp_buff. This is done by moving lower-bound check to callers as most of them already do except bpf_prog_test_run_skb(). The change also fixes a bug that allows passing an xdp_buff with data < ETH_HLEN. This can happen when ctx is used and metadata is at least ETH_HLEN. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250922233356.3356453-7-ameryhung@gmail.com Stable-dep-of: e558cca21779 ("bpf, test_run: Subtract size of xdp_frame from allowed metadata size") Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursbpf: Make variables in bpf_prog_test_run_xdp less confusingAmery Hung1-13/+13
[ Upstream commit 7eb83bff02ad5e82e8c456c58717ef181c220870 ] Change the variable naming in bpf_prog_test_run_xdp() to make the overall logic less confusing. As different modes were added to the function over the time, some variables got overloaded, making it hard to understand and changing the code becomes error-prone. Replace "size" with "linear_sz" where it refers to the size of metadata and data. If "size" refers to input data size, use test.data_size_in directly. Replace "max_data_sz" with "max_linear_sz" to better reflect the fact that it is the maximum size of metadata and data (i.e., linear_sz). Also, xdp_rxq.frags_size is always PAGE_SIZE, so just set it directly instead of subtracting headroom and tailroom and adding them back. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250922233356.3356453-6-ameryhung@gmail.com Stable-dep-of: e558cca21779 ("bpf, test_run: Subtract size of xdp_frame from allowed metadata size") Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursbpf: Fix an issue in bpf_prog_test_run_xdp when page size greater than 4KYonghong Song3-9/+97
[ Upstream commit 4fc012daf9c074772421c904357abf586336b1ca ] The bpf selftest xdp_adjust_tail/xdp_adjust_frags_tail_grow failed on arm64 with 64KB page: xdp_adjust_tail/xdp_adjust_frags_tail_grow:FAIL In bpf_prog_test_run_xdp(), the xdp->frame_sz is set to 4K, but later on when constructing frags, with 64K page size, the frag data_len could be more than 4K. This will cause problems in bpf_xdp_frags_increase_tail(). To fix the failure, the xdp->frame_sz is set to be PAGE_SIZE so kernel can test different page size properly. With the kernel change, the user space and bpf prog needs adjustment. Currently, the MAX_SKB_FRAGS default value is 17, so for 4K page, the maximum packet size will be less than 68K. To test 64K page, a bigger maximum packet size than 68K is desired. So two different functions are implemented for subtest xdp_adjust_frags_tail_grow. Depending on different page size, different data input/output sizes are used to adapt with different page size. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20250612035032.2207498-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org> Stable-dep-of: e558cca21779 ("bpf, test_run: Subtract size of xdp_frame from allowed metadata size") Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursNFSD: Remove NFSERR_EAGAINChuck Lever5-6/+1
[ Upstream commit c6c209ceb87f64a6ceebe61761951dcbbf4a0baa ] I haven't found an NFSERR_EAGAIN in RFCs 1094, 1813, 7530, or 8881. None of these RFCs have an NFS status code that match the numeric value "11". Based on the meaning of the EAGAIN errno, I presume the use of this status in NFSD means NFS4ERR_DELAY. So replace the one usage of nfserr_eagain, and remove it from NFSD's NFS status conversion tables. As far as I can tell, NFSERR_EAGAIN has existed since the pre-git era, but was not actually used by any code until commit f4e44b393389 ("NFSD: delay unmount source's export after inter-server copy completed."), at which time it become possible for NFSD to return a status code of 11 (which is not valid NFS protocol). Fixes: f4e44b393389 ("NFSD: delay unmount source's export after inter-server copy completed.") Cc: stable@vger.kernel.org Reviewed-by: NeilBrown <neil@brown.name> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
43 hoursnfs_common: factor out nfs_errtbl and nfs_stat_to_errnoMike Snitzer8-160/+109
[ Upstream commit 4806ded4c14c5e8fdc6ce885d83221a78c06a428 ] Common nfs_stat_to_errno() is used by both fs/nfs/nfs2xdr.c and fs/nfs/nfs3xdr.c Will also be used by fs/nfsd/localio.c Signed-off-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com> Stable-dep-of: c6c209ceb87f ("NFSD: Remove NFSERR_EAGAIN") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
43 hoursNFS: trace: show TIMEDOUT instead of 0x6eChen Hanxiao1-0/+1
[ Upstream commit cef48236dfe55fa266d505e8a497963a7bc5ef2a ] __nfs_revalidate_inode may return ETIMEDOUT. print symbol of ETIMEDOUT in nfs trace: before: cat-5191 [005] 119.331127: nfs_revalidate_inode_exit: error=-110 (0x6e) after: cat-1738 [004] 44.365509: nfs_revalidate_inode_exit: error=-110 (TIMEDOUT) Signed-off-by: Chen Hanxiao <chenhx.fnst@fujitsu.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Stable-dep-of: c6c209ceb87f ("NFSD: Remove NFSERR_EAGAIN") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
43 hoursnfsd: provide locking for v4_end_graceNeilBrown4-5/+44
[ Upstream commit 2857bd59feb63fcf40fe4baf55401baea6b4feb4 ] Writing to v4_end_grace can race with server shutdown and result in memory being accessed after it was freed - reclaim_str_hashtbl in particularly. We cannot hold nfsd_mutex across the nfsd4_end_grace() call as that is held while client_tracking_op->init() is called and that can wait for an upcall to nfsdcltrack which can write to v4_end_grace, resulting in a deadlock. nfsd4_end_grace() is also called by the landromat work queue and this doesn't require locking as server shutdown will stop the work and wait for it before freeing anything that nfsd4_end_grace() might access. However, we must be sure that writing to v4_end_grace doesn't restart the work item after shutdown has already waited for it. For this we add a new flag protected with nn->client_lock. It is set only while it is safe to make client tracking calls, and v4_end_grace only schedules work while the flag is set with the spinlock held. So this patch adds a nfsd_net field "client_tracking_active" which is set as described. Another field "grace_end_forced", is set when v4_end_grace is written. After this is set, and providing client_tracking_active is set, the laundromat is scheduled. This "grace_end_forced" field bypasses other checks for whether the grace period has finished. This resolves a race which can result in use-after-free. Reported-by: Li Lingfeng <lilingfeng3@huawei.com> Closes: https://lore.kernel.org/linux-nfs/20250623030015.2353515-1-neil@brown.name/T/#t Fixes: 7f5ef2e900d9 ("nfsd: add a v4_end_grace file to /proc/fs/nfsd") Cc: stable@vger.kernel.org Signed-off-by: NeilBrown <neil@brown.name> Tested-by: Li Lingfeng <lilingfeng3@huawei.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> [ Adjust context ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
43 hoursALSA: ac97: fix a double free in snd_ac97_controller_register()Haoxiang Li1-5/+5
[ Upstream commit 830988b6cf197e6dcffdfe2008c5738e6c6c3c0f ] If ac97_add_adapter() fails, put_device() is the correct way to drop the device reference. kfree() is not required. Add kfree() if idr_alloc() fails and in ac97_adapter_release() to do the cleanup. Found by code review. Fixes: 74426fbff66e ("ALSA: ac97: add an ac97 bus") Cc: stable@vger.kernel.org Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn> Link: https://patch.msgid.link/20251219162845.657525-1-lihaoxiang@isrc.iscas.ac.cn Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
43 hoursALSA: ac97bus: Use guard() for mutex locksTakashi Iwai1-13/+9
[ Upstream commit c07824a14d99c10edd4ec4c389d219af336ecf20 ] Replace the manual mutex lock/unlock pairs with guard() for code simplification. Only code refactoring, and no behavior change. Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20250829151335.7342-18-tiwai@suse.de Stable-dep-of: 830988b6cf19 ("ALSA: ac97: fix a double free in snd_ac97_controller_register()") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
43 hoursksm: use range-walk function to jump over holes in scan_get_next_rmap_itemPedro Demarchi Gomes1-13/+113
[ Upstream commit f5548c318d6520d4fa3c5ed6003eeb710763cbc5 ] Currently, scan_get_next_rmap_item() walks every page address in a VMA to locate mergeable pages. This becomes highly inefficient when scanning large virtual memory areas that contain mostly unmapped regions, causing ksmd to use large amount of cpu without deduplicating much pages. This patch replaces the per-address lookup with a range walk using walk_page_range(). The range walker allows KSM to skip over entire unmapped holes in a VMA, avoiding unnecessary lookups. This problem was previously discussed in [1]. Consider the following test program which creates a 32 TiB mapping in the virtual address space but only populates a single page: /* 32 TiB */ const size_t size = 32ul * 1024 * 1024 * 1024 * 1024; int main() { char *area = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_NORESERVE | MAP_PRIVATE | MAP_ANON, -1, 0); if (area == MAP_FAILED) { perror("mmap() failed\n"); return -1; } /* Populate a single page such that we get an anon_vma. */ *area = 0; /* Enable KSM. */ madvise(area, size, MADV_MERGEABLE); pause(); return 0; } $ ./ksm-sparse & $ echo 1 > /sys/kernel/mm/ksm/run Without this patch ksmd uses 100% of the cpu for a long time (more then 1 hour in my test machine) scanning all the 32 TiB virtual address space that contain only one mapped page. This makes ksmd essentially deadlocked not able to deduplicate anything of value. With this patch ksmd walks only the one mapped page and skips the rest of the 32 TiB virtual address space, making the scan fast using little cpu. Link: https://lkml.kernel.org/r/20251023035841.41406-1-pedrodemargomes@gmail.com Link: https://lkml.kernel.org/r/20251022153059.22763-1-pedrodemargomes@gmail.com Link: https://lore.kernel.org/linux-mm/423de7a3-1c62-4e72-8e79-19a6413e420c@redhat.com/ [1] Fixes: 31dbd01f3143 ("ksm: Kernel SamePage Merging") Signed-off-by: Pedro Demarchi Gomes <pedrodemargomes@gmail.com> Co-developed-by: David Hildenbrand <david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reported-by: craftfever <craftfever@airmail.cc> Closes: https://lkml.kernel.org/r/020cf8de6e773bb78ba7614ef250129f11a63781@murena.io Suggested-by: David Hildenbrand <david@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: xu xin <xu.xin16@zte.com.cn> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ replace pmdp_get_lockless with pmd_read_atomic and pmdp_get with READ_ONCE(*pmdp) ] Signed-off-by: Pedro Demarchi Gomes <pedrodemargomes@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
43 hoursmm/pagewalk: add walk_page_range_vma()David Hildenbrand2-0/+23
[ Upstream commit e07cda5f232fac4de0925d8a4c92e51e41fa2f6e ] Let's add walk_page_range_vma(), which is similar to walk_page_vma(), however, is only interested in a subset of the VMA range. To be used in KSM code to stop using follow_page() next. Link: https://lkml.kernel.org/r/20221021101141.84170-8-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Peter Xu <peterx@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: f5548c318d6 ("ksm: use range-walk function to jump over holes in scan_get_next_rmap_item") Signed-off-by: Pedro Demarchi Gomes <pedrodemargomes@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
43 hourspinctrl: qcom: lpass-lpi: mark the GPIO controller as sleepingBartosz Golaszewski1-1/+1
commit ebc18e9854e5a2b62a041fb57b216a903af45b85 upstream. The gpio_chip settings in this driver say the controller can't sleep but it actually uses a mutex for synchronization. This triggers the following BUG(): [ 9.233659] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:281 [ 9.233665] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 554, name: (udev-worker) [ 9.233669] preempt_count: 1, expected: 0 [ 9.233673] RCU nest depth: 0, expected: 0 [ 9.233688] Tainted: [W]=WARN [ 9.233690] Hardware name: Dell Inc. Latitude 7455/0FK7MX, BIOS 2.10.1 05/20/2025 [ 9.233694] Call trace: [ 9.233696] show_stack+0x24/0x38 (C) [ 9.233709] dump_stack_lvl+0x40/0x88 [ 9.233716] dump_stack+0x18/0x24 [ 9.233722] __might_resched+0x148/0x160 [ 9.233731] __might_sleep+0x38/0x98 [ 9.233736] mutex_lock+0x30/0xd8 [ 9.233749] lpi_config_set+0x2e8/0x3c8 [pinctrl_lpass_lpi] [ 9.233757] lpi_gpio_direction_output+0x58/0x90 [pinctrl_lpass_lpi] [ 9.233761] gpiod_direction_output_raw_commit+0x110/0x428 [ 9.233772] gpiod_direction_output_nonotify+0x234/0x358 [ 9.233779] gpiod_direction_output+0x38/0xd0 [ 9.233786] gpio_shared_proxy_direction_output+0xb8/0x2a8 [gpio_shared_proxy] [ 9.233792] gpiod_direction_output_raw_commit+0x110/0x428 [ 9.233799] gpiod_direction_output_nonotify+0x234/0x358 [ 9.233806] gpiod_configure_flags+0x2c0/0x580 [ 9.233812] gpiod_find_and_request+0x358/0x4f8 [ 9.233819] gpiod_get_index+0x7c/0x98 [ 9.233826] devm_gpiod_get+0x34/0xb0 [ 9.233829] reset_gpio_probe+0x58/0x128 [reset_gpio] [ 9.233836] auxiliary_bus_probe+0xb0/0xf0 [ 9.233845] really_probe+0x14c/0x450 [ 9.233853] __driver_probe_device+0xb0/0x188 [ 9.233858] driver_probe_device+0x4c/0x250 [ 9.233863] __driver_attach+0xf8/0x2a0 [ 9.233868] bus_for_each_dev+0xf8/0x158 [ 9.233872] driver_attach+0x30/0x48 [ 9.233876] bus_add_driver+0x158/0x2b8 [ 9.233880] driver_register+0x74/0x118 [ 9.233886] __auxiliary_driver_register+0x94/0xe8 [ 9.233893] init_module+0x34/0xfd0 [reset_gpio] [ 9.233898] do_one_initcall+0xec/0x300 [ 9.233903] do_init_module+0x64/0x260 [ 9.233910] load_module+0x16c4/0x1900 [ 9.233915] __arm64_sys_finit_module+0x24c/0x378 [ 9.233919] invoke_syscall+0x4c/0xe8 [ 9.233925] el0_svc_common+0x8c/0xf0 [ 9.233929] do_el0_svc+0x28/0x40 [ 9.233934] el0_svc+0x38/0x100 [ 9.233938] el0t_64_sync_handler+0x84/0x130 [ 9.233943] el0t_64_sync+0x17c/0x180 Mark the controller as sleeping. Fixes: 6e261d1090d6 ("pinctrl: qcom: Add sm8250 lpass lpi pinctrl driver") Cc: stable@vger.kernel.org Reported-by: Val Packett <val@packett.cool> Closes: https://lore.kernel.org/all/98c0f185-b0e0-49ea-896c-f3972dd011ca@packett.cool/ Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Linus Walleij <linusw@kernel.org> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
43 hoursarp: do not assume dev_hard_header() does not change skb->headEric Dumazet1-3/+4
[ Upstream commit c92510f5e3f82ba11c95991824a41e59a9c5ed81 ] arp_create() is the only dev_hard_header() caller making assumption about skb->head being unchanged. A recent commit broke this assumption. Initialize @arp pointer after dev_hard_header() call. Fixes: db5b4e39c4e6 ("ip6_gre: make ip6gre_header() robust") Reported-by: syzbot+58b44a770a1585795351@syzkaller.appspotmail.com Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260107212250.384552-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursnet: enetc: fix build warning when PAGE_SIZE is greater than 128KWei Fang1-2/+2
[ Upstream commit 4b5bdabb5449b652122e43f507f73789041d4abe ] The max buffer size of ENETC RX BD is 0xFFFF bytes, so if the PAGE_SIZE is greater than 128K, ENETC_RXB_DMA_SIZE and ENETC_RXB_DMA_SIZE_XDP will be greater than 0xFFFF, thus causing a build warning. This will not cause any practical issues because ENETC is currently only used on the ARM64 platform, and the max PAGE_SIZE is 64K. So this patch is only for fixing the build warning that occurs when compiling ENETC drivers for other platforms. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202601050637.kHEKKOG7-lkp@intel.com/ Fixes: e59bc32df2e9 ("net: enetc: correct the value of ENETC_RXB_TRUESIZE") Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20260107091204.1980222-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursnet: usb: pegasus: fix memory leak in update_eth_regs_async()Petko Manolov1-0/+2
[ Upstream commit afa27621a28af317523e0836dad430bec551eb54 ] When asynchronously writing to the device registers and if usb_submit_urb() fail, the code fail to release allocated to this point resources. Fixes: 323b34963d11 ("drivers: net: usb: pegasus: fix control urb submission") Signed-off-by: Petko Manolov <petkan@nucleusys.com> Link: https://patch.msgid.link/20260106084821.3746677-1-petko.manolov@konsulko.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursnet/sched: sch_qfq: Fix NULL deref when deactivating inactive aggregate in ↵Xiang Mei1-1/+1
qfq_reset [ Upstream commit c1d73b1480235731e35c81df70b08f4714a7d095 ] `qfq_class->leaf_qdisc->q.qlen > 0` does not imply that the class itself is active. Two qfq_class objects may point to the same leaf_qdisc. This happens when: 1. one QFQ qdisc is attached to the dev as the root qdisc, and 2. another QFQ qdisc is temporarily referenced (e.g., via qdisc_get() / qdisc_put()) and is pending to be destroyed, as in function tc_new_tfilter. When packets are enqueued through the root QFQ qdisc, the shared leaf_qdisc->q.qlen increases. At the same time, the second QFQ qdisc triggers qdisc_put and qdisc_destroy: the qdisc enters qfq_reset() with its own q->q.qlen == 0, but its class's leaf qdisc->q.qlen > 0. Therefore, the qfq_reset would wrongly deactivate an inactive aggregate and trigger a null-deref in qfq_deactivate_agg: [ 0.903172] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 0.903571] #PF: supervisor write access in kernel mode [ 0.903860] #PF: error_code(0x0002) - not-present page [ 0.904177] PGD 10299b067 P4D 10299b067 PUD 10299c067 PMD 0 [ 0.904502] Oops: Oops: 0002 [#1] SMP NOPTI [ 0.904737] CPU: 0 UID: 0 PID: 135 Comm: exploit Not tainted 6.19.0-rc3+ #2 NONE [ 0.905157] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014 [ 0.905754] RIP: 0010:qfq_deactivate_agg (include/linux/list.h:992 (discriminator 2) include/linux/list.h:1006 (discriminator 2) net/sched/sch_qfq.c:1367 (discriminator 2) net/sched/sch_qfq.c:1393 (discriminator 2)) [ 0.906046] Code: 0f 84 4d 01 00 00 48 89 70 18 8b 4b 10 48 c7 c2 ff ff ff ff 48 8b 78 08 48 d3 e2 48 21 f2 48 2b 13 48 8b 30 48 d3 ea 8b 4b 18 0 Code starting with the faulting instruction =========================================== 0: 0f 84 4d 01 00 00 je 0x153 6: 48 89 70 18 mov %rsi,0x18(%rax) a: 8b 4b 10 mov 0x10(%rbx),%ecx d: 48 c7 c2 ff ff ff ff mov $0xffffffffffffffff,%rdx 14: 48 8b 78 08 mov 0x8(%rax),%rdi 18: 48 d3 e2 shl %cl,%rdx 1b: 48 21 f2 and %rsi,%rdx 1e: 48 2b 13 sub (%rbx),%rdx 21: 48 8b 30 mov (%rax),%rsi 24: 48 d3 ea shr %cl,%rdx 27: 8b 4b 18 mov 0x18(%rbx),%ecx ... [ 0.907095] RSP: 0018:ffffc900004a39a0 EFLAGS: 00010246 [ 0.907368] RAX: ffff8881043a0880 RBX: ffff888102953340 RCX: 0000000000000000 [ 0.907723] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 0.908100] RBP: ffff888102952180 R08: 0000000000000000 R09: 0000000000000000 [ 0.908451] R10: ffff8881043a0000 R11: 0000000000000000 R12: ffff888102952000 [ 0.908804] R13: ffff888102952180 R14: ffff8881043a0ad8 R15: ffff8881043a0880 [ 0.909179] FS: 000000002a1a0380(0000) GS:ffff888196d8d000(0000) knlGS:0000000000000000 [ 0.909572] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.909857] CR2: 0000000000000000 CR3: 0000000102993002 CR4: 0000000000772ef0 [ 0.910247] PKRU: 55555554 [ 0.910391] Call Trace: [ 0.910527] <TASK> [ 0.910638] qfq_reset_qdisc (net/sched/sch_qfq.c:357 net/sched/sch_qfq.c:1485) [ 0.910826] qdisc_reset (include/linux/skbuff.h:2195 include/linux/skbuff.h:2501 include/linux/skbuff.h:3424 include/linux/skbuff.h:3430 net/sched/sch_generic.c:1036) [ 0.911040] __qdisc_destroy (net/sched/sch_generic.c:1076) [ 0.911236] tc_new_tfilter (net/sched/cls_api.c:2447) [ 0.911447] rtnetlink_rcv_msg (net/core/rtnetlink.c:6958) [ 0.911663] ? __pfx_rtnetlink_rcv_msg (net/core/rtnetlink.c:6861) [ 0.911894] netlink_rcv_skb (net/netlink/af_netlink.c:2550) [ 0.912100] netlink_unicast (net/netlink/af_netlink.c:1319 net/netlink/af_netlink.c:1344) [ 0.912296] ? __alloc_skb (net/core/skbuff.c:706) [ 0.912484] netlink_sendmsg (net/netlink/af_netlink.c:1894) [ 0.912682] sock_write_iter (net/socket.c:727 (discriminator 1) net/socket.c:742 (discriminator 1) net/socket.c:1195 (discriminator 1)) [ 0.912880] vfs_write (fs/read_write.c:593 fs/read_write.c:686) [ 0.913077] ksys_write (fs/read_write.c:738) [ 0.913252] do_syscall_64 (arch/x86/entry/syscall_64.c:63 (discriminator 1) arch/x86/entry/syscall_64.c:94 (discriminator 1)) [ 0.913438] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:131) [ 0.913687] RIP: 0033:0x424c34 [ 0.913844] Code: 89 02 48 c7 c0 ff ff ff ff eb bd 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 80 3d 2d 44 09 00 00 74 13 b8 01 00 00 00 0f 05 9 Code starting with the faulting instruction =========================================== 0: 89 02 mov %eax,(%rdx) 2: 48 c7 c0 ff ff ff ff mov $0xffffffffffffffff,%rax 9: eb bd jmp 0xffffffffffffffc8 b: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1) 12: 00 00 00 15: 90 nop 16: f3 0f 1e fa endbr64 1a: 80 3d 2d 44 09 00 00 cmpb $0x0,0x9442d(%rip) # 0x9444e 21: 74 13 je 0x36 23: b8 01 00 00 00 mov $0x1,%eax 28: 0f 05 syscall 2a: 09 .byte 0x9 [ 0.914807] RSP: 002b:00007ffea1938b78 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 [ 0.915197] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 0000000000424c34 [ 0.915556] RDX: 000000000000003c RSI: 000000002af378c0 RDI: 0000000000000003 [ 0.915912] RBP: 00007ffea1938bc0 R08: 00000000004b8820 R09: 0000000000000000 [ 0.916297] R10: 0000000000000001 R11: 0000000000000202 R12: 00007ffea1938d28 [ 0.916652] R13: 00007ffea1938d38 R14: 00000000004b3828 R15: 0000000000000001 [ 0.917039] </TASK> [ 0.917158] Modules linked in: [ 0.917316] CR2: 0000000000000000 [ 0.917484] ---[ end trace 0000000000000000 ]--- [ 0.917717] RIP: 0010:qfq_deactivate_agg (include/linux/list.h:992 (discriminator 2) include/linux/list.h:1006 (discriminator 2) net/sched/sch_qfq.c:1367 (discriminator 2) net/sched/sch_qfq.c:1393 (discriminator 2)) [ 0.917978] Code: 0f 84 4d 01 00 00 48 89 70 18 8b 4b 10 48 c7 c2 ff ff ff ff 48 8b 78 08 48 d3 e2 48 21 f2 48 2b 13 48 8b 30 48 d3 ea 8b 4b 18 0 Code starting with the faulting instruction =========================================== 0: 0f 84 4d 01 00 00 je 0x153 6: 48 89 70 18 mov %rsi,0x18(%rax) a: 8b 4b 10 mov 0x10(%rbx),%ecx d: 48 c7 c2 ff ff ff ff mov $0xffffffffffffffff,%rdx 14: 48 8b 78 08 mov 0x8(%rax),%rdi 18: 48 d3 e2 shl %cl,%rdx 1b: 48 21 f2 and %rsi,%rdx 1e: 48 2b 13 sub (%rbx),%rdx 21: 48 8b 30 mov (%rax),%rsi 24: 48 d3 ea shr %cl,%rdx 27: 8b 4b 18 mov 0x18(%rbx),%ecx ... [ 0.918902] RSP: 0018:ffffc900004a39a0 EFLAGS: 00010246 [ 0.919198] RAX: ffff8881043a0880 RBX: ffff888102953340 RCX: 0000000000000000 [ 0.919559] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 0.919908] RBP: ffff888102952180 R08: 0000000000000000 R09: 0000000000000000 [ 0.920289] R10: ffff8881043a0000 R11: 0000000000000000 R12: ffff888102952000 [ 0.920648] R13: ffff888102952180 R14: ffff8881043a0ad8 R15: ffff8881043a0880 [ 0.921014] FS: 000000002a1a0380(0000) GS:ffff888196d8d000(0000) knlGS:0000000000000000 [ 0.921424] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.921710] CR2: 0000000000000000 CR3: 0000000102993002 CR4: 0000000000772ef0 [ 0.922097] PKRU: 55555554 [ 0.922240] Kernel panic - not syncing: Fatal exception [ 0.922590] Kernel Offset: disabled Fixes: 0545a3037773 ("pkt_sched: QFQ - quick fair queue scheduler") Signed-off-by: Xiang Mei <xmei5@asu.edu> Link: https://patch.msgid.link/20260106034100.1780779-1-xmei5@asu.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursHID: quirks: work around VID/PID conflict for appledisplayRené Rebe1-0/+9
[ Upstream commit c7fabe4ad9219866c203164a214c474c95b36bf2 ] For years I wondered why the Apple Cinema Display driver would not just work for me. Turns out the hidraw driver instantly takes it over. Fix by adding appledisplay VID/PIDs to hid_have_special_driver. Fixes: 069e8a65cd79 ("Driver for Apple Cinema Display") Signed-off-by: René Rebe <rene@exactco.de> Signed-off-by: Jiri Kosina <jkosina@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursnet: fix memory leak in skb_segment_list for GRO packetsMohammad Heib1-3/+5
[ Upstream commit 238e03d0466239410b72294b79494e43d4fabe77 ] When skb_segment_list() is called during packet forwarding, it handles packets that were aggregated by the GRO engine. Historically, the segmentation logic in skb_segment_list assumes that individual segments are split from a parent SKB and may need to carry their own socket memory accounting. Accordingly, the code transfers truesize from the parent to the newly created segments. Prior to commit ed4cccef64c1 ("gro: fix ownership transfer"), this truesize subtraction in skb_segment_list() was valid because fragments still carry a reference to the original socket. However, commit ed4cccef64c1 ("gro: fix ownership transfer") changed this behavior by ensuring that fraglist entries are explicitly orphaned (skb->sk = NULL) to prevent illegal orphaning later in the stack. This change meant that the entire socket memory charge remained with the head SKB, but the corresponding accounting logic in skb_segment_list() was never updated. As a result, the current code unconditionally adds each fragment's truesize to delta_truesize and subtracts it from the parent SKB. Since the fragments are no longer charged to the socket, this subtraction results in an effective under-count of memory when the head is freed. This causes sk_wmem_alloc to remain non-zero, preventing socket destruction and leading to a persistent memory leak. The leak can be observed via KMEMLEAK when tearing down the networking environment: unreferenced object 0xffff8881e6eb9100 (size 2048): comm "ping", pid 6720, jiffies 4295492526 backtrace: kmem_cache_alloc_noprof+0x5c6/0x800 sk_prot_alloc+0x5b/0x220 sk_alloc+0x35/0xa00 inet6_create.part.0+0x303/0x10d0 __sock_create+0x248/0x640 __sys_socket+0x11b/0x1d0 Since skb_segment_list() is exclusively used for SKB_GSO_FRAGLIST packets constructed by GRO, the truesize adjustment is removed. The call to skb_release_head_state() must be preserved. As documented in commit cf673ed0e057 ("net: fix fraglist segmentation reference count leak"), it is still required to correctly drop references to SKB extensions that may be overwritten during __copy_skb_header(). Fixes: ed4cccef64c1 ("gro: fix ownership transfer") Signed-off-by: Mohammad Heib <mheib@redhat.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260104213101.352887-1-mheib@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursbnxt_en: Fix potential data corruption with HW GRO/LROSrijit Bose2-6/+13
[ Upstream commit ffeafa65b2b26df2f5b5a6118d3174f17bd12ec5 ] Fix the max number of bits passed to find_first_zero_bit() in bnxt_alloc_agg_idx(). We were incorrectly passing the number of long words. find_first_zero_bit() may fail to find a zero bit and cause a wrong ID to be used. If the wrong ID is already in use, this can cause data corruption. Sometimes an error like this can also be seen: bnxt_en 0000:83:00.0 enp131s0np0: TPA end agg_buf 2 != expected agg_bufs 1 Fix it by passing the correct number of bits MAX_TPA_P5. Use DECLARE_BITMAP() to more cleanly define the bitmap. Add a sanity check to warn if a bit cannot be found and reset the ring [MChan]. Fixes: ec4d8e7cf024 ("bnxt_en: Add TPA ID mapping logic for 57500 chips.") Reviewed-by: Ray Jui <ray.jui@broadcom.com> Signed-off-by: Srijit Bose <srijit.bose@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20251231083625.3911652-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hourseth: bnxt: move and rename reset helpersJakub Kicinski1-36/+36
[ Upstream commit fea2993aecd74d5d11ede1ebbd60e478ebfed996 ] Move the reset helpers, subsequent patches will need some of them on the Tx path. While at it rename bnxt_sched_reset(), on more recent chips it schedules a queue reset, instead of a fuller reset. Link: https://lore.kernel.org/r/20230720010440.1967136-2-kuba@kernel.org Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: ffeafa65b2b2 ("bnxt_en: Fix potential data corruption with HW GRO/LRO") Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursnet: wwan: iosm: Fix memory leak in ipc_mux_deinit()Zilin Guan1-0/+6
[ Upstream commit 92e6e0a87f6860a4710f9494f8c704d498ae60f8 ] Commit 1f52d7b62285 ("net: wwan: iosm: Enable M.2 7360 WWAN card support") allocated memory for pp_qlt in ipc_mux_init() but did not free it in ipc_mux_deinit(). This results in a memory leak when the driver is unloaded. Free the allocated memory in ipc_mux_deinit() to fix the leak. Fixes: 1f52d7b62285 ("net: wwan: iosm: Enable M.2 7360 WWAN card support") Co-developed-by: Jianhao Xu <jianhao.xu@seu.edu.cn> Signed-off-by: Jianhao Xu <jianhao.xu@seu.edu.cn> Signed-off-by: Zilin Guan <zilin@seu.edu.cn> Reviewed-by: Loic Poulain <loic.poulain@oss.qualcomm.com> Link: https://patch.msgid.link/20251230071853.1062223-1-zilin@seu.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursnet/mlx5e: Don't print error message due to invalid moduleGal Pressman1-1/+2
[ Upstream commit 144297e2a24e3e54aee1180ec21120ea38822b97 ] Dumping module EEPROM on newer modules is supported through the netlink interface only. Querying with old userspace ethtool (or other tools, such as 'lshw') which still uses the ioctl interface results in an error message that could flood dmesg (in addition to the expected error return value). The original message was added under the assumption that the driver should be able to handle all module types, but now that such flows are easily triggered from userspace, it doesn't serve its purpose. Change the log level of the print in mlx5_query_module_eeprom() to debug. Fixes: bb64143eee8c ("net/mlx5e: Add ethtool support for dump module EEPROM") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20251225132717.358820-5-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursnetdev: preserve NETIF_F_ALL_FOR_ALL across TSO updatesDi Zhu1-1/+2
[ Upstream commit 02d1e1a3f9239cdb3ecf2c6d365fb959d1bf39df ] Directly increment the TSO features incurs a side effect: it will also directly clear the flags in NETIF_F_ALL_FOR_ALL on the master device, which can cause issues such as the inability to enable the nocache copy feature on the bonding driver. The fix is to include NETIF_F_ALL_FOR_ALL in the update mask, thereby preventing it from being cleared. Fixes: b0ce3508b25e ("bonding: allow TSO being set on bonding master") Signed-off-by: Di Zhu <zhud@hygon.cn> Link: https://patch.msgid.link/20251224012224.56185-1-zhud@hygon.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursnet: sock: fix hardened usercopy panic in sock_recv_errqueueWeiming Shi1-3/+4
[ Upstream commit 2a71a1a8d0ed718b1c7a9ac61f07e5755c47ae20 ] skbuff_fclone_cache was created without defining a usercopy region, [1] unlike skbuff_head_cache which properly whitelists the cb[] field. [2] This causes a usercopy BUG() when CONFIG_HARDENED_USERCOPY is enabled and the kernel attempts to copy sk_buff.cb data to userspace via sock_recv_errqueue() -> put_cmsg(). The crash occurs when: 1. TCP allocates an skb using alloc_skb_fclone() (from skbuff_fclone_cache) [1] 2. The skb is cloned via skb_clone() using the pre-allocated fclone [3] 3. The cloned skb is queued to sk_error_queue for timestamp reporting 4. Userspace reads the error queue via recvmsg(MSG_ERRQUEUE) 5. sock_recv_errqueue() calls put_cmsg() to copy serr->ee from skb->cb [4] 6. __check_heap_object() fails because skbuff_fclone_cache has no usercopy whitelist [5] When cloned skbs allocated from skbuff_fclone_cache are used in the socket error queue, accessing the sock_exterr_skb structure in skb->cb via put_cmsg() triggers a usercopy hardening violation: [ 5.379589] usercopy: Kernel memory exposure attempt detected from SLUB object 'skbuff_fclone_cache' (offset 296, size 16)! [ 5.382796] kernel BUG at mm/usercopy.c:102! [ 5.383923] Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI [ 5.384903] CPU: 1 UID: 0 PID: 138 Comm: poc_put_cmsg Not tainted 6.12.57 #7 [ 5.384903] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [ 5.384903] RIP: 0010:usercopy_abort+0x6c/0x80 [ 5.384903] Code: 1a 86 51 48 c7 c2 40 15 1a 86 41 52 48 c7 c7 c0 15 1a 86 48 0f 45 d6 48 c7 c6 80 15 1a 86 48 89 c1 49 0f 45 f3 e8 84 27 88 ff <0f> 0b 490 [ 5.384903] RSP: 0018:ffffc900006f77a8 EFLAGS: 00010246 [ 5.384903] RAX: 000000000000006f RBX: ffff88800f0ad2a8 RCX: 1ffffffff0f72e74 [ 5.384903] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff87b973a0 [ 5.384903] RBP: 0000000000000010 R08: 0000000000000000 R09: fffffbfff0f72e74 [ 5.384903] R10: 0000000000000003 R11: 79706f6372657375 R12: 0000000000000001 [ 5.384903] R13: ffff88800f0ad2b8 R14: ffffea00003c2b40 R15: ffffea00003c2b00 [ 5.384903] FS: 0000000011bc4380(0000) GS:ffff8880bf100000(0000) knlGS:0000000000000000 [ 5.384903] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5.384903] CR2: 000056aa3b8e5fe4 CR3: 000000000ea26004 CR4: 0000000000770ef0 [ 5.384903] PKRU: 55555554 [ 5.384903] Call Trace: [ 5.384903] <TASK> [ 5.384903] __check_heap_object+0x9a/0xd0 [ 5.384903] __check_object_size+0x46c/0x690 [ 5.384903] put_cmsg+0x129/0x5e0 [ 5.384903] sock_recv_errqueue+0x22f/0x380 [ 5.384903] tls_sw_recvmsg+0x7ed/0x1960 [ 5.384903] ? srso_alias_return_thunk+0x5/0xfbef5 [ 5.384903] ? schedule+0x6d/0x270 [ 5.384903] ? srso_alias_return_thunk+0x5/0xfbef5 [ 5.384903] ? mutex_unlock+0x81/0xd0 [ 5.384903] ? __pfx_mutex_unlock+0x10/0x10 [ 5.384903] ? __pfx_tls_sw_recvmsg+0x10/0x10 [ 5.384903] ? _raw_spin_lock_irqsave+0x8f/0xf0 [ 5.384903] ? _raw_read_unlock_irqrestore+0x20/0x40 [ 5.384903] ? srso_alias_return_thunk+0x5/0xfbef5 The crash offset 296 corresponds to skb2->cb within skbuff_fclones: - sizeof(struct sk_buff) = 232 - offsetof(struct sk_buff, cb) = 40 - offset of skb2.cb in fclones = 232 + 40 = 272 - crash offset 296 = 272 + 24 (inside sock_exterr_skb.ee) This patch uses a local stack variable as a bounce buffer to avoid the hardened usercopy check failure. [1] https://elixir.bootlin.com/linux/v6.12.62/source/net/ipv4/tcp.c#L885 [2] https://elixir.bootlin.com/linux/v6.12.62/source/net/core/skbuff.c#L5104 [3] https://elixir.bootlin.com/linux/v6.12.62/source/net/core/skbuff.c#L5566 [4] https://elixir.bootlin.com/linux/v6.12.62/source/net/core/skbuff.c#L5491 [5] https://elixir.bootlin.com/linux/v6.12.62/source/mm/slub.c#L5719 Fixes: 6d07d1cd300f ("usercopy: Restrict non-usercopy caches to size 0") Reported-by: Xiang Mei <xmei5@asu.edu> Signed-off-by: Weiming Shi <bestswngs@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20251223203534.1392218-2-bestswngs@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursinet: ping: Fix icmp out countingyuan.gao1-3/+1
[ Upstream commit 4c0856c225b39b1def6c9a6bc56faca79550da13 ] When the ping program uses an IPPROTO_ICMP socket to send ICMP_ECHO messages, ICMP_MIB_OUTMSGS is counted twice. ping_v4_sendmsg ping_v4_push_pending_frames ip_push_pending_frames ip_finish_skb __ip_make_skb icmp_out_count(net, icmp_type); // first count icmp_out_count(sock_net(sk), user_icmph.type); // second count However, when the ping program uses an IPPROTO_RAW socket, ICMP_MIB_OUTMSGS is counted correctly only once. Therefore, the first count should be removed. Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind") Signed-off-by: yuan.gao <yuan.gao@ucloud.cn> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20251224063145.3615282-1-yuan.gao@ucloud.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursnet: mscc: ocelot: Fix crash when adding interface under a lagJerry Wu1-2/+4
[ Upstream commit 34f3ff52cb9fa7dbf04f5c734fcc4cb6ed5d1a95 ] Commit 15faa1f67ab4 ("lan966x: Fix crash when adding interface under a lag") fixed a similar issue in the lan966x driver caused by a NULL pointer dereference. The ocelot_set_aggr_pgids() function in the ocelot driver has similar logic and is susceptible to the same crash. This issue specifically affects the ocelot_vsc7514.c frontend, which leaves unused ports as NULL pointers. The felix_vsc9959.c frontend is unaffected as it uses the DSA framework which registers all ports. Fix this by checking if the port pointer is valid before accessing it. Fixes: 528d3f190c98 ("net: mscc: ocelot: drop the use of the "lags" array") Signed-off-by: Jerry Wu <w.7erry@foxmail.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/tencent_75EF812B305E26B0869C673DD1160866C90A@qq.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursbridge: fix C-VLAN preservation in 802.1ad vlan_tunnel egressAlexandre Knecht1-4/+7
[ Upstream commit 3128df6be147768fe536986fbb85db1d37806a9f ] When using an 802.1ad bridge with vlan_tunnel, the C-VLAN tag is incorrectly stripped from frames during egress processing. br_handle_egress_vlan_tunnel() uses skb_vlan_pop() to remove the S-VLAN from hwaccel before VXLAN encapsulation. However, skb_vlan_pop() also moves any "next" VLAN from the payload into hwaccel: /* move next vlan tag to hw accel tag */ __skb_vlan_pop(skb, &vlan_tci); __vlan_hwaccel_put_tag(skb, vlan_proto, vlan_tci); For QinQ frames where the C-VLAN sits in the payload, this moves it to hwaccel where it gets lost during VXLAN encapsulation. Fix by calling __vlan_hwaccel_clear_tag() directly, which clears only the hwaccel S-VLAN and leaves the payload untouched. This path is only taken when vlan_tunnel is enabled and tunnel_info is configured, so 802.1Q bridges are unaffected. Tested with 802.1ad bridge + VXLAN vlan_tunnel, verified C-VLAN preserved in VXLAN payload via tcpdump. Fixes: 11538d039ac6 ("bridge: vlan dst_metadata hooks in ingress and egress paths") Signed-off-by: Alexandre Knecht <knecht.alexandre@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20251228020057.2788865-1-knecht.alexandre@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursnet: marvell: prestera: fix NULL dereference on devlink_alloc() failureAlok Tiwari1-0/+2
[ Upstream commit a428e0da1248c353557970848994f35fd3f005e2 ] devlink_alloc() may return NULL on allocation failure, but prestera_devlink_alloc() unconditionally calls devlink_priv() on the returned pointer. This leads to a NULL pointer dereference if devlink allocation fails. Add a check for a NULL devlink pointer and return NULL early to avoid the crash. Fixes: 34dd1710f5a3 ("net: marvell: prestera: Add basic devlink support") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Acked-by: Elad Nachman <enachman@marvell.com> Link: https://patch.msgid.link/20251230052124.897012-1-alok.a.tiwari@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursnetfilter: nf_conncount: update last_gc only when GC has been performedFernando Fernandez Mancera1-1/+1
[ Upstream commit 7811ba452402d58628e68faedf38745b3d485e3c ] Currently last_gc is being updated everytime a new connection is tracked, that means that it is updated even if a GC wasn't performed. With a sufficiently high packet rate, it is possible to always bypass the GC, causing the list to grow infinitely. Update the last_gc value only when a GC has been actually performed. Fixes: d265929930e2 ("netfilter: nf_conncount: reduce unnecessary GC") Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursnetfilter: nf_tables: fix memory leak in nf_tables_newrule()Zilin Guan1-1/+2
[ Upstream commit d077e8119ddbb4fca67540f1a52453631a47f221 ] In nf_tables_newrule(), if nft_use_inc() fails, the function jumps to the err_release_rule label without freeing the allocated flow, leading to a memory leak. Fix this by adding a new label err_destroy_flow and jumping to it when nft_use_inc() fails. This ensures that the flow is properly released in this error case. Fixes: 1689f25924ada ("netfilter: nf_tables: report use refcount overflow") Signed-off-by: Zilin Guan <zilin@seu.edu.cn> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursnetfilter: nft_synproxy: avoid possible data-race on update operationFernando Fernandez Mancera1-3/+3
[ Upstream commit 36a3200575642846a96436d503d46544533bb943 ] During nft_synproxy eval we are reading nf_synproxy_info struct which can be modified on update operation concurrently. As nf_synproxy_info struct fits in 32 bits, use READ_ONCE/WRITE_ONCE annotations. Fixes: ee394f96ad75 ("netfilter: nft_synproxy: add synproxy stateful object support") Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursarm64: dts: imx8mp: Fix LAN8740Ai PHY reference clock on DH electronics ↵Marek Vasut1-0/+1
i.MX8M Plus DHCOM [ Upstream commit c63749a7ddc59ac6ec0b05abfa0a21af9f2c1d38 ] Add missing 'clocks' property to LAN8740Ai PHY node, to allow the PHY driver to manage LAN8740Ai CLKIN reference clock supply. This fixes sporadic link bouncing caused by interruptions on the PHY reference clock, by letting the PHY driver manage the reference clock and assure there are no interruptions. This follows the matching PHY driver recommendation described in commit bedd8d78aba3 ("net: phy: smsc: LAN8710/20: add phy refclk in support") Fixes: 8d6712695bc8 ("arm64: dts: imx8mp: Add support for DH electronics i.MX8M Plus DHCOM and PDK2") Signed-off-by: Marek Vasut <marek.vasut@mailbox.org> Tested-by: Christoph Niedermaier <cniedermaier@dh-electronics.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursARM: dts: imx6q-ba16: fix RTC interrupt levelIan Ray1-1/+1
[ Upstream commit e6a4eedd49ce27c16a80506c66a04707e0ee0116 ] RTC interrupt level should be set to "LOW". This was revealed by the introduction of commit: f181987ef477 ("rtc: m41t80: use IRQ flags obtained from fwnode") which changed the way IRQ type is obtained. Fixes: 56c27310c1b4 ("ARM: dts: imx: Add Advantech BA-16 Qseven module") Signed-off-by: Ian Ray <ian.ray@gehealthcare.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursarm64: dts: add off-on-delay-us for usdhc2 regulatorHaibo Chen1-0/+1
[ Upstream commit ca643894a37a25713029b36cfe7d1bae515cac08 ] For SD card, according to the spec requirement, for sd card power reset operation, it need sd card supply voltage to be lower than 0.5v and keep over 1ms, otherwise, next time power back the sd card supply voltage to 3.3v, sd card can't support SD3.0 mode again. To match such requirement on imx8qm-mek board, add 4.8ms delay between sd power off and power on. Fixes: 307fd14d4b14 ("arm64: dts: imx: add imx8qm mek support") Reviewed-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Haibo Chen <haibo.chen@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursscsi: Revert "scsi: libsas: Fix exp-attached device scan after probe failure ↵Xingui Yang1-14/+0
scanned in again after probe failed" [ Upstream commit 278712d20bc8ec29d1ad6ef9bdae9000ef2c220c ] This reverts commit ab2068a6fb84751836a84c26ca72b3beb349619d. When probing the exp-attached sata device, libsas/libata will issue a hard reset in sas_probe_sata() -> ata_sas_async_probe(), then a broadcast event will be received after the disk probe fails, and this commit causes the probe will be re-executed on the disk, and a faulty disk may get into an indefinite loop of probe. Therefore, revert this commit, although it can fix some temporary issues with disk probe failure. Signed-off-by: Xingui Yang <yangxingui@huawei.com> Reviewed-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://patch.msgid.link/20251202065627.140361-1-yangxingui@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursscsi: ufs: core: Fix EH failure after W-LUN resume errorBrian Kao1-8/+28
[ Upstream commit b4bb6daf4ac4d4560044ecdd81e93aa2f6acbb06 ] When a W-LUN resume fails, its parent devices in the SCSI hierarchy, including the scsi_target, may be runtime suspended. Subsequently, the error handler in ufshcd_recover_pm_error() fails to set the W-LUN device back to active because the parent target is not active. This results in the following errors: google-ufshcd 3c2d0000.ufs: ufshcd_err_handler started; HBA state eh_fatal; ... ufs_device_wlun 0:0:0:49488: START_STOP failed for power mode: 1, result 40000 ufs_device_wlun 0:0:0:49488: ufshcd_wl_runtime_resume failed: -5 ... ufs_device_wlun 0:0:0:49488: runtime PM trying to activate child device 0:0:0:49488 but parent (target0:0:0) is not active Address this by: 1. Ensuring the W-LUN's parent scsi_target is runtime resumed before attempting to set the W-LUN to active within ufshcd_recover_pm_error(). 2. Explicitly checking for power.runtime_error on the HBA and W-LUN devices before calling pm_runtime_set_active() to clear the error state. 3. Adding pm_runtime_get_sync(hba->dev) in ufshcd_err_handling_prepare() to ensure the HBA itself is active during error recovery, even if a child device resume failed. These changes ensure the device power states are managed correctly during error recovery. Signed-off-by: Brian Kao <powenkao@google.com> Tested-by: Brian Kao <powenkao@google.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251112063214.1195761-1-powenkao@google.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hoursscsi: ipr: Enable/disable IRQD_NO_BALANCING during resetWen Xiong1-1/+27
[ Upstream commit 6ac3484fb13b2fc7f31cfc7f56093e7d0ce646a5 ] A dynamic remove/add storage adapter test hits EEH on PowerPC: EEH: [c00000000004f75c] __eeh_send_failure_event+0x7c/0x160 EEH: [c000000000048444] eeh_dev_check_failure.part.0+0x254/0x650 EEH: [c008000001650678] eeh_readl+0x60/0x90 [ipr] EEH: [c00800000166746c] ipr_cancel_op+0x2b8/0x524 [ipr] EEH: [c008000001656524] ipr_eh_abort+0x6c/0x130 [ipr] EEH: [c000000000ab0d20] scmd_eh_abort_handler+0x140/0x440 EEH: [c00000000017e558] process_one_work+0x298/0x590 EEH: [c00000000017eef8] worker_thread+0xa8/0x620 EEH: [c00000000018be34] kthread+0x124/0x130 EEH: [c00000000000cd64] ret_from_kernel_thread+0x5c/0x64 A PCIe bus trace reveals that a vector of MSI-X is cleared to 0 by irqbalance daemon. If we disable irqbalance daemon, we won't see the issue. With debug enabled in ipr driver: [ 44.103071] ipr: Entering __ipr_remove [ 44.103083] ipr: Entering ipr_initiate_ioa_bringdown [ 44.103091] ipr: Entering ipr_reset_shutdown_ioa [ 44.103099] ipr: Leaving ipr_reset_shutdown_ioa [ 44.103105] ipr: Leaving ipr_initiate_ioa_bringdown [ 44.149918] ipr: Entering ipr_reset_ucode_download [ 44.149935] ipr: Entering ipr_reset_alert [ 44.150032] ipr: Entering ipr_reset_start_timer [ 44.150038] ipr: Leaving ipr_reset_alert [ 44.244343] scsi 1:2:3:0: alua: Detached [ 44.254300] ipr: Entering ipr_reset_start_bist [ 44.254320] ipr: Entering ipr_reset_start_timer [ 44.254325] ipr: Leaving ipr_reset_start_bist [ 44.364329] scsi 1:2:4:0: alua: Detached [ 45.134341] scsi 1:2:5:0: alua: Detached [ 45.860949] ipr: Entering ipr_reset_shutdown_ioa [ 45.860962] ipr: Leaving ipr_reset_shutdown_ioa [ 45.860966] ipr: Entering ipr_reset_alert [ 45.861028] ipr: Entering ipr_reset_start_timer [ 45.861035] ipr: Leaving ipr_reset_alert [ 45.964302] ipr: Entering ipr_reset_start_bist [ 45.964309] ipr: Entering ipr_reset_start_timer [ 45.964313] ipr: Leaving ipr_reset_start_bist [ 46.264301] ipr: Entering ipr_reset_bist_done [ 46.264309] ipr: Leaving ipr_reset_bist_done During adapter reset, ipr device driver blocks config space access but can't block MMIO access for MSI-X entries. There is very small window: irqbalance daemon kicks in during adapter reset before ipr driver calls pci_restore_state(pdev) to restore MSI-X table. irqbalance daemon reads back all 0 for that MSI-X vector in __pci_read_msi_msg(). irqbalance daemon: msi_domain_set_affinity() ->irq_chip_set_affinity_patent() ->xive_irq_set_affinity() ->irq_chip_compose_msi_msg() ->pseries_msi_compose_msg() ->__pci_read_msi_msg(): read all 0 since didn't call pci_restore_state ->irq_chip_write_msi_msg() -> pci_write_msg_msi(): write 0 to the msix vector entry When ipr driver calls pci_restore_state(pdev) in ipr_reset_restore_cfg_space(), the MSI-X vector entry has been cleared by irqbalance daemon in pci_write_msg_msix(). pci_restore_state() ->__pci_restore_msix_state() Below is the MSI-X table for ipr adapter after irqbalance daemon kicked in during adapter reset: Dump MSIx table: index=0 address_lo=c800 address_hi=10000000 msg_data=0 Dump MSIx table: index=1 address_lo=c810 address_hi=10000000 msg_data=0 Dump MSIx table: index=2 address_lo=c820 address_hi=10000000 msg_data=0 Dump MSIx table: index=3 address_lo=c830 address_hi=10000000 msg_data=0 Dump MSIx table: index=4 address_lo=c840 address_hi=10000000 msg_data=0 Dump MSIx table: index=5 address_lo=c850 address_hi=10000000 msg_data=0 Dump MSIx table: index=6 address_lo=c860 address_hi=10000000 msg_data=0 Dump MSIx table: index=7 address_lo=c870 address_hi=10000000 msg_data=0 Dump MSIx table: index=8 address_lo=0 address_hi=0 msg_data=0 ---------> Hit EEH since msix vector of index=8 are 0 Dump MSIx table: index=9 address_lo=c890 address_hi=10000000 msg_data=0 Dump MSIx table: index=10 address_lo=c8a0 address_hi=10000000 msg_data=0 Dump MSIx table: index=11 address_lo=c8b0 address_hi=10000000 msg_data=0 Dump MSIx table: index=12 address_lo=c8c0 address_hi=10000000 msg_data=0 Dump MSIx table: index=13 address_lo=c8d0 address_hi=10000000 msg_data=0 Dump MSIx table: index=14 address_lo=c8e0 address_hi=10000000 msg_data=0 Dump MSIx table: index=15 address_lo=c8f0 address_hi=10000000 msg_data=0 [ 46.264312] ipr: Entering ipr_reset_restore_cfg_space [ 46.267439] ipr: Entering ipr_fail_all_ops [ 46.267447] ipr: Leaving ipr_fail_all_ops [ 46.267451] ipr: Leaving ipr_reset_restore_cfg_space [ 46.267454] ipr: Entering ipr_ioa_bringdown_done [ 46.267458] ipr: Leaving ipr_ioa_bringdown_done [ 46.267467] ipr: Entering ipr_worker_thread [ 46.267470] ipr: Leaving ipr_worker_thread IRQ balancing is not required during adapter reset. Enable "IRQ_NO_BALANCING" flag before starting adapter reset and disable it after calling pci_restore_state(). The irqbalance daemon is disabled for this short period of time (~2s). Co-developed-by: Kyle Mahlkuch <Kyle.Mahlkuch@ibm.com> Signed-off-by: Kyle Mahlkuch <Kyle.Mahlkuch@ibm.com> Signed-off-by: Wen Xiong <wenxiong@linux.ibm.com> Link: https://patch.msgid.link/20251028142427.3969819-2-wenxiong@linux.ibm.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hourssmb/client: fix NT_STATUS_NO_DATA_DETECTED valueChenXiaoSong1-1/+1
[ Upstream commit a1237c203f1757480dc2f3b930608ee00072d3cc ] This was reported by the KUnit tests in the later patches. See MS-ERREF 2.3.1 STATUS_NO_DATA_DETECTED. Keep it consistent with the value in the documentation. Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hourssmb/client: fix NT_STATUS_DEVICE_DOOR_OPEN valueChenXiaoSong1-1/+1
[ Upstream commit b2b50fca34da5ec231008edba798ddf92986bd7f ] This was reported by the KUnit tests in the later patches. See MS-ERREF 2.3.1 STATUS_DEVICE_DOOR_OPEN. Keep it consistent with the value in the documentation. Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
43 hourssmb/client: fix NT_STATUS_UNABLE_TO_FREE_VM valueChenXiaoSong1-1/+1
[ Upstream commit 9f99caa8950a76f560a90074e3a4b93cfa8b3d84 ] This was reported by the KUnit tests in the later patches. See MS-ERREF 2.3.1 STATUS_UNABLE_TO_FREE_VM. Keep it consistent with the value in the documentation. Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>