summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
7 daysefi/cper: align ARM CPER type with UEFI 2.9A/2.10 specsMauro Carvalho Chehab1-5/+5
[ Upstream commit 96b010536ee020e716d28d9b359a4bcd18800aeb ] Up to UEFI spec 2.9, the type byte of CPER struct for ARM processor was defined simply as: Type at byte offset 4: - Cache error - TLB Error - Bus Error - Micro-architectural Error All other values are reserved Yet, there was no information about how this would be encoded. Spec 2.9A errata corrected it by defining: - Bit 1 - Cache Error - Bit 2 - TLB Error - Bit 3 - Bus Error - Bit 4 - Micro-architectural Error All other values are reserved That actually aligns with the values already defined on older versions at N.2.4.1. Generic Processor Error Section. Spec 2.10 also preserve the same encoding as 2.9A. Adjust CPER and GHES handling code for both generic and ARM processors to properly handle UEFI 2.9A and 2.10 encoding. Link: https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#arm-processor-error-information Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysefi/cper: Add a new helper function to print bitmasksMauro Carvalho Chehab1-0/+2
[ Upstream commit a976d790f49499ccaa0f991788ad8ebf92e7fd5c ] Add a helper function to print a string with names associated to each bit field. A typical example is: const char * const bits[] = { "bit 3 name", "bit 4 name", "bit 5 name", }; char str[120]; unsigned int bitmask = BIT(3) | BIT(5); #define MASK GENMASK(5,3) cper_bits_to_str(str, sizeof(str), FIELD_GET(MASK, bitmask), bits, ARRAY_SIZE(bits)); The above code fills string "str" with "bit 3 name|bit 5 name". Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysblock: return unsigned int from queue_dma_alignmentChristoph Hellwig1-1/+1
[ Upstream commit ed5db174cf39374215934f21b04639a7a1513023 ] The underlying limit is defined as an unsigned int, so return that from queue_dma_alignment as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20241119160932.1327864-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Stable-dep-of: 2c38ec934ddf ("block: fix cached zone reports on devices with native zone append") Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysblock: fix comment for op_is_zone_mgmt() to include RESET_ALLshechenglong1-4/+1
[ Upstream commit 8a32282175c964eb15638e8dfe199fc13c060f67 ] REQ_OP_ZONE_RESET_ALL is a zone management request, and op_is_zone_mgmt() has returned true for it. Update the comment to remove the misleading exception note so the documentation matches the implementation. Fixes: 12a1c9353c47 ("block: fix op_is_zone_mgmt() to handle REQ_OP_ZONE_RESET_ALL") Signed-off-by: shechenglong <shechenglong@xfusion.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysNFS: Fix inheritance of the block sizes when automountingTrond Myklebust1-0/+5
[ Upstream commit 2b092175f5e301cdaa935093edfef2be9defb6df ] Only inherit the block sizes that were actually specified as mount parameters for the parent mount. Fixes: 62a55d088cd8 ("NFS: Additional refactoring for fs_context conversion") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysExpand the type of nfs_fattr->validTrond Myklebust2-28/+28
[ Upstream commit ce60ab3964782df9ba34f0a64c0bc766dd508bde ] We need to be able to track more than 32 attributes per inode. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com> Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/1e3405fca54efd0be7c91c1da77917b94f5dfcc4.1748515333.git.bcodding@redhat.com Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Stable-dep-of: 2b092175f5e3 ("NFS: Fix inheritance of the block sizes when automounting") Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysnfs/vfs: discard d_exact_alias()NeilBrown1-1/+0
[ Upstream commit 3ff6c8707c9a0116d00982851ec1216a42053ace ] d_exact_alias() is a descendent of d_add_unique() which was introduced 20 years ago mostly likely to work around problems with NFS servers of the time. It is now not used in several situations were it was originally needed and there have been no reports of problems - presumably the old NFS servers have been improved. This only place it is now use is in NFSv4 code and the old problematic servers are thought to have been v2/v3 only. There is no clear benefit in reusing a unhashed() dentry which happens to have the same name as the dentry we are adding. So this patch removes d_exact_alias() and the one place that it is used. Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/20250226062135.2043651-2-neilb@suse.de Signed-off-by: Christian Brauner <brauner@kernel.org> Stable-dep-of: 0f900f11002f ("NFS: Initialise verifiers for visible dentries in _nfs4_open_and_get_state") Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysnet: hsr: create an API to get hsr port typeXiaoliang Yang1-0/+9
[ Upstream commit a0244e76213980f3b9bb5d40b0b6705fcf24230d ] Since the introduction of HSR_PT_INTERLINK in commit 5055cccfc2d1 ("net: hsr: Provide RedBox support (HSR-SAN)"), we see that different port types require different settings for hardware offload, which was not the case before when we only had HSR_PT_SLAVE_A and HSR_PT_SLAVE_B. But there is currently no way to know which port is which type, so create the hsr_get_port_type() API function and export it. When hsr_get_port_type() is called from the device driver, the port can must be found in the HSR port list. An important use case is for this function to work from offloading drivers' NETDEV_CHANGEUPPER handler, which is triggered by hsr_portdev_setup() -> netdev_master_upper_dev_link(). Therefore, we need to move the addition of the hsr_port to the HSR port list prior to calling hsr_portdev_setup(). This makes the error restoration path also more similar to hsr_del_port(), where kfree_rcu(port) is already used. Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Lukasz Majewski <lukma@denx.de> Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Łukasz Majewski <lukma@nabladev.com> Link: https://patch.msgid.link/20251130131657.65080-3-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 30296ac76426 ("net: dsa: xrs700x: reject unsupported HSR configurations") Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysnet: hsr: Create and export hsr_get_port_ndev()MD Danish Anwar1-0/+17
[ Upstream commit 9c10dd8eed74de9e8adeb820939f8745cd566d4a ] Create an API to get the net_device to the slave port of HSR device. The API will take hsr net_device and enum hsr_port_type for which we want the net_device as arguments. This API can be used by client drivers who support HSR and want to get the net_devcie of slave ports from the hsr device. Export this API for the same. This API needs the enum hsr_port_type to be accessible by the drivers using hsr. Move the enum hsr_port_type from net/hsr/hsr_main.h to include/linux/if_hsr.h for the same. Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Stable-dep-of: 30296ac76426 ("net: dsa: xrs700x: reject unsupported HSR configurations") Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysvfio/pci: Use RCU for error/request triggers to avoid circular lockingAlex Williamson1-2/+8
[ Upstream commit 98693e0897f754e3f51ce6626ed5f785f625ba2b ] Thanks to a device generating an ACS violation during bus reset, lockdep reported the following circular locking issue: CPU0: SET_IRQS (MSI/X): holds igate, acquires memory_lock CPU1: HOT_RESET: holds memory_lock, acquires pci_bus_sem CPU2: AER: holds pci_bus_sem, acquires igate This results in a potential 3-way deadlock. Remove the pci_bus_sem->igate leg of the triangle by using RCU to peek at the eventfd rather than locking it with igate. Fixes: 3be3a074cf5b ("vfio-pci: Don't use device_lock around AER interrupt setup") Signed-off-by: Alex Williamson <alex.williamson@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20251124223623.2770706-1-alex@shazbot.org Signed-off-by: Alex Williamson <alex@shazbot.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysvirtio: fix virtqueue_set_affinity() docsMichael S. Tsirkin1-1/+1
[ Upstream commit 43236d8bbafff94b423afecc4a692dd90602d426 ] Rewrite the comment for better grammar and clarity. Fixes: 75a0a52be3c2 ("virtio: introduce an API to set affinity for a virtqueue") Message-Id: <e317e91bd43b070e5eaec0ebbe60c5749d02e2dd.1763026134.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysvirtio: fix grammar in virtio_queue_info docsMichael S. Tsirkin1-1/+1
[ Upstream commit 63598fba55ab9d384818fed48dc04006cecf7be4 ] Fix grammar in the description of @ctx Fixes: c502eb85c34e ("virtio: introduce virtio_queue_info struct and find_vqs_info() config op") Message-Id: <a5cf2b92573200bdb1c1927e559d3930d61a4af2.1763026134.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysvirtio: fix whitespace in virtio_config_opsMichael S. Tsirkin1-1/+1
[ Upstream commit 7831791e77a1cd29528d4dc336ce14466aef5ba6 ] The finalize_features documentation uses a tab between words. Use space instead. Fixes: d16c0cd27331 ("docs: driver-api: virtio: virtio on Linux") Message-Id: <39d7685c82848dc6a876d175e33a1407f6ab3fc1.1763026134.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysvirtio: fix typo in virtio_device_ready() commentMichael S. Tsirkin1-1/+1
[ Upstream commit 361173f95ae4b726ebbbf0bd594274f5576c4abc ] "coherenct" -> "coherent" Fixes: 8b4ec69d7e09 ("virtio: harden vring IRQ") Message-Id: <db286e9a65449347f6584e68c9960fd5ded2b4b0.1763026134.git.mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysbacklight: lp855x: Fix lp855x.h kernel-doc warningsRandy Dunlap1-2/+2
[ Upstream commit 2d45db63260c6ae3cf007361e04a1c41bd265084 ] Add a missing struct short description and a missing leading " *" to lp855x.h to avoid kernel-doc warnings: Warning: include/linux/platform_data/lp855x.h:126 missing initial short description on line: * struct lp855x_platform_data Warning: include/linux/platform_data/lp855x.h:131 bad line: Only valid when mode is PWM_BASED. Fixes: 7be865ab8634 ("backlight: new backlight driver for LP855x devices") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Daniel Thompson (RISCstar) <danielt@kernel.org> Link: https://patch.msgid.link/20251111060916.1995920-1-rdunlap@infradead.org Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 dayswifi: ieee80211: correct FILS status codesRia Thomas1-2/+2
[ Upstream commit 24d4da5c2565313c2ad3c43449937a9351a64407 ] The FILS status codes are set to 108/109, but the IEEE 802.11-2020 spec defines them as 112/113. Update the enum so it matches the specification and keeps the kernel consistent with standard values. Fixes: a3caf7440ded ("cfg80211: Add support for FILS shared key authentication offload") Signed-off-by: Ria Thomas <ria.thomas@morsemicro.com> Reviewed-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com> Link: https://patch.msgid.link/20251124125637.3936154-1-ria.thomas@morsemicro.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysRAS: Report all ARM processor CPER information to userspaceJason Tian1-3/+13
[ Upstream commit 05954511b73e748d0370549ad9dd9cd95297d97a ] The ARM processor CPER record was added in UEFI v2.6 and remained unchanged up to v2.10. Yet, the original arm_event trace code added by e9279e83ad1f ("trace, ras: add ARM processor error trace event") is incomplete, as it only traces some fields of UAPI 2.6 table N.16, not exporting any information from tables N.17 to N.29 of the record. This is not enough for the user to be able to figure out what has exactly happened or to take appropriate action. According to the UEFI v2.9 specification chapter N2.4.4, the ARM processor error section includes: - several (ERR_INFO_NUM) ARM processor error information structures (Tables N.17 to N.20); - several (CONTEXT_INFO_NUM) ARM processor context information structures (Tables N.21 to N.29); - several vendor specific error information structures. The size is given by Section Length minus the size of the other fields. In addition, it also exports two fields that are parsed by the GHES driver when firmware reports it, e.g.: - error severity - CPU logical index Report all of these information to userspace via a the ARM tracepoint so that userspace can properly record the error and take decisions related to CPU core isolation according to error severity and other info. The updated ARM trace event now contains the following fields: ====================================== ============================= UEFI field on table N.16 ARM Processor trace fields ====================================== ============================= Validation handled when filling data for affinity MPIDR and running state. ERR_INFO_NUM pei_len CONTEXT_INFO_NUM ctx_len Section Length indirectly reported by pei_len, ctx_len and oem_len Error affinity level affinity MPIDR_EL1 mpidr MIDR_EL1 midr Running State running_state PSCI State psci_state Processor Error Information Structure pei_err - count at pei_len Processor Context ctx_err- count at ctx_len Vendor Specific Error Info oem - count at oem_len ====================================== ============================= It should be noted that decoding of tables N.17 to N.29, if needed, will be handled in userspace. That gives more flexibility, as there won't be any need to flood the kernel with micro-architecture specific error decoding. Also, decoding the other fields require a complex logic, and should be done for each of the several values inside the record field. So, let userspace daemons like rasdaemon decode them, parsing such tables and having vendor-specific micro-architecture-specific decoders. [mchehab: modified description, solved merge conflicts and fixed coding style] Signed-off-by: Jason Tian <jason@os.amperecomputing.com> Co-developed-by: Shengwei Luo <luoshengwei@huawei.com> Signed-off-by: Shengwei Luo <luoshengwei@huawei.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Daniel Ferguson <danielf@os.amperecomputing.com> # rebased Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Shiju Jose <shiju.jose@huawei.com> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Fixes: e9279e83ad1f ("trace, ras: add ARM processor error trace event") Link: https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#arm-processor-error-section Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysbpf: Fix invalid prog->stats access when update_effective_progs failsPu Lehui1-5/+7
[ Upstream commit 7dc211c1159d991db609bdf4b0fb9033c04adcbc ] Syzkaller triggers an invalid memory access issue following fault injection in update_effective_progs. The issue can be described as follows: __cgroup_bpf_detach update_effective_progs compute_effective_progs bpf_prog_array_alloc <-- fault inject purge_effective_progs /* change to dummy_bpf_prog */ array->items[index] = &dummy_bpf_prog.prog ---softirq start--- __do_softirq ... __cgroup_bpf_run_filter_skb __bpf_prog_run_save_cb bpf_prog_run stats = this_cpu_ptr(prog->stats) /* invalid memory access */ flags = u64_stats_update_begin_irqsave(&stats->syncp) ---softirq end--- static_branch_dec(&cgroup_bpf_enabled_key[atype]) The reason is that fault injection caused update_effective_progs to fail and then changed the original prog into dummy_bpf_prog.prog in purge_effective_progs. Then a softirq came, and accessing the members of dummy_bpf_prog.prog in the softirq triggers invalid mem access. To fix it, skip updating stats when stats is NULL. Fixes: 492ecee892c2 ("bpf: enable program stats") Signed-off-by: Pu Lehui <pulehui@huawei.com> Link: https://lore.kernel.org/r/20251115102343.2200727-1-pulehui@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 dayscoresight: Change device mode to atomic typeLeo Yan1-14/+11
[ Upstream commit 693d1eaca940f277af24c74873ef2313816ff444 ] The device mode is defined as local type. This type cannot promise SMP-safe access. Change to atomic type and impose relax ordering, which ensures the SMP-safe synchronisation and the ordering between the mode setting and relevant operations. Fixes: 22fd532eaa0c ("coresight: etm3x: adding operation mode for etm_enable()") Reviewed-by: Mike Leach <mike.leach@linaro.org> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20251111-arm_coresight_power_management_fix-v6-1-f55553b6c8b3@arm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysperf: Remove get_perf_callchain() init_nr argumentJosh Poimboeuf1-1/+1
[ Upstream commit e649bcda25b5ae1a30a182cc450f928a0b282c93 ] The 'init_nr' argument has double duty: it's used to initialize both the number of contexts and the number of stack entries. That's confusing and the callers always pass zero anyway. Hard code the zero. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Namhyung Kim <Namhyung@kernel.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20250820180428.259565081@kernel.org Stable-dep-of: 23f852daa4ba ("bpf: Fix stackmap overflow check in __bpf_get_stackid()") Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysfirmware: qcom: tzmem: fix qcom_tzmem_policy kernel-docRandy Dunlap1-3/+12
[ Upstream commit edd548dc64a699d71ea4f537f815044e763d01e1 ] Fix kernel-doc warnings by using correct kernel-doc syntax and formatting to prevent warnings: Warning: include/linux/firmware/qcom/qcom_tzmem.h:25 Enum value 'QCOM_TZMEM_POLICY_STATIC' not described in enum 'qcom_tzmem_policy' Warning: ../include/linux/firmware/qcom/qcom_tzmem.h:25 Enum value 'QCOM_TZMEM_POLICY_MULTIPLIER' not described in enum 'qcom_tzmem_policy' Warning: ../include/linux/firmware/qcom/qcom_tzmem.h:25 Enum value 'QCOM_TZMEM_POLICY_ON_DEMAND' not described in enum 'qcom_tzmem_policy' Fixes: 84f5a7b67b61 ("firmware: qcom: add a dedicated TrustZone buffer allocator") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20251017191323.1820167-1-rdunlap@infradead.org Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysrculist: Add hlist_nulls_replace_rcu() and hlist_nulls_replace_init_rcu()Xuanqiang Luo1-0/+59
[ Upstream commit 9c4609225ec1cb551006d6a03c7c4ad8cb5584c0 ] Add two functions to atomically replace RCU-protected hlist_nulls entries. Keep using WRITE_ONCE() to assign values to ->next and ->pprev, as mentioned in the patch below: commit efd04f8a8b45 ("rcu: Use WRITE_ONCE() for assignments to ->next for rculist_nulls") commit 860c8802ace1 ("rcu: Use WRITE_ONCE() for assignments to ->pprev for hlist_nulls") Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Xuanqiang Luo <luoxuanqiang@kylinos.cn> Link: https://patch.msgid.link/20251015020236.431822-2-xuanqiang.luo@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 1532ed0d0753 ("inet: Avoid ehash lookup race in inet_ehash_insert()") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07usb: gadget: udc: fix use-after-free in usb_gadget_state_workJimmy Hu1-0/+5
[ Upstream commit baeb66fbd4201d1c4325074e78b1f557dff89b5b ] A race condition during gadget teardown can lead to a use-after-free in usb_gadget_state_work(), as reported by KASAN: BUG: KASAN: invalid-access in sysfs_notify+0x2c/0xd0 Workqueue: events usb_gadget_state_work The fundamental race occurs because a concurrent event (e.g., an interrupt) can call usb_gadget_set_state() and schedule gadget->work at any time during the cleanup process in usb_del_gadget(). Commit 399a45e5237c ("usb: gadget: core: flush gadget workqueue after device removal") attempted to fix this by moving flush_work() to after device_del(). However, this does not fully solve the race, as a new work item can still be scheduled *after* flush_work() completes but before the gadget's memory is freed, leading to the same use-after-free. This patch fixes the race condition robustly by introducing a 'teardown' flag and a 'state_lock' spinlock to the usb_gadget struct. The flag is set during cleanup in usb_del_gadget() *before* calling flush_work() to prevent any new work from being scheduled once cleanup has commenced. The scheduling site, usb_gadget_set_state(), now checks this flag under the lock before queueing the work, thus safely closing the race window. Fixes: 5702f75375aa9 ("usb: gadget: udc-core: move sysfs_notify() to a workqueue") Cc: stable <stable@kernel.org> Signed-off-by: Jimmy Hu <hhhuuu@google.com> Link: https://patch.msgid.link/20251023054945.233861-1-hhhuuu@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07iio: buffer: support getting dma channel from the bufferNuno Sá1-0/+2
commit a514bb109eada64f798f1c86c17182229cc20fe7 upstream. Add a new buffer accessor .get_dma_dev() in order to get the struct device responsible for actually providing the dma channel. We cannot assume that we can use the parent of the IIO device for mapping the DMA buffer. This becomes important on systems (like the Xilinx/AMD zynqMP Ultrascale) where memory (or part of it) is mapped above the 32 bit range. On such systems and given that a device by default has a dma mask of 32 bits we would then need to rely on bounce buffers (to swiotlb) for mapping memory above the dma mask limit. In the process, add an iio_buffer_get_dma_dev() helper function to get the proper DMA device. Cc: stable@vger.kernel.org Reviewed-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Nuno Sá <nuno.sa@analog.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07iio: buffer-dma: support getting the DMA channelNuno Sá1-0/+1
commit f9c198c3ccaf90a1a265fb2ffa8d4b093c3b0784 upstream. Implement the .get_dma_dev() callback for DMA buffers by returning the device that owns the DMA channel. This allows the core DMABUF infrastructure to properly map DMA buffers using the correct device, avoiding the need for bounce buffers on systems where memory is mapped above the 32-bit range. The function returns the DMA queue's device, which is the actual device responsible for DMA operations in buffer-dma implementations. Cc: stable@vger.kernel.org Reviewed-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Nuno Sá <nuno.sa@analog.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07spi: spi-mem: Add a new controller capabilityMiquel Raynal1-0/+2
[ Upstream commit 1248c9b8d54120950fda10fbeb98fb8932b4d45c ] There are spi devices with multiple frequency limitations depending on the invoked command. We probably do not want to afford running at the lowest supported frequency all the time, so if we want to get the most of our hardware, we need to allow per-operation frequency limitations. Among all the SPI memory controllers, I believe all are capable of changing the spi frequency on the fly. Some of the drivers do not make any frequency setup though. And some others will derive a per chip prescaler value which will be used forever. Actually changing the frequency on the fly is something new in Linux, so we need to carefully flag the drivers which do and do not support it. A controller capability is created for that, and the presence for this capability will always be checked before accepting such pattern. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Tudor Ambarus <tudor.ambarus@linaro.org> Link: https://patch.msgid.link/20241224-winbond-6-11-rc1-quad-support-v2-2-ad218dbc406f@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org> Stable-dep-of: 40ad64ac25bb ("spi: nxp-fspi: Propagate fwnode in ACPI case as well") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07spi: spi-mem: Extend spi-mem operations with a per-operation maximum frequencyMiquel Raynal1-1/+11
[ Upstream commit 0fefeade90e74bc8f40ab0e460f483565c492e28 ] In the spi subsystem, the bus frequency is derived as follows: - the controller may expose a minimum and maximum operating frequency - the hardware description, through the spi peripheral properties, advise what is the maximum acceptable frequency from a device/wiring point of view. Transfers must be observed at a frequency which fits both (so in practice, the lowest maximum). Actually, this second point mixes two information and already takes the lowest frequency among: - what the spi device is capable of (what is written in the component datasheet) - what the wiring allows (electromagnetic sensibility, crossovers, terminations, antenna effect, etc). This logic works until spi devices are no longer capable of sustaining their highest frequency regardless of the operation. Spi memories are typically subject to such variation. Some devices are capable of spitting their internally stored data (essentially in read mode) at a very fast rate, typically up to 166MHz on Winbond SPI-NAND chips, using "fast" commands. However, some of the low-end operations, such as regular page read-from-cache commands, are more limited and can only be executed at 54MHz at most. This is currently a problem in the SPI-NAND subsystem. Another situation, even if not yet supported, will be with DTR commands, when the data is latched on both edges of the clock. The same chips as mentioned previously are in this case limited to 80MHz. Yet another example might be continuous reads, which, under certain circumstances, can also run at most at 104 or 120MHz. As a matter of fact, the "one frequency per chip" policy is outdated and more fine grain configuration is needed: we need to allow per-operation frequency limitations. So far, all datasheets I encountered advertise a maximum default frequency, which need to be lowered for certain specific operations. So based on the current infrastructure, we can still expect firmware (device trees in general) to continued advertising the same maximum speed which is a mix between the PCB limitations and the chip maximum capability, and expect per-operation lower frequencies when this is relevant. Add a `struct spi_mem_op` member to carry this information. Not providing this field explicitly from upper layers means that there is no further constraint and the default spi device maximum speed will be carried instead. The SPI_MEM_OP() macro is also expanded with an optional frequency argument, because virtually all operations can be subject to such a limitation, and this will allow for a smooth and discrete transition. For controller drivers which do not implement the spi-mem interface, the per-transfer speed is also set acordingly to a lower (than the maximum default) speed when relevant. Acked-by: Pratyush Yadav <pratyush@kernel.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://patch.msgid.link/20241224-winbond-6-11-rc1-quad-support-v2-1-ad218dbc406f@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org> Stable-dep-of: 40ad64ac25bb ("spi: nxp-fspi: Propagate fwnode in ACPI case as well") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07spi: spi-mem: Allow specifying the byte order in Octal DTR modeTudor Ambarus1-1/+7
[ Upstream commit 030ace430afcf847f537227afceb22dfe8fb8fc8 ] There are NOR flashes (Macronix) that swap the bytes on a 16-bit boundary when configured in Octal DTR mode. The byte order of 16-bit words is swapped when read or written in Octal Double Transfer Rate (DTR) mode compared to Single Transfer Rate (STR) modes. If one writes D0 D1 D2 D3 bytes using 1-1-1 mode, and uses 8D-8D-8D SPI mode for reading, it will read back D1 D0 D3 D2. Swapping the bytes may introduce some endianness problems. It can affect the boot sequence if the entire boot sequence is not handled in either 8D-8D-8D mode or 1-1-1 mode. Therefore, it is necessary to swap the bytes back to ensure the same byte order as in STR modes. Fortunately there are controllers that could swap the bytes back at runtime, addressing the flash's endianness requirements. Provide a way for the upper layers to specify the byte order in Octal DTR mode. Merge Tudor's patch and add modifications for suiting newer version of Linux kernel. Suggested-by: Michael Walle <mwalle@kernel.org> Signed-off-by: JaimeLiao <jaimeliao@mxic.com.tw> Signed-off-by: AlvinZhou <alvinzhou@mxic.com.tw> Acked-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240926141956.2386374-3-alvinzhou.tw@gmail.com Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org> Stable-dep-of: 40ad64ac25bb ("spi: nxp-fspi: Propagate fwnode in ACPI case as well") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07mailbox: mtk-cmdq: Refine DMA address handling for the command bufferJason-JH Lin1-0/+10
[ Upstream commit a195c7ccfb7a21b8118139835e25936ec8722596 ] GCE can only fetch the command buffer address from a 32-bit register. Some SoCs support a 35-bit command buffer address for GCE, which requires a right shift of 3 bits before setting the address into the 32-bit register. A comment has been added to the header of cmdq_get_shift_pa() to explain this requirement. To prevent the GCE command buffer address from being DMA mapped beyond its supported bit range, the DMA bit mask for the device is set during initialization. Additionally, to ensure the correct shift is applied when setting or reading the register that stores the GCE command buffer address, new APIs, cmdq_convert_gce_addr() and cmdq_revert_gce_addr(), have been introduced for consistent operations on this register. The variable type for the command buffer address has been standardized to dma_addr_t to prevent handling issues caused by type mismatches. Fixes: 0858fde496f8 ("mailbox: cmdq: variablize address shift in platform") Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-01ata: libata-scsi: Fix system suspend for a security locked driveNiklas Cassel1-0/+1
commit b11890683380a36b8488229f818d5e76e8204587 upstream. Commit cf3fc037623c ("ata: libata-scsi: Fix ata_to_sense_error() status handling") fixed ata_to_sense_error() to properly generate sense key ABORTED COMMAND (without any additional sense code), instead of the previous bogus sense key ILLEGAL REQUEST with the additional sense code UNALIGNED WRITE COMMAND, for a failed command. However, this broke suspend for Security locked drives (drives that have Security enabled, and have not been Security unlocked by boot firmware). The reason for this is that the SCSI disk driver, for the Synchronize Cache command only, treats any sense data with sense key ILLEGAL REQUEST as a successful command (regardless of ASC / ASCQ). After commit cf3fc037623c ("ata: libata-scsi: Fix ata_to_sense_error() status handling") the code that treats any sense data with sense key ILLEGAL REQUEST as a successful command is no longer applicable, so the command fails, which causes the system suspend to be aborted: sd 1:0:0:0: PM: dpm_run_callback(): scsi_bus_suspend returns -5 sd 1:0:0:0: PM: failed to suspend async: error -5 PM: Some devices failed to suspend, or early wake event detected To make suspend work once again, for a Security locked device only, return sense data LOGICAL UNIT ACCESS NOT AUTHORIZED, the actual sense data which a real SCSI device would have returned if locked. The SCSI disk driver treats this sense data as a successful command. Cc: stable@vger.kernel.org Reported-by: Ilia Baryshnikov <qwelias@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220704 Fixes: cf3fc037623c ("ata: libata-scsi: Fix ata_to_sense_error() status handling") Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-24mm/huge_memory: do not change split_huge_page*() target order silentlyZi Yan1-14/+7
commit 77008e1b2ef73249bceb078a321a3ff6bc087afb upstream. Page cache folios from a file system that support large block size (LBS) can have minimal folio order greater than 0, thus a high order folio might not be able to be split down to order-0. Commit e220917fa507 ("mm: split a folio in minimum folio order chunks") bumps the target order of split_huge_page*() to the minimum allowed order when splitting a LBS folio. This causes confusion for some split_huge_page*() callers like memory failure handling code, since they expect after-split folios all have order-0 when split succeeds but in reality get min_order_for_split() order folios and give warnings. Fix it by failing a split if the folio cannot be split to the target order. Rename try_folio_split() to try_folio_split_to_order() to reflect the added new_order parameter. Remove its unused list parameter. [The test poisons LBS folios, which cannot be split to order-0 folios, and also tries to poison all memory. The non split LBS folios take more memory than the test anticipated, leading to OOM. The patch fixed the kernel warning and the test needs some change to avoid OOM.] Link: https://lkml.kernel.org/r/20251017013630.139907-1-ziy@nvidia.com Fixes: e220917fa507 ("mm: split a folio in minimum folio order chunks") Signed-off-by: Zi Yan <ziy@nvidia.com> Reported-by: syzbot+e6367ea2fdab6ed46056@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/68d2c943.a70a0220.1b52b.02b3.GAE@google.com/ Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Pankaj Raghav <p.raghav@samsung.com> Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Jane Chu <jane.chu@oracle.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mariano Pache <npache@redhat.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-24net: netpoll: Individualize the skb poolBreno Leitao1-0/+1
[ Upstream commit 221a9c1df790fa711d65daf5ba05d0addc279153 ] The current implementation of the netpoll system uses a global skb pool, which can lead to inefficient memory usage and waste when targets are disabled or no longer in use. This can result in a significant amount of memory being unnecessarily allocated and retained, potentially causing performance issues and limiting the availability of resources for other system components. Modify the netpoll system to assign a skb pool to each target instead of using a global one. This approach allows for more fine-grained control over memory allocation and deallocation, ensuring that resources are only allocated and retained as needed. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20241114-skb_buffers_v2-v3-1-9be9f52a8b69@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 49c8d2c1f94c ("net: netpoll: fix incorrect refcount handling causing incorrect cleanup") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-24KVM: guest_memfd: Remove RCU-protected attribute from slot->gmem.fileYan Zhao1-1/+6
[ Upstream commit 67b43038ce14d6b0673bdffb2052d879065c94ae ] Remove the RCU-protected attribute from slot->gmem.file. No need to use RCU primitives rcu_assign_pointer()/synchronize_rcu() to update this pointer. - slot->gmem.file is updated in 3 places: kvm_gmem_bind(), kvm_gmem_unbind(), kvm_gmem_release(). All of them are protected by kvm->slots_lock. - slot->gmem.file is read in 2 paths: (1) kvm_gmem_populate kvm_gmem_get_file __kvm_gmem_get_pfn (2) kvm_gmem_get_pfn kvm_gmem_get_file __kvm_gmem_get_pfn Path (1) kvm_gmem_populate() requires holding kvm->slots_lock, so slot->gmem.file is protected by the kvm->slots_lock in this path. Path (2) kvm_gmem_get_pfn() does not require holding kvm->slots_lock. However, it's also not guarded by rcu_read_lock() and rcu_read_unlock(). So synchronize_rcu() in kvm_gmem_unbind()/kvm_gmem_release() actually will not wait for the readers in kvm_gmem_get_pfn() due to lack of RCU read-side critical section. The path (2) kvm_gmem_get_pfn() is safe without RCU protection because: a) kvm_gmem_bind() is called on a new memslot, before the memslot is visible to kvm_gmem_get_pfn(). b) kvm->srcu ensures that kvm_gmem_unbind() and freeing of a memslot occur after the memslot is no longer visible to kvm_gmem_get_pfn(). c) get_file_active() ensures that kvm_gmem_get_pfn() will not access the stale file if kvm_gmem_release() sets it to NULL. This is because if kvm_gmem_release() occurs before kvm_gmem_get_pfn(), get_file_active() will return NULL; if get_file_active() does not return NULL, kvm_gmem_release() should not occur until after kvm_gmem_get_pfn() releases the file reference. Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> Message-ID: <20241104084303.29909-1-yan.y.zhao@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Stable-dep-of: ae431059e75d ("KVM: guest_memfd: Remove bindings on memslot deletion when gmem is dying") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-24dma-mapping: benchmark: Restore padding to ensure uABI remained consistentQinxin Xia1-0/+1
commit 23ee8a2563a0f24cf4964685ced23c32be444ab8 upstream. The padding field in the structure was previously reserved to maintain a stable interface for potential new fields, ensuring compatibility with user-space shared data structures. However,it was accidentally removed by tiantao in a prior commit, which may lead to incompatibility between user space and the kernel. This patch reinstates the padding to restore the original structure layout and preserve compatibility. Fixes: 8ddde07a3d28 ("dma-mapping: benchmark: extract a common header file for map_benchmark definition") Cc: stable@vger.kernel.org Acked-by: Barry Song <baohua@kernel.org> Signed-off-by: Qinxin Xia <xiaqinxin@huawei.com> Reported-by: Barry Song <baohua@kernel.org> Closes: https://lore.kernel.org/lkml/CAGsJ_4waiZ2+NBJG+SCnbNk+nQ_ZF13_Q5FHJqZyxyJTcEop2A@mail.gmail.com/ Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20251028120900.2265511-2-xiaqinxin@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-24bpf: Add bpf_prog_run_data_pointers()Eric Dumazet1-0/+20
[ Upstream commit 4ef92743625818932b9c320152b58274c05e5053 ] syzbot found that cls_bpf_classify() is able to change tc_skb_cb(skb)->drop_reason triggering a warning in sk_skb_reason_drop(). WARNING: CPU: 0 PID: 5965 at net/core/skbuff.c:1192 __sk_skb_reason_drop net/core/skbuff.c:1189 [inline] WARNING: CPU: 0 PID: 5965 at net/core/skbuff.c:1192 sk_skb_reason_drop+0x76/0x170 net/core/skbuff.c:1214 struct tc_skb_cb has been added in commit ec624fe740b4 ("net/sched: Extend qdisc control block with tc control block"), which added a wrong interaction with db58ba459202 ("bpf: wire in data and data_end for cls_act_bpf"). drop_reason was added later. Add bpf_prog_run_data_pointers() helper to save/restore the net_sched storage colliding with BPF data_meta/data_end. Fixes: ec624fe740b4 ("net/sched: Extend qdisc control block with tc control block") Reported-by: syzbot <syzkaller@googlegroups.com> Closes: https://lore.kernel.org/netdev/6913437c.a70a0220.22f260.013b.GAE@google.com/ Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20251112125516.1563021-1-edumazet@google.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-11-24NFS4: Apply delay_retrans to async operationsJoshua Watt1-0/+1
[ Upstream commit 7a84394f02ab1985ebbe0a8d6f6d69bd040de4b3 ] The setting of delay_retrans is applied to synchronous RPC operations because the retransmit count is stored in same struct nfs4_exception that is passed each time an error is checked. However, for asynchronous operations (READ, WRITE, LOCKU, CLOSE, DELEGRETURN), a new struct nfs4_exception is made on the stack each time the task callback is invoked. This means that the retransmit count is always zero and thus delay_retrans never takes effect. Apply delay_retrans to these operations by tracking and updating their retransmit count. Change-Id: Ieb33e046c2b277cb979caa3faca7f52faf0568c9 Signed-off-by: Joshua Watt <jpewhacker@gmail.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-11-24compiler_types: Move unused static inline functions warning to W=2Peter Zijlstra1-3/+2
[ Upstream commit 9818af18db4bfefd320d0fef41390a616365e6f7 ] Per Nathan, clang catches unused "static inline" functions in C files since commit 6863f5643dd7 ("kbuild: allow Clang to find unused static inline functions for W=1 build"). Linus said: > So I entirely ignore W=1 issues, because I think so many of the extra > warnings are bogus. > > But if this one in particular is causing more problems than most - > some teams do seem to use W=1 as part of their test builds - it's fine > to send me a patch that just moves bad warnings to W=2. > > And if anybody uses W=2 for their test builds, that's THEIR problem.. Here is the change to bump the warning from W=1 to W=2. Fixes: 6863f5643dd7 ("kbuild: allow Clang to find unused static inline functions for W=1 build") Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://patch.msgid.link/20251106105000.2103276-1-andriy.shevchenko@linux.intel.com [nathan: Adjust comment as well] Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-11-13dmaengine: sh: setup_xref error handlingThomas Andreatta1-1/+1
[ Upstream commit d9a3e9929452780df16f3414f0d59b5f69d058cf ] This patch modifies the type of setup_xref from void to int and handles errors since the function can fail. `setup_xref` now returns the (eventual) error from `dmae_set_dmars`|`dmae_set_chcr`, while `shdma_tx_submit` handles the result, removing the chunks from the queue and marking PM as idle in case of an error. Signed-off-by: Thomas Andreatta <thomas.andreatta2000@gmail.com> Link: https://lore.kernel.org/r/20250827152442.90962-1-thomas.andreatta2000@gmail.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-11-13f2fs: fix to detect potential corrupted nid in free_nid_listChao Yu1-0/+1
[ Upstream commit 8fc6056dcf79937c46c97fa4996cda65956437a9 ] As reported, on-disk footer.ino and footer.nid is the same and out-of-range, let's add sanity check on f2fs_alloc_nid() to detect any potential corruption in free_nid_list. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-11-13s390/pci: Use pci_uevent_ers() in PCI recoveryNiklas Schnelle1-1/+1
[ Upstream commit dab32f2576a39d5f54f3dbbbc718d92fa5109ce9 ] Issue uevents on s390 during PCI recovery using pci_uevent_ers() as done by EEH and AER PCIe recovery routines. Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Link: https://patch.msgid.link/20250807-add_err_uevents-v5-2-adf85b0620b0@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-11-13bpf: Do not limit bpf_cgroup_from_id to current's namespaceKumar Kartikeya Dwivedi1-0/+1
[ Upstream commit 2c895133950646f45e5cf3900b168c952c8dbee8 ] The bpf_cgroup_from_id kfunc relies on cgroup_get_from_id to obtain the cgroup corresponding to a given cgroup ID. This helper can be called in a lot of contexts where the current thread can be random. A recent example was its use in sched_ext's ops.tick(), to obtain the root cgroup pointer. Since the current task can be whatever random user space task preempted by the timer tick, this makes the behavior of the helper unreliable. Refactor out __cgroup_get_from_id as the non-namespace aware version of cgroup_get_from_id, and change bpf_cgroup_from_id to make use of it. There is no compatibility breakage here, since changing the namespace against which the lookup is being done to the root cgroup namespace only permits a wider set of lookups to succeed now. The cgroup IDs across namespaces are globally unique, and thus don't need to be retranslated. Reported-by: Dan Schatzberg <dschatzberg@meta.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20250915032618.1551762-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-11-13bpf: Use tnums for JEQ/JNE is_branch_taken logicPaul Chaignon1-0/+3
[ Upstream commit f41345f47fb267a9c95ca710c33448f8d0d81d83 ] In the following toy program (reg states minimized for readability), R0 and R1 always have different values at instruction 6. This is obvious when reading the program but cannot be guessed from ranges alone as they overlap (R0 in [0; 0xc0000000], R1 in [1024; 0xc0000400]). 0: call bpf_get_prandom_u32#7 ; R0_w=scalar() 1: w0 = w0 ; R0_w=scalar(var_off=(0x0; 0xffffffff)) 2: r0 >>= 30 ; R0_w=scalar(var_off=(0x0; 0x3)) 3: r0 <<= 30 ; R0_w=scalar(var_off=(0x0; 0xc0000000)) 4: r1 = r0 ; R1_w=scalar(var_off=(0x0; 0xc0000000)) 5: r1 += 1024 ; R1_w=scalar(var_off=(0x400; 0xc0000000)) 6: if r1 != r0 goto pc+1 Looking at tnums however, we can deduce that R1 is always different from R0 because their tnums don't agree on known bits. This patch uses this logic to improve is_scalar_branch_taken in case of BPF_JEQ and BPF_JNE. This change has a tiny impact on complexity, which was measured with the Cilium complexity CI test. That test covers 72 programs with various build and load time configurations for a total of 970 test cases. For 80% of test cases, the patch has no impact. On the other test cases, the patch decreases complexity by only 0.08% on average. In the best case, the verifier needs to walk 3% less instructions and, in the worst case, 1.5% more. Overall, the patch has a small positive impact, especially for our largest programs. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/be3ee70b6e489c49881cb1646114b1d861b5c334.1755694147.git.paul.chaignon@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-11-13bpf: Don't use %pK through printkThomas Weißschuh1-1/+1
[ Upstream commit 2caa6b88e0ba0231fb4ff0ba8e73cedd5fb81fc8 ] In the past %pK was preferable to %p as it would not leak raw pointer values into the kernel log. Since commit ad67b74d2469 ("printk: hash addresses printed with %p") the regular %p has been improved to avoid this issue. Furthermore, restricted pointers ("%pK") were never meant to be used through printk(). They can still unintentionally leak raw pointers or acquire sleeping locks in atomic contexts. Switch to the regular pointer formatting which is safer and easier to reason about. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250811-restricted-pointers-bpf-v1-1-a1d7cc3cb9e7@linutronix.de Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-11-13block: make REQ_OP_ZONE_OPEN a write operationDamien Le Moal1-5/+5
commit 19de03b312d69a7e9bacb51c806c6e3f4207376c upstream. A REQ_OP_OPEN_ZONE request changes the condition of a sequential zone of a zoned block device to the explicitly open condition (BLK_ZONE_COND_EXP_OPEN). As such, it should be considered a write operation. Change this operation code to be an odd number to reflect this. The following operation numbers are changed to keep the numbering compact. No problems were reported without this change as this operation has no data. However, this unifies the zone operation to reflect that they modify the device state and also allows strengthening checks in the block layer, e.g. checking if this operation is not issued against a read-only device. Fixes: 6c1b1da58f8c ("block: add zone open, close and finish operations") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-13block: fix op_is_zone_mgmt() to handle REQ_OP_ZONE_RESET_ALLDamien Le Moal1-0/+1
commit 12a1c9353c47c0fb3464eba2d78cdf649dee1cf7 upstream. REQ_OP_ZONE_RESET_ALL is a zone management request. Fix op_is_zone_mgmt() to return true for that operation, like it already does for REQ_OP_ZONE_RESET. While no problems were reported without this fix, this change allows strengthening checks in various block device drivers (scsi sd, virtioblk, DM) where op_is_zone_mgmt() is used to verify that a zone management command is not being issued to a regular block device. Fixes: 6c1b1da58f8c ("block: add zone open, close and finish operations") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-13bpf: Find eligible subprogs for private stack supportYonghong Song2-0/+8
[ Upstream commit a76ab5731e32d50ff5b1ae97e9dc4b23f41c23f5 ] Private stack will be allocated with percpu allocator in jit time. To avoid complexity at runtime, only one copy of private stack is available per cpu per prog. So runtime recursion check is necessary to avoid stack corruption. Current private stack only supports kprobe/perf_event/tp/raw_tp which has recursion check in the kernel, and prog types that use bpf trampoline recursion check. For trampoline related prog types, currently only tracing progs have recursion checking. To avoid complexity, all async_cb subprogs use normal kernel stack including those subprogs used by both main prog subtree and async_cb subtree. Any prog having tail call also uses kernel stack. To avoid jit penalty with private stack support, a subprog stack size threshold is set such that only if the stack size is no less than the threshold, private stack is supported. The current threshold is 64 bytes. This avoids jit penality if the stack usage is small. A useless 'continue' is also removed from a loop in func check_max_stack_depth(). Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20241112163907.2223839-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org> Stable-dep-of: 881a9c9cb785 ("bpf: Do not audit capability check in do_jit()") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-11-13fbcon: Set fb_display[i]->mode to NULL when the mode is releasedQuanmin Yan1-0/+2
commit a1f3058930745d2b938b6b4f5bd9630dc74b26b7 upstream. Recently, we discovered the following issue through syzkaller: BUG: KASAN: slab-use-after-free in fb_mode_is_equal+0x285/0x2f0 Read of size 4 at addr ff11000001b3c69c by task syz.xxx ... Call Trace: <TASK> dump_stack_lvl+0xab/0xe0 print_address_description.constprop.0+0x2c/0x390 print_report+0xb9/0x280 kasan_report+0xb8/0xf0 fb_mode_is_equal+0x285/0x2f0 fbcon_mode_deleted+0x129/0x180 fb_set_var+0xe7f/0x11d0 do_fb_ioctl+0x6a0/0x750 fb_ioctl+0xe0/0x140 __x64_sys_ioctl+0x193/0x210 do_syscall_64+0x5f/0x9c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e Based on experimentation and analysis, during framebuffer unregistration, only the memory of fb_info->modelist is freed, without setting the corresponding fb_display[i]->mode to NULL for the freed modes. This leads to UAF issues during subsequent accesses. Here's an example of reproduction steps: 1. With /dev/fb0 already registered in the system, load a kernel module to register a new device /dev/fb1; 2. Set fb1's mode to the global fb_display[] array (via FBIOPUT_CON2FBMAP); 3. Switch console from fb to VGA (to allow normal rmmod of the ko); 4. Unload the kernel module, at this point fb1's modelist is freed, leaving a wild pointer in fb_display[]; 5. Trigger the bug via system calls through fb0 attempting to delete a mode from fb0. Add a check in do_unregister_framebuffer(): if the mode to be freed exists in fb_display[], set the corresponding mode pointer to NULL. Signed-off-by: Quanmin Yan <yanquanmin1@huawei.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-02gpio: regmap: add the .fixed_direction_output configuration parameterIoana Ciornei1-0/+5
[ Upstream commit 00aaae60faf554c27c95e93d47f200a93ff266ef ] There are GPIO controllers such as the one present in the LX2160ARDB QIXIS FPGA which have fixed-direction input and output GPIO lines mixed together in a single register. This cannot be modeled using the gpio-regmap as-is since there is no way to present the true direction of a GPIO line. In order to make this use case possible, add a new configuration parameter - fixed_direction_output - into the gpio_regmap_config structure. This will enable user drivers to provide a bitmap that represents the fixed direction of the GPIO lines. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by: Michael Walle <mwalle@kernel.org> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Stable-dep-of: 2ba5772e530f ("gpio: idio-16: Define fixed direction of the GPIO lines") Signed-off-by: William Breathitt Gray <wbg@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-02gpio: regmap: Allow to allocate regmap-irq deviceMathieu Dubois-Briand1-0/+11
[ Upstream commit 553b75d4bfe9264f631d459fe9996744e0672b0e ] GPIO controller often have support for IRQ: allow to easily allocate both gpio-regmap and regmap-irq in one operation. Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Mathieu Dubois-Briand <mathieu.dubois-briand@bootlin.com> Link: https://lore.kernel.org/r/20250824-mdb-max7360-support-v14-5-435cfda2b1ea@bootlin.com Signed-off-by: Lee Jones <lee@kernel.org> Stable-dep-of: 2ba5772e530f ("gpio: idio-16: Define fixed direction of the GPIO lines") Signed-off-by: William Breathitt Gray <wbg@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-11-02bits: introduce fixed-type GENMASK_U*()Vincent Mailhol2-1/+30
[ Upstream commit 19408200c094858d952a90bf4977733dc89a4df5 ] Add GENMASK_TYPE() which generalizes __GENMASK() to support different types, and implement fixed-types versions of GENMASK() based on it. The fixed-type version allows more strict checks to the min/max values accepted, which is useful for defining registers like implemented by i915 and xe drivers with their REG_GENMASK*() macros. The strict checks rely on shift-count-overflow compiler check to fail the build if a number outside of the range allowed is passed. Example: #define FOO_MASK GENMASK_U32(33, 4) will generate a warning like: include/linux/bits.h:51:27: error: right shift count >= width of type [-Werror=shift-count-overflow] 51 | type_max(t) >> (BITS_PER_TYPE(t) - 1 - (h))))) | ^~ The result is casted to the corresponding fixed width type. For example, GENMASK_U8() returns an u8. Note that because of the C promotion rules, GENMASK_U8() and GENMASK_U16() will immediately be promoted to int if used in an expression. Regardless, the main goal is not to get the correct type, but rather to enforce more checks at compile time. While GENMASK_TYPE() is crafted to cover all variants, including the already existing GENMASK(), GENMASK_ULL() and GENMASK_U128(), for the moment, only use it for the newly introduced GENMASK_U*(). The consolidation will be done in a separate change. Co-developed-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Yury Norov <yury.norov@gmail.com> Stable-dep-of: 2ba5772e530f ("gpio: idio-16: Define fixed direction of the GPIO lines") Signed-off-by: William Breathitt Gray <wbg@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>