summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2025-08-13cifs: update internal version numberSteve French1-2/+2
to 2.56 Signed-off-by: Steve French <stfrench@microsoft.com>
2025-08-13smb: client: don't wait for info->send_pending == 0 on errorStefan Metzmacher1-5/+5
We already called ib_drain_qp() before and that makes sure send_done() was called with IB_WC_WR_FLUSH_ERR, but didn't called atomic_dec_and_test(&sc->send_io.pending.count) So we may never reach the info->send_pending == 0 condition. Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Fixes: 5349ae5e05fa ("smb: client: let send_done() cleanup before calling smbd_disconnect_rdma_connection()") Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-08-13smb: client: fix mid_q_entry memleak leak with per-mid lockingWang Zhaolong6-20/+40
This is step 4/4 of a patch series to fix mid_q_entry memory leaks caused by race conditions in callback execution. In compound_send_recv(), when wait_for_response() is interrupted by signals, the code attempts to cancel pending requests by changing their callbacks to cifs_cancelled_callback. However, there's a race condition between signal interruption and network response processing that causes both mid_q_entry and server buffer leaks: ``` User foreground process cifsd cifs_readdir open_cached_dir cifs_send_recv compound_send_recv smb2_setup_request smb2_mid_entry_alloc smb2_get_mid_entry smb2_mid_entry_alloc mempool_alloc // alloc mid kref_init(&temp->refcount); // refcount = 1 mid[0]->callback = cifs_compound_callback; mid[1]->callback = cifs_compound_last_callback; smb_send_rqst rc = wait_for_response wait_event_state TASK_KILLABLE cifs_demultiplex_thread allocate_buffers server->bigbuf = cifs_buf_get() standard_receive3 ->find_mid() smb2_find_mid __smb2_find_mid kref_get(&mid->refcount) // +1 cifs_handle_standard handle_mid /* bigbuf will also leak */ mid->resp_buf = server->bigbuf server->bigbuf = NULL; dequeue_mid /* in for loop */ mids[0]->callback cifs_compound_callback /* Signal interrupts wait: rc = -ERESTARTSYS */ /* if (... || midQ[i]->mid_state == MID_RESPONSE_RECEIVED) *? midQ[0]->callback = cifs_cancelled_callback; cancelled_mid[i] = true; /* The change comes too late */ mid->mid_state = MID_RESPONSE_READY release_mid // -1 /* cancelled_mid[i] == true causes mid won't be released in compound_send_recv cleanup */ /* cifs_cancelled_callback won't executed to release mid */ ``` The root cause is that there's a race between callback assignment and execution. Fix this by introducing per-mid locking: - Add spinlock_t mid_lock to struct mid_q_entry - Add mid_execute_callback() for atomic callback execution - Use mid_lock in cancellation paths to ensure atomicity This ensures that either the original callback or the cancellation callback executes atomically, preventing reference count leaks when requests are interrupted by signals. Link: https://bugzilla.kernel.org/show_bug.cgi?id=220404 Fixes: ee258d79159a ("CIFS: Move credit processing to mid callbacks for SMB3") Signed-off-by: Wang Zhaolong <wangzhaolong@huaweicloud.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-08-13smb3: fix for slab out of bounds on mount to ksmbdSteve French1-1/+10
With KASAN enabled, it is possible to get a slab out of bounds during mount to ksmbd due to missing check in parse_server_interfaces() (see below): BUG: KASAN: slab-out-of-bounds in parse_server_interfaces+0x14ee/0x1880 [cifs] Read of size 4 at addr ffff8881433dba98 by task mount/9827 CPU: 5 UID: 0 PID: 9827 Comm: mount Tainted: G OE 6.16.0-rc2-kasan #2 PREEMPT(voluntary) Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE Hardware name: Dell Inc. Precision Tower 3620/0MWYPT, BIOS 2.13.1 06/14/2019 Call Trace: <TASK> dump_stack_lvl+0x9f/0xf0 print_report+0xd1/0x670 __virt_addr_valid+0x22c/0x430 ? parse_server_interfaces+0x14ee/0x1880 [cifs] ? kasan_complete_mode_report_info+0x2a/0x1f0 ? parse_server_interfaces+0x14ee/0x1880 [cifs] kasan_report+0xd6/0x110 parse_server_interfaces+0x14ee/0x1880 [cifs] __asan_report_load_n_noabort+0x13/0x20 parse_server_interfaces+0x14ee/0x1880 [cifs] ? __pfx_parse_server_interfaces+0x10/0x10 [cifs] ? trace_hardirqs_on+0x51/0x60 SMB3_request_interfaces+0x1ad/0x3f0 [cifs] ? __pfx_SMB3_request_interfaces+0x10/0x10 [cifs] ? SMB2_tcon+0x23c/0x15d0 [cifs] smb3_qfs_tcon+0x173/0x2b0 [cifs] ? __pfx_smb3_qfs_tcon+0x10/0x10 [cifs] ? cifs_get_tcon+0x105d/0x2120 [cifs] ? do_raw_spin_unlock+0x5d/0x200 ? cifs_get_tcon+0x105d/0x2120 [cifs] ? __pfx_smb3_qfs_tcon+0x10/0x10 [cifs] cifs_mount_get_tcon+0x369/0xb90 [cifs] ? dfs_cache_find+0xe7/0x150 [cifs] dfs_mount_share+0x985/0x2970 [cifs] ? check_path.constprop.0+0x28/0x50 ? save_trace+0x54/0x370 ? __pfx_dfs_mount_share+0x10/0x10 [cifs] ? __lock_acquire+0xb82/0x2ba0 ? __kasan_check_write+0x18/0x20 cifs_mount+0xbc/0x9e0 [cifs] ? __pfx_cifs_mount+0x10/0x10 [cifs] ? do_raw_spin_unlock+0x5d/0x200 ? cifs_setup_cifs_sb+0x29d/0x810 [cifs] cifs_smb3_do_mount+0x263/0x1990 [cifs] Reported-by: Namjae Jeon <linkinjeon@kernel.org> Tested-by: Namjae Jeon <linkinjeon@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-08-13Revert "ALSA: hda: Add ASRock X670E Taichi to denylist"Mario Limonciello (AMD)1-1/+0
On a motherboard with an AMD Granite Ridge CPU there is a report that 3.5mm microphone and headphones aren't working. In the log it's observed: snd_hda_intel 0000:02:00.6: Skipping the device on the denylist This was because of commit df42ee7e22f03 ("ALSA: hda: Add ASRock X670E Taichi to denylist"). Reverting this commit allows the microphone and headphones to work again. As at least some combinations of this motherboard do have applicable devices, revert so that they can be probed. Cc: Richard Gong <richard.gong@amd.com> Cc: Juan Martinez <juan.martinez@amd.com> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Link: https://patch.msgid.link/20250813140427.1577172-1-superm1@kernel.org Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-08-13ALSA: azt3328: Put __maybe_unused for inline functions for gameportTakashi Iwai1-4/+4
Some inline functions are unused depending on kconfig, and the recent change for clang builds made those handled as errors with W=1. For avoiding pitfalls, mark those with __maybe_unused attributes. Link: https://patch.msgid.link/20250813153628.12303-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-08-13rust: faux: fix C header linkMiguel Ojeda1-1/+1
Starting with Rust 1.91.0 (expected 2025-10-30), `rustdoc` has improved some false negatives around intra-doc links [1], and it found a broken intra-doc link we currently have: error: unresolved link to `include/linux/device/faux.h` --> rust/kernel/faux.rs:7:17 | 7 | //! C header: [`include/linux/device/faux.h`] | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ no item named `include/linux/device/faux.h` in scope | = help: to escape `[` and `]` characters, add '\' before them like `\[` or `\]` = note: `-D rustdoc::broken-intra-doc-links` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(rustdoc::broken_intra_doc_links)]` Our `srctree/` C header links are not intra-doc links, thus they need the link destination. Thus fix it. Cc: stable <stable@kernel.org> Link: https://github.com/rust-lang/rust/pull/132748 [1] Fixes: 78418f300d39 ("rust/kernel: Add faux device bindings") Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Reviewed-by: Benno Lossin <lossin@kernel.org> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20250804171311.1186538-1-ojeda@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-13Merge tag 'mm-hotfixes-stable-2025-08-12-20-50' of ↵Linus Torvalds9-32/+57
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "12 hotfixes. 5 are cc:stable and the remainder address post-6.16 issues or aren't considered necessary for -stable kernels. 10 of these fixes are for MM" * tag 'mm-hotfixes-stable-2025-08-12-20-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: proc: proc_maps_open allow proc_mem_open to return NULL mm/mremap: avoid expensive folio lookup on mremap folio pte batch userfaultfd: fix a crash in UFFDIO_MOVE when PMD is a migration entry mm: pass page directly instead of using folio_page selftests/proc: fix string literal warning in proc-maps-race.c fs/proc/task_mmu: hold PTL in pagemap_hugetlb_range and gather_hugetlb_stats mm/smaps: fix race between smaps_hugetlb_range and migration mm: fix the race between collapse and PT_RECLAIM under per-vma lock mm/kmemleak: avoid soft lockup in __kmemleak_do_cleanup() MAINTAINERS: add Masami as a reviewer of hung task detector mm/kmemleak: avoid deadlock by moving pr_warn() outside kmemleak_lock kasan/test: fix protection against compiler elision
2025-08-13usb: storage: realtek_cr: Use correct byte order for bcs->ResidueThorsten Blum1-1/+1
Since 'bcs->Residue' has the data type '__le32', convert it to the correct byte order of the CPU using this driver when assigning it to the local variable 'residue'. Cc: stable <stable@kernel.org> Fixes: 50a6cb932d5c ("USB: usb_storage: add ums-realtek driver") Suggested-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Link: https://lore.kernel.org/r/20250813145247.184717-3-thorsten.blum@linux.dev Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-13usb: chipidea: imx: improve usbmisc_imx7d_pullup()Xu Yang2-8/+18
When add workaround for ERR051725, the usbmisc will put PHY to Non-driving mode (OPMODE = 01) after stopping the device controller and put PHY back to Normal mode (OPMODE = 00) after starting the device controller. However, this will bring issue for host controller. Because the PHY may stay in Non-driving mode after switching the role from device to host. Then the port will not work if USB device is attached. To fix this issue, improving the workaround by putting PHY to Non-driving mode for a certain period and back to Normal mode finally. To make host detect a disconnect signal, the period should be at least 125us (a micro-frame time) for high-speed link. And only working as high-speed mode will need workaround for ERR051725. So this will also filter the pullup event for high-speed. Fixes: 11992b410083 ("usb: chipidea: imx: implement workaround for ERR051725") Reviewed-by: Jun Li <jun.li@nxp.com> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Acked-by: Peter Chen <peter.chen@kernel.org> Link: https://lore.kernel.org/r/20250811100833.862876-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-13kcov, usb: Don't disable interrupts in kcov_remote_start_usb_softirq()Sebastian Andrzej Siewior2-45/+14
kcov_remote_start_usb_softirq() the begin of urb's completion callback. HCDs marked HCD_BH will invoke this function from the softirq and in_serving_softirq() will detect this properly. Root-HUB (RH) requests will not be delayed to softirq but complete immediately in IRQ context. This will confuse kcov because in_serving_softirq() will report true if the softirq is served after the hardirq and if the softirq got interrupted by the hardirq in which currently runs. This was addressed by simply disabling interrupts in kcov_remote_start_usb_softirq() which avoided the interruption by the RH while a regular completion callback was invoked. This not only changes the behaviour while kconv is enabled but also breaks PREEMPT_RT because now sleeping locks can no longer be acquired. Revert the previous fix. Address the issue by invoking kcov_remote_start_usb() only if the context is just "serving softirqs" which is identified by checking in_serving_softirq() and in_hardirq() must be false. Fixes: f85d39dd7ed89 ("kcov, usb: disable interrupts in kcov_remote_start_usb_softirq") Cc: stable <stable@kernel.org> Reported-by: Yunseong Kim <ysk@kzalloc.com> Closes: https://lore.kernel.org/all/20250725201400.1078395-2-ysk@kzalloc.com/ Tested-by: Yunseong Kim <ysk@kzalloc.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/20250811082745.ycJqBXMs@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-13usb: dwc3: pci: add support for the Intel Wildcat LakeHeikki Krogerus1-0/+2
This patch adds the necessary PCI ID for Intel Wildcat Lake devices. Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Cc: stable <stable@kernel.org> Link: https://lore.kernel.org/r/20250812131101.2930199-1-heikki.krogerus@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-13usb: dwc3: Ignore late xferNotReady event to prevent halt timeoutKuen-Han Tsai1-0/+9
During a device-initiated disconnect, the End Transfer command resets the event filter, allowing a new xferNotReady event to be generated before the controller is fully halted. Processing this late event incorrectly triggers a Start Transfer, which prevents the controller from halting and results in a DSTS.DEVCTLHLT bit polling timeout. Ignore the late xferNotReady event if the controller is already in a disconnected state. Fixes: 72246da40f37 ("usb: Introduce DesignWare USB3 DRD Driver") Cc: stable <stable@kernel.org> Signed-off-by: Kuen-Han Tsai <khtsai@google.com> Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://lore.kernel.org/r/20250807090700.2397190-1-khtsai@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-13USB: storage: Add unusual-devs entry for Novatek NTK96550-based cameraMael GUERIN1-0/+7
Add the US_FL_BULK_IGNORE_TAG quirk for Novatek NTK96550-based camera to fix USB resets after sending SCSI vendor commands due to CBW and CSW tags difference, leading to undesired slowness while communicating with the device. Please find below the copy of /sys/kernel/debug/usb/devices with my device plugged in (listed as TechSys USB mass storage here, the underlying chipset being the Novatek NTK96550-based camera): T: Bus=03 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 3 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=0603 ProdID=8611 Rev= 0.01 S: Manufacturer=TechSys S: Product=USB Mass Storage S: SerialNumber=966110000000100 C:* #Ifs= 1 Cfg#= 1 Atr=c0 MxPwr=100mA I:* If#= 0 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms Signed-off-by: Mael GUERIN <mael.guerin@murena.io> Cc: stable <stable@kernel.org> Acked-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/20250806164406.43450-1-mael.guerin@murena.io Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-13usb: core: hcd: fix accessing unmapped memory in SINGLE_STEP_SET_FEATURE testXu Yang1-1/+7
The USB core will unmap urb->transfer_dma after SETUP stage completes. Then the USB controller will access unmapped memory when it received device descriptor. If iommu is equipped, the entire test can't be completed due to the memory accessing is blocked. Fix it by calling map_urb_for_dma() again for IN stage. To reduce redundant map for urb->transfer_buffer, this will also set URB_NO_TRANSFER_DMA_MAP flag before first map_urb_for_dma() to skip dma map for urb->transfer_buffer and clear URB_NO_TRANSFER_DMA_MAP flag before second map_urb_for_dma(). Fixes: 216e0e563d81 ("usb: core: hcd: use map_urb_for_dma for single step set feature urb") Cc: stable <stable@kernel.org> Reviewed-by: Jun Li <jun.li@nxp.com> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/20250806083955.3325299-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-13usb: renesas-xhci: Fix External ROM access timeoutsMarek Vasut1-3/+4
Increase the External ROM access timeouts to prevent failures during programming of External SPI EEPROM chips. The current timeouts are too short for some SPI EEPROMs used with uPD720201 controllers. The current timeout for Chip Erase in renesas_rom_erase() is 100 ms , the current timeout for Sector Erase issued by the controller before Page Program in renesas_fw_download_image() is also 100 ms. Neither timeout is sufficient for e.g. the Macronix MX25L5121E or MX25V5126F. MX25L5121E reference manual [1] page 35 section "ERASE AND PROGRAMMING PERFORMANCE" and page 23 section "Table 8. AC CHARACTERISTICS (Temperature = 0°C to 70°C for Commercial grade, VCC = 2.7V ~ 3.6V)" row "tCE" indicate that the maximum time required for Chip Erase opcode to complete is 2 s, and for Sector Erase it is 300 ms . MX25V5126F reference manual [2] page 47 section "13. ERASE AND PROGRAMMING PERFORMANCE (2.3V - 3.6V)" and page 42 section "Table 8. AC CHARACTERISTICS (Temperature = -40°C to 85°C for Industrial grade, VCC = 2.3V - 3.6V)" row "tCE" indicate that the maximum time required for Chip Erase opcode to complete is 3.2 s, and for Sector Erase it is 400 ms . Update the timeouts such, that Chip Erase timeout is set to 5 seconds, and Sector Erase timeout is set to 500 ms. Such lengthy timeouts ought to be sufficient for majority of SPI EEPROM chips. [1] https://www.macronix.com/Lists/Datasheet/Attachments/8634/MX25L5121E,%203V,%20512Kb,%20v1.3.pdf [2] https://www.macronix.com/Lists/Datasheet/Attachments/8750/MX25V5126F,%202.5V,%20512Kb,%20v1.1.pdf Fixes: 2478be82de44 ("usb: renesas-xhci: Add ROM loader for uPD720201") Cc: stable <stable@kernel.org> Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org> Link: https://lore.kernel.org/r/20250802225526.25431-1-marek.vasut+renesas@mailbox.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-13usb: gadget: tegra-xudc: fix PM use count underflowRussell King (Oracle)1-2/+7
Upon resume from system suspend, the PM runtime core issues the following warning: tegra-xudc 3550000.usb: Runtime PM usage count underflow! This is because tegra_xudc_resume() unconditionally calls schedule_work(&xudc->usb_role_sw_work) whether or not anything has changed, which causes tegra_xudc_device_mode_off() to be called even when we're already in that mode. Keep track of the current state of "device_mode", and only schedule this work if it has changed from the hardware state on resume. Signed-off-by: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/E1uhtkH-007KDZ-JT@rmk-PC.armlinux.org.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-13usb: quirks: Add DELAY_INIT quick for another SanDisk 3.2Gen1 Flash DriveMiao Li1-0/+1
Another SanDisk 3.2Gen1 Flash Drive also need DELAY_INIT quick, or it will randomly work incorrectly on Huawei hisi platforms when doing reboot test. Signed-off-by: Miao Li <limiao@kylinos.cn> Cc: stable <stable@kernel.org> Link: https://lore.kernel.org/r/20250801082728.469406-1-limiao870622@163.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-13ASoC: tas2781: Normalize the volume kcontrol nameBaojun Xu1-2/+2
Change the name of the kcontrol from "Gain" to "Volume". Signed-off-by: Baojun Xu <baojun.xu@ti.com> Link: https://patch.msgid.link/20250813100708.12197-1-baojun.xu@ti.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-08-13io_uring/io-wq: add check free worker before create new workerFengnan Chang1-0/+8
After commit 0b2b066f8a85 ("io_uring/io-wq: only create a new worker if it can make progress"), in our produce environment, we still observe that part of io_worker threads keeps creating and destroying. After analysis, it was confirmed that this was due to a more complex scenario involving a large number of fsync operations, which can be abstracted as frequent write + fsync operations on multiple files in a single uring instance. Since write is a hash operation while fsync is not, and fsync is likely to be suspended during execution, the action of checking the hash value in io_wqe_dec_running cannot handle such scenarios. Similarly, if hash-based work and non-hash-based work are sent at the same time, similar issues are likely to occur. Returning to the starting point of the issue, when a new work arrives, io_wq_enqueue may wake up free worker A, while io_wq_dec_running may create worker B. Ultimately, only one of A and B can obtain and process the task, leaving the other in an idle state. In the end, the issue is caused by inconsistent logic in the checks performed by io_wq_enqueue and io_wq_dec_running. Therefore, the problem can be resolved by checking for available workers in io_wq_dec_running. Signed-off-by: Fengnan Chang <changfengnan@bytedance.com> Reviewed-by: Diangang Li <lidiangang@bytedance.com> Link: https://lore.kernel.org/r/20250813120214.18729-1-changfengnan@bytedance.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-13btrfs: fix printing of mount info messages for NODATACOW/NODATASUMKyoji Ogasawara1-1/+2
The NODATASUM message was printed twice by mistake and the NODATACOW was missing from the 'unset' part. Fix the duplication and make the output look the same. Fixes: eddb1a433f26 ("btrfs: add reconfigure callback for fs_context") CC: stable@vger.kernel.org # 6.8+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Kyoji Ogasawara <sawara04.o@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-08-13btrfs: restore mount option info messages during mountKyoji Ogasawara1-3/+5
After the fsconfig migration in 6.8, mount option info messages are no longer displayed during mount operations because btrfs_emit_options() is only called during remount, not during initial mount. Fix this by calling btrfs_emit_options() in btrfs_fill_super() after open_ctree() succeeds. Additionally, prevent log duplication by ensuring btrfs_check_options() handles validation with warn-level and err-level messages, while btrfs_emit_options() provides info-level messages. Fixes: eddb1a433f26 ("btrfs: add reconfigure callback for fs_context") CC: stable@vger.kernel.org # 6.8+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Kyoji Ogasawara <sawara04.o@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-08-13btrfs: fix incorrect log message for nobarrier mount optionKyoji Ogasawara1-1/+1
Fix a wrong log message that appears when the "nobarrier" mount option is unset. When "nobarrier" is unset, barrier is actually enabled. However, the log incorrectly stated "turning off barriers". Fixes: eddb1a433f26 ("btrfs: add reconfigure callback for fs_context") CC: stable@vger.kernel.org # 6.12+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Kyoji Ogasawara <sawara04.o@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-08-13btrfs: fix buffer index in wait_eb_writebacks()Naohiro Aota1-1/+1
The commit f2cb97ee964a ("btrfs: index buffer_tree using node size") changed the index of buffer_tree from "start >> sectorsize_bits" to "start >> nodesize_bits". However, the change is not applied for wait_eb_writebacks() and caused IO failures by writing in a full zone. Use the index properly. Fixes: f2cb97ee964a ("btrfs: index buffer_tree using node size") Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-08-13btrfs: subpage: keep TOWRITE tag until folio is cleanedNaohiro Aota1-1/+18
btrfs_subpage_set_writeback() calls folio_start_writeback() the first time a folio is written back, and it also clears the PAGECACHE_TAG_TOWRITE tag even if there are still dirty blocks in the folio. This can break ordering guarantees, such as those required by btrfs_wait_ordered_extents(). That ordering breakage leads to a real failure. For example, running generic/464 on a zoned setup will hit the following ASSERT. This happens because the broken ordering fails to flush existing dirty pages before the file size is truncated. assertion failed: !list_empty(&ordered->list) :: 0, in fs/btrfs/zoned.c:1899 ------------[ cut here ]------------ kernel BUG at fs/btrfs/zoned.c:1899! Oops: invalid opcode: 0000 [#1] SMP NOPTI CPU: 2 UID: 0 PID: 1906169 Comm: kworker/u130:2 Kdump: loaded Not tainted 6.16.0-rc6-BTRFS-ZNS+ #554 PREEMPT(voluntary) Hardware name: Supermicro Super Server/H12SSL-NT, BIOS 2.0 02/22/2021 Workqueue: btrfs-endio-write btrfs_work_helper [btrfs] RIP: 0010:btrfs_finish_ordered_zoned.cold+0x50/0x52 [btrfs] RSP: 0018:ffffc9002efdbd60 EFLAGS: 00010246 RAX: 000000000000004c RBX: ffff88811923c4e0 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff827e38b1 RDI: 00000000ffffffff RBP: ffff88810005d000 R08: 00000000ffffdfff R09: ffffffff831051c8 R10: ffffffff83055220 R11: 0000000000000000 R12: ffff8881c2458c00 R13: ffff88811923c540 R14: ffff88811923c5e8 R15: ffff8881c1bd9680 FS: 0000000000000000(0000) GS:ffff88a04acd0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f907c7a918c CR3: 0000000004024000 CR4: 0000000000350ef0 Call Trace: <TASK> ? srso_return_thunk+0x5/0x5f btrfs_finish_ordered_io+0x4a/0x60 [btrfs] btrfs_work_helper+0xf9/0x490 [btrfs] process_one_work+0x204/0x590 ? srso_return_thunk+0x5/0x5f worker_thread+0x1d6/0x3d0 ? __pfx_worker_thread+0x10/0x10 kthread+0x118/0x230 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x205/0x260 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> Consider process A calling writepages() with WB_SYNC_NONE. In zoned mode or for compressed writes, it locks several folios for delalloc and starts writing them out. Let's call the last locked folio folio X. Suppose the write range only partially covers folio X, leaving some pages dirty. Process A calls btrfs_subpage_set_writeback() when building a bio. This function call clears the TOWRITE tag of folio X, whose size = 8K and the block size = 4K. It is following state. 0 4K 8K |/////|/////| (flag: DIRTY, tag: DIRTY) <-----> Process A will write this range. Now suppose process B concurrently calls writepages() with WB_SYNC_ALL. It calls tag_pages_for_writeback() to tag dirty folios with PAGECACHE_TAG_TOWRITE. Since folio X is still dirty, it gets tagged. Then, B collects tagged folios using filemap_get_folios_tag() and must wait for folio X to be written before returning from writepages(). 0 4K 8K |/////|/////| (flag: DIRTY, tag: DIRTY|TOWRITE) However, between tagging and collecting, process A may call btrfs_subpage_set_writeback() and clear folio X's TOWRITE tag. 0 4K 8K | |/////| (flag: DIRTY|WRITEBACK, tag: DIRTY) As a result, process B won't see folio X in its batch, and returns without waiting for it. This breaks the WB_SYNC_ALL ordering requirement. Fix this by using btrfs_subpage_set_writeback_keepwrite(), which retains the TOWRITE tag. We now manually clear the tag only after the folio becomes clean, via the xas operation. Fixes: 3470da3b7d87 ("btrfs: subpage: introduce helpers for writeback status") CC: stable@vger.kernel.org # 6.12+ Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-08-13btrfs: clear TAG_TOWRITE from buffer tree when submitting a tree blockQu Wenruo1-0/+1
[POSSIBLE BUG] After commit 5e121ae687b8 ("btrfs: use buffer xarray for extent buffer writeback operations"), we have a dedicated xarray for extent buffers, and a lot of tags are migrated to that buffer tree, like PAGECACHE_TAG_TOWRITE/DIRTY/WRITEBACK. This frees us from the limits of page flags, but there is a new asymmetric behavior, we call buffer_tree_tag_for_writeback() to set PAGECACHE_TAG_TOWRITE for the involved ranges, but there is no one to clear that tag. Before that rework, we relied on the page cache tag which was cleared when folio_start_writeback() was called. Although this has its own problems (e.g. the first one calling folio_start_writeback() will clear the tag for the whole page), it at least cleared the tag. But now our real tags are stored in the buffer tree, no one is really clearing the PAGECACHE_TAG_TOWRITE tag now. [FIX] Thankfully this is not going to cause any real bug, but just some inefficiency iterating the extent buffers. As if we hit an extent buffer which is not dirty but still has the PAGECACHE_TAG_TOWRITE tag, lock_extent_buffer_for_io() will skip it so we won't writeback the extent buffer again. To properly fix the inefficiency, just clear the PAGECACHE_TAG_TOWRITE inside lock_extent_buffer_for_io(). There is no error path between lock_extent_buffer_for_io() and write_one_eb(), so we're safe to clear the tag there. Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-08-13btrfs: do not set mtime/ctime to current time when unlinking for log replayFilipe Manana1-10/+19
If we are doing an unlink for log replay, we are updating the directory's mtime and ctime to the current time, and this is incorrect since it should stay with the mtime and ctime that were set when the directory was logged. This is the same as when adding a link to an inode during log replay (with btrfs_add_link()), where we want the mtime and ctime to be the values that were in place when the inode was logged. This was found with generic/547 using LOAD_FACTOR=20 and TIME_FACTOR=20, where due to large log trees we have longer log replay times and fssum could detect a mismatch of the mtime and ctime of a directory. Fix this by skipping the mtime and ctime update at __btrfs_unlink_inode() if we are in log replay context (just like btrfs_add_link()). Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-08-13btrfs: clear block dirty if btrfs_writepage_cow_fixup() failedQu Wenruo1-1/+5
[BUG] If btrfs_writepage_cow_fixup() failed (returning value -EUCLEAN), the block will be kept dirty, but with its corresponding range finished in the ordered extent. Currently that error pattern is only possible for experimental builds, which places extra check to ensure we shouldn't hit a dirty block without a corresponding ordered extent. This means if later a writeback happens again, we can hit the following problems: - ASSERT(block_start != EXTENT_MAP_HOLE) in submit_one_sector() If the original extent map is a hole, then we can hit this case, as the new ordered extent failed, we will drop the new extent map and re-read one from the disk. - DEBUG_WARN() in btrfs_writepage_cow_fixup() This is because we no longer have an ordered extent for those dirty blocks. The original for them is already finished with error. [CAUSE] The function btrfs_writepage_cow_fixup() is not following the regular error handling of writeback. The common practice is to clear the folio dirty, start and finish the writeback for the block. This is normally done by extent_clear_unlock_delalloc() with PAGE_START_WRITEBACK | PAGE_END_WRITEBACK flags during run_delalloc_range(). So if we keep those failed blocks dirty, they will stay in the page cache and wait for the next writeback. And since the original ordered extent is already finished and removed, depending on the original extent map, we either hit the ASSERT() inside submit_one_sector(), or hit the DEBUG_WARN() in btrfs_writepage_cow_fixup() again (and very ironic). [FIX] Follow the regular error handling to clear the dirty flag for the block range, start and finish writeback for that block range instead. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-08-13btrfs: clear block dirty if submit_one_sector() failedQu Wenruo1-4/+13
[BUG] If submit_one_sector() failed, the block will be kept dirty, but with their corresponding range finished in the ordered extent. This means if a writeback happens later again, we can hit the following problems: - ASSERT(block_start != EXTENT_MAP_HOLE) in submit_one_sector() If the original extent map is a hole, then we can hit this case, as the new ordered extent failed, we will drop the new extent map and re-read one from the disk. - DEBUG_WARN() in btrfs_writepage_cow_fixup() This is because we no longer have an ordered extent for those dirty blocks. The original for them is already finished with error. [CAUSE] The function submit_one_sector() is not following the regular error handling of writeback. The common practice is to clear the folio dirty, start and finish the writeback for the block. This is normally done by extent_clear_unlock_delalloc() with PAGE_START_WRITEBACK | PAGE_END_WRITEBACK flags during run_delalloc_range(). So if we keep those failed blocks dirty, they will stay in the page cache and wait for the next writeback. And since the original ordered extent is already finished and removed, depending on the original extent map, we either hit the ASSERT() inside submit_one_sector(), or hit the DEBUG_WARN() in btrfs_writepage_cow_fixup(). [FIX] Follow the regular error handling to clear the dirty flag for the block, start and finish writeback for that block instead. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-08-13HID: intel-thc-hid: Intel-quicki2c: Enhance driver re-install flowEven Xu1-0/+1
After driver module is removed and during re-install stage, if there is continueous user touching on the screen, it is a risk impacting THC hardware initialization which causes driver installation failure. This patch enhances this flow by quiescing the external touch interrupt after driver is removed which keeps THC hardware ignore external interrupt during this remove and re-install stage. Signed-off-by: Even Xu <even.xu@intel.com> Tested-by: Rui Zhang <rui1.zhang@intel.com> Fixes: 66b59bfce6d9 ("HID: intel-thc-hid: intel-quicki2c: Complete THC QuickI2C driver") Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-08-13md: add legacy_async_del_gendisk modeXiao Ni1-14/+42
commit 9e59d609763f ("md: call del_gendisk in control path") changes the async way to sync way of calling del_gendisk. But it breaks mdadm --assemble command. The assemble command runs like this: 1. create the array 2. stop the array 3. access the sysfs files after stopping The sync way calls del_gendisk in step 2, so all sysfs files are removed. Now to avoid breaking mdadm assemble command, this patch adds the parameter legacy_async_del_gendisk that can be used to choose which way. The default is async way. In future, we plan to change default to sync way in kernel 7.0. Then users need to upgrade to mdadm 4.5+ which removes step 2. Fixes: 9e59d609763f ("md: call del_gendisk in control path") Reported-by: Mikulas Patocka <mpatocka@redhat.com> Closes: https://lore.kernel.org/linux-raid/CAMw=ZnQ=ET2St-+hnhsuq34rRPnebqcXqP1QqaHW5Bh4aaaZ4g@mail.gmail.com/T/#t Suggested-and-reviewed-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Xiao Ni <xni@redhat.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Link: https://lore.kernel.org/linux-raid/20250813032929.54978-1-xni@redhat.com Signed-off-by: Yu Kuai <yukuai3@huawei.com>
2025-08-13block: restore default wbt enablementJulian Sun1-1/+1
The commit 245618f8e45f ("block: protect wbt_lat_usec using q->elevator_lock") protected wbt_enable_default() with q->elevator_lock; however, it also placed wbt_enable_default() before blk_queue_flag_set(QUEUE_FLAG_REGISTERED, q);, resulting in wbt failing to be enabled. Moreover, the protection of wbt_enable_default() by q->elevator_lock was removed in commit 78c271344b6f ("block: move wbt_enable_default() out of queue freezing from sched ->exit()"), so we can directly fix this issue by placing wbt_enable_default() after blk_queue_flag_set(QUEUE_FLAG_REGISTERED, q);. Additionally, this issue also causes the inability to read the wbt_lat_usec file, and the scenario is as follows: root@q:/sys/block/sda/queue# cat wbt_lat_usec cat: wbt_lat_usec: Invalid argument root@q:/data00/sjc/linux# ls /sys/kernel/debug/block/sda/rqos cannot access '/sys/kernel/debug/block/sda/rqos': No such file or directory root@q:/data00/sjc/linux# find /sys -name wbt /sys/kernel/debug/tracing/events/wbt After testing with this patch, wbt can be enabled normally. Signed-off-by: Julian Sun <sunjunchao@bytedance.com> Cc: stable@vger.kernel.org Fixes: 245618f8e45f ("block: protect wbt_lat_usec using q->elevator_lock") Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250812154257.57540-1-sunjunchao@bytedance.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-13Docs: admin-guide: Correct spelling mistakeErick Karanja1-1/+1
Fix spelling mistake directoy to directory Reported-by: codespell Signed-off-by: Erick Karanja <karanja99erick@gmail.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20250813071837.668613-1-karanja99erick@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-13RDMA/hns: Fix dip entries leak on devices newer than hip09Junxian Huang1-1/+1
DIP algorithm is also supported on devices newer than hip09, so free dip entries too. Fixes: f91696f2f053 ("RDMA/hns: Support congestion control type selection according to the FW") Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20250812122602.3524602-1-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-08-13HID: hid-ntrig: fix unable to handle page fault in ntrig_report_version()Minjong Kim1-0/+3
in ntrig_report_version(), hdev parameter passed from hid_probe(). sending descriptor to /dev/uhid can make hdev->dev.parent->parent to null if hdev->dev.parent->parent is null, usb_dev has invalid address(0xffffffffffffff58) that hid_to_usb_dev(hdev) returned when usb_rcvctrlpipe() use usb_dev,it trigger page fault error for address(0xffffffffffffff58) add null check logic to ntrig_report_version() before calling hid_to_usb_dev() Signed-off-by: Minjong Kim <minbell.kim@samsung.com> Link: https://patch.msgid.link/20250813-hid-ntrig-page-fault-fix-v2-1-f98581f35106@samsung.com Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2025-08-13RDMA/core: Free pfn_list with appropriate kvfree callAkhilesh Patil1-2/+2
Ensure that pfn_list allocated by kvcalloc() is freed using corresponding kvfree() function. Match memory allocation and free routines kvcalloc -> kvfree. Fixes: 259e9bd07c57 ("RDMA/core: Avoid hmm_dma_map_alloc() for virtual DMA devices") Signed-off-by: Akhilesh Patil <akhilesh@ee.iitb.ac.in> Link: https://patch.msgid.link/aJjcPjL1BVh8QrMN@bhairav-test.ee.iitb.ac.in Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-08-13MAINTAINERS: Remove bouncing irdma maintainerDave Hansen1-1/+0
This maintainer's email no longer works. Remove it from MAINTAINERS. This still leaves one maintainer for the driver. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Cc: linux-rdma@vger.kernel.org Link: https://patch.msgid.link/20250808175601.EF0AF767@davehans-spike.ostc.intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-08-13RDMA/bnxt_re: Fix to initialize the PBL arrayAnantha Prabhu1-0/+2
memset the PBL page pointer and page map arrays before populating the SGL addresses of the HWQ. Fixes: 0c4dcd602817 ("RDMA/bnxt_re: Refactor hardware queue memory allocation") Signed-off-by: Anantha Prabhu <anantha.prabhu@broadcom.com> Reviewed-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com> Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/20250805101000.233310-5-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-08-13RDMA/bnxt_re: Fix a possible memory leak in the driverKalesh AP1-0/+23
The GID context reuse logic requires the context memory to be not freed if and when DEL_GID firmware command fails. But, if there's no subsequent ADD_GID to reuse it, the context memory must be freed when the driver is unloaded. Otherwise it leads to a memory leak. Below is the kmemleak trace reported: unreferenced object 0xffff88817a4f34d0 (size 8): comm "insmod", pid 1072504, jiffies 4402561550 hex dump (first 8 bytes): 01 00 00 00 00 00 00 00 ........ backtrace (crc ccaa009e): __kmalloc_cache_noprof+0x33e/0x400 0xffffffffc2db9d48 add_modify_gid+0x5e0/0xb60 [ib_core] __ib_cache_gid_add+0x213/0x350 [ib_core] update_gid+0xf2/0x180 [ib_core] enum_netdev_ipv4_ips+0x3f3/0x690 [ib_core] enum_all_gids_of_dev_cb+0x125/0x1b0 [ib_core] ib_enum_roce_netdev+0x14b/0x250 [ib_core] ib_cache_setup_one+0x2e5/0x540 [ib_core] ib_register_device+0x82c/0xf10 [ib_core] 0xffffffffc2df5ad9 0xffffffffc2da8b07 0xffffffffc2db174d auxiliary_bus_probe+0xa5/0x120 really_probe+0x1e4/0x850 __driver_probe_device+0x18f/0x3d0 Fixes: 4a62c5e9e2e1 ("RDMA/bnxt_re: Do not free the ctx_tbl entry if delete GID fails") Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/20250805101000.233310-4-kalesh-anakkur.purayil@broadcom.com Reviewed-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-08-13RDMA/bnxt_re: Fix to remove workload check in SRQ limit pathKashyap Desai3-35/+2
There should not be any checks of current workload to set srq_limit value to SRQ hw context. Remove all such workload checks and make a direct call to set srq_limit via doorbell SRQ_ARM. Fixes: 37cb11acf1f7 ("RDMA/bnxt_re: Add SRQ support for Broadcom adapters") Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/20250805101000.233310-3-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-08-13RDMA/bnxt_re: Fix to do SRQ armena by defaultKashyap Desai1-2/+1
Whenever SRQ is created, make sure SRQ arm enable is always set. Driver is always ready to receive SRQ ASYNC event. Additional note - There is no need to do srq arm enable conditionally. See bnxt_qplib_armen_db in bnxt_qplib_create_cq(). Fixes: 37cb11acf1f7 ("RDMA/bnxt_re: Add SRQ support for Broadcom adapters") Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com> Link: https://patch.msgid.link/20250805101000.233310-2-kalesh-anakkur.purayil@broadcom.com Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-08-13RDMA/hns: Fix querying wrong SCC context for DIP algorithmwenglianfa2-3/+10
When using DIP algorithm, all QPs establishing connections with the same destination IP share the same SCC, which is indexed by dip_idx, but dip_idx isn't necessarily equal to qpn. Therefore, dip_idx should be used to query SCC context instead of qpn. Fixes: 124a9fbe43aa ("RDMA/hns: Append SCC context to the raw dump of QPC") Signed-off-by: wenglianfa <wenglianfa@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20250726075345.846957-1-huangjunxian6@hisilicon.com Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-08-13btrfs: zoned: limit active zones to max_open_zonesNaohiro Aota1-1/+7
When there is no active zone limit, we can technically write into any number of zones at the same time. However, exceeding the max open zones can degrade performance. To prevent this, set the max_active_zones to bdev_max_open_zones() if there is no active zone limit. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-08-13btrfs: zoned: fix write time activation failure for metadata block groupNaohiro Aota1-4/+9
Since commit 13bb483d32ab ("btrfs: zoned: activate metadata block group on write time"), we activate a metadata block group at the write time. If the zone capacity is small enough, we can allocate the entire region before the first write. Then, we hit the btrfs_zoned_bg_is_full() in btrfs_zone_activate() and the activation fails. For a data block group, we activate it at the allocation time and we should check the fullness condition in the caller side. Add, a WARN to check the fullness condition. For a metadata block group, we don't need the fullness check because we activate it at the write time. Instead, activating it once it is written should be invalid. Catch that with a WARN too. Fixes: 13bb483d32ab ("btrfs: zoned: activate metadata block group on write time") CC: stable@vger.kernel.org # 6.6+ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-08-13btrfs: zoned: fix data relocation block group reservationNaohiro Aota1-8/+47
btrfs_zoned_reserve_data_reloc_bg() is called on mount and at that point, all data block groups belong to the primary data space_info. So, we don't find anything in the data relocation space_info. Also, the condition "bg->used > 0" can select a block group with full of zone_unusable bytes for the candidate. As we cannot allocate from the block group, it is useless to reserve it as the data relocation block group. Furthermore, because of the space_info separation, we need to migrate the selected block group to the data relocation space_info. If not, the extent allocator cannot use the block group to do the allocation. This commit fixes these three issues. Fixes: e606ff985ec7 ("btrfs: zoned: reserve data_reloc block group on mount") Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-08-13btrfs: zoned: skip ZONE FINISH of conventional zonesJohannes Thumshirn1-20/+35
Don't call ZONE FINISH for conventional zones as this will result in I/O errors. Instead check if the zone that needs finishing is a conventional zone and if yes skip it. Also factor out the actual handling of finishing a single zone into a helper function, as do_zone_finish() is growing ever bigger and the indentations levels are getting higher. Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-08-13RDMA/erdma: Fix unset QPN of GSI QPBoshi Yu1-0/+2
The QPN of the GSI QP was not set, which may cause issues. Set the QPN to 1 when creating the GSI QP. Fixes: 999a0a2e9b87 ("RDMA/erdma: Support UD QPs and UD WRs") Reviewed-by: Cheng Xu <chengyou@linux.alibaba.com> Signed-off-by: Boshi Yu <boshiyu@linux.alibaba.com> Link: https://patch.msgid.link/20250725055410.67520-4-boshiyu@linux.alibaba.com Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-08-13RDMA/erdma: Fix ignored return value of init_kernel_qpBoshi Yu1-1/+3
The init_kernel_qp interface may fail. Check its return value and free related resources properly when it does. Fixes: 155055771704 ("RDMA/erdma: Add verbs implementation") Reviewed-by: Cheng Xu <chengyou@linux.alibaba.com> Signed-off-by: Boshi Yu <boshiyu@linux.alibaba.com> Link: https://patch.msgid.link/20250725055410.67520-3-boshiyu@linux.alibaba.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-08-13RDMA/rxe: Flush delayed SKBs while releasing RXE resourcesZhu Yanjun2-22/+9
When skb packets are sent out, these skb packets still depends on the rxe resources, for example, QP, sk, when these packets are destroyed. If these rxe resources are released when the skb packets are destroyed, the call traces will appear. To avoid skb packets hang too long time in some network devices, a timestamp is added when these skb packets are created. If these skb packets hang too long time in network devices, these network devices can free these skb packets to release rxe resources. Reported-by: syzbot+8425ccfb599521edb153@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=8425ccfb599521edb153 Tested-by: syzbot+8425ccfb599521edb153@syzkaller.appspotmail.com Fixes: 1a633bdc8fd9 ("RDMA/rxe: Let destroy qp succeed with stuck packet") Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Link: https://patch.msgid.link/20250726013104.463570-1-yanjun.zhu@linux.dev Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-08-13Revert "drm/amdgpu: Use dma_buf from GEM object instance"Thomas Zimmermann3-3/+4
This reverts commit 515986100d176663d0a03219a3056e4252f729e6. The dma_buf field in struct drm_gem_object is not stable over the object instance's lifetime. The field becomes NULL when user space releases the final GEM handle on the buffer object. This resulted in a NULL-pointer deref. Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer: Acquire internal references on GEM handles") only solved the problem partially. They especially don't work for buffer objects without a DRM framebuffer associated. Hence, this revert to going back to using .import_attach->dmabuf. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250715082635.34974-1-tzimmermann@suse.de