Age | Commit message (Collapse) | Author | Files | Lines |
|
to 2.56
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
We already called ib_drain_qp() before and that makes sure
send_done() was called with IB_WC_WR_FLUSH_ERR, but
didn't called atomic_dec_and_test(&sc->send_io.pending.count)
So we may never reach the info->send_pending == 0 condition.
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Fixes: 5349ae5e05fa ("smb: client: let send_done() cleanup before calling smbd_disconnect_rdma_connection()")
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
This is step 4/4 of a patch series to fix mid_q_entry memory leaks
caused by race conditions in callback execution.
In compound_send_recv(), when wait_for_response() is interrupted by
signals, the code attempts to cancel pending requests by changing
their callbacks to cifs_cancelled_callback. However, there's a race
condition between signal interruption and network response processing
that causes both mid_q_entry and server buffer leaks:
```
User foreground process cifsd
cifs_readdir
open_cached_dir
cifs_send_recv
compound_send_recv
smb2_setup_request
smb2_mid_entry_alloc
smb2_get_mid_entry
smb2_mid_entry_alloc
mempool_alloc // alloc mid
kref_init(&temp->refcount); // refcount = 1
mid[0]->callback = cifs_compound_callback;
mid[1]->callback = cifs_compound_last_callback;
smb_send_rqst
rc = wait_for_response
wait_event_state TASK_KILLABLE
cifs_demultiplex_thread
allocate_buffers
server->bigbuf = cifs_buf_get()
standard_receive3
->find_mid()
smb2_find_mid
__smb2_find_mid
kref_get(&mid->refcount) // +1
cifs_handle_standard
handle_mid
/* bigbuf will also leak */
mid->resp_buf = server->bigbuf
server->bigbuf = NULL;
dequeue_mid
/* in for loop */
mids[0]->callback
cifs_compound_callback
/* Signal interrupts wait: rc = -ERESTARTSYS */
/* if (... || midQ[i]->mid_state == MID_RESPONSE_RECEIVED) *?
midQ[0]->callback = cifs_cancelled_callback;
cancelled_mid[i] = true;
/* The change comes too late */
mid->mid_state = MID_RESPONSE_READY
release_mid // -1
/* cancelled_mid[i] == true causes mid won't be released
in compound_send_recv cleanup */
/* cifs_cancelled_callback won't executed to release mid */
```
The root cause is that there's a race between callback assignment and
execution.
Fix this by introducing per-mid locking:
- Add spinlock_t mid_lock to struct mid_q_entry
- Add mid_execute_callback() for atomic callback execution
- Use mid_lock in cancellation paths to ensure atomicity
This ensures that either the original callback or the cancellation
callback executes atomically, preventing reference count leaks when
requests are interrupted by signals.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=220404
Fixes: ee258d79159a ("CIFS: Move credit processing to mid callbacks for SMB3")
Signed-off-by: Wang Zhaolong <wangzhaolong@huaweicloud.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
With KASAN enabled, it is possible to get a slab out of bounds
during mount to ksmbd due to missing check in parse_server_interfaces()
(see below):
BUG: KASAN: slab-out-of-bounds in
parse_server_interfaces+0x14ee/0x1880 [cifs]
Read of size 4 at addr ffff8881433dba98 by task mount/9827
CPU: 5 UID: 0 PID: 9827 Comm: mount Tainted: G
OE 6.16.0-rc2-kasan #2 PREEMPT(voluntary)
Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Hardware name: Dell Inc. Precision Tower 3620/0MWYPT,
BIOS 2.13.1 06/14/2019
Call Trace:
<TASK>
dump_stack_lvl+0x9f/0xf0
print_report+0xd1/0x670
__virt_addr_valid+0x22c/0x430
? parse_server_interfaces+0x14ee/0x1880 [cifs]
? kasan_complete_mode_report_info+0x2a/0x1f0
? parse_server_interfaces+0x14ee/0x1880 [cifs]
kasan_report+0xd6/0x110
parse_server_interfaces+0x14ee/0x1880 [cifs]
__asan_report_load_n_noabort+0x13/0x20
parse_server_interfaces+0x14ee/0x1880 [cifs]
? __pfx_parse_server_interfaces+0x10/0x10 [cifs]
? trace_hardirqs_on+0x51/0x60
SMB3_request_interfaces+0x1ad/0x3f0 [cifs]
? __pfx_SMB3_request_interfaces+0x10/0x10 [cifs]
? SMB2_tcon+0x23c/0x15d0 [cifs]
smb3_qfs_tcon+0x173/0x2b0 [cifs]
? __pfx_smb3_qfs_tcon+0x10/0x10 [cifs]
? cifs_get_tcon+0x105d/0x2120 [cifs]
? do_raw_spin_unlock+0x5d/0x200
? cifs_get_tcon+0x105d/0x2120 [cifs]
? __pfx_smb3_qfs_tcon+0x10/0x10 [cifs]
cifs_mount_get_tcon+0x369/0xb90 [cifs]
? dfs_cache_find+0xe7/0x150 [cifs]
dfs_mount_share+0x985/0x2970 [cifs]
? check_path.constprop.0+0x28/0x50
? save_trace+0x54/0x370
? __pfx_dfs_mount_share+0x10/0x10 [cifs]
? __lock_acquire+0xb82/0x2ba0
? __kasan_check_write+0x18/0x20
cifs_mount+0xbc/0x9e0 [cifs]
? __pfx_cifs_mount+0x10/0x10 [cifs]
? do_raw_spin_unlock+0x5d/0x200
? cifs_setup_cifs_sb+0x29d/0x810 [cifs]
cifs_smb3_do_mount+0x263/0x1990 [cifs]
Reported-by: Namjae Jeon <linkinjeon@kernel.org>
Tested-by: Namjae Jeon <linkinjeon@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
On a motherboard with an AMD Granite Ridge CPU there is a report
that 3.5mm microphone and headphones aren't working. In the
log it's observed:
snd_hda_intel 0000:02:00.6: Skipping the device on the denylist
This was because of commit df42ee7e22f03 ("ALSA: hda: Add ASRock
X670E Taichi to denylist"). Reverting this commit allows the
microphone and headphones to work again. As at least some combinations
of this motherboard do have applicable devices, revert so that they
can be probed.
Cc: Richard Gong <richard.gong@amd.com>
Cc: Juan Martinez <juan.martinez@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Link: https://patch.msgid.link/20250813140427.1577172-1-superm1@kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Some inline functions are unused depending on kconfig, and the recent
change for clang builds made those handled as errors with W=1.
For avoiding pitfalls, mark those with __maybe_unused attributes.
Link: https://patch.msgid.link/20250813153628.12303-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Starting with Rust 1.91.0 (expected 2025-10-30), `rustdoc` has improved
some false negatives around intra-doc links [1], and it found a broken
intra-doc link we currently have:
error: unresolved link to `include/linux/device/faux.h`
--> rust/kernel/faux.rs:7:17
|
7 | //! C header: [`include/linux/device/faux.h`]
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^ no item named `include/linux/device/faux.h` in scope
|
= help: to escape `[` and `]` characters, add '\' before them like `\[` or `\]`
= note: `-D rustdoc::broken-intra-doc-links` implied by `-D warnings`
= help: to override `-D warnings` add `#[allow(rustdoc::broken_intra_doc_links)]`
Our `srctree/` C header links are not intra-doc links, thus they need
the link destination.
Thus fix it.
Cc: stable <stable@kernel.org>
Link: https://github.com/rust-lang/rust/pull/132748 [1]
Fixes: 78418f300d39 ("rust/kernel: Add faux device bindings")
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Reviewed-by: Benno Lossin <lossin@kernel.org>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20250804171311.1186538-1-ojeda@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"12 hotfixes. 5 are cc:stable and the remainder address post-6.16
issues or aren't considered necessary for -stable kernels.
10 of these fixes are for MM"
* tag 'mm-hotfixes-stable-2025-08-12-20-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
proc: proc_maps_open allow proc_mem_open to return NULL
mm/mremap: avoid expensive folio lookup on mremap folio pte batch
userfaultfd: fix a crash in UFFDIO_MOVE when PMD is a migration entry
mm: pass page directly instead of using folio_page
selftests/proc: fix string literal warning in proc-maps-race.c
fs/proc/task_mmu: hold PTL in pagemap_hugetlb_range and gather_hugetlb_stats
mm/smaps: fix race between smaps_hugetlb_range and migration
mm: fix the race between collapse and PT_RECLAIM under per-vma lock
mm/kmemleak: avoid soft lockup in __kmemleak_do_cleanup()
MAINTAINERS: add Masami as a reviewer of hung task detector
mm/kmemleak: avoid deadlock by moving pr_warn() outside kmemleak_lock
kasan/test: fix protection against compiler elision
|
|
Since 'bcs->Residue' has the data type '__le32', convert it to the
correct byte order of the CPU using this driver when assigning it to
the local variable 'residue'.
Cc: stable <stable@kernel.org>
Fixes: 50a6cb932d5c ("USB: usb_storage: add ums-realtek driver")
Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Link: https://lore.kernel.org/r/20250813145247.184717-3-thorsten.blum@linux.dev
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
When add workaround for ERR051725, the usbmisc will put PHY to
Non-driving mode (OPMODE = 01) after stopping the device controller
and put PHY back to Normal mode (OPMODE = 00) after starting the device
controller.
However, this will bring issue for host controller. Because the PHY may
stay in Non-driving mode after switching the role from device to host.
Then the port will not work if USB device is attached. To fix this issue,
improving the workaround by putting PHY to Non-driving mode for a certain
period and back to Normal mode finally. To make host detect a disconnect
signal, the period should be at least 125us (a micro-frame time) for
high-speed link.
And only working as high-speed mode will need workaround for ERR051725.
So this will also filter the pullup event for high-speed.
Fixes: 11992b410083 ("usb: chipidea: imx: implement workaround for ERR051725")
Reviewed-by: Jun Li <jun.li@nxp.com>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Acked-by: Peter Chen <peter.chen@kernel.org>
Link: https://lore.kernel.org/r/20250811100833.862876-1-xu.yang_2@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
kcov_remote_start_usb_softirq() the begin of urb's completion callback.
HCDs marked HCD_BH will invoke this function from the softirq and
in_serving_softirq() will detect this properly.
Root-HUB (RH) requests will not be delayed to softirq but complete
immediately in IRQ context.
This will confuse kcov because in_serving_softirq() will report true if
the softirq is served after the hardirq and if the softirq got
interrupted by the hardirq in which currently runs.
This was addressed by simply disabling interrupts in
kcov_remote_start_usb_softirq() which avoided the interruption by the RH
while a regular completion callback was invoked.
This not only changes the behaviour while kconv is enabled but also
breaks PREEMPT_RT because now sleeping locks can no longer be acquired.
Revert the previous fix. Address the issue by invoking
kcov_remote_start_usb() only if the context is just "serving softirqs"
which is identified by checking in_serving_softirq() and in_hardirq()
must be false.
Fixes: f85d39dd7ed89 ("kcov, usb: disable interrupts in kcov_remote_start_usb_softirq")
Cc: stable <stable@kernel.org>
Reported-by: Yunseong Kim <ysk@kzalloc.com>
Closes: https://lore.kernel.org/all/20250725201400.1078395-2-ysk@kzalloc.com/
Tested-by: Yunseong Kim <ysk@kzalloc.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20250811082745.ycJqBXMs@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This patch adds the necessary PCI ID for Intel Wildcat Lake
devices.
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Cc: stable <stable@kernel.org>
Link: https://lore.kernel.org/r/20250812131101.2930199-1-heikki.krogerus@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
During a device-initiated disconnect, the End Transfer command resets
the event filter, allowing a new xferNotReady event to be generated
before the controller is fully halted. Processing this late event
incorrectly triggers a Start Transfer, which prevents the controller
from halting and results in a DSTS.DEVCTLHLT bit polling timeout.
Ignore the late xferNotReady event if the controller is already in a
disconnected state.
Fixes: 72246da40f37 ("usb: Introduce DesignWare USB3 DRD Driver")
Cc: stable <stable@kernel.org>
Signed-off-by: Kuen-Han Tsai <khtsai@google.com>
Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Link: https://lore.kernel.org/r/20250807090700.2397190-1-khtsai@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add the US_FL_BULK_IGNORE_TAG quirk for Novatek NTK96550-based camera
to fix USB resets after sending SCSI vendor commands due to CBW and
CSW tags difference, leading to undesired slowness while communicating
with the device.
Please find below the copy of /sys/kernel/debug/usb/devices with my
device plugged in (listed as TechSys USB mass storage here, the
underlying chipset being the Novatek NTK96550-based camera):
T: Bus=03 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 3 Spd=480 MxCh= 0
D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1
P: Vendor=0603 ProdID=8611 Rev= 0.01
S: Manufacturer=TechSys
S: Product=USB Mass Storage
S: SerialNumber=966110000000100
C:* #Ifs= 1 Cfg#= 1 Atr=c0 MxPwr=100mA
I:* If#= 0 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage
E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
Signed-off-by: Mael GUERIN <mael.guerin@murena.io>
Cc: stable <stable@kernel.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/20250806164406.43450-1-mael.guerin@murena.io
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The USB core will unmap urb->transfer_dma after SETUP stage completes.
Then the USB controller will access unmapped memory when it received
device descriptor. If iommu is equipped, the entire test can't be
completed due to the memory accessing is blocked.
Fix it by calling map_urb_for_dma() again for IN stage. To reduce
redundant map for urb->transfer_buffer, this will also set
URB_NO_TRANSFER_DMA_MAP flag before first map_urb_for_dma() to skip
dma map for urb->transfer_buffer and clear URB_NO_TRANSFER_DMA_MAP
flag before second map_urb_for_dma().
Fixes: 216e0e563d81 ("usb: core: hcd: use map_urb_for_dma for single step set feature urb")
Cc: stable <stable@kernel.org>
Reviewed-by: Jun Li <jun.li@nxp.com>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/20250806083955.3325299-1-xu.yang_2@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Increase the External ROM access timeouts to prevent failures during
programming of External SPI EEPROM chips. The current timeouts are
too short for some SPI EEPROMs used with uPD720201 controllers.
The current timeout for Chip Erase in renesas_rom_erase() is 100 ms ,
the current timeout for Sector Erase issued by the controller before
Page Program in renesas_fw_download_image() is also 100 ms. Neither
timeout is sufficient for e.g. the Macronix MX25L5121E or MX25V5126F.
MX25L5121E reference manual [1] page 35 section "ERASE AND PROGRAMMING
PERFORMANCE" and page 23 section "Table 8. AC CHARACTERISTICS (Temperature
= 0°C to 70°C for Commercial grade, VCC = 2.7V ~ 3.6V)" row "tCE" indicate
that the maximum time required for Chip Erase opcode to complete is 2 s,
and for Sector Erase it is 300 ms .
MX25V5126F reference manual [2] page 47 section "13. ERASE AND PROGRAMMING
PERFORMANCE (2.3V - 3.6V)" and page 42 section "Table 8. AC CHARACTERISTICS
(Temperature = -40°C to 85°C for Industrial grade, VCC = 2.3V - 3.6V)" row
"tCE" indicate that the maximum time required for Chip Erase opcode to
complete is 3.2 s, and for Sector Erase it is 400 ms .
Update the timeouts such, that Chip Erase timeout is set to 5 seconds,
and Sector Erase timeout is set to 500 ms. Such lengthy timeouts ought
to be sufficient for majority of SPI EEPROM chips.
[1] https://www.macronix.com/Lists/Datasheet/Attachments/8634/MX25L5121E,%203V,%20512Kb,%20v1.3.pdf
[2] https://www.macronix.com/Lists/Datasheet/Attachments/8750/MX25V5126F,%202.5V,%20512Kb,%20v1.1.pdf
Fixes: 2478be82de44 ("usb: renesas-xhci: Add ROM loader for uPD720201")
Cc: stable <stable@kernel.org>
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Link: https://lore.kernel.org/r/20250802225526.25431-1-marek.vasut+renesas@mailbox.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Upon resume from system suspend, the PM runtime core issues the
following warning:
tegra-xudc 3550000.usb: Runtime PM usage count underflow!
This is because tegra_xudc_resume() unconditionally calls
schedule_work(&xudc->usb_role_sw_work) whether or not anything has
changed, which causes tegra_xudc_device_mode_off() to be called
even when we're already in that mode.
Keep track of the current state of "device_mode", and only schedule
this work if it has changed from the hardware state on resume.
Signed-off-by: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1uhtkH-007KDZ-JT@rmk-PC.armlinux.org.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Another SanDisk 3.2Gen1 Flash Drive also need DELAY_INIT quick,
or it will randomly work incorrectly on Huawei hisi platforms
when doing reboot test.
Signed-off-by: Miao Li <limiao@kylinos.cn>
Cc: stable <stable@kernel.org>
Link: https://lore.kernel.org/r/20250801082728.469406-1-limiao870622@163.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Change the name of the kcontrol from "Gain" to "Volume".
Signed-off-by: Baojun Xu <baojun.xu@ti.com>
Link: https://patch.msgid.link/20250813100708.12197-1-baojun.xu@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
After commit 0b2b066f8a85 ("io_uring/io-wq: only create a new worker
if it can make progress"), in our produce environment, we still
observe that part of io_worker threads keeps creating and destroying.
After analysis, it was confirmed that this was due to a more complex
scenario involving a large number of fsync operations, which can be
abstracted as frequent write + fsync operations on multiple files in
a single uring instance. Since write is a hash operation while fsync
is not, and fsync is likely to be suspended during execution, the
action of checking the hash value in
io_wqe_dec_running cannot handle such scenarios.
Similarly, if hash-based work and non-hash-based work are sent at the
same time, similar issues are likely to occur.
Returning to the starting point of the issue, when a new work
arrives, io_wq_enqueue may wake up free worker A, while
io_wq_dec_running may create worker B. Ultimately, only one of A and
B can obtain and process the task, leaving the other in an idle
state. In the end, the issue is caused by inconsistent logic in the
checks performed by io_wq_enqueue and io_wq_dec_running.
Therefore, the problem can be resolved by checking for available
workers in io_wq_dec_running.
Signed-off-by: Fengnan Chang <changfengnan@bytedance.com>
Reviewed-by: Diangang Li <lidiangang@bytedance.com>
Link: https://lore.kernel.org/r/20250813120214.18729-1-changfengnan@bytedance.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The NODATASUM message was printed twice by mistake and the NODATACOW was
missing from the 'unset' part. Fix the duplication and make the output
look the same.
Fixes: eddb1a433f26 ("btrfs: add reconfigure callback for fs_context")
CC: stable@vger.kernel.org # 6.8+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Kyoji Ogasawara <sawara04.o@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
After the fsconfig migration in 6.8, mount option info messages are no
longer displayed during mount operations because btrfs_emit_options() is
only called during remount, not during initial mount.
Fix this by calling btrfs_emit_options() in btrfs_fill_super() after
open_ctree() succeeds. Additionally, prevent log duplication by ensuring
btrfs_check_options() handles validation with warn-level and err-level
messages, while btrfs_emit_options() provides info-level messages.
Fixes: eddb1a433f26 ("btrfs: add reconfigure callback for fs_context")
CC: stable@vger.kernel.org # 6.8+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Kyoji Ogasawara <sawara04.o@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Fix a wrong log message that appears when the "nobarrier" mount option
is unset. When "nobarrier" is unset, barrier is actually enabled.
However, the log incorrectly stated "turning off barriers".
Fixes: eddb1a433f26 ("btrfs: add reconfigure callback for fs_context")
CC: stable@vger.kernel.org # 6.12+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Kyoji Ogasawara <sawara04.o@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The commit f2cb97ee964a ("btrfs: index buffer_tree using node size")
changed the index of buffer_tree from "start >> sectorsize_bits" to "start
>> nodesize_bits". However, the change is not applied for
wait_eb_writebacks() and caused IO failures by writing in a full zone. Use
the index properly.
Fixes: f2cb97ee964a ("btrfs: index buffer_tree using node size")
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
btrfs_subpage_set_writeback() calls folio_start_writeback() the first time
a folio is written back, and it also clears the PAGECACHE_TAG_TOWRITE tag
even if there are still dirty blocks in the folio. This can break ordering
guarantees, such as those required by btrfs_wait_ordered_extents().
That ordering breakage leads to a real failure. For example, running
generic/464 on a zoned setup will hit the following ASSERT. This happens
because the broken ordering fails to flush existing dirty pages before the
file size is truncated.
assertion failed: !list_empty(&ordered->list) :: 0, in fs/btrfs/zoned.c:1899
------------[ cut here ]------------
kernel BUG at fs/btrfs/zoned.c:1899!
Oops: invalid opcode: 0000 [#1] SMP NOPTI
CPU: 2 UID: 0 PID: 1906169 Comm: kworker/u130:2 Kdump: loaded Not tainted 6.16.0-rc6-BTRFS-ZNS+ #554 PREEMPT(voluntary)
Hardware name: Supermicro Super Server/H12SSL-NT, BIOS 2.0 02/22/2021
Workqueue: btrfs-endio-write btrfs_work_helper [btrfs]
RIP: 0010:btrfs_finish_ordered_zoned.cold+0x50/0x52 [btrfs]
RSP: 0018:ffffc9002efdbd60 EFLAGS: 00010246
RAX: 000000000000004c RBX: ffff88811923c4e0 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff827e38b1 RDI: 00000000ffffffff
RBP: ffff88810005d000 R08: 00000000ffffdfff R09: ffffffff831051c8
R10: ffffffff83055220 R11: 0000000000000000 R12: ffff8881c2458c00
R13: ffff88811923c540 R14: ffff88811923c5e8 R15: ffff8881c1bd9680
FS: 0000000000000000(0000) GS:ffff88a04acd0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f907c7a918c CR3: 0000000004024000 CR4: 0000000000350ef0
Call Trace:
<TASK>
? srso_return_thunk+0x5/0x5f
btrfs_finish_ordered_io+0x4a/0x60 [btrfs]
btrfs_work_helper+0xf9/0x490 [btrfs]
process_one_work+0x204/0x590
? srso_return_thunk+0x5/0x5f
worker_thread+0x1d6/0x3d0
? __pfx_worker_thread+0x10/0x10
kthread+0x118/0x230
? __pfx_kthread+0x10/0x10
ret_from_fork+0x205/0x260
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
Consider process A calling writepages() with WB_SYNC_NONE. In zoned mode or
for compressed writes, it locks several folios for delalloc and starts
writing them out. Let's call the last locked folio folio X. Suppose the
write range only partially covers folio X, leaving some pages dirty.
Process A calls btrfs_subpage_set_writeback() when building a bio. This
function call clears the TOWRITE tag of folio X, whose size = 8K and
the block size = 4K. It is following state.
0 4K 8K
|/////|/////| (flag: DIRTY, tag: DIRTY)
<-----> Process A will write this range.
Now suppose process B concurrently calls writepages() with WB_SYNC_ALL. It
calls tag_pages_for_writeback() to tag dirty folios with
PAGECACHE_TAG_TOWRITE. Since folio X is still dirty, it gets tagged. Then,
B collects tagged folios using filemap_get_folios_tag() and must wait for
folio X to be written before returning from writepages().
0 4K 8K
|/////|/////| (flag: DIRTY, tag: DIRTY|TOWRITE)
However, between tagging and collecting, process A may call
btrfs_subpage_set_writeback() and clear folio X's TOWRITE tag.
0 4K 8K
| |/////| (flag: DIRTY|WRITEBACK, tag: DIRTY)
As a result, process B won't see folio X in its batch, and returns without
waiting for it. This breaks the WB_SYNC_ALL ordering requirement.
Fix this by using btrfs_subpage_set_writeback_keepwrite(), which retains
the TOWRITE tag. We now manually clear the tag only after the folio becomes
clean, via the xas operation.
Fixes: 3470da3b7d87 ("btrfs: subpage: introduce helpers for writeback status")
CC: stable@vger.kernel.org # 6.12+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
[POSSIBLE BUG]
After commit 5e121ae687b8 ("btrfs: use buffer xarray for extent buffer
writeback operations"), we have a dedicated xarray for extent buffers,
and a lot of tags are migrated to that buffer tree, like
PAGECACHE_TAG_TOWRITE/DIRTY/WRITEBACK.
This frees us from the limits of page flags, but there is a new
asymmetric behavior, we call buffer_tree_tag_for_writeback() to set
PAGECACHE_TAG_TOWRITE for the involved ranges, but there is no one to
clear that tag.
Before that rework, we relied on the page cache tag which was cleared
when folio_start_writeback() was called.
Although this has its own problems (e.g. the first one calling
folio_start_writeback() will clear the tag for the whole page), it at
least cleared the tag.
But now our real tags are stored in the buffer tree, no one is really
clearing the PAGECACHE_TAG_TOWRITE tag now.
[FIX]
Thankfully this is not going to cause any real bug, but just some
inefficiency iterating the extent buffers.
As if we hit an extent buffer which is not dirty but still has the
PAGECACHE_TAG_TOWRITE tag, lock_extent_buffer_for_io() will skip it so
we won't writeback the extent buffer again.
To properly fix the inefficiency, just clear the PAGECACHE_TAG_TOWRITE
inside lock_extent_buffer_for_io().
There is no error path between lock_extent_buffer_for_io() and
write_one_eb(), so we're safe to clear the tag there.
Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
If we are doing an unlink for log replay, we are updating the directory's
mtime and ctime to the current time, and this is incorrect since it should
stay with the mtime and ctime that were set when the directory was logged.
This is the same as when adding a link to an inode during log replay (with
btrfs_add_link()), where we want the mtime and ctime to be the values that
were in place when the inode was logged.
This was found with generic/547 using LOAD_FACTOR=20 and TIME_FACTOR=20,
where due to large log trees we have longer log replay times and fssum
could detect a mismatch of the mtime and ctime of a directory.
Fix this by skipping the mtime and ctime update at __btrfs_unlink_inode()
if we are in log replay context (just like btrfs_add_link()).
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
[BUG]
If btrfs_writepage_cow_fixup() failed (returning value -EUCLEAN),
the block will be kept dirty, but with its corresponding range finished
in the ordered extent.
Currently that error pattern is only possible for experimental builds,
which places extra check to ensure we shouldn't hit a dirty block
without a corresponding ordered extent.
This means if later a writeback happens again, we can hit the following
problems:
- ASSERT(block_start != EXTENT_MAP_HOLE) in submit_one_sector()
If the original extent map is a hole, then we can hit this case, as
the new ordered extent failed, we will drop the new extent map and
re-read one from the disk.
- DEBUG_WARN() in btrfs_writepage_cow_fixup()
This is because we no longer have an ordered extent for those dirty
blocks. The original for them is already finished with error.
[CAUSE]
The function btrfs_writepage_cow_fixup() is not following the regular
error handling of writeback. The common practice is to clear the folio
dirty, start and finish the writeback for the block.
This is normally done by extent_clear_unlock_delalloc() with
PAGE_START_WRITEBACK | PAGE_END_WRITEBACK flags during
run_delalloc_range().
So if we keep those failed blocks dirty, they will stay in the page
cache and wait for the next writeback.
And since the original ordered extent is already finished and removed,
depending on the original extent map, we either hit the ASSERT() inside
submit_one_sector(), or hit the DEBUG_WARN() in
btrfs_writepage_cow_fixup() again (and very ironic).
[FIX]
Follow the regular error handling to clear the dirty flag for the block
range, start and finish writeback for that block range instead.
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
[BUG]
If submit_one_sector() failed, the block will be kept dirty, but with
their corresponding range finished in the ordered extent.
This means if a writeback happens later again, we can hit the following
problems:
- ASSERT(block_start != EXTENT_MAP_HOLE) in submit_one_sector()
If the original extent map is a hole, then we can hit this case, as
the new ordered extent failed, we will drop the new extent map and
re-read one from the disk.
- DEBUG_WARN() in btrfs_writepage_cow_fixup()
This is because we no longer have an ordered extent for those dirty
blocks. The original for them is already finished with error.
[CAUSE]
The function submit_one_sector() is not following the regular error
handling of writeback. The common practice is to clear the folio dirty,
start and finish the writeback for the block.
This is normally done by extent_clear_unlock_delalloc() with
PAGE_START_WRITEBACK | PAGE_END_WRITEBACK flags during
run_delalloc_range().
So if we keep those failed blocks dirty, they will stay in the page
cache and wait for the next writeback.
And since the original ordered extent is already finished and removed,
depending on the original extent map, we either hit the ASSERT() inside
submit_one_sector(), or hit the DEBUG_WARN() in
btrfs_writepage_cow_fixup().
[FIX]
Follow the regular error handling to clear the dirty flag for the block,
start and finish writeback for that block instead.
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
After driver module is removed and during re-install stage, if there
is continueous user touching on the screen, it is a risk impacting
THC hardware initialization which causes driver installation failure.
This patch enhances this flow by quiescing the external touch
interrupt after driver is removed which keeps THC hardware
ignore external interrupt during this remove and re-install stage.
Signed-off-by: Even Xu <even.xu@intel.com>
Tested-by: Rui Zhang <rui1.zhang@intel.com>
Fixes: 66b59bfce6d9 ("HID: intel-thc-hid: intel-quicki2c: Complete THC QuickI2C driver")
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
|
commit 9e59d609763f ("md: call del_gendisk in control path") changes the
async way to sync way of calling del_gendisk. But it breaks mdadm
--assemble command. The assemble command runs like this:
1. create the array
2. stop the array
3. access the sysfs files after stopping
The sync way calls del_gendisk in step 2, so all sysfs files are removed.
Now to avoid breaking mdadm assemble command, this patch adds the parameter
legacy_async_del_gendisk that can be used to choose which way. The default
is async way. In future, we plan to change default to sync way in kernel
7.0. Then users need to upgrade to mdadm 4.5+ which removes step 2.
Fixes: 9e59d609763f ("md: call del_gendisk in control path")
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Closes: https://lore.kernel.org/linux-raid/CAMw=ZnQ=ET2St-+hnhsuq34rRPnebqcXqP1QqaHW5Bh4aaaZ4g@mail.gmail.com/T/#t
Suggested-and-reviewed-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Link: https://lore.kernel.org/linux-raid/20250813032929.54978-1-xni@redhat.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
|
|
The commit 245618f8e45f ("block: protect wbt_lat_usec using
q->elevator_lock") protected wbt_enable_default() with
q->elevator_lock; however, it also placed wbt_enable_default()
before blk_queue_flag_set(QUEUE_FLAG_REGISTERED, q);, resulting
in wbt failing to be enabled.
Moreover, the protection of wbt_enable_default() by q->elevator_lock
was removed in commit 78c271344b6f ("block: move wbt_enable_default()
out of queue freezing from sched ->exit()"), so we can directly fix
this issue by placing wbt_enable_default() after
blk_queue_flag_set(QUEUE_FLAG_REGISTERED, q);.
Additionally, this issue also causes the inability to read the
wbt_lat_usec file, and the scenario is as follows:
root@q:/sys/block/sda/queue# cat wbt_lat_usec
cat: wbt_lat_usec: Invalid argument
root@q:/data00/sjc/linux# ls /sys/kernel/debug/block/sda/rqos
cannot access '/sys/kernel/debug/block/sda/rqos': No such file or directory
root@q:/data00/sjc/linux# find /sys -name wbt
/sys/kernel/debug/tracing/events/wbt
After testing with this patch, wbt can be enabled normally.
Signed-off-by: Julian Sun <sunjunchao@bytedance.com>
Cc: stable@vger.kernel.org
Fixes: 245618f8e45f ("block: protect wbt_lat_usec using q->elevator_lock")
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250812154257.57540-1-sunjunchao@bytedance.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Fix spelling mistake directoy to directory
Reported-by: codespell
Signed-off-by: Erick Karanja <karanja99erick@gmail.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250813071837.668613-1-karanja99erick@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
DIP algorithm is also supported on devices newer than hip09, so free
dip entries too.
Fixes: f91696f2f053 ("RDMA/hns: Support congestion control type selection according to the FW")
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://patch.msgid.link/20250812122602.3524602-1-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
in ntrig_report_version(), hdev parameter passed from hid_probe().
sending descriptor to /dev/uhid can make hdev->dev.parent->parent to null
if hdev->dev.parent->parent is null, usb_dev has
invalid address(0xffffffffffffff58) that hid_to_usb_dev(hdev) returned
when usb_rcvctrlpipe() use usb_dev,it trigger
page fault error for address(0xffffffffffffff58)
add null check logic to ntrig_report_version()
before calling hid_to_usb_dev()
Signed-off-by: Minjong Kim <minbell.kim@samsung.com>
Link: https://patch.msgid.link/20250813-hid-ntrig-page-fault-fix-v2-1-f98581f35106@samsung.com
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
|
|
Ensure that pfn_list allocated by kvcalloc() is freed using corresponding
kvfree() function. Match memory allocation and free routines kvcalloc -> kvfree.
Fixes: 259e9bd07c57 ("RDMA/core: Avoid hmm_dma_map_alloc() for virtual DMA devices")
Signed-off-by: Akhilesh Patil <akhilesh@ee.iitb.ac.in>
Link: https://patch.msgid.link/aJjcPjL1BVh8QrMN@bhairav-test.ee.iitb.ac.in
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
This maintainer's email no longer works. Remove it from MAINTAINERS.
This still leaves one maintainer for the driver.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Cc: linux-rdma@vger.kernel.org
Link: https://patch.msgid.link/20250808175601.EF0AF767@davehans-spike.ostc.intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
memset the PBL page pointer and page map arrays before
populating the SGL addresses of the HWQ.
Fixes: 0c4dcd602817 ("RDMA/bnxt_re: Refactor hardware queue memory allocation")
Signed-off-by: Anantha Prabhu <anantha.prabhu@broadcom.com>
Reviewed-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com>
Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Link: https://patch.msgid.link/20250805101000.233310-5-kalesh-anakkur.purayil@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
The GID context reuse logic requires the context memory to be
not freed if and when DEL_GID firmware command fails. But, if
there's no subsequent ADD_GID to reuse it, the context memory
must be freed when the driver is unloaded. Otherwise it leads
to a memory leak.
Below is the kmemleak trace reported:
unreferenced object 0xffff88817a4f34d0 (size 8):
comm "insmod", pid 1072504, jiffies 4402561550
hex dump (first 8 bytes):
01 00 00 00 00 00 00 00 ........
backtrace (crc ccaa009e):
__kmalloc_cache_noprof+0x33e/0x400
0xffffffffc2db9d48
add_modify_gid+0x5e0/0xb60 [ib_core]
__ib_cache_gid_add+0x213/0x350 [ib_core]
update_gid+0xf2/0x180 [ib_core]
enum_netdev_ipv4_ips+0x3f3/0x690 [ib_core]
enum_all_gids_of_dev_cb+0x125/0x1b0 [ib_core]
ib_enum_roce_netdev+0x14b/0x250 [ib_core]
ib_cache_setup_one+0x2e5/0x540 [ib_core]
ib_register_device+0x82c/0xf10 [ib_core]
0xffffffffc2df5ad9
0xffffffffc2da8b07
0xffffffffc2db174d
auxiliary_bus_probe+0xa5/0x120
really_probe+0x1e4/0x850
__driver_probe_device+0x18f/0x3d0
Fixes: 4a62c5e9e2e1 ("RDMA/bnxt_re: Do not free the ctx_tbl entry if delete GID fails")
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Link: https://patch.msgid.link/20250805101000.233310-4-kalesh-anakkur.purayil@broadcom.com
Reviewed-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
There should not be any checks of current workload to set
srq_limit value to SRQ hw context.
Remove all such workload checks and make a direct call to
set srq_limit via doorbell SRQ_ARM.
Fixes: 37cb11acf1f7 ("RDMA/bnxt_re: Add SRQ support for Broadcom adapters")
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Link: https://patch.msgid.link/20250805101000.233310-3-kalesh-anakkur.purayil@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Whenever SRQ is created, make sure SRQ arm enable is always
set. Driver is always ready to receive SRQ ASYNC event.
Additional note -
There is no need to do srq arm enable conditionally.
See bnxt_qplib_armen_db in bnxt_qplib_create_cq().
Fixes: 37cb11acf1f7 ("RDMA/bnxt_re: Add SRQ support for Broadcom adapters")
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com>
Link: https://patch.msgid.link/20250805101000.233310-2-kalesh-anakkur.purayil@broadcom.com
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
When using DIP algorithm, all QPs establishing connections with
the same destination IP share the same SCC, which is indexed by
dip_idx, but dip_idx isn't necessarily equal to qpn. Therefore,
dip_idx should be used to query SCC context instead of qpn.
Fixes: 124a9fbe43aa ("RDMA/hns: Append SCC context to the raw dump of QPC")
Signed-off-by: wenglianfa <wenglianfa@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://patch.msgid.link/20250726075345.846957-1-huangjunxian6@hisilicon.com
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
When there is no active zone limit, we can technically write into any
number of zones at the same time. However, exceeding the max open zones can
degrade performance. To prevent this, set the max_active_zones to
bdev_max_open_zones() if there is no active zone limit.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Since commit 13bb483d32ab ("btrfs: zoned: activate metadata block group on
write time"), we activate a metadata block group at the write time. If the
zone capacity is small enough, we can allocate the entire region before the
first write. Then, we hit the btrfs_zoned_bg_is_full() in
btrfs_zone_activate() and the activation fails.
For a data block group, we activate it at the allocation time and we should
check the fullness condition in the caller side. Add, a WARN to check the
fullness condition.
For a metadata block group, we don't need the fullness check because we
activate it at the write time. Instead, activating it once it is written
should be invalid. Catch that with a WARN too.
Fixes: 13bb483d32ab ("btrfs: zoned: activate metadata block group on write time")
CC: stable@vger.kernel.org # 6.6+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
btrfs_zoned_reserve_data_reloc_bg() is called on mount and at that point,
all data block groups belong to the primary data space_info. So, we don't
find anything in the data relocation space_info.
Also, the condition "bg->used > 0" can select a block group with full of
zone_unusable bytes for the candidate. As we cannot allocate from the block
group, it is useless to reserve it as the data relocation block group.
Furthermore, because of the space_info separation, we need to migrate the
selected block group to the data relocation space_info. If not, the extent
allocator cannot use the block group to do the allocation.
This commit fixes these three issues.
Fixes: e606ff985ec7 ("btrfs: zoned: reserve data_reloc block group on mount")
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Don't call ZONE FINISH for conventional zones as this will result in I/O
errors. Instead check if the zone that needs finishing is a conventional
zone and if yes skip it.
Also factor out the actual handling of finishing a single zone into a
helper function, as do_zone_finish() is growing ever bigger and the
indentations levels are getting higher.
Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The QPN of the GSI QP was not set, which may cause issues.
Set the QPN to 1 when creating the GSI QP.
Fixes: 999a0a2e9b87 ("RDMA/erdma: Support UD QPs and UD WRs")
Reviewed-by: Cheng Xu <chengyou@linux.alibaba.com>
Signed-off-by: Boshi Yu <boshiyu@linux.alibaba.com>
Link: https://patch.msgid.link/20250725055410.67520-4-boshiyu@linux.alibaba.com
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
The init_kernel_qp interface may fail. Check its return value and free
related resources properly when it does.
Fixes: 155055771704 ("RDMA/erdma: Add verbs implementation")
Reviewed-by: Cheng Xu <chengyou@linux.alibaba.com>
Signed-off-by: Boshi Yu <boshiyu@linux.alibaba.com>
Link: https://patch.msgid.link/20250725055410.67520-3-boshiyu@linux.alibaba.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
When skb packets are sent out, these skb packets still depends on
the rxe resources, for example, QP, sk, when these packets are
destroyed.
If these rxe resources are released when the skb packets are destroyed,
the call traces will appear.
To avoid skb packets hang too long time in some network devices,
a timestamp is added when these skb packets are created. If these
skb packets hang too long time in network devices, these network
devices can free these skb packets to release rxe resources.
Reported-by: syzbot+8425ccfb599521edb153@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=8425ccfb599521edb153
Tested-by: syzbot+8425ccfb599521edb153@syzkaller.appspotmail.com
Fixes: 1a633bdc8fd9 ("RDMA/rxe: Let destroy qp succeed with stuck packet")
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://patch.msgid.link/20250726013104.463570-1-yanjun.zhu@linux.dev
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
This reverts commit 515986100d176663d0a03219a3056e4252f729e6.
The dma_buf field in struct drm_gem_object is not stable over the
object instance's lifetime. The field becomes NULL when user space
releases the final GEM handle on the buffer object. This resulted
in a NULL-pointer deref.
Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
Acquire internal references on GEM handles") only solved the problem
partially. They especially don't work for buffer objects without a DRM
framebuffer associated.
Hence, this revert to going back to using .import_attach->dmabuf.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250715082635.34974-1-tzimmermann@suse.de
|