Age | Commit message (Collapse) | Author | Files | Lines |
|
commit 6e64d6b3a3c39655de56682ec83e894978d23412 upstream.
In commit e4b5ccd392b9 ("drm/v3d: Ensure job pointer is set to NULL
after job completion"), we introduced a change to assign the job pointer
to NULL after completing a job, indicating job completion.
However, this approach created a race condition between the DRM
scheduler workqueue and the IRQ execution thread. As soon as the fence is
signaled in the IRQ execution thread, a new job starts to be executed.
This results in a race condition where the IRQ execution thread sets the
job pointer to NULL simultaneously as the `run_job()` function assigns
a new job to the pointer.
This race condition can lead to a NULL pointer dereference if the IRQ
execution thread sets the job pointer to NULL after `run_job()` assigns
it to the new job. When the new job completes and the GPU emits an
interrupt, `v3d_irq()` is triggered, potentially causing a crash.
[ 466.310099] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000c0
[ 466.318928] Mem abort info:
[ 466.321723] ESR = 0x0000000096000005
[ 466.325479] EC = 0x25: DABT (current EL), IL = 32 bits
[ 466.330807] SET = 0, FnV = 0
[ 466.333864] EA = 0, S1PTW = 0
[ 466.337010] FSC = 0x05: level 1 translation fault
[ 466.341900] Data abort info:
[ 466.344783] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
[ 466.350285] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 466.355350] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 466.360677] user pgtable: 4k pages, 39-bit VAs, pgdp=0000000089772000
[ 466.367140] [00000000000000c0] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
[ 466.375875] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
[ 466.382163] Modules linked in: rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device algif_hash algif_skcipher af_alg bnep binfmt_misc vc4 snd_soc_hdmi_codec drm_display_helper cec brcmfmac_wcc spidev rpivid_hevc(C) drm_client_lib brcmfmac hci_uart drm_dma_helper pisp_be btbcm brcmutil snd_soc_core aes_ce_blk v4l2_mem2mem bluetooth aes_ce_cipher snd_compress videobuf2_dma_contig ghash_ce cfg80211 gf128mul snd_pcm_dmaengine videobuf2_memops ecdh_generic sha2_ce ecc videobuf2_v4l2 snd_pcm v3d sha256_arm64 rfkill videodev snd_timer sha1_ce libaes gpu_sched snd videobuf2_common sha1_generic drm_shmem_helper mc rp1_pio drm_kms_helper raspberrypi_hwmon spi_bcm2835 gpio_keys i2c_brcmstb rp1 raspberrypi_gpiomem rp1_mailbox rp1_adc nvmem_rmem uio_pdrv_genirq uio i2c_dev drm ledtrig_pattern drm_panel_orientation_quirks backlight fuse dm_mod ip_tables x_tables ipv6
[ 466.458429] CPU: 0 UID: 1000 PID: 2008 Comm: chromium Tainted: G C 6.13.0-v8+ #18
[ 466.467336] Tainted: [C]=CRAP
[ 466.470306] Hardware name: Raspberry Pi 5 Model B Rev 1.0 (DT)
[ 466.476157] pstate: 404000c9 (nZcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 466.483143] pc : v3d_irq+0x118/0x2e0 [v3d]
[ 466.487258] lr : __handle_irq_event_percpu+0x60/0x228
[ 466.492327] sp : ffffffc080003ea0
[ 466.495646] x29: ffffffc080003ea0 x28: ffffff80c0c94200 x27: 0000000000000000
[ 466.502807] x26: ffffffd08dd81d7b x25: ffffff80c0c94200 x24: ffffff8003bdc200
[ 466.509969] x23: 0000000000000001 x22: 00000000000000a7 x21: 0000000000000000
[ 466.517130] x20: ffffff8041bb0000 x19: 0000000000000001 x18: 0000000000000000
[ 466.524291] x17: ffffffafadfb0000 x16: ffffffc080000000 x15: 0000000000000000
[ 466.531452] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[ 466.538613] x11: 0000000000000000 x10: 0000000000000000 x9 : ffffffd08c527eb0
[ 466.545777] x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
[ 466.552941] x5 : ffffffd08c4100d0 x4 : ffffffafadfb0000 x3 : ffffffc080003f70
[ 466.560102] x2 : ffffffc0829e8058 x1 : 0000000000000001 x0 : 0000000000000000
[ 466.567263] Call trace:
[ 466.569711] v3d_irq+0x118/0x2e0 [v3d] (P)
[ 466.573826] __handle_irq_event_percpu+0x60/0x228
[ 466.578546] handle_irq_event+0x54/0xb8
[ 466.582391] handle_fasteoi_irq+0xac/0x240
[ 466.586498] generic_handle_domain_irq+0x34/0x58
[ 466.591128] gic_handle_irq+0x48/0xd8
[ 466.594798] call_on_irq_stack+0x24/0x58
[ 466.598730] do_interrupt_handler+0x88/0x98
[ 466.602923] el0_interrupt+0x44/0xc0
[ 466.606508] __el0_irq_handler_common+0x18/0x28
[ 466.611050] el0t_64_irq_handler+0x10/0x20
[ 466.615156] el0t_64_irq+0x198/0x1a0
[ 466.618740] Code: 52800035 3607faf3 f9442e80 52800021 (f9406018)
[ 466.624853] ---[ end trace 0000000000000000 ]---
[ 466.629483] Kernel panic - not syncing: Oops: Fatal exception in interrupt
[ 466.636384] SMP: stopping secondary CPUs
[ 466.640320] Kernel Offset: 0x100c400000 from 0xffffffc080000000
[ 466.646259] PHYS_OFFSET: 0x0
[ 466.649141] CPU features: 0x100,00000170,00901250,0200720b
[ 466.654644] Memory Limit: none
[ 466.657706] ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---
Fix the crash by assigning the job pointer to NULL before signaling the
fence. This ensures that the job pointer is cleared before any new job
starts execution, preventing the race condition and the NULL pointer
dereference crash.
Cc: stable@vger.kernel.org
Fixes: e4b5ccd392b9 ("drm/v3d: Ensure job pointer is set to NULL after job completion")
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Phil Elwell <phil@raspberrypi.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250123012403.20447-1-mcanal@igalia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 222f3390c15c4452a9f7e26f5b7d9138e75d00d5 upstream.
Add Wooting Two HE (ARM) to the list of supported devices.
Signed-off-by: Jack Greiner <jack@emoss.org>
Signed-off-by: Pavel Rojtberg <rojtberg@gmail.com>
Link: https://lore.kernel.org/r/20250107192830.414709-3-rojtberg@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e4940fe6322c851659c17852b671c6e7b1aa9f56 upstream.
Although it mimics the Microsoft's VendorID, it is in fact a clone.
Taking into account that the original Microsoft Receiver is not being
manufactured anymore, this drive can solve dpad issues encontered by
those who still use the original 360 Wireless controller
but are using a receiver clone.
Signed-off-by: Nilton Perim Neto <niltonperimneto@gmail.com>
Signed-off-by: Pavel Rojtberg <rojtberg@gmail.com>
Link: https://lore.kernel.org/r/20250107192830.414709-12-rojtberg@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 907bc9268a5a9f823ffa751957a5c1dd59f83f42 upstream.
Microsoft defined Meta+Shift+F23 as the Copilot shortcut instead of a
dedicated keycode, and multiple vendors have their keyboards emit this
sequence in response to users pressing a dedicated "Copilot" key.
Unfortunately the default keymap table in atkbd does not map scancode
0x6e (F23) and so the key combination does not work even if userspace
is ready to handle it.
Because this behavior is common between multiple vendors and the
scancode is currently unused map 0x6e to keycode 193 (KEY_F23) so that
key sequence is generated properly.
MS documentation for the scan code:
https://learn.microsoft.com/en-us/windows/win32/inputdev/about-keyboard-input#scan-codes
Confirmed on Lenovo, HP and Dell machines by Canonical.
Tested on Lenovo T14s G6 AMD.
Signed-off-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Link: https://lore.kernel.org/r/20250107034554.25843-1-mpearson-lenovo@squebb.ca
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
the crash caused by port being null"
commit 086fd062bc3883ae1ce4166cff5355db315ad879 upstream.
This reverts commit 13014969cbf07f18d62ceea40bd8ca8ec9d36cec.
It is reported to cause crashes on Tegra systems, so revert it for now.
Link: https://lore.kernel.org/r/1037c1ad-9230-4181-b9c3-167dbaa47644@nvidia.com
Reported-by: Jon Hunter <jonathanh@nvidia.com>
Cc: stable <stable@kernel.org>
Cc: Lianqin Hu <hulianqin@vivo.com>
Link: https://lore.kernel.org/r/2025011711-yippee-fever-a737@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 575a5adf48b06a2980c9eeffedf699ed5534fade upstream.
This patch addresses a null-ptr-deref in qt2_process_read_urb() due to
an incorrect bounds check in the following:
if (newport > serial->num_ports) {
dev_err(&port->dev,
"%s - port change to invalid port: %i\n",
__func__, newport);
break;
}
The condition doesn't account for the valid range of the serial->port
buffer, which is from 0 to serial->num_ports - 1. When newport is equal
to serial->num_ports, the assignment of "port" in the
following code is out-of-bounds and NULL:
serial_priv->current_port = newport;
port = serial->port[serial_priv->current_port];
The fix checks if newport is greater than or equal to serial->num_ports
indicating it is out-of-bounds.
Reported-by: syzbot <syzbot+506479ebf12fe435d01a@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=506479ebf12fe435d01a
Fixes: f7a33e608d9a ("USB: serial: add quatech2 usb to serial driver")
Cc: <stable@vger.kernel.org> # 3.5
Signed-off-by: Qasim Ijaz <qasdev00@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit efbe8f81952fe469d38655744627d860879dcde8 upstream.
Validate index before access iwl_rate_mcs to keep rate->index
inside the valid boundaries. Use MCS_0_INDEX if index is less
than MCS_0_INDEX and MCS_9_INDEX if index is greater then
MCS_9_INDEX.
Signed-off-by: Anjaneyulu <pagadala.yesu.anjaneyulu@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230614123447.79f16b3aef32.If1137f894775d6d07b78cbf3a6163ffce6399507@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d2138eab8cde61e0e6f62d0713e45202e8457d6d upstream.
If there's a persistent error in the hypervisor, the SCSI warning for
failed I/O can flood the kernel log and max out CPU utilization,
preventing troubleshooting from the VM side. Ratelimit the warning so
it doesn't DoS the VM.
Closes: https://github.com/microsoft/WSL/issues/9173
Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Link: https://lore.kernel.org/r/20250107-eahariha-ratelimit-storvsc-v1-1-7fc193d1f2b0@linux.microsoft.com
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit ce9ff21ea89d191e477a02ad7eabf4f996b80a69 upstream.
count and offset are passed from user space and not checked, only
offset is capped to 40 bits, which can be used to read/write out of
bounds of the device.
Fixes: 6e3f26456009 (“vfio/platform: read and write support for the device fd”)
Cc: stable@vger.kernel.org
Reported-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Tested-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3d88ba86ba6f35a0467f25a88c38aa5639190d04 upstream.
This reverts commit 251efae73bd46b097deec4f9986d926813aed744.
Quoting Wang Yuli:
"The 27C6:01E0 touchpad doesn't require the workaround and applying it
would actually break functionality.
The initial report came from a BBS forum, but we suspect the
information provided by the forum user may be incorrect which could
happen sometimes. [1]
Further investigation showed that the Lenovo Y9000P 2024 doesn't even
use a Goodix touchpad. [2]
For the broader issue of 27c6:01e0 being unusable on some devices, it
just need to address it with a libinput quirk.
In conclusion, we should revert this commit, which is the best
solution."
Reported-by: Ulrich Müller <ulm@gentoo.org>
Reported-by: WangYuli <wangyuli@uniontech.com>
Link: https://lore.kernel.org/all/uikt4wwpw@gentoo.org/
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3061e170381af96d1e66799d34264e6414d428a7 upstream.
At the end of __regmap_init(), if dev is not NULL, regmap_attach_dev()
is called, which adds a devres reference to the regmap, to be able to
retrieve a dev's regmap by name using dev_get_regmap().
When calling regmap_exit, the opposite does not happen, and the
reference is kept until the dev is detached.
Add a regmap_detach_dev() function and call it in regmap_exit() to make
sure that the devres reference is not kept.
Cc: stable@vger.kernel.org
Fixes: 72b39f6f2b5a ("regmap: Implement dev_get_regmap()")
Signed-off-by: Cosmin Tanislav <demonsingur@gmail.com>
Rule: add
Link: https://lore.kernel.org/stable/20241128130554.362486-1-demonsingur%40gmail.com
Link: https://patch.msgid.link/20241128131625.363835-1-demonsingur@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20250115033244.2540522-1-tzungbi@kernel.org
Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 3a748d483d80f066ca4b26abe45cdc0c367d13e9 ]
Some boards with Allwinner SoCs connect the PMIC's IRQ pin to the SoC's NMI
pin instead of a normal GPIO. Since the power key is connected to the PMIC,
and people expect to wake up a suspended system via this key, the NMI IRQ
controller must stay alive when the system goes into suspend.
Add the SKIP_WAKE flag to prevent the sunxi NMI controller from going to
sleep, so that the power key can wake up those systems.
[ tglx: Fixed up coding style ]
Signed-off-by: Philippe Simons <simons.philippe@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250112123402.388520-1-simons.philippe@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b5c764d6ed556c4e81fbe3fd976da77ec450c08e ]
[Why]
Without the dmub hw lock, it may cause the lock timeout issue
while do modeset on PSR1 eDP panel.
[How]
Allow dmub hw lock for PSR1.
Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a2b5a9956269f4c1a09537177f18ab0229fe79f7)
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 63ca02221cc5aa0731fe2b0cc28158aaa4b84982 ]
The ISCSI_UEVENT_GET_HOST_STATS request is already handled in
iscsi_get_host_stats(). This fix ensures that redundant responses are
skipped in iscsi_if_rx().
- On success: send reply and stats from iscsi_get_host_stats()
within if_recv_msg().
- On error: fall through.
Signed-off-by: Xiang Zhang <hawkxiang.cpp@gmail.com>
Link: https://lore.kernel.org/r/20250107022432.65390-1-hawkxiang.cpp@gmail.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 95c38953cb1ecf40399a676a1f85dfe2b5780a9a upstream.
When running 'rmmod ath10k', ath10k_sdio_remove() will free sdio
workqueue by destroy_workqueue(). But if CONFIG_INIT_ON_FREE_DEFAULT_ON
is set to yes, kernel panic will happen:
Call trace:
destroy_workqueue+0x1c/0x258
ath10k_sdio_remove+0x84/0x94
sdio_bus_remove+0x50/0x16c
device_release_driver_internal+0x188/0x25c
device_driver_detach+0x20/0x2c
This is because during 'rmmod ath10k', ath10k_sdio_remove() will call
ath10k_core_destroy() before destroy_workqueue(). wiphy_dev_release()
will finally be called in ath10k_core_destroy(). This function will free
struct cfg80211_registered_device *rdev and all its members, including
wiphy, dev and the pointer of sdio workqueue. Then the pointer of sdio
workqueue will be set to NULL due to CONFIG_INIT_ON_FREE_DEFAULT_ON.
After device release, destroy_workqueue() will use NULL pointer then the
kernel panic happen.
Call trace:
ath10k_sdio_remove
->ath10k_core_unregister
……
->ath10k_core_stop
->ath10k_hif_stop
->ath10k_sdio_irq_disable
->ath10k_hif_power_down
->del_timer_sync(&ar_sdio->sleep_timer)
->ath10k_core_destroy
->ath10k_mac_destroy
->ieee80211_free_hw
->wiphy_free
……
->wiphy_dev_release
->destroy_workqueue
Need to call destroy_workqueue() before ath10k_core_destroy(), free
the work queue buffer first and then free pointer of work queue by
ath10k_core_destroy(). This order matches the error path order in
ath10k_sdio_probe().
No work will be queued on sdio workqueue between it is destroyed and
ath10k_core_destroy() is called. Based on the call_stack above, the
reason is:
Only ath10k_sdio_sleep_timer_handler(), ath10k_sdio_hif_tx_sg() and
ath10k_sdio_irq_disable() will queue work on sdio workqueue.
Sleep timer will be deleted before ath10k_core_destroy() in
ath10k_hif_power_down().
ath10k_sdio_irq_disable() only be called in ath10k_hif_stop().
ath10k_core_unregister() will call ath10k_hif_power_down() to stop hif
bus, so ath10k_sdio_hif_tx_sg() won't be called anymore.
Tested-on: QCA6174 hw3.2 SDIO WLAN.RMH.4.4.1-00189
Signed-off-by: Kang Yang <quic_kangyang@quicinc.com>
Tested-by: David Ruth <druth@chromium.org>
Reviewed-by: David Ruth <druth@chromium.org>
Link: https://patch.msgid.link/20241008022246.1010-1-quic_kangyang@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This reverts commit 48dc44f3c1afa29390cb2fbc8badad1b1111cea4 which is
commit 3061e170381af96d1e66799d34264e6414d428a7 upstream.
It was backported incorrectly, a fixed version will be applied later.
Cc: Cosmin Tanislav <demonsingur@gmail.com>
Cc: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20250115033244.2540522-1-tzungbi@kernel.org
Reported-by: Tzung-Bi Shih <tzungbi@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f10593ad9bc36921f623361c9e3dd96bd52d85ee upstream.
Fix a use-after-free bug in sg_release(), detected by syzbot with KASAN:
BUG: KASAN: slab-use-after-free in lock_release+0x151/0xa30
kernel/locking/lockdep.c:5838
__mutex_unlock_slowpath+0xe2/0x750 kernel/locking/mutex.c:912
sg_release+0x1f4/0x2e0 drivers/scsi/sg.c:407
In sg_release(), the function kref_put(&sfp->f_ref, sg_remove_sfp) is
called before releasing the open_rel_lock mutex. The kref_put() call may
decrement the reference count of sfp to zero, triggering its cleanup
through sg_remove_sfp(). This cleanup includes scheduling deferred work
via sg_remove_sfp_usercontext(), which ultimately frees sfp.
After kref_put(), sg_release() continues to unlock open_rel_lock and may
reference sfp or sdp. If sfp has already been freed, this results in a
slab-use-after-free error.
Move the kref_put(&sfp->f_ref, sg_remove_sfp) call after unlocking the
open_rel_lock mutex. This ensures:
- No references to sfp or sdp occur after the reference count is
decremented.
- Cleanup functions such as sg_remove_sfp() and
sg_remove_sfp_usercontext() can safely execute without impacting the
mutex handling in sg_release().
The fix has been tested and validated by syzbot. This patch closes the
bug reported at the following syzkaller link and ensures proper
sequencing of resource cleanup and mutex operations, eliminating the
risk of use-after-free errors in sg_release().
Reported-by: syzbot+7efb5850a17ba6ce098b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7efb5850a17ba6ce098b
Tested-by: syzbot+7efb5850a17ba6ce098b@syzkaller.appspotmail.com
Fixes: cc833acbee9d ("sg: O_EXCL and other lock handling")
Signed-off-by: Suraj Sonawane <surajsonawane0215@gmail.com>
Link: https://lore.kernel.org/r/20241120125944.88095-1-surajsonawane0215@gmail.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Alva Lan <alvalan9@foxmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit ea4c990fa9e19ffef0648e40c566b94ba5ab31be upstream.
When the qp is in error state, the status of WQEs in the queue should be
set to error. Or else the following will appear.
[ 920.617269] WARNING: CPU: 1 PID: 21 at drivers/infiniband/sw/rxe/rxe_comp.c:756 rxe_completer+0x989/0xcc0 [rdma_rxe]
[ 920.617744] Modules linked in: rnbd_client(O) rtrs_client(O) rtrs_core(O) rdma_ucm rdma_cm iw_cm ib_cm crc32_generic rdma_rxe ip6_udp_tunnel udp_tunnel ib_uverbs ib_core loop brd null_blk ipv6
[ 920.618516] CPU: 1 PID: 21 Comm: ksoftirqd/1 Tainted: G O 6.1.113-storage+ #65
[ 920.618986] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[ 920.619396] RIP: 0010:rxe_completer+0x989/0xcc0 [rdma_rxe]
[ 920.619658] Code: 0f b6 84 24 3a 02 00 00 41 89 84 24 44 04 00 00 e9 2a f7 ff ff 39 ca bb 03 00 00 00 b8 0e 00 00 00 48 0f 45 d8 e9 15 f7 ff ff <0f> 0b e9 cb f8 ff ff 41 bf f5 ff ff ff e9 08 f8 ff ff 49 8d bc 24
[ 920.620482] RSP: 0018:ffff97b7c00bbc38 EFLAGS: 00010246
[ 920.620817] RAX: 0000000000000000 RBX: 000000000000000c RCX: 0000000000000008
[ 920.621183] RDX: ffff960dc396ebc0 RSI: 0000000000005400 RDI: ffff960dc4e2fbac
[ 920.621548] RBP: 0000000000000000 R08: 0000000000000001 R09: ffffffffac406450
[ 920.621884] R10: ffffffffac4060c0 R11: 0000000000000001 R12: ffff960dc4e2f800
[ 920.622254] R13: ffff960dc4e2f928 R14: ffff97b7c029c580 R15: 0000000000000000
[ 920.622609] FS: 0000000000000000(0000) GS:ffff960ef7d00000(0000) knlGS:0000000000000000
[ 920.622979] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 920.623245] CR2: 00007fa056965e90 CR3: 00000001107f1000 CR4: 00000000000006e0
[ 920.623680] Call Trace:
[ 920.623815] <TASK>
[ 920.623933] ? __warn+0x79/0xc0
[ 920.624116] ? rxe_completer+0x989/0xcc0 [rdma_rxe]
[ 920.624356] ? report_bug+0xfb/0x150
[ 920.624594] ? handle_bug+0x3c/0x60
[ 920.624796] ? exc_invalid_op+0x14/0x70
[ 920.624976] ? asm_exc_invalid_op+0x16/0x20
[ 920.625203] ? rxe_completer+0x989/0xcc0 [rdma_rxe]
[ 920.625474] ? rxe_completer+0x329/0xcc0 [rdma_rxe]
[ 920.625749] rxe_do_task+0x80/0x110 [rdma_rxe]
[ 920.626037] rxe_requester+0x625/0xde0 [rdma_rxe]
[ 920.626310] ? rxe_cq_post+0xe2/0x180 [rdma_rxe]
[ 920.626583] ? do_complete+0x18d/0x220 [rdma_rxe]
[ 920.626812] ? rxe_completer+0x1a3/0xcc0 [rdma_rxe]
[ 920.627050] rxe_do_task+0x80/0x110 [rdma_rxe]
[ 920.627285] tasklet_action_common.constprop.0+0xa4/0x120
[ 920.627522] handle_softirqs+0xc2/0x250
[ 920.627728] ? sort_range+0x20/0x20
[ 920.627942] run_ksoftirqd+0x1f/0x30
[ 920.628158] smpboot_thread_fn+0xc7/0x1b0
[ 920.628334] kthread+0xd6/0x100
[ 920.628504] ? kthread_complete_and_exit+0x20/0x20
[ 920.628709] ret_from_fork+0x1f/0x30
[ 920.628892] </TASK>
Fixes: ae720bdb703b ("RDMA/rxe: Generate error completion for error requester QP state")
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://patch.msgid.link/20241025152036.121417-1-yanjun.zhu@linux.dev
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Bin Lan <lanbincn@qq.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This reverts commit c807ab3a861f656bc39471a20a16b36632ac6b04 which is
commit 73dae652dcac776296890da215ee7dec357a1032 upstream.
The original patch 73dae652dcac (drm/amdgpu: rework resume handling for
display (v2)), was only targeted at kernels 6.11 and newer. It did not
apply cleanly to 6.12 so I backported it and it backport landed as
99a02eab8251 ("drm/amdgpu: rework resume handling for display (v2)"),
however there was a bug in the backport that was subsequently fixed in
063d380ca28e ("drm/amdgpu: fix backport of commit 73dae652dcac"). None
of this was intended for kernels older than 6.11, however the original
backport eventually landed in 6.6, 6.1, and 5.15.
Please revert the change from kernels 6.6, 6.1, and 5.15.
Link: https://lore.kernel.org/r/BL1PR12MB5144D5363FCE6F2FD3502534F7E72@BL1PR12MB5144.namprd12.prod.outlook.com
Link: https://lore.kernel.org/r/BL1PR12MB51449ADCFBF2314431F8BCFDF7132@BL1PR12MB5144.namprd12.prod.outlook.com
Reported-by: Salvatore Bonaccorso <carnil@debian.org>
Reported-by: Christian König <christian.koenig@amd.com>
Reported-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b61badd20b443eabe132314669bb51a263982e5c upstream.
[ +0.000021] BUG: KASAN: slab-use-after-free in drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
[ +0.000027] Read of size 8 at addr ffff8881b8605f88 by task amd_pci_unplug/2147
[ +0.000023] CPU: 6 PID: 2147 Comm: amd_pci_unplug Not tainted 6.10.0+ #1
[ +0.000016] Hardware name: ASUS System Product Name/ROG STRIX B550-F GAMING (WI-FI), BIOS 1401 12/03/2020
[ +0.000016] Call Trace:
[ +0.000008] <TASK>
[ +0.000009] dump_stack_lvl+0x76/0xa0
[ +0.000017] print_report+0xce/0x5f0
[ +0.000017] ? drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
[ +0.000019] ? srso_return_thunk+0x5/0x5f
[ +0.000015] ? kasan_complete_mode_report_info+0x72/0x200
[ +0.000016] ? drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
[ +0.000019] kasan_report+0xbe/0x110
[ +0.000015] ? drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
[ +0.000023] __asan_report_load8_noabort+0x14/0x30
[ +0.000014] drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
[ +0.000020] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? __kasan_check_write+0x14/0x30
[ +0.000016] ? __pfx_drm_sched_entity_flush+0x10/0x10 [gpu_sched]
[ +0.000020] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? __kasan_check_write+0x14/0x30
[ +0.000013] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? enable_work+0x124/0x220
[ +0.000015] ? __pfx_enable_work+0x10/0x10
[ +0.000013] ? srso_return_thunk+0x5/0x5f
[ +0.000014] ? free_large_kmalloc+0x85/0xf0
[ +0.000016] drm_sched_entity_destroy+0x18/0x30 [gpu_sched]
[ +0.000020] amdgpu_vce_sw_fini+0x55/0x170 [amdgpu]
[ +0.000735] ? __kasan_check_read+0x11/0x20
[ +0.000016] vce_v4_0_sw_fini+0x80/0x110 [amdgpu]
[ +0.000726] amdgpu_device_fini_sw+0x331/0xfc0 [amdgpu]
[ +0.000679] ? mutex_unlock+0x80/0xe0
[ +0.000017] ? __pfx_amdgpu_device_fini_sw+0x10/0x10 [amdgpu]
[ +0.000662] ? srso_return_thunk+0x5/0x5f
[ +0.000014] ? __kasan_check_write+0x14/0x30
[ +0.000013] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? mutex_unlock+0x80/0xe0
[ +0.000016] amdgpu_driver_release_kms+0x16/0x80 [amdgpu]
[ +0.000663] drm_minor_release+0xc9/0x140 [drm]
[ +0.000081] drm_release+0x1fd/0x390 [drm]
[ +0.000082] __fput+0x36c/0xad0
[ +0.000018] __fput_sync+0x3c/0x50
[ +0.000014] __x64_sys_close+0x7d/0xe0
[ +0.000014] x64_sys_call+0x1bc6/0x2680
[ +0.000014] do_syscall_64+0x70/0x130
[ +0.000014] ? srso_return_thunk+0x5/0x5f
[ +0.000014] ? irqentry_exit_to_user_mode+0x60/0x190
[ +0.000015] ? srso_return_thunk+0x5/0x5f
[ +0.000014] ? irqentry_exit+0x43/0x50
[ +0.000012] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? exc_page_fault+0x7c/0x110
[ +0.000015] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ +0.000014] RIP: 0033:0x7ffff7b14f67
[ +0.000013] Code: ff e8 0d 16 02 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 73 ba f7 ff
[ +0.000026] RSP: 002b:00007fffffffe378 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[ +0.000019] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ffff7b14f67
[ +0.000014] RDX: 0000000000000000 RSI: 00007ffff7f6f47a RDI: 0000000000000003
[ +0.000014] RBP: 00007fffffffe3a0 R08: 0000555555569890 R09: 0000000000000000
[ +0.000014] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fffffffe5c8
[ +0.000013] R13: 00005555555552a9 R14: 0000555555557d48 R15: 00007ffff7ffd040
[ +0.000020] </TASK>
[ +0.000016] Allocated by task 383 on cpu 7 at 26.880319s:
[ +0.000014] kasan_save_stack+0x28/0x60
[ +0.000008] kasan_save_track+0x18/0x70
[ +0.000007] kasan_save_alloc_info+0x38/0x60
[ +0.000007] __kasan_kmalloc+0xc1/0xd0
[ +0.000007] kmalloc_trace_noprof+0x180/0x380
[ +0.000007] drm_sched_init+0x411/0xec0 [gpu_sched]
[ +0.000012] amdgpu_device_init+0x695f/0xa610 [amdgpu]
[ +0.000658] amdgpu_driver_load_kms+0x1a/0x120 [amdgpu]
[ +0.000662] amdgpu_pci_probe+0x361/0xf30 [amdgpu]
[ +0.000651] local_pci_probe+0xe7/0x1b0
[ +0.000009] pci_device_probe+0x248/0x890
[ +0.000008] really_probe+0x1fd/0x950
[ +0.000008] __driver_probe_device+0x307/0x410
[ +0.000007] driver_probe_device+0x4e/0x150
[ +0.000007] __driver_attach+0x223/0x510
[ +0.000006] bus_for_each_dev+0x102/0x1a0
[ +0.000007] driver_attach+0x3d/0x60
[ +0.000006] bus_add_driver+0x2ac/0x5f0
[ +0.000006] driver_register+0x13d/0x490
[ +0.000008] __pci_register_driver+0x1ee/0x2b0
[ +0.000007] llc_sap_close+0xb0/0x160 [llc]
[ +0.000009] do_one_initcall+0x9c/0x3e0
[ +0.000008] do_init_module+0x241/0x760
[ +0.000008] load_module+0x51ac/0x6c30
[ +0.000006] __do_sys_init_module+0x234/0x270
[ +0.000007] __x64_sys_init_module+0x73/0xc0
[ +0.000006] x64_sys_call+0xe3/0x2680
[ +0.000006] do_syscall_64+0x70/0x130
[ +0.000007] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ +0.000015] Freed by task 2147 on cpu 6 at 160.507651s:
[ +0.000013] kasan_save_stack+0x28/0x60
[ +0.000007] kasan_save_track+0x18/0x70
[ +0.000007] kasan_save_free_info+0x3b/0x60
[ +0.000007] poison_slab_object+0x115/0x1c0
[ +0.000007] __kasan_slab_free+0x34/0x60
[ +0.000007] kfree+0xfa/0x2f0
[ +0.000007] drm_sched_fini+0x19d/0x410 [gpu_sched]
[ +0.000012] amdgpu_fence_driver_sw_fini+0xc4/0x2f0 [amdgpu]
[ +0.000662] amdgpu_device_fini_sw+0x77/0xfc0 [amdgpu]
[ +0.000653] amdgpu_driver_release_kms+0x16/0x80 [amdgpu]
[ +0.000655] drm_minor_release+0xc9/0x140 [drm]
[ +0.000071] drm_release+0x1fd/0x390 [drm]
[ +0.000071] __fput+0x36c/0xad0
[ +0.000008] __fput_sync+0x3c/0x50
[ +0.000007] __x64_sys_close+0x7d/0xe0
[ +0.000007] x64_sys_call+0x1bc6/0x2680
[ +0.000007] do_syscall_64+0x70/0x130
[ +0.000007] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ +0.000014] The buggy address belongs to the object at ffff8881b8605f80
which belongs to the cache kmalloc-64 of size 64
[ +0.000020] The buggy address is located 8 bytes inside of
freed 64-byte region [ffff8881b8605f80, ffff8881b8605fc0)
[ +0.000028] The buggy address belongs to the physical page:
[ +0.000011] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1b8605
[ +0.000008] anon flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
[ +0.000007] page_type: 0xffffefff(slab)
[ +0.000009] raw: 0017ffffc0000000 ffff8881000428c0 0000000000000000 dead000000000001
[ +0.000006] raw: 0000000000000000 0000000000200020 00000001ffffefff 0000000000000000
[ +0.000006] page dumped because: kasan: bad access detected
[ +0.000012] Memory state around the buggy address:
[ +0.000011] ffff8881b8605e80: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[ +0.000015] ffff8881b8605f00: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
[ +0.000015] >ffff8881b8605f80: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[ +0.000013] ^
[ +0.000011] ffff8881b8606000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc
[ +0.000014] ffff8881b8606080: fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb fb
[ +0.000013] ==================================================================
The issue reproduced on VG20 during the IGT pci_unplug test.
The root cause of the issue is that the function drm_sched_fini is called before drm_sched_entity_kill.
In drm_sched_fini, the drm_sched_rq structure is freed, but this structure is later accessed by
each entity within the run queue, leading to invalid memory access.
To resolve this, the order of cleanup calls is updated:
Before:
amdgpu_fence_driver_sw_fini
amdgpu_device_ip_fini
After:
amdgpu_device_ip_fini
amdgpu_fence_driver_sw_fini
This updated order ensures that all entities in the IPs are cleaned up first, followed by proper
cleanup of the schedulers.
Additional Investigation:
During debugging, another issue was identified in the amdgpu_vce_sw_fini function. The vce.vcpu_bo
buffer must be freed only as the final step in the cleanup process to prevent any premature
access during earlier cleanup stages.
v2: Using Christian suggestion call drm_sched_entity_destroy before drm_sched_fini.
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alva Lan <alvalan9@foxmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 63de35a8fcfca59ae8750d469a7eb220c7557baf upstream.
An issue was identified in the dcn21_link_encoder_create function where
an out-of-bounds access could occur when the hpd_source index was used
to reference the link_enc_hpd_regs array. This array has a fixed size
and the index was not being checked against the array's bounds before
accessing it.
This fix adds a conditional check to ensure that the hpd_source index is
within the valid range of the link_enc_hpd_regs array. If the index is
out of bounds, the function now returns NULL to prevent undefined
behavior.
References:
[ 65.920507] ------------[ cut here ]------------
[ 65.920510] UBSAN: array-index-out-of-bounds in drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn21/dcn21_resource.c:1312:29
[ 65.920519] index 7 is out of range for type 'dcn10_link_enc_hpd_registers [5]'
[ 65.920523] CPU: 3 PID: 1178 Comm: modprobe Tainted: G OE 6.8.0-cleanershaderfeatureresetasdntipmi200nv2132 #13
[ 65.920525] Hardware name: AMD Majolica-RN/Majolica-RN, BIOS WMJ0429N_Weekly_20_04_2 04/29/2020
[ 65.920527] Call Trace:
[ 65.920529] <TASK>
[ 65.920532] dump_stack_lvl+0x48/0x70
[ 65.920541] dump_stack+0x10/0x20
[ 65.920543] __ubsan_handle_out_of_bounds+0xa2/0xe0
[ 65.920549] dcn21_link_encoder_create+0xd9/0x140 [amdgpu]
[ 65.921009] link_create+0x6d3/0xed0 [amdgpu]
[ 65.921355] create_links+0x18a/0x4e0 [amdgpu]
[ 65.921679] dc_create+0x360/0x720 [amdgpu]
[ 65.921999] ? dmi_matches+0xa0/0x220
[ 65.922004] amdgpu_dm_init+0x2b6/0x2c90 [amdgpu]
[ 65.922342] ? console_unlock+0x77/0x120
[ 65.922348] ? dev_printk_emit+0x86/0xb0
[ 65.922354] dm_hw_init+0x15/0x40 [amdgpu]
[ 65.922686] amdgpu_device_init+0x26a8/0x33a0 [amdgpu]
[ 65.922921] amdgpu_driver_load_kms+0x1b/0xa0 [amdgpu]
[ 65.923087] amdgpu_pci_probe+0x1b7/0x630 [amdgpu]
[ 65.923087] local_pci_probe+0x4b/0xb0
[ 65.923087] pci_device_probe+0xc8/0x280
[ 65.923087] really_probe+0x187/0x300
[ 65.923087] __driver_probe_device+0x85/0x130
[ 65.923087] driver_probe_device+0x24/0x110
[ 65.923087] __driver_attach+0xac/0x1d0
[ 65.923087] ? __pfx___driver_attach+0x10/0x10
[ 65.923087] bus_for_each_dev+0x7d/0xd0
[ 65.923087] driver_attach+0x1e/0x30
[ 65.923087] bus_add_driver+0xf2/0x200
[ 65.923087] driver_register+0x64/0x130
[ 65.923087] ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
[ 65.923087] __pci_register_driver+0x61/0x70
[ 65.923087] amdgpu_init+0x7d/0xff0 [amdgpu]
[ 65.923087] do_one_initcall+0x49/0x310
[ 65.923087] ? kmalloc_trace+0x136/0x360
[ 65.923087] do_init_module+0x6a/0x270
[ 65.923087] load_module+0x1fce/0x23a0
[ 65.923087] init_module_from_file+0x9c/0xe0
[ 65.923087] ? init_module_from_file+0x9c/0xe0
[ 65.923087] idempotent_init_module+0x179/0x230
[ 65.923087] __x64_sys_finit_module+0x5d/0xa0
[ 65.923087] do_syscall_64+0x76/0x120
[ 65.923087] entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 65.923087] RIP: 0033:0x7f2d80f1e88d
[ 65.923087] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
[ 65.923087] RSP: 002b:00007ffc7bc1aa78 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 65.923087] RAX: ffffffffffffffda RBX: 0000564c9c1db130 RCX: 00007f2d80f1e88d
[ 65.923087] RDX: 0000000000000000 RSI: 0000564c9c1e5480 RDI: 000000000000000f
[ 65.923087] RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000002
[ 65.923087] R10: 000000000000000f R11: 0000000000000246 R12: 0000564c9c1e5480
[ 65.923087] R13: 0000564c9c1db260 R14: 0000000000000000 R15: 0000564c9c1e54b0
[ 65.923087] </TASK>
[ 65.923927] ---[ end trace ]---
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Bin Lan <lanbincn@qq.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 38724591364e1e3b278b4053f102b49ea06ee17c upstream.
The 'data' local struct is used to push data to user space from a
triggered buffer, but it does not set values for inactive channels, as
it only uses iio_for_each_active_channel() to assign new values.
Initialize the struct to zero before using it to avoid pushing
uninitialized information to userspace.
Cc: stable@vger.kernel.org
Fixes: 4e130dc7b413 ("iio: adc: rockchip_saradc: Add support iio buffers")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-4-0cb6e98d895c@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Bin Lan <lanbincn@qq.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 65a60a590142c54a3f3be11ff162db2d5b0e1e06 upstream.
Currently suspending while sensors are one will result in timestamping
continuing without gap at resume. It can work with monotonic clock but
not with other clocks. Fix that by resetting timestamping.
Fixes: ec74ae9fd37c ("iio: imu: inv_icm42600: add accurate timestamping")
Cc: stable@vger.kernel.org
Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com>
Link: https://patch.msgid.link/20241113-inv_icm42600-fix-timestamps-after-suspend-v1-1-dfc77c394173@tdk.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit c0f866de4ce447bca3191b9cefac60c4b36a7922 upstream.
Burst write with SPI is not working for all icm42600 chips. It was
only used for setting user offsets with regmap_bulk_write.
Add specific SPI regmap config for using only single write with SPI.
Fixes: 9f9ff91b775b ("iio: imu: inv_icm42600: add SPI driver for inv_icm42600 driver")
Cc: stable@vger.kernel.org
Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com>
Link: https://patch.msgid.link/20241112-inv-icm42600-fix-spi-burst-write-not-supported-v2-1-97690dc03607@tdk.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This reverts commit f858b0fab28d8bc2d0f0e8cd4afc3216f347cfcc which is
commit 7246a4520b4bf1494d7d030166a11b5226f6d508 upstream.
This patch causes a regression in cuttlefish/crossvm boot on arm64.
The patch was part of a series that when applied will not cause a regression
but this patch was backported to the 6.1 branch by itself.
The other patches do not apply cleanly to the 6.1 branch.
Signed-off-by: Terry Tritton <terry.tritton@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 1a5401ec3018c101c456cdbda2eaef9482db6786 upstream.
Mesa changed its clear color alignment from 4k to 64 bytes
without informing the kernel side about the change. This
is now likely to cause framebuffer creation to fail.
The only thing we do with the clear color buffer in i915 is:
1. map a single page
2. read out bytes 16-23 from said page
3. unmap the page
So the only requirement we really have is that those 8 bytes
are all contained within one page. Thus we can deal with the
Mesa regression by reducing the alignment requiment from 4k
to the same 64 bytes in the kernel. We could even go as low as
32 bytes, but IIRC 64 bytes is the hardware requirement on
the 3D engine side so matching that seems sensible.
Note that the Mesa alignment chages were partially undone
so the regression itself was already fixed on userspace
side.
Cc: stable@vger.kernel.org
Cc: Sagar Ghuge <sagar.ghuge@intel.com>
Cc: Nanley Chery <nanley.g.chery@intel.com>
Reported-by: Xi Ruoyao <xry111@xry111.site>
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13057
Closes: https://lore.kernel.org/all/45a5bba8de009347262d86a4acb27169d9ae0d9f.camel@xry111.site/
Link: https://gitlab.freedesktop.org/mesa/mesa/-/commit/17f97a69c13832a6c1b0b3aad45b06f07d4b852f
Link: https://gitlab.freedesktop.org/mesa/mesa/-/commit/888f63cf1baf34bc95e847a30a041dc7798edddb
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241129065014.8363-2-ville.syrjala@linux.intel.com
Tested-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit ed3a892e5e3d6b3f6eeb76db7c92a968aeb52f3d)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 35cb2c6ce7da545f3b5cb1e6473ad7c3a6f08310 upstream.
The following call-chain leads to enabling interrupts in a nested interrupt
disabled section:
irq_set_vcpu_affinity()
irq_get_desc_lock()
raw_spin_lock_irqsave() <--- Disable interrupts
its_irq_set_vcpu_affinity()
guard(raw_spinlock_irq) <--- Enables interrupts when leaving the guard()
irq_put_desc_unlock() <--- Warns because interrupts are enabled
This was broken in commit b97e8a2f7130, which replaced the original
raw_spin_[un]lock() pair with guard(raw_spinlock_irq).
Fix the issue by using guard(raw_spinlock).
[ tglx: Massaged change log ]
Fixes: b97e8a2f7130 ("irqchip/gic-v3-its: Fix potential race condition in its_vlpi_prop_update()")
Signed-off-by: Tomas Krcka <krckatom@amazon.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20241230150825.62894-1-krckatom@amazon.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 0d62a49ab55c99e8deb4593b8d9f923de1ab5c18 upstream.
When a CPU attempts to enter low power mode, it disables the redistributor
and Group 1 interrupts and reinitializes the system registers upon wakeup.
If the transition into low power mode fails, then the CPU_PM framework
invokes the PM notifier callback with CPU_PM_ENTER_FAILED to allow the
drivers to undo the state changes.
The GIC V3 driver ignores CPU_PM_ENTER_FAILED, which leaves the GIC in
disabled state.
Handle CPU_PM_ENTER_FAILED in the same way as CPU_PM_EXIT to restore normal
operation.
[ tglx: Massage change log, add Fixes tag ]
Fixes: 3708d52fc6bb ("irqchip: gic-v3: Implement CPU PM notifier")
Signed-off-by: Yogesh Lal <quic_ylal@quicinc.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20241220093907.2747601-1-quic_ylal@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9322d1915f9d976ee48c09d800fbd5169bc2ddcc upstream.
platform_irqchip_probe() leaks a OF node when irq_init_cb() fails. Fix it
by declaring par_np with the __free(device_node) cleanup construct.
This bug was found by an experimental static analysis tool that I am
developing.
Fixes: f8410e626569 ("irqchip: Add IRQCHIP_PLATFORM_DRIVER_BEGIN/END and IRQCHIP_MATCH helper macros")
Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20241215033945.3414223-1-joe@pf.is.s.u-tokyo.ac.jp
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 726efa92e02b460811e8bc6990dd742f03b645ea upstream.
Currently imx8mp_blk_ctrl_remove() will continue the for loop
until an out-of-bounds exception occurs.
pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : dev_pm_domain_detach+0x8/0x48
lr : imx8mp_blk_ctrl_shutdown+0x58/0x90
sp : ffffffc084f8bbf0
x29: ffffffc084f8bbf0 x28: ffffff80daf32ac0 x27: 0000000000000000
x26: ffffffc081658d78 x25: 0000000000000001 x24: ffffffc08201b028
x23: ffffff80d0db9490 x22: ffffffc082340a78 x21: 00000000000005b0
x20: ffffff80d19bc180 x19: 000000000000000a x18: ffffffffffffffff
x17: ffffffc080a39e08 x16: ffffffc080a39c98 x15: 4f435f464f006c72
x14: 0000000000000004 x13: ffffff80d0172110 x12: 0000000000000000
x11: ffffff80d0537740 x10: ffffff80d05376c0 x9 : ffffffc0808ed2d8
x8 : ffffffc084f8bab0 x7 : 0000000000000000 x6 : 0000000000000000
x5 : ffffff80d19b9420 x4 : fffffffe03466e60 x3 : 0000000080800077
x2 : 0000000000000000 x1 : 0000000000000001 x0 : 0000000000000000
Call trace:
dev_pm_domain_detach+0x8/0x48
platform_shutdown+0x2c/0x48
device_shutdown+0x158/0x268
kernel_restart_prepare+0x40/0x58
kernel_kexec+0x58/0xe8
__do_sys_reboot+0x198/0x258
__arm64_sys_reboot+0x2c/0x40
invoke_syscall+0x5c/0x138
el0_svc_common.constprop.0+0x48/0xf0
do_el0_svc+0x24/0x38
el0_svc+0x38/0xc8
el0t_64_sync_handler+0x120/0x130
el0t_64_sync+0x190/0x198
Code: 8128c2d0 ffffffc0 aa1e03e9 d503201f
Fixes: 556f5cf9568a ("soc: imx: add i.MX8MP HSIO blk-ctrl")
Cc: stable@vger.kernel.org
Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20250115014118.4086729-1-xiaolei.wang@windriver.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 02f6b0e1ec7e0e7d059dddc893645816552039da upstream.
The use-after-free issue occurs as follows: when the GPIO chip device file
is being closed by invoking gpio_chrdev_release(), watched_lines is freed
by bitmap_free(), but the unregistration of lineinfo_changed_nb notifier
chain failed due to waiting write rwsem. Additionally, one of the GPIO
chip's lines is also in the release process and holds the notifier chain's
read rwsem. Consequently, a race condition leads to the use-after-free of
watched_lines.
Here is the typical stack when issue happened:
[free]
gpio_chrdev_release()
--> bitmap_free(cdev->watched_lines) <-- freed
--> blocking_notifier_chain_unregister()
--> down_write(&nh->rwsem) <-- waiting rwsem
--> __down_write_common()
--> rwsem_down_write_slowpath()
--> schedule_preempt_disabled()
--> schedule()
[use]
st54spi_gpio_dev_release()
--> gpio_free()
--> gpiod_free()
--> gpiod_free_commit()
--> gpiod_line_state_notify()
--> blocking_notifier_call_chain()
--> down_read(&nh->rwsem); <-- held rwsem
--> notifier_call_chain()
--> lineinfo_changed_notify()
--> test_bit(xxxx, cdev->watched_lines) <-- use after free
The side effect of the use-after-free issue is that a GPIO line event is
being generated for userspace where it shouldn't. However, since the chrdev
is being closed, userspace won't have the chance to read that event anyway.
To fix the issue, call the bitmap_free() function after the unregistration
of lineinfo_changed_nb notifier chain.
Fixes: 51c1064e82e7 ("gpiolib: add new ioctl() for monitoring changes in line info")
Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com>
Link: https://lore.kernel.org/r/20240505141156.2944912-1-quic_zhonhan@quicinc.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Bruno VERNAY <bruno.vernay@se.com>
Signed-off-by: Hugo SIMELIERE <hsimeliere.opensource@witekio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6be7aca91009865d8c2b73589270224a6b6e67ab upstream.
In 4.19, before the switch to linkmode bitmaps, PHY_GBIT_FEATURES
included feature bits for aneg and TP/MII ports.
SUPPORTED_TP | \
SUPPORTED_MII)
SUPPORTED_10baseT_Full)
SUPPORTED_100baseT_Full)
SUPPORTED_1000baseT_Full)
PHY_100BT_FEATURES | \
PHY_DEFAULT_FEATURES)
PHY_1000BT_FEATURES)
Referenced commit expanded PHY_GBIT_FEATURES, silently removing
PHY_DEFAULT_FEATURES. The removed part can be re-added by using
the new PHY_GBIT_FEATURES definition.
Not clear to me is why nobody seems to have noticed this issue.
I stumbled across this when checking what it takes to make
phy_10_100_features_array et al private to phylib.
Fixes: d0939c26c53a ("net: ethernet: xgbe: expand PHY_GBIT_FEAUTRES")
Cc: stable@vger.kernel.org
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/46521973-7738-4157-9f5e-0bb6f694acba@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 212fe1c0df4a150fb6298db2cfff267ceaba5402 upstream.
If zram_meta_alloc failed early, it frees allocated zram->table without
setting it NULL. Which will potentially cause zram_meta_free to access
the table if user reset an failed and uninitialized device.
Link: https://lkml.kernel.org/r/20250107065446.86928-1-ryncsn@gmail.com
Fixes: 74363ec674cb ("zram: fix uninitialized ZRAM not releasing backing device")
Signed-off-by: Kairui Song <kasong@tencent.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit cd4a7b2e6a2437a5502910c08128ea3bad55a80b ]
acpi_dev_irq_override() gets called approx. 30 times during boot (15 legacy
IRQs * 2 override_table entries). Of these 30 calls at max 1 will match
the non DMI checks done by acpi_dev_irq_override(). The dmi_check_system()
check is by far the most expensive check done by acpi_dev_irq_override(),
make this call the last check done by acpi_dev_irq_override() so that it
will be called at max 1 time instead of 30 times.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20241228165253.42584-1-hdegoede@redhat.com
[ rjw: Subject edit ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
drivers
[ Upstream commit bb9850704c043e48c86cc9df90ee102e8a338229 ]
Otherwise, the default levels will override the levels set by the host
controller drivers.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20241219-ufs-qcom-suspend-fix-v3-2-63c4b95a70b9@linaro.org
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b579d6fdc3a9149bb4d2b3133cc0767130ed13e6 ]
Ensure we propagate npwg to the target as well instead
of assuming its the same logical blocks per physical block.
This ensures devices with large IUs information properly
propagated on the target.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 093f70c134f70e4632b295240f07d2b50b74e247 ]
When this controller is a target, the NACK handling had two issues.
First, the return value from the backend was not checked on the initial
WRITE_REQUESTED. So, the driver missed to send a NACK in this case.
Also, the NACK always arrives one byte late on the bus, even in the
WRITE_RECEIVED case. This seems to be a HW issue. We should then not
rely on the backend to correctly NACK the superfluous byte as well. Fix
both issues by introducing a flag which gets set whenever the backend
requests a NACK and keep sending it until we get a STOP condition.
Fixes: de20d1857dd6 ("i2c: rcar: add slave support")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ca89f73394daf92779ddaa37b42956f4953f3941 ]
When misconfigured, the initial setup of the current mux channel can
fail, too. It must be checked as well.
Fixes: 50a5ba876908 ("i2c: mux: demux-pinctrl: add driver")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d15638bf76ad47874ecb5dc386f0945fc0b2a875 ]
This reverts commit 98d1fb94ce75f39febd456d6d3cbbe58b6678795.
The commit uses data nbits instead of addr nbits for dummy phase. This
causes a regression for all boards where spi-tx-bus-width is smaller
than spi-rx-bus-width. It is a common pattern for boards to have
spi-tx-bus-width == 1 and spi-rx-bus-width > 1. The regression causes
all reads with a dummy phase to become unavailable for such boards,
leading to a usually slower 0-dummy-cycle read being selected.
Most controllers' supports_op hooks call spi_mem_default_supports_op().
In spi_mem_default_supports_op(), spi_mem_check_buswidth() is called to
check if the buswidths for the op can actually be supported by the
board's wiring. This wiring information comes from (among other things)
the spi-{tx,rx}-bus-width DT properties. Based on these properties,
SPI_TX_* or SPI_RX_* flags are set by of_spi_parse_dt().
spi_mem_check_buswidth() then uses these flags to make the decision
whether an op can be supported by the board's wiring (in a way,
indirectly checking against spi-{rx,tx}-bus-width).
Now the tricky bit here is that spi_mem_check_buswidth() does:
if (op->dummy.nbytes &&
spi_check_buswidth_req(mem, op->dummy.buswidth, true))
return false;
The true argument to spi_check_buswidth_req() means the op is treated as
a TX op. For a board that has say 1-bit TX and 4-bit RX, a 4-bit dummy
TX is considered as unsupported, and the op gets rejected.
The commit being reverted uses the data buswidth for dummy buswidth. So
for reads, the RX buswidth gets used for the dummy phase, uncovering
this issue. In reality, a dummy phase is neither RX nor TX. As the name
suggests, these are just dummy cycles that send or receive no data, and
thus don't really need to have any buswidth at all.
Ideally, dummy phases should not be checked against the board's wiring
capabilities at all, and should only be sanity-checked for having a sane
buswidth value. Since we are now at rc7 and such a change might
introduce many unexpected bugs, revert the commit for now. It can be
sent out later along with the spi_mem_check_buswidth() fix.
Fixes: 98d1fb94ce75 ("mtd: spi-nor: core: replace dummy buswidth from addr to data")
Reported-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Closes: https://lore.kernel.org/linux-mtd/3342163.44csPzL39Z@steina-w/
Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Signed-off-by: Pratyush Yadav <pratyush@kernel.org>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e2c68cea431d65292b592c9f8446c918d45fcf78 ]
Fix several issues with division of negative numbers in the tmp513
driver.
The docs on the DIV_ROUND_CLOSEST macro explain that dividing a negative
value by an unsigned type is undefined behavior. The driver was doing
this in several places, i.e. data->shunt_uohms has type of u32. The
actual "undefined" behavior is that it converts both values to unsigned
before doing the division, for example:
int ret = DIV_ROUND_CLOSEST(-100, 3U);
results in ret == 1431655732 instead of -33.
Furthermore the MILLI macro has a type of unsigned long. Multiplying a
signed long by an unsigned long results in an unsigned long.
So, we need to cast both MILLI and data data->shunt_uohms to long when
using the DIV_ROUND_CLOSEST macro.
Fixes: f07f9d2467f4 ("hwmon: (tmp513) Use SI constants from units.h")
Fixes: 59dfa75e5d82 ("hwmon: Add driver for Texas Instruments TMP512/513 sensor chips.")
Signed-off-by: David Lechner <dlechner@baylibre.com>
Link: https://lore.kernel.org/r/20250114-fix-si-prefix-macro-sign-bugs-v1-1-696fd8d10f00@baylibre.com
[groeck: Drop some continuation lines]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e4b5ccd392b92300a2b341705cc4805681094e49 ]
After a job completes, the corresponding pointer in the device must
be set to NULL. Failing to do so triggers a warning when unloading
the driver, as it appears the job is still active. To prevent this,
assign the job pointer to NULL after completing the job, indicating
the job has finished.
Fixes: 14d1d1908696 ("drm/v3d: Remove the bad signaled() implementation.")
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250113154741.67520-1-mcanal@igalia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5641e82cb55b4ecbc6366a499300917d2f3e6790 ]
Clear the port select structure on error so no stale values left after
definers are destroyed. That's because the mlx5_lag_destroy_definers()
always try to destroy all lag definers in the tt_map, so in the flow
below lag definers get double-destroyed and cause kernel crash:
mlx5_lag_port_sel_create()
mlx5_lag_create_definers()
mlx5_lag_create_definer() <- Failed on tt 1
mlx5_lag_destroy_definers() <- definers[tt=0] gets destroyed
mlx5_lag_port_sel_create()
mlx5_lag_create_definers()
mlx5_lag_create_definer() <- Failed on tt 0
mlx5_lag_destroy_definers() <- definers[tt=0] gets double-destroyed
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
Mem abort info:
ESR = 0x0000000096000005
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x05: level 1 translation fault
Data abort info:
ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
CM = 0, WnR = 0, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
user pgtable: 64k pages, 48-bit VAs, pgdp=0000000112ce2e00
[0000000000000008] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
Modules linked in: iptable_raw bonding ip_gre ip6_gre gre ip6_tunnel tunnel6 geneve ip6_udp_tunnel udp_tunnel ipip tunnel4 ip_tunnel rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_ib(OE) ib_uverbs(OE) mlx5_fwctl(OE) fwctl(OE) mlx5_core(OE) mlxdevm(OE) ib_core(OE) mlxfw(OE) memtrack(OE) mlx_compat(OE) openvswitch nsh nf_conncount psample xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc netconsole overlay efi_pstore sch_fq_codel zram ip_tables crct10dif_ce qemu_fw_cfg fuse ipv6 crc_ccitt [last unloaded: mlx_compat(OE)]
CPU: 3 UID: 0 PID: 217 Comm: kworker/u53:2 Tainted: G OE 6.11.0+ #2
Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
Workqueue: mlx5_lag mlx5_do_bond_work [mlx5_core]
pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : mlx5_del_flow_rules+0x24/0x2c0 [mlx5_core]
lr : mlx5_lag_destroy_definer+0x54/0x100 [mlx5_core]
sp : ffff800085fafb00
x29: ffff800085fafb00 x28: ffff0000da0c8000 x27: 0000000000000000
x26: ffff0000da0c8000 x25: ffff0000da0c8000 x24: ffff0000da0c8000
x23: ffff0000c31f81a0 x22: 0400000000000000 x21: ffff0000da0c8000
x20: 0000000000000000 x19: 0000000000000001 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffff8b0c9350
x14: 0000000000000000 x13: ffff800081390d18 x12: ffff800081dc3cc0
x11: 0000000000000001 x10: 0000000000000b10 x9 : ffff80007ab7304c
x8 : ffff0000d00711f0 x7 : 0000000000000004 x6 : 0000000000000190
x5 : ffff00027edb3010 x4 : 0000000000000000 x3 : 0000000000000000
x2 : ffff0000d39b8000 x1 : ffff0000d39b8000 x0 : 0400000000000000
Call trace:
mlx5_del_flow_rules+0x24/0x2c0 [mlx5_core]
mlx5_lag_destroy_definer+0x54/0x100 [mlx5_core]
mlx5_lag_destroy_definers+0xa0/0x108 [mlx5_core]
mlx5_lag_port_sel_create+0x2d4/0x6f8 [mlx5_core]
mlx5_activate_lag+0x60c/0x6f8 [mlx5_core]
mlx5_do_bond_work+0x284/0x5c8 [mlx5_core]
process_one_work+0x170/0x3e0
worker_thread+0x2d8/0x3e0
kthread+0x11c/0x128
ret_from_fork+0x10/0x20
Code: a9025bf5 aa0003f6 a90363f7 f90023f9 (f9400400)
---[ end trace 0000000000000000 ]---
Fixes: dc48516ec7d3 ("net/mlx5: Lag, add support to create definers for LAG")
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c08d3e62b2e73e14da318a1d20b52d0486a28ee0 ]
User added steering rules at RDMA_TX were being added to the first prio,
which is the counters prio.
Fix that so that they are correctly added to the BYPASS_PRIO instead.
Fixes: 24670b1a3166 ("net/mlx5: Add support for RDMA TX steering")
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c17ff476f53afb30f90bb3c2af77de069c81a622 ]
If coalesce_count is greater than 255 it will not fit in the register and
will overflow. This can be reproduced by running
# ethtool -C ethX rx-frames 256
which will result in a timeout of 0us instead. Fix this by checking for
invalid values and reporting an error.
Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Link: https://patch.msgid.link/20250113163001.2335235-1-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 16ebb6f5b6295c9688749862a39a4889c56227f8 ]
The "sizeof(struct cmsg_bpf_event) + pkt_size + data_size" math could
potentially have an integer wrapping bug on 32bit systems. Check for
this and return an error.
Fixes: 9816dd35ecec ("nfp: bpf: perf event output helpers support")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://patch.msgid.link/6074805b-e78d-4b8a-bf05-e929b5377c28@stanley.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit eb28fd76c0a08a47b470677c6cef9dd1c60e92d1 ]
gtp_newlink() links the device to a list in dev_net(dev) instead of
src_net, where a udp tunnel socket is created.
Even when src_net is removed, the device stays alive on dev_net(dev).
Then, removing src_net triggers the splat below. [0]
In this example, gtp0 is created in ns2, and the udp socket is created
in ns1.
ip netns add ns1
ip netns add ns2
ip -n ns1 link add netns ns2 name gtp0 type gtp role sgsn
ip netns del ns1
Let's link the device to the socket's netns instead.
Now, gtp_net_exit_batch_rtnl() needs another netdev iteration to remove
all gtp devices in the netns.
[0]:
ref_tracker: net notrefcnt@000000003d6e7d05 has 1/2 users at
sk_alloc (./include/net/net_namespace.h:345 net/core/sock.c:2236)
inet_create (net/ipv4/af_inet.c:326 net/ipv4/af_inet.c:252)
__sock_create (net/socket.c:1558)
udp_sock_create4 (net/ipv4/udp_tunnel_core.c:18)
gtp_create_sock (./include/net/udp_tunnel.h:59 drivers/net/gtp.c:1423)
gtp_create_sockets (drivers/net/gtp.c:1447)
gtp_newlink (drivers/net/gtp.c:1507)
rtnl_newlink (net/core/rtnetlink.c:3786 net/core/rtnetlink.c:3897 net/core/rtnetlink.c:4012)
rtnetlink_rcv_msg (net/core/rtnetlink.c:6922)
netlink_rcv_skb (net/netlink/af_netlink.c:2542)
netlink_unicast (net/netlink/af_netlink.c:1321 net/netlink/af_netlink.c:1347)
netlink_sendmsg (net/netlink/af_netlink.c:1891)
____sys_sendmsg (net/socket.c:711 net/socket.c:726 net/socket.c:2583)
___sys_sendmsg (net/socket.c:2639)
__sys_sendmsg (net/socket.c:2669)
do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
WARNING: CPU: 1 PID: 60 at lib/ref_tracker.c:179 ref_tracker_dir_exit (lib/ref_tracker.c:179)
Modules linked in:
CPU: 1 UID: 0 PID: 60 Comm: kworker/u16:2 Not tainted 6.13.0-rc5-00147-g4c1224501e9d #5
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Workqueue: netns cleanup_net
RIP: 0010:ref_tracker_dir_exit (lib/ref_tracker.c:179)
Code: 00 00 00 fc ff df 4d 8b 26 49 bd 00 01 00 00 00 00 ad de 4c 39 f5 0f 85 df 00 00 00 48 8b 74 24 08 48 89 df e8 a5 cc 12 02 90 <0f> 0b 90 48 8d 6b 44 be 04 00 00 00 48 89 ef e8 80 de 67 ff 48 89
RSP: 0018:ff11000009a07b60 EFLAGS: 00010286
RAX: 0000000000002bd3 RBX: ff1100000f4e1aa0 RCX: 1ffffffff0e40ac6
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff8423ee3c
RBP: ff1100000f4e1af0 R08: 0000000000000001 R09: fffffbfff0e395ae
R10: 0000000000000001 R11: 0000000000036001 R12: ff1100000f4e1af0
R13: dead000000000100 R14: ff1100000f4e1af0 R15: dffffc0000000000
FS: 0000000000000000(0000) GS:ff1100006ce80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f9b2464bd98 CR3: 0000000005286005 CR4: 0000000000771ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<TASK>
? __warn (kernel/panic.c:748)
? ref_tracker_dir_exit (lib/ref_tracker.c:179)
? report_bug (lib/bug.c:201 lib/bug.c:219)
? handle_bug (arch/x86/kernel/traps.c:285)
? exc_invalid_op (arch/x86/kernel/traps.c:309 (discriminator 1))
? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:621)
? _raw_spin_unlock_irqrestore (./arch/x86/include/asm/irqflags.h:42 ./arch/x86/include/asm/irqflags.h:97 ./arch/x86/include/asm/irqflags.h:155 ./include/linux/spinlock_api_smp.h:151 kernel/locking/spinlock.c:194)
? ref_tracker_dir_exit (lib/ref_tracker.c:179)
? __pfx_ref_tracker_dir_exit (lib/ref_tracker.c:158)
? kfree (mm/slub.c:4613 mm/slub.c:4761)
net_free (net/core/net_namespace.c:476 net/core/net_namespace.c:467)
cleanup_net (net/core/net_namespace.c:664 (discriminator 3))
process_one_work (kernel/workqueue.c:3229)
worker_thread (kernel/workqueue.c:3304 kernel/workqueue.c:3391)
kthread (kernel/kthread.c:389)
ret_from_fork (arch/x86/kernel/process.c:147)
ret_from_fork_asm (arch/x86/entry/entry_64.S:257)
</TASK>
Fixes: 459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
Reported-by: Xiao Liang <shaw.leon@gmail.com>
Closes: https://lore.kernel.org/netdev/20250104125732.17335-1-shaw.leon@gmail.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 46841c7053e6d25fb33e0534ef023833bf03e382 ]
gtp_newlink() links the gtp device to a list in dev_net(dev).
However, even after the gtp device is moved to another netns,
it stays on the list but should be invisible.
Let's use for_each_netdev_rcu() for netdev traversal in
gtp_genl_dump_pdp().
Note that gtp_dev_list is no longer used under RCU, so list
helpers are converted to the non-RCU variant.
Fixes: 459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
Reported-by: Xiao Liang <shaw.leon@gmail.com>
Closes: https://lore.kernel.org/netdev/CABAhCOQdBL6h9M2C+kd+bGivRJ9Q72JUxW+-gur0nub_=PmFPA@mail.gmail.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 6eedda01b2bfdcf427b37759e053dc27232f3af1 ]
exit_batch_rtnl() is called while RTNL is held,
and devices to be unregistered can be queued in the dev_kill_list.
This saves one rtnl_lock()/rtnl_unlock() pair per netns
and one unregister_netdevice_many() call per netns.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20240206144313.2050392-8-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 46841c7053e6 ("gtp: Use for_each_netdev_rcu() in gtp_genl_dump_pdp().")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 03d120f27d050336f7e7d21879891542c4741f81 ]
CPSW ALE has 75-bit ALE entries stored across three 32-bit words.
The cpsw_ale_get_field() and cpsw_ale_set_field() functions support
ALE field entries spanning up to two words at the most.
The cpsw_ale_get_field() and cpsw_ale_set_field() functions work as
expected when ALE field spanned across word1 and word2, but fails when
ALE field spanned across word2 and word3.
For example, while reading the ALE field spanned across word2 and word3
(i.e. bits 62 to 64), the word3 data shifted to an incorrect position
due to the index becoming zero while flipping.
The same issue occurred when setting an ALE entry.
This issue has not been seen in practice but will be an issue in the future
if the driver supports accessing ALE fields spanning word2 and word3
Fix the methods to handle getting/setting fields spanning up to two words.
Fixes: b685f1a58956 ("net: ethernet: ti: cpsw_ale: Fix cpsw_ale_get_field()/cpsw_ale_set_field()")
Signed-off-by: Sudheer Kumar Doredla <s-doredla@ti.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Link: https://patch.msgid.link/20250108172433.311694-1-s-doredla@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 9734fd7a2777 ("xhci: use pm_ptr() instead of #ifdef for CONFIG_PM
conditionals") did not quite work properly in the 6.1.y branch where it was
applied to fix a build error when CONFIG_PM was set as it left the following
build errors still present:
ERROR: modpost: "xhci_suspend" [drivers/usb/host/xhci-pci.ko] undefined!
ERROR: modpost: "xhci_resume" [drivers/usb/host/xhci-pci.ko] undefined!
Fix this up by properly placing the #ifdef CONFIG_PM in the xhci-pci.c and
hcd.h files to handle this correctly.
Link: https://lore.kernel.org/r/133dbfa0-4a37-4ae0-bb95-1a35f668ec11@w6rz.net
Signed-off-by: Ron Economos <re@w6rz.net>
Link: https://lore.kernel.org/r/d0919169-ee06-4bdd-b2e3-2f776db90971@roeck-us.net
Reported-by: Guenter Roeck <linux@roeck-us.net>
[ Trimmed the partial revert down to an even smaller bit to only be what
is required to fix the build error - gregkh]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|