| Age | Commit message (Collapse) | Author | Files | Lines |
|
Link: https://lore.kernel.org/r/20260209142320.474120190@linuxfoundation.org
Tested-by: Ronald Warsow <rwarsow@gmx.de>
Tested-by: Brett A C Sheffield <bacs@librecast.net>
Tested-by: Luna Jernberg <droidbittin@gmail.com>
Tested-by: Peter Schneider <pschneider1968@googlemail.com>
Tested-by: Hardik Garg <hargar@linux.microsoft.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Takeshi Ogasawara <takeshi.ogasawara@futuring-girl.com>
Tested-by: Justin M. Forbes <jforbes@fedoraproject.org>
Tested-by: Ron Economos <re@w6rz.net>
Tested-by: Mark Brown <broonie@kernel.org>
Tested-by: Jeffrin Jose T <jeffrin@rajagiritech.edu.in>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Tested-by: Dileep Malepu <dileep.debian@gmail.com>
Tested-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Tested-by: Salvatore Bonaccorso <carnil@debian.org>
Tested-by: Barry K. Nathan <barryn@pobox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 841e47d56cef9b96fd2314220e3d0f1d92c719f4 upstream.
After commit bdce162f2e57 ("riscv: Use 64-bit variable for output in
__get_user_asm"), there is a warning when building for 32-bit RISC-V:
In file included from include/linux/uaccess.h:13,
from include/linux/sched/task.h:13,
from include/linux/sched/signal.h:9,
from include/linux/rcuwait.h:6,
from include/linux/mm.h:36,
from include/linux/migrate.h:5,
from mm/migrate.c:16:
mm/migrate.c: In function 'do_pages_move':
arch/riscv/include/asm/uaccess.h:115:15: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
115 | (x) = (__typeof__(x))__tmp; \
| ^
arch/riscv/include/asm/uaccess.h:198:17: note: in expansion of macro '__get_user_asm'
198 | __get_user_asm("lb", (x), __gu_ptr, label); \
| ^~~~~~~~~~~~~~
arch/riscv/include/asm/uaccess.h:218:9: note: in expansion of macro '__get_user_nocheck'
218 | __get_user_nocheck(x, ptr, __gu_failed); \
| ^~~~~~~~~~~~~~~~~~
arch/riscv/include/asm/uaccess.h:255:9: note: in expansion of macro '__get_user_error'
255 | __get_user_error(__gu_val, __gu_ptr, __gu_err); \
| ^~~~~~~~~~~~~~~~
arch/riscv/include/asm/uaccess.h:285:17: note: in expansion of macro '__get_user'
285 | __get_user((x), __p) : \
| ^~~~~~~~~~
mm/migrate.c:2358:29: note: in expansion of macro 'get_user'
2358 | if (get_user(p, pages + i))
| ^~~~~~~~
Add an intermediate cast to 'unsigned long', which is guaranteed to be the same
width as a pointer, before the cast to the type of the output variable to clear
up the warning.
Fixes: bdce162f2e57 ("riscv: Use 64-bit variable for output in __get_user_asm")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202601210526.OT45dlOZ-lkp@intel.com/
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://patch.msgid.link/20260121-riscv-fix-int-to-pointer-cast-v1-1-b83eebe57c76@kernel.org
Signed-off-by: Paul Walmsley <pjw@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 70b4db7d258118a7464f039112a74ddb49a95b06 upstream.
The recent fix commit for addressing the OOB access of PCM URB data
buffer caused a regression on Behringer UMC2020HD device, resulting in
choppy sound. The fix used ep->max_urb_frames for the upper limit
check, and this is no right value to be referred.
Use the actual buffer size (ctx->buffer_size) as the upper limit
instead, which also avoids the regression on the device above.
Fixes: ef5749ef8b30 ("ALSA: usb-audio: Prevent excessive number of frames")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=220997
Link: https://patch.msgid.link/20260121082025.718748-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 1aaedafb21f38cb872d44f7608b4828a1e14e795 upstream.
Add a PCI quirk to enable microphone detection on the headphone jack of
TongFang X6AR55xU devices.
The former quirk entry did not acomplish this and is removed.
Fixes: b48fe9af1e60 ("ALSA: hda/realtek: Fix headset mic for TongFang X6AR55xU")
Signed-off-by: Tim Guttzeit <t.guttzeit@tuxedocomputers.com>
Signed-off-by: Werner Sembach <wse@tuxedocomputers.com>
Link: https://patch.msgid.link/20260123221233.28273-1-wse@tuxedocomputers.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit a0a75b40c919b9f6d3a0b6c978e6ccf344c1be5a ]
The COMMAND1 register bits [29:28] set the SPI mode, which controls
the clock idle level. When a transfer ends, tegra_spi_transfer_end()
writes def_command1_reg back to restore the default state, but this
register value currently lacks the mode bits. This results in the
clock always being configured as idle low, breaking devices that
need it high.
Fix this by storing the mode bits in def_command1_reg during setup,
to prevent this field from always being cleared.
Fixes: f333a331adfa ("spi/tegra114: add spi driver")
Signed-off-by: Vishwaroop A <va@nvidia.com>
Link: https://patch.msgid.link/20260204141212.1540382-1-va@nvidia.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 41d9a6795b95d6ea28439ac1e9ce8c95bbca20fc ]
In tegra_slink_probe(), when platform_get_irq() fails, it directly
returns from the function with an error code, which causes a memory leak.
Replace it with a goto label to ensure proper cleanup.
Fixes: eb9913b511f1 ("spi: tegra: Fix missing IRQ check in tegra_slink_probe()")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://patch.msgid.link/20260202-slink-v1-1-eac50433a6f9@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit edf9088b6e1d6d88982db7eb5e736a0e4fbcc09e ]
Now that all other accesses to curr_xfer are done under the lock,
protect the curr_xfer NULL check in tegra_qspi_isr_thread() with the
spinlock. Without this protection, the following race can occur:
CPU0 (ISR thread) CPU1 (timeout path)
---------------- -------------------
if (!tqspi->curr_xfer)
// sees non-NULL
spin_lock()
tqspi->curr_xfer = NULL
spin_unlock()
handle_*_xfer()
spin_lock()
t = tqspi->curr_xfer // NULL!
... t->len ... // NULL dereference!
With this patch, all curr_xfer accesses are now properly synchronized.
Although all accesses to curr_xfer are done under the lock, in
tegra_qspi_isr_thread() it checks for NULL, releases the lock and
reacquires it later in handle_cpu_based_xfer()/handle_dma_based_xfer().
There is a potential for an update in between, which could cause a NULL
pointer dereference.
To handle this, add a NULL check inside the handlers after acquiring
the lock. This ensures that if the timeout path has already cleared
curr_xfer, the handler will safely return without dereferencing the
NULL pointer.
Fixes: b4e002d8a7ce ("spi: tegra210-quad: Fix timeout handling")
Signed-off-by: Breno Leitao <leitao@debian.org>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://patch.msgid.link/20260126-tegra_xfer-v2-6-6d2115e4f387@debian.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
tegra_qspi_non_combined_seq_xfer
[ Upstream commit 6d7723e8161f3c3f14125557e19dd080e9d882be ]
Protect the curr_xfer clearing in tegra_qspi_non_combined_seq_xfer()
with the spinlock to prevent a race with the interrupt handler that
reads this field to check if a transfer is in progress.
Fixes: b4e002d8a7ce ("spi: tegra210-quad: Fix timeout handling")
Signed-off-by: Breno Leitao <leitao@debian.org>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://patch.msgid.link/20260126-tegra_xfer-v2-5-6d2115e4f387@debian.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit bf4528ab28e2bf112c3a2cdef44fd13f007781cd ]
The curr_xfer field is read by the IRQ handler without holding the lock
to check if a transfer is in progress. When clearing curr_xfer in the
combined sequence transfer loop, protect it with the spinlock to prevent
a race with the interrupt handler.
Protect the curr_xfer clearing at the exit path of
tegra_qspi_combined_seq_xfer() with the spinlock to prevent a race
with the interrupt handler that reads this field.
Without this protection, the IRQ handler could read a partially updated
curr_xfer value, leading to NULL pointer dereference or use-after-free.
Fixes: b4e002d8a7ce ("spi: tegra210-quad: Fix timeout handling")
Signed-off-by: Breno Leitao <leitao@debian.org>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://patch.msgid.link/20260126-tegra_xfer-v2-4-6d2115e4f387@debian.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
tegra_qspi_setup_transfer_one
[ Upstream commit f5a4d7f5e32ba163cff893493ec1cbb0fd2fb0d5 ]
When the timeout handler processes a completed transfer and signals
completion, the transfer thread can immediately set up the next transfer
and assign curr_xfer to point to it.
If a delayed ISR from the previous transfer then runs, it checks if
(!tqspi->curr_xfer) (currently without the lock also -- to be fixed
soon) to detect stale interrupts, but this check passes because
curr_xfer now points to the new transfer. The ISR then incorrectly
processes the new transfer's context.
Protect the curr_xfer assignment with the spinlock to ensure the ISR
either sees NULL (and bails out) or sees the new value only after the
assignment is complete.
Fixes: 921fc1838fb0 ("spi: tegra210-quad: Add support for Tegra210 QSPI controller")
Signed-off-by: Breno Leitao <leitao@debian.org>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://patch.msgid.link/20260126-tegra_xfer-v2-3-6d2115e4f387@debian.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ef13ba357656451d6371940d8414e3e271df97e3 ]
Move the assignment of the transfer pointer from curr_xfer inside the
spinlock critical section in both handle_cpu_based_xfer() and
handle_dma_based_xfer().
Previously, curr_xfer was read before acquiring the lock, creating a
window where the timeout path could clear curr_xfer between reading it
and using it. By moving the read inside the lock, the handlers are
guaranteed to see a consistent value that cannot be modified by the
timeout path.
Fixes: 921fc1838fb0 ("spi: tegra210-quad: Add support for Tegra210 QSPI controller")
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Thierry Reding <treding@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://patch.msgid.link/20260126-tegra_xfer-v2-2-6d2115e4f387@debian.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit aabd8ea0aa253d40cf5f20a609fc3d6f61e38299 ]
When the ISR thread wakes up late and finds that the timeout handler
has already processed the transfer (curr_xfer is NULL), return
IRQ_HANDLED instead of IRQ_NONE.
Use a similar approach to tegra_qspi_handle_timeout() by reading
QSPI_TRANS_STATUS and checking the QSPI_RDY bit to determine if the
hardware actually completed the transfer. If QSPI_RDY is set, the
interrupt was legitimate and triggered by real hardware activity.
The fact that the timeout path handled it first doesn't make it
spurious. Returning IRQ_NONE incorrectly suggests the interrupt
wasn't for this device, which can cause issues with shared interrupt
lines and interrupt accounting.
Fixes: b4e002d8a7ce ("spi: tegra210-quad: Fix timeout handling")
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Usama Arif <usamaarif642@gmail.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://patch.msgid.link/20260126-tegra_xfer-v2-1-6d2115e4f387@debian.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 41399c5d476156635c9a58de870d39318e22fa09 ]
Higher voltage settings were unusable due to incorrect n_voltages values
causing registration failures. For example, setting aldo4 to 3.3V failed
with -EINVAL because the required selector (123) exceeded the allowed
range (n_voltages=117).
Fix by aligning n_voltages with the hardware register widths per the P1
datasheet [1]:
- BUCK: 255 (was 254), allows selectors 0-254, selector 255 is reserved
- LDO: 128 (was 117), allows selectors 0-127, selectors 0-10 are for
suspend mode, valid operational range is 11-127
This enables the full voltage range supported by the hardware.
Fixes: 8b84d712ad84 ("regulator: spacemit: support SpacemiT P1 regulators")
Link: https://developer.spacemit.com/documentation [1]
Signed-off-by: Guodong Xu <guodong@riscstar.com>
Link: https://patch.msgid.link/20260122-spacemit-p1-v1-1-309be27fbff9@riscstar.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit b126097b0327437048bd045a0e4d273dea2910dd upstream.
When a block read returns an invalid length, zero or >I2C_SMBUS_BLOCK_MAX,
the length handler sets the state to IMX_I2C_STATE_FAILED. However,
i2c_imx_master_isr() unconditionally overwrites this with
IMX_I2C_STATE_READ_CONTINUE, causing an endless read loop that overruns
buffers and crashes the system.
Guard the state transition to preserve error states set by the length
handler.
Fixes: 5f5c2d4579ca ("i2c: imx: prevent rescheduling in non dma mode")
Signed-off-by: LI Qingwu <Qing-wu.Li@leica-geosystems.com.cn>
Cc: <stable@vger.kernel.org> # v6.13+
Reviewed-by: Stefan Eichenberger <eichest@gmail.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20260116111906.3413346-2-Qing-wu.Li@leica-geosystems.com.cn
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit e34f77b09080c86c929153e2a72da26b4f8947ff ]
Fix incorrect NULL check in loongson_gpio_init_irqchip().
The function checks chip->parent instead of chip->irq.parents.
Fixes: 03c146cb6cd1 ("gpio: loongson-64bit: Add support for Loongson-2K0300 SoC")
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Link: https://patch.msgid.link/20260205072649.3271158-1-nichen@iscas.ac.cn
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 7f67ba5413f98d93116a756e7f17cd2c1d6c2bd6 ]
Fixes: 4a767b1d039a8 ("ASoC: amd: add acp3x pdm driver dma ops")
Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Link: https://patch.msgid.link/20260202205034.7697-1-chris.bainbridge@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 124bdc6eccc8c5cba68fee00e01c084c116c4360 ]
When the support for the Sound Blaster X-Fi Surround 5.1 Pro was added,
the existing logic for the X-Fi Surround 5.1 in snd_audigy2nx_led_put()
was broken due to missing *else* before the added *if*: snd_usb_ctl_msg()
became incorrectly called twice and an error from first snd_usb_ctl_msg()
call ignored. As the added snd_usb_ctl_msg() call was totally identical
to the existing one for the "plain" X-Fi Surround 5.1, just merge those
two *if* statements while fixing the broken logic...
Found by Linux Verification Center (linuxtesting.org) with the Svace static
analysis tool.
Fixes: 7cdd8d73139e ("ALSA: usb-audio - Add support for USB X-Fi S51 Pro")
Signed-off-by: Sergey Shtylyov <s.shtylyov@auroraos.dev>
Link: https://patch.msgid.link/20260203161558.18680-1-s.shtylyov@auroraos.dev
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 10db9f6899dd3a2dfd26efd40afd308891dc44a8 ]
Use the dev_*_ratelimit() macros if the cs_dsp KUnit tests are enabled
in the build, and allow the KUnit tests to disable message output.
Some of the KUnit tests cause a very large number of log messages from
cs_dsp, because the tests perform many different test cases. This could
cause some lines to be dropped from the kernel log. Dropped lines can
prevent the KUnit wrappers from parsing the ktap output in the dmesg log.
The KUnit builds of cs_dsp export three bools that the KUnit tests can
use to entirely disable log output of err, warn and info messages. Some
tests have been updated to use this, replacing the previous fudge of a
usleep() in the exit handler of each test. We don't necessarily want to
disable all log messages if they aren't expected to be excessive,
so the rate-limiting allows leaving some logging enabled.
The rate-limited macros are not used in normal builds because it is not
appropriate to rate-limit every message. That could cause important
messages to be dropped, and there wouldn't be such a high rate of
messages in normal operation.
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Reported-by: Mark Brown <broonie@kernel.org>
Closes: https://lore.kernel.org/linux-sound/af393f08-facb-4c44-a054-1f61254803ec@opensource.cirrus.com/T/#t
Fixes: cd8c058499b6 ("firmware: cs_dsp: Add KUnit testing of bin error cases")
Link: https://patch.msgid.link/20260130171256.863152-1-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 78cfd833bc04c0398ca4cfc64704350aebe4d4c2 ]
cs_dsp_debugfs_wmfw_read() and cs_dsp_debugfs_bin_read() were identical
except for which struct member they printed. Move all this duplicated
code into a common function cs_dsp_debugfs_string_read().
The check for dsp->booted has been removed because this is redundant.
The two strings are set when the DSP is booted and cleared when the
DSP is powered-down.
Access to the string char * must be protected by the pwr_lock mutex. The
string is passed into cs_dsp_debugfs_string_read() as a pointer to the
char * so that the mutex lock can also be factored out into
cs_dsp_debugfs_string_read().
wmfw_file_name and bin_file_name members of struct cs_dsp have been
changed to const char *. It makes for a better API to pass a const
pointer into cs_dsp_debugfs_string_read().
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Link: https://patch.msgid.link/20251120130640.1169780-2-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Stable-dep-of: 10db9f6899dd ("firmware: cs_dsp: rate-limit log messages in KUnit builds")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit bbf4a17ad9ffc4e3d7ec13d73ecd59dea149ed25 ]
syzbot reported a kernel BUG in fib6_add_rt2node() when adding an IPv6
route. [0]
Commit f72514b3c569 ("ipv6: clear RA flags when adding a static
route") introduced logic to clear RTF_ADDRCONF from existing routes
when a static route with the same nexthop is added. However, this
causes a problem when the existing route has a gateway.
When RTF_ADDRCONF is cleared from a route that has a gateway, that
route becomes eligible for ECMP, i.e. rt6_qualify_for_ecmp() returns
true. The issue is that this route was never added to the
fib6_siblings list.
This leads to a mismatch between the following counts:
- The sibling count computed by iterating fib6_next chain, which
includes the newly ECMP-eligible route
- The actual siblings in fib6_siblings list, which does not include
that route
When a subsequent ECMP route is added, fib6_add_rt2node() hits
BUG_ON(sibling->fib6_nsiblings != rt->fib6_nsiblings) because the
counts don't match.
Fix this by only clearing RTF_ADDRCONF when the existing route does
not have a gateway. Routes without a gateway cannot qualify for ECMP
anyway (rt6_qualify_for_ecmp() requires fib_nh_gw_family), so clearing
RTF_ADDRCONF on them is safe and matches the original intent of the
commit.
[0]:
kernel BUG at net/ipv6/ip6_fib.c:1217!
Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 6010 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
RIP: 0010:fib6_add_rt2node+0x3433/0x3470 net/ipv6/ip6_fib.c:1217
[...]
Call Trace:
<TASK>
fib6_add+0x8da/0x18a0 net/ipv6/ip6_fib.c:1532
__ip6_ins_rt net/ipv6/route.c:1351 [inline]
ip6_route_add+0xde/0x1b0 net/ipv6/route.c:3946
ipv6_route_ioctl+0x35c/0x480 net/ipv6/route.c:4571
inet6_ioctl+0x219/0x280 net/ipv6/af_inet6.c:577
sock_do_ioctl+0xdc/0x300 net/socket.c:1245
sock_ioctl+0x576/0x790 net/socket.c:1366
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl+0xfc/0x170 fs/ioctl.c:583
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Fixes: f72514b3c569 ("ipv6: clear RA flags when adding a static route")
Reported-by: syzbot+cb809def1baaac68ab92@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=cb809def1baaac68ab92
Tested-by: syzbot+cb809def1baaac68ab92@syzkaller.appspotmail.com
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de>
Link: https://patch.msgid.link/20260204095837.1285552-1-syoshida@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 071be3b0b6575d45be9df9c5b612f5882bfc5e88 ]
The initial state of dma_needs_unmap may be false, but change to true
while mapping the data iterator. Enabling swiotlb is one such case that
can change the result. The nvme driver needs to save the mapped dma
vectors to be unmapped later, so allocate as needed during iteration
rather than assume it was always allocated at the beginning. This fixes
a NULL dereference from accessing an uninitialized dma_vecs when the
device dma unmapping requirements change mid-iteration.
Fixes: b8b7570a7ec8 ("nvme-pci: fix dma unmapping when using PRPs and not using the IOVA mapping")
Link: https://lore.kernel.org/linux-nvme/20260202125738.1194899-1-pradeep.pragallapati@oss.qualcomm.com/
Reported-by: Pradeep P V K <pradeep.pragallapati@oss.qualcomm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4cb1b327135dddf3d0ec2544ea36ed05ba2252bc ]
xe_guc_print_info is void-returning, but the function pointer it is
assigned to expects an int-returning function, leading to the following
CFI error:
[ 206.873690] CFI failure at guc_debugfs_show+0xa1/0xf0 [xe]
(target: xe_guc_print_info+0x0/0x370 [xe]; expected type: 0xbe3bc66a)
Fix this by updating xe_guc_print_info to return an integer.
Fixes: e15826bb3c2c ("drm/xe/guc: Refactor GuC debugfs initialization")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: George D Sworo <george.d.sworo@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260129182547.32899-2-daniele.ceraolospurio@intel.com
(cherry picked from commit dd8ea2f2ab71b98887fdc426b0651dbb1d1ea760)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f41c5d151078c5348271ffaf8e7410d96f2d82f8 ]
nft_map_catchall_activate() has an inverted element activity check
compared to its non-catchall counterpart nft_mapelem_activate() and
compared to what is logically required.
nft_map_catchall_activate() is called from the abort path to re-activate
catchall map elements that were deactivated during a failed transaction.
It should skip elements that are already active (they don't need
re-activation) and process elements that are inactive (they need to be
restored). Instead, the current code does the opposite: it skips inactive
elements and processes active ones.
Compare the non-catchall activate callback, which is correct:
nft_mapelem_activate():
if (nft_set_elem_active(ext, iter->genmask))
return 0; /* skip active, process inactive */
With the buggy catchall version:
nft_map_catchall_activate():
if (!nft_set_elem_active(ext, genmask))
continue; /* skip inactive, process active */
The consequence is that when a DELSET operation is aborted,
nft_setelem_data_activate() is never called for the catchall element.
For NFT_GOTO verdict elements, this means nft_data_hold() is never
called to restore the chain->use reference count. Each abort cycle
permanently decrements chain->use. Once chain->use reaches zero,
DELCHAIN succeeds and frees the chain while catchall verdict elements
still reference it, resulting in a use-after-free.
This is exploitable for local privilege escalation from an unprivileged
user via user namespaces + nftables on distributions that enable
CONFIG_USER_NS and CONFIG_NF_TABLES.
Fix by removing the negation so the check matches nft_mapelem_activate():
skip active elements, process inactive ones.
Fixes: 628bd3e49cba ("netfilter: nf_tables: drop map element references from preparation phase")
Signed-off-by: Andrew Fasano <andrew.fasano@nist.gov>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 831a2b27914cc880130ffe8fb8d1e65a5324d07f ]
This is a printf-style function, which gcc -Werror=suggest-attribute=format
correctly points out:
drivers/hwmon/occ/common.c: In function 'occ_init_attribute':
drivers/hwmon/occ/common.c:761:9: error: function 'occ_init_attribute' might be a candidate for 'gnu_printf' format attribute [-Werror=suggest-attribute=format]
Add the attribute to avoid this warning and ensure any incorrect
format strings are detected here.
Fixes: 744c2fe950e9 ("hwmon: (occ) Rework attribute registration for stack usage")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20260203163440.2674340-1-arnd@kernel.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit bb36170d959fad7f663f91eb9c32a84dd86bef2b ]
Restrict D3Cold disablement for BMG to unsupported NUC platforms,
instead of disabling it on all platforms.
Signed-off-by: Karthik Poosa <karthik.poosa@intel.com>
Fixes: 3e331a6715ee ("drm/xe/pm: Temporarily disable D3Cold on BMG")
Link: https://patch.msgid.link/20260123173238.1642383-1-karthik.poosa@intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 39125eaf8863ab09d70c4b493f58639b08d5a897)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 7ee9b3e091c63da71e15c72003f1f07e467f5158 ]
The topology query helper advanced the user pointer by the size
of the pointer, not the size of the structure. This can misalign
the output blob and corrupt the following mask. Fix the increment
to use sizeof(*topo).
There is no issue currently, as sizeof(*topo) happens to be equal
to sizeof(topo) on 64-bit systems (both evaluate to 8 bytes).
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260130043907.465128-2-shuicheng.lin@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit c2a6859138e7f73ad904be17dd7d1da6cc7f06b3)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0e0c8f4d16de92520623aa1ea485cadbf64e6929 ]
The mgag200_bmc_stop_scanout() function is called by the .atomic_disable()
handler for the MGA G200 VGA BMC encoder. This function performs a few
register writes to inform the BMC of an upcoming mode change, and then
polls to wait until the BMC actually stops.
The polling is implemented using a busy loop with udelay() and an iteration
timeout of 300, resulting in the function blocking for 300 milliseconds.
The function gets called ultimately by the output_poll_execute work thread
for the DRM output change polling thread of the mgag200 driver:
kworker/0:0-mm_ 3528 [000] 4555.315364:
ffffffffaa0e25b3 delay_halt.part.0+0x33
ffffffffc03f6188 mgag200_bmc_stop_scanout+0x178
ffffffffc087ae7a disable_outputs+0x12a
ffffffffc087c12a drm_atomic_helper_commit_tail+0x1a
ffffffffc03fa7b6 mgag200_mode_config_helper_atomic_commit_tail+0x26
ffffffffc087c9c1 commit_tail+0x91
ffffffffc087d51b drm_atomic_helper_commit+0x11b
ffffffffc0509694 drm_atomic_commit+0xa4
ffffffffc05105e8 drm_client_modeset_commit_atomic+0x1e8
ffffffffc0510ce6 drm_client_modeset_commit_locked+0x56
ffffffffc0510e24 drm_client_modeset_commit+0x24
ffffffffc088a743 __drm_fb_helper_restore_fbdev_mode_unlocked+0x93
ffffffffc088a683 drm_fb_helper_hotplug_event+0xe3
ffffffffc050f8aa drm_client_dev_hotplug+0x9a
ffffffffc088555a output_poll_execute+0x29a
ffffffffa9b35924 process_one_work+0x194
ffffffffa9b364ee worker_thread+0x2fe
ffffffffa9b3ecad kthread+0xdd
ffffffffa9a08549 ret_from_fork+0x29
On a server running ptp4l with the mgag200 driver loaded, we found that
ptp4l would sometimes get blocked from execution because of this busy
waiting loop.
Every so often, approximately once every 20 minutes -- though with large
variance -- the output_poll_execute() thread would detect some sort of
change that required performing a hotplug event which results in attempting
to stop the BMC scanout, resulting in a 300msec delay on one CPU.
On this system, ptp4l was pinned to a single CPU. When the
output_poll_execute() thread ran on that CPU, it blocked ptp4l from
executing for its 300 millisecond duration.
This resulted in PTP service disruptions such as failure to send a SYNC
message on time, failure to handle ANNOUNCE messages on time, and clock
check warnings from the application. All of this despite the application
being configured with FIFO_RT and a higher priority than the background
workqueue tasks. (However, note that the kernel did not use
CONFIG_PREEMPT...)
It is unclear if the event is due to a faulty VGA connection, another bug,
or actual events causing a change in the connection. At least on the system
under test it is not a one-time event and consistently causes disruption to
the time sensitive applications.
The function has some helpful comments explaining what steps it is
attempting to take. In particular, step 3a and 3b are explained as such:
3a - The third step is to verify if there is an active scan. We are
waiting on a 0 on remhsyncsts (<XSPAREREG<0>.
3b - This step occurs only if the remove is actually scanning. We are
waiting for the end of the frame which is a 1 on remvsyncsts
(<XSPAREREG<1>).
The actual steps 3a and 3b are implemented as while loops with a
non-sleeping udelay(). The first step iterates while the tmp value at
position 0 is *not* set. That is, it keeps iterating as long as the bit is
zero. If the bit is already 0 (because there is no active scan), it will
iterate the entire 300 attempts which wastes 300 milliseconds in total.
This is opposite of what the description claims.
The step 3b logic only executes if we do not iterate over the entire 300
attempts in the first loop. If it does trigger, it is trying to check and
wait for a 1 on the remvsyncsts. However, again the condition is actually
inverted and it will loop as long as the bit is 1, stopping once it hits
zero (rather than the explained attempt to wait until we see a 1).
Worse, both loops are implemented using non-sleeping waits which spin
instead of allowing the scheduler to run other processes. If the kernel is
not configured to allow arbitrary preemption, it will waste valuable CPU
time doing nothing.
There does not appear to be any documentation for the BMC register
interface, beyond what is in the comments here. It seems more probable that
the comment here is correct and the implementation accidentally got
inverted from the intended logic.
Reading through other DRM driver implementations, it does not appear that
the .atomic_enable or .atomic_disable handlers need to delay instead of
sleep. For example, the ast_astdp_encoder_helper_atomic_disable() function
calls ast_dp_set_phy_sleep() which uses msleep(). The "atomic" in the name
is referring to the atomic modesetting support, which is the support to
enable atomic configuration from userspace, and not to the "atomic context"
of the kernel. There is no reason to use udelay() here if a sleep would be
sufficient.
Replace the while loops with a read_poll_timeout() based implementation
that will sleep between iterations, and which stops polling once the
condition is met (instead of looping as long as the condition is met). This
aligns with the commented behavior and avoids blocking on the CPU while
doing nothing.
Note the RREG_DAC is implemented using a statement expression to allow
working properly with the read_poll_timeout family of functions. The other
RREG_<TYPE> macros ought to be cleaned up to have better semantics, and
several places in the mgag200 driver could make use of RREG_DAC or similar
RREG_* macros should likely be cleaned up for better semantics as well, but
that task has been left as a future cleanup for a non-bugfix.
Fixes: 414c45310625 ("mgag200: initial g200se driver (v2)")
Suggested-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20260202-jk-mgag200-fix-bad-udelay-v2-1-ce1e9665987d@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5c2c3c38be396257a6a2e55bd601a12bb9781507 ]
The udp GRO complete stage assumes that all the packets inserted the RX
have the `encapsulation` flag zeroed. Such assumption is not true, as a
few H/W NICs can set such flag when H/W offloading the checksum for
an UDP encapsulated traffic, the tun driver can inject GSO packets with
UDP encapsulation and the problematic layout can also be created via
a veth based setup.
Due to the above, in the problematic scenarios, udp4_gro_complete() uses
the wrong network offset (inner instead of outer) to compute the outer
UDP header pseudo checksum, leading to csum validation errors later on
in packet processing.
Address the issue always clearing the encapsulation flag at GRO completion
time. Such flag will be set again as needed for encapsulated packets by
udp_gro_complete().
Fixes: 5ef31ea5d053 ("net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb")
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/562638dbebb3b15424220e26a180274b387e2a88.1770032084.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f613e8b4afea0cd17c7168e8b00e25bc8d33175d ]
Yin Fengwei reported an RCU stall in ptype_seq_show() and provided
a patch.
Real issue is that ptype_seq_next() and ptype_seq_show() violate
RCU rules.
ptype_seq_show() runs under rcu_read_lock(), and reads pt->dev
to get device name without any barrier.
At the same time, concurrent writers can remove a packet_type structure
(which is correctly freed after an RCU grace period) and clear pt->dev
without an RCU grace period.
Define ptype_iter_state to carry a dev pointer along seq_net_private:
struct ptype_iter_state {
struct seq_net_private p;
struct net_device *dev; // added in this patch
};
We need to record the device pointer in ptype_get_idx() and
ptype_seq_next() so that ptype_seq_show() is safe against
concurrent pt->dev changes.
We also need to add full RCU protection in ptype_seq_next().
(Missing READ_ONCE() when reading list.next values)
Many thanks to Dong Chenchen for providing a repro.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Fixes: 1d10f8a1f40b ("net-procfs: show net devices bound packet types")
Fixes: c353e8983e0d ("net: introduce per netns packet chains")
Reported-by: Yin Fengwei <fengwei_yin@linux.alibaba.com>
Reported-by: Dong Chenchen <dongchenchen2@huawei.com>
Closes: https://lore.kernel.org/netdev/CANn89iKRRKPnWjJmb-_3a=sq+9h6DvTQM4DBZHT5ZRGPMzQaiA@mail.gmail.com/T/#m7b80b9fc9b9267f90e0b7aad557595f686f9c50d
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Tested-by: Yin Fengwei <fengwei_yin@linux.alibaba.com>
Link: https://patch.msgid.link/20260202205217.2881198-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
devm_gpiod_get_optional() in adin1110_check_spi()
[ Upstream commit 78211543d2e44f84093049b4ef5f5bfa535f4645 ]
The devm_gpiod_get_optional() function may return an ERR_PTR in case of
genuine GPIO acquisition errors, not just NULL which indicates the
legitimate absence of an optional GPIO.
Add an IS_ERR() check after the call in adin1110_check_spi(). On error,
return the error code to ensure proper failure handling rather than
proceeding with invalid pointers.
Fixes: 36934cac7aaf ("net: ethernet: adi: adin1110: add reset GPIO")
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Link: https://patch.msgid.link/20260202040228.4129097-1-nichen@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8f959d37c1f2efec6dac55915ee82302e98101fb ]
Some shimmer/colorful points appears when using the steamOS color
pipeline for HDR on gaming with DCN32. These points look like black
values being wrongly mapped to red/blue/green values. It was caused
because the number of hw points in regular LUTs and in a shaper LUT was
treated as the same.
DCN3+ regular LUTs have 257 bases and implicit deltas (i.e. HW
calculates them), but shaper LUT is a special case: it has 256 bases and
256 deltas, as in DCN1-2 regular LUTs, and outputs 14-bit values.
Fix that by setting by decreasing in 1 the number of HW points computed
in the LUT segmentation so that shaper LUT (i.e. fixpoint == true) keeps
the same DCN10 CM logic and regular LUTs go with `hw_points + 1`.
CC: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Fixes: 4d5fd3d08ea9 ("drm/amd/display: PQ tail accuracy")
Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5006505b19a2119e71c008044d59f6d753c858b9)
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit fb7f54aa2a99b07945911152c5d3d4a6eb39f797 ]
Not pausing it means that we can have the TCM work queued into a
non-freezable workqueue, which, in resume, is re-activated before the
driver's resume is called.
The TCM work might send commands to the FW before we resumed the device,
leading to an assert.
Closes: https://lore.kernel.org/linux-wireless/aTDoDiD55qlUZ0pn@debian.local/
Tested-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Fixes: e8bb19c1d590 ("wifi: iwlwifi: support fast resume")
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260129212650.05621f3faedb.I44df9cf9183b5143df8078131e0d87c0fd7e1763@changeid
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5ff641011ab7fb63ea101251087745d9826e8ef5 ]
mlo_scan_start_wk is not canceled on disconnection. In fact, it is not
canceled anywhere except in the restart cleanup, where we don't really
have to.
This can cause an init-after-queue issue: if, for example, the work was
queued and then drv_change_interface got executed.
This can also cause use-after-free: if the work is executed after the
vif is freed.
Fixes: 9748ad82a9d9 ("wifi: iwlwifi: defer MLO scan after link activation")
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260129212650.a36482a60719.I5bf64a108ca39dacb5ca0dcd8b7258a3ce8db74c@changeid
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c28d765ec5da160d3a48d0928528084cef97bf19 ]
It is not recommended to access the 32‑bit registers of this hardware IP
using lower‑width accessors (i.e. 16‑bit), and the only exception to
this rule was introduced in the initial ENETC v1 driver for the PMAR1
register, which holds the lower 16 bits of the primary MAC address of
an SI. Meanwhile, this exception has been replicated in the v4 driver
code as well.
Since LS1028 (the only SoC with ENETC v1) is not affected by this issue,
the current patch converts the 16‑bit reads from PMAR1 starting with
ENETC v4.
Fixes: 99100d0d9922 ("net: enetc: add preliminary support for i.MX95 ENETC PF")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Reviewed-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20260130141035.272471-5-claudiu.manoil@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 21d0fc95b5920ae8e69a2c0394bef82b8392bcc9 ]
For ENETC v4, which is integrated into more complex SoCs (compared to v1),
16‑bit register writes are blocked in the SoC interconnect on some chips.
To be fair, it is not recommended to access 32‑bit registers of this IP
using lower‑width accessors (i.e. 16‑bit), and the only exception to
this rule was introduced by me in the initial ENETC v1 driver for the
PMAR1 register, which holds the lower 16 bits of the primary MAC address
of an SI. Meanwhile, this exception has been replicated for v4 as well.
Since LS1028 (the only SoC with ENETC v1) is not affected by this issue,
the current patch fixes the 16‑bit writes to PMAR1 starting with ENETC
v4.
Fixes: 99100d0d9922 ("net: enetc: add preliminary support for i.MX95 ENETC PF")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Reviewed-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20260130141035.272471-4-claudiu.manoil@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 9ae13b2e64fcd2ca00a76b7d60fc4641a6b9209d ]
For ENETC v4 these settings are controlled by the global ENETC
command cache attribute registers (EnCAR), from the IERB register
block.
The hardcoded CDBR cacheability settings were inherited from LS1028A,
and should be removed from the ENETC v4 driver as they conflict
with the global IERB settings.
Fixes: e3f4a0a8ddb4 ("net: enetc: add command BD ring support for i.MX95 ENETC")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Reviewed-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20260130141035.272471-3-claudiu.manoil@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a69c17230cab07bd156f894fdc82bd78b43ea72f ]
For ENETC v4 these settings are controlled by the global ENETC
message and buffer cache attribute registers (EnBCAR and EnMCAR),
from the IERB register block.
The hardcoded cacheability settings were inherited from LS1028A,
and should be removed from the ENETC v4 driver as they conflict
with the global IERB settings.
Fixes: 99100d0d9922 ("net: enetc: add preliminary support for i.MX95 ENETC PF")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Reviewed-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20260130141035.272471-2-claudiu.manoil@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 74d9391e8849e70ded5309222d09b0ed0edbd039 ]
The rx->skey field contains a struct tipc_aead_key with GCM-AES
encryption keys used for TIPC cluster communication. Using plain
kfree() leaves this sensitive key material in freed memory pages
where it could potentially be recovered.
Switch to kfree_sensitive() to ensure the key material is zeroed
before the memory is freed.
Fixes: 1ef6f7c9390f ("tipc: add automatic session key exchange")
Signed-off-by: Daniel Hodges <hodgesd@meta.com>
Link: https://patch.msgid.link/20260131180114.2121438-1-hodgesd@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 1c172febdf065375359b2b95156e476bfee30b60 ]
Initializing input_xfrm to RXH_XFRM_NO_CHANGE in RSS contexts is
problematic. I think I did this to make it clear that the context
does not have its own settings applied. But unlike ETH_RSS_HASH_NO_CHANGE
which is zero, RXH_XFRM_NO_CHANGE is 0xff. We need to be careful
when reading the value back, and remember to treat 0xff as 0.
Remove the initialization and switch to storing 0. This lets us
also remove the workaround in ethnl_rss_set(). Get side does not
need any adjustments and context get no longer reports:
RSS input transformation:
symmetric-xor: on
symmetric-or-xor: on
Unknown bits in RSS input transformation: 0xfc
for NICs which don't support input_xfrm.
Remove the init of hfunc to ETH_RSS_HASH_NO_CHANGE while at it.
As already mentioned this is a noop since ETH_RSS_HASH_NO_CHANGE
is 0 and struct is zalloc'd. But as this fix exemplifies storing
NO_CHANGE as state is fragile.
This issue is implicitly caught by running our selftests because
YNL in selftests errors out on unknown bits.
Fixes: d3e2c7bab124 ("ethtool: rss: support setting input-xfrm via Netlink")
Link: https://patch.msgid.link/20260130190311.811129-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 83b67cc9be9223183caf91826d9c194d7fb128fa ]
After linkwatch_do_dev() calls __dev_put() to release the linkwatch
reference, the device refcount may drop to 1. At this point,
netdev_run_todo() can proceed (since linkwatch_sync_dev() sees an
empty list and returns without blocking), wait for the refcount to
become 1 via netdev_wait_allrefs_any(), and then free the device
via kobject_put().
This creates a use-after-free when __linkwatch_run_queue() tries to
call netdev_unlock_ops() on the already-freed device.
Note that adding netdev_lock_ops()/netdev_unlock_ops() pair in
netdev_run_todo() before kobject_put() would not work, because
netdev_lock_ops() is conditional - it only locks when
netdev_need_ops_lock() returns true. If the device doesn't require
ops_lock, linkwatch won't hold any lock, and netdev_run_todo()
acquiring the lock won't provide synchronization.
Fix this by moving __dev_put() from linkwatch_do_dev() to its
callers. The device reference logically pairs with de-listing the
device, so it's reasonable for the caller that did the de-listing
to release it. This allows placing __dev_put() after all device
accesses are complete, preventing UAF.
The bug can be reproduced by adding mdelay(2000) after
linkwatch_do_dev() in __linkwatch_run_queue(), then running:
ip tuntap add mode tun name tun_test
ip link set tun_test up
ip link set tun_test carrier off
ip link set tun_test carrier on
sleep 0.5
ip tuntap del mode tun name tun_test
KASAN report:
==================================================================
BUG: KASAN: use-after-free in netdev_need_ops_lock include/net/netdev_lock.h:33 [inline]
BUG: KASAN: use-after-free in netdev_unlock_ops include/net/netdev_lock.h:47 [inline]
BUG: KASAN: use-after-free in __linkwatch_run_queue+0x865/0x8a0 net/core/link_watch.c:245
Read of size 8 at addr ffff88804de5c008 by task kworker/u32:10/8123
CPU: 0 UID: 0 PID: 8123 Comm: kworker/u32:10 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
Workqueue: events_unbound linkwatch_event
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x100/0x190 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0x156/0x4c9 mm/kasan/report.c:482
kasan_report+0xdf/0x1a0 mm/kasan/report.c:595
netdev_need_ops_lock include/net/netdev_lock.h:33 [inline]
netdev_unlock_ops include/net/netdev_lock.h:47 [inline]
__linkwatch_run_queue+0x865/0x8a0 net/core/link_watch.c:245
linkwatch_event+0x8f/0xc0 net/core/link_watch.c:304
process_one_work+0x9c2/0x1840 kernel/workqueue.c:3257
process_scheduled_works kernel/workqueue.c:3340 [inline]
worker_thread+0x5da/0xe40 kernel/workqueue.c:3421
kthread+0x3b3/0x730 kernel/kthread.c:463
ret_from_fork+0x754/0xaf0 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246
</TASK>
==================================================================
Fixes: 04efcee6ef8d ("net: hold instance lock during NETDEV_CHANGE")
Reported-by: syzbot+1ec2f6a450f0b54af8c8@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/6824d064.a70a0220.3e9d8.001a.GAE@google.com/T/
Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260201135915.393451-1-jiayuan.chen@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0ae91d8ab70922fb74c22c20bedcb69459579b1c ]
d9f595b9a65e ("io_uring/zcrx: fix leaking pages on sg init fail") fixed
a page leakage but didn't free the page array, release it as well.
Fixes: b84621d96ee02 ("io_uring/zcrx: allocate sgtable for umem areas")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit fdf3f6800be36377e045e2448087f12132b88d2f ]
Gal reports that BPF redirect increments dev->stats.tx_errors
on failure. This is not correct, most modern drivers completely
ignore dev->stats so these drops will be invisible to the user.
Core code should use the dedicated core stats which are folded
into device stats in dev_get_stats().
Note that we're switching from tx_errors to tx_dropped.
Core only has tx_dropped, hence presumably users already expect
that counter to increment for "stack" Tx issues.
Reported-by: Gal Pressman <gal@nvidia.com>
Link: https://lore.kernel.org/c5df3b60-246a-4030-9c9a-0a35cd1ca924@nvidia.com
Fixes: b4ab31414970 ("bpf: Add redirect_neigh helper as redirect drop-in")
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260130033827.698841-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 615901b57b7ef8eb655f71358f7e956e42bcd16b ]
The acpi_power_meter driver's .notify() callback function,
acpi_power_meter_notify(), calls hwmon_device_unregister() under a lock
that is also acquired by callbacks in sysfs attributes of the device
being unregistered which is prone to deadlocks between sysfs access and
device removal.
Address this by moving the hwmon device removal in
acpi_power_meter_notify() outside the lock in question, but notice
that doing it alone is not sufficient because two concurrent
METER_NOTIFY_CONFIG notifications may be attempting to remove the
same device at the same time. To prevent that from happening, add a
new lock serializing the execution of the switch () statement in
acpi_power_meter_notify(). For simplicity, it is a static mutex
which should not be a problem from the performance perspective.
The new lock also allows the hwmon_device_register_with_info()
in acpi_power_meter_notify() to be called outside the inner lock
because it prevents the other notifications handled by that function
from manipulating the "resource" object while the hwmon device based
on it is being registered. The sending of ACPI netlink messages from
acpi_power_meter_notify() is serialized by the new lock too which
generally helps to ensure that the order of handling firmware
notifications is the same as the order of sending netlink messages
related to them.
In addition, notice that hwmon_device_register_with_info() may fail
in which case resource->hwmon_dev will become an error pointer,
so add checks to avoid attempting to unregister the hwmon device
pointer to by it in that case to acpi_power_meter_notify() and
acpi_power_meter_remove().
Fixes: 16746ce8adfe ("hwmon: (acpi_power_meter) Replace the deprecated hwmon_device_register")
Closes: https://lore.kernel.org/linux-hwmon/CAK8fFZ58fidGUCHi5WFX0uoTPzveUUDzT=k=AAm4yWo3bAuCFg@mail.gmail.com/
Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 6d06bc83a5ae8777a5f7a81c32dd75b8d9b2fe04 ]
rtl8152 can trigger device reset during reset which
potentially can result in a deadlock:
**** DPM device timeout after 10 seconds; 15 seconds until panic ****
Call Trace:
<TASK>
schedule+0x483/0x1370
schedule_preempt_disabled+0x15/0x30
__mutex_lock_common+0x1fd/0x470
__rtl8152_set_mac_address+0x80/0x1f0
dev_set_mac_address+0x7f/0x150
rtl8152_post_reset+0x72/0x150
usb_reset_device+0x1d0/0x220
rtl8152_resume+0x99/0xc0
usb_resume_interface+0x3e/0xc0
usb_resume_both+0x104/0x150
usb_resume+0x22/0x110
The problem is that rtl8152 resume calls reset under
tp->control mutex while reset basically re-enters rtl8152
and attempts to acquire the same tp->control lock once
again.
Reset INACCESSIBLE device outside of tp->control mutex
scope to avoid recursive mutex_lock() deadlock.
Fixes: 4933b066fefb ("r8152: If inaccessible at resume time, issue a reset")
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Link: https://patch.msgid.link/20260129031106.3805887-1-senozhatsky@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f8db6475a83649689c087a8f52486fcc53e627e9 ]
valis provided a nice repro to crash the kernel:
ip link add p1 type veth peer p2
ip link set address 00:00:00:00:00:20 dev p1
ip link set up dev p1
ip link set up dev p2
ip link add mv0 link p2 type macvlan mode source
ip link add invalid% link p2 type macvlan mode source macaddr add 00:00:00:00:00:20
ping -c1 -I p1 1.2.3.4
He also gave a very detailed analysis:
<quote valis>
The issue is triggered when a new macvlan link is created with
MACVLAN_MODE_SOURCE mode and MACVLAN_MACADDR_ADD (or
MACVLAN_MACADDR_SET) parameter, lower device already has a macvlan
port and register_netdevice() called from macvlan_common_newlink()
fails (e.g. because of the invalid link name).
In this case macvlan_hash_add_source is called from
macvlan_change_sources() / macvlan_common_newlink():
This adds a reference to vlan to the port's vlan_source_hash using
macvlan_source_entry.
vlan is a pointer to the priv data of the link that is being created.
When register_netdevice() fails, the error is returned from
macvlan_newlink() to rtnl_newlink_create():
if (ops->newlink)
err = ops->newlink(dev, ¶ms, extack);
else
err = register_netdevice(dev);
if (err < 0) {
free_netdev(dev);
goto out;
}
and free_netdev() is called, causing a kvfree() on the struct
net_device that is still referenced in the source entry attached to
the lower device's macvlan port.
Now all packets sent on the macvlan port with a matching source mac
address will trigger a use-after-free in macvlan_forward_source().
</quote valis>
With all that, my fix is to make sure we call macvlan_flush_sources()
regardless of @create value whenever "goto destroy_macvlan_port;"
path is taken.
Many thanks to valis for following up on this issue.
Fixes: aa5fd0fb7748 ("driver: macvlan: Destroy new macvlan port if macvlan_common_newlink failed.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: valis <sec@valis.email>
Reported-by: syzbot+7182fbe91e58602ec1fe@syzkaller.appspotmail.com
Closes: https: //lore.kernel.org/netdev/695fb1e8.050a0220.1c677c.039f.GAE@google.com/T/#u
Cc: Boudewijn van der Heide <boudewijn@delta-utec.com>
Link: https://patch.msgid.link/20260129204359.632556-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit adcbadfd8e05d3558c9cfaa783f17c645181165f ]
Commit fd580c9830316eda ("net: sfp: augment SFP parsing with
phy_interface_t bitmap") did not add augumentation for the interface
bitmap in the quirk for Ubiquiti U-Fiber Instant.
The subsequent commit f81fa96d8a6c7a77 ("net: phylink: use
phy_interface_t bitmaps for optical modules") then changed phylink code
for selection of SFP interface: instead of using link mode bitmap, the
interface bitmap is used, and the fastest interface mode supported by
both SFP module and MAC is chosen.
Since the interface bitmap contains also modes faster than 1000base-x,
this caused a regression wherein this module stopped working
out-of-the-box.
Fix this.
Fixes: fd580c9830316eda ("net: sfp: augment SFP parsing with phy_interface_t bitmap")
Signed-off-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20260129082227.17443-1-kabel@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 40857194956dcaf3d2b66d6bd113d844c93bef54 ]
The i40e driver calls udp_tunnel_get_rx_info() during i40e_open().
This is redundant because UDP tunnel RX offload state is preserved
across device down/up cycles. The udp_tunnel core handles
synchronization automatically when required.
Furthermore, recent changes in the udp_tunnel infrastructure require
querying RX info while holding the udp_tunnel lock. Calling it
directly from the ndo_open path violates this requirement,
triggering the following lockdep warning:
Call Trace:
<TASK>
? __udp_tunnel_nic_assert_locked+0x39/0x40 [udp_tunnel]
i40e_open+0x135/0x14f [i40e]
__dev_open+0x121/0x2e0
__dev_change_flags+0x227/0x270
dev_change_flags+0x3d/0xb0
devinet_ioctl+0x56f/0x860
sock_do_ioctl+0x7b/0x130
__x64_sys_ioctl+0x91/0xd0
do_syscall_64+0x90/0x170
...
</TASK>
Remove the redundant and unsafe call to udp_tunnel_get_rx_info() from
i40e_open() resolve the locking violation.
Fixes: 1ead7501094c ("udp_tunnel: remove rtnl_lock dependency")
Signed-off-by: Mohammad Heib <mheib@redhat.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 234e615bfece9e3e91c50fe49ab9e68ee37c791a ]
The ice driver calls udp_tunnel_get_rx_info() during ice_open_internal().
This is redundant because UDP tunnel RX offload state is preserved
across device down/up cycles. The udp_tunnel core handles
synchronization automatically when required.
Furthermore, recent changes in the udp_tunnel infrastructure require
querying RX info while holding the udp_tunnel lock. Calling it
directly from the ndo_open path violates this requirement,
triggering the following lockdep warning:
Call Trace:
<TASK>
ice_open_internal+0x253/0x350 [ice]
__udp_tunnel_nic_assert_locked+0x86/0xb0 [udp_tunnel]
__dev_open+0x2f5/0x880
__dev_change_flags+0x44c/0x660
netif_change_flags+0x80/0x160
devinet_ioctl+0xd21/0x15f0
inet_ioctl+0x311/0x350
sock_ioctl+0x114/0x220
__x64_sys_ioctl+0x131/0x1a0
...
</TASK>
Remove the redundant and unsafe call to udp_tunnel_get_rx_info() from
ice_open_internal() to resolve the locking violation
Fixes: 1ead7501094c ("udp_tunnel: remove rtnl_lock dependency")
Signed-off-by: Mohammad Heib <mheib@redhat.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit fc6f36eaaedcf4b81af6fe1a568f018ffd530660 ]
Fix race condition where PTP periodic work runs while VSI is being
rebuilt, accessing NULL vsi->rx_rings.
The sequence was:
1. ice_ptp_prepare_for_reset() cancels PTP work
2. ice_ptp_rebuild() immediately queues PTP work
3. VSI rebuild happens AFTER ice_ptp_rebuild()
4. PTP work runs and accesses NULL vsi->rx_rings
Fix: Keep PTP work cancelled during rebuild, only queue it after
VSI rebuild completes in ice_rebuild().
Added ice_ptp_queue_work() helper function to encapsulate the logic
for queuing PTP work, ensuring it's only queued when PTP is supported
and the state is ICE_PTP_READY.
Error log:
[ 121.392544] ice 0000:60:00.1: PTP reset successful
[ 121.392692] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 121.392712] #PF: supervisor read access in kernel mode
[ 121.392720] #PF: error_code(0x0000) - not-present page
[ 121.392727] PGD 0
[ 121.392734] Oops: Oops: 0000 [#1] SMP NOPTI
[ 121.392746] CPU: 8 UID: 0 PID: 1005 Comm: ice-ptp-0000:60 Tainted: G S 6.19.0-rc6+ #4 PREEMPT(voluntary)
[ 121.392761] Tainted: [S]=CPU_OUT_OF_SPEC
[ 121.392773] RIP: 0010:ice_ptp_update_cached_phctime+0xbf/0x150 [ice]
[ 121.393042] Call Trace:
[ 121.393047] <TASK>
[ 121.393055] ice_ptp_periodic_work+0x69/0x180 [ice]
[ 121.393202] kthread_worker_fn+0xa2/0x260
[ 121.393216] ? __pfx_ice_ptp_periodic_work+0x10/0x10 [ice]
[ 121.393359] ? __pfx_kthread_worker_fn+0x10/0x10
[ 121.393371] kthread+0x10d/0x230
[ 121.393382] ? __pfx_kthread+0x10/0x10
[ 121.393393] ret_from_fork+0x273/0x2b0
[ 121.393407] ? __pfx_kthread+0x10/0x10
[ 121.393417] ret_from_fork_asm+0x1a/0x30
[ 121.393432] </TASK>
Fixes: 803bef817807d ("ice: factor out ice_ptp_rebuild_owner()")
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 88b68f35eb43ad5ac77ac1107059040b04e6f477 ]
The E825 hardware currently has each PF handle the PFINT_TSYN_TX cause of
the miscellaneous OICR interrupt vector. The actual interrupt cause
underlying this is shared by all ports on the same quad:
┌─────────────────────────────────┐
│ │
│ ┌────┐ ┌────┐ ┌────┐ ┌────┐ │
│ │PF 0│ │PF 1│ │PF 2│ │PF 3│ │
│ └────┘ └────┘ └────┘ └────┘ │
│ │
└────────────────▲────────────────┘
│
│
┌────────────────┼────────────────┐
│ PHY QUAD │
└───▲────────▲────────▲────────▲──┘
│ │ │ │
┌───┼──┐ ┌───┴──┐ ┌───┼──┐ ┌───┼──┐
│Port 0│ │Port 1│ │Port 2│ │Port 3│
└──────┘ └──────┘ └──────┘ └──────┘
If multiple PFs issue Tx timestamp requests near simultaneously, it is
possible that the correct PF will not be interrupted and will miss its
timestamp. Understanding why is somewhat complex.
Consider the following sequence of events:
CPU 0:
Send Tx packet on PF 0
...
PF 0 enqueues packet with Tx request CPU 1, PF1:
... Send Tx packet on PF1
... PF 1 enqueues packet with Tx request
HW:
PHY Port 0 sends packet
PHY raises Tx timestamp event interrupt
MAC raises each PF interrupt
CPU 0, PF0: CPU 1, PF1:
ice_misc_intr() checks for Tx timestamps ice_misc_intr() checks for Tx timestamp
Sees packet ready bit set Sees nothing available
... Exits
...
...
HW:
PHY port 1 sends packet
PHY interrupt ignored because not all packet timestamps read yet.
...
Read timestamp, report to stack
Because the interrupt event is shared for all ports on the same quad, the
PHY will not raise a new interrupt for any PF until all timestamps are
read.
In the example above, the second timestamp comes in for port 1 before the
timestamp from port 0 is read. At this point, there is no longer an
interrupt thread running that will read the timestamps, because each PF has
checked and found that there was no work to do. Applications such as ptp4l
will timeout after waiting a few milliseconds. Eventually, the watchdog
service task will re-check for all quads and notice that there are
outstanding timestamps, and issue a software interrupt to recover. However,
by this point it is far too late, and applications have already failed.
All of this occurs because of the underlying hardware behavior. The PHY
cannot raise a new interrupt signal until all outstanding timestamps have
been read.
As a first step to fix this, switch the E825C hardware to the
ICE_PTP_TX_INTERRUPT_ALL mode. In this mode, only the clock owner PF will
respond to the PFINT_TSYN_TX cause. Other PFs disable this cause and will
not wake. In this mode, the clock owner will iterate over all ports and
handle timestamps for each connected port.
This matches the E822 behavior, and is a necessary but insufficient step to
resolve the missing timestamps.
Even with use of the ICE_PTP_TX_INTERRUPT_ALL mode, we still sometimes miss
a timestamp event. The ice_ptp_tx_tstamp_owner() does re-check the ready
bitmap, but does so before re-enabling the OICR interrupt vector. It also
only checks the ready bitmap, but not the software Tx timestamp tracker.
To avoid risk of losing a timestamp, refactor the logic to check both the
software Tx timestamp tracker bitmap *and* the hardware ready bitmap.
Additionally, do this outside of ice_ptp_process_ts() after we have already
re-enabled the OICR interrupt.
Remove the checks from the ice_ptp_tx_tstamp(), ice_ptp_tx_tstamp_owner(),
and the ice_ptp_process_ts() functions. This results in ice_ptp_tx_tstamp()
being nothing more than a wrapper around ice_ptp_process_tx_tstamp() so we
can remove it.
Add the ice_ptp_tx_tstamps_pending() function which returns a boolean
indicating if there are any pending Tx timestamps. First, check the
software timestamp tracker bitmap. In ICE_PTP_TX_INTERRUPT_ALL mode, check
*all* ports software trackers. If a tracker has outstanding timestamp
requests, return true. Additionally, check the PHY ready bitmap to confirm
if the PHY indicates any outstanding timestamps.
In the ice_misc_thread_fn(), call ice_ptp_tx_tstamps_pending() just before
returning from the IRQ thread handler. If it returns true, write to
PFINT_OICR to trigger a PFINT_OICR_TSYN_TX_M software interrupt. This will
force the handler to interrupt again and complete the work even if the PHY
hardware did not interrupt for any reason.
This results in the following new flow for handling Tx timestamps:
1) send Tx packet
2) PHY captures timestamp
3) PHY triggers MAC interrupt
4) clock owner executes ice_misc_intr() with PFINT_OICR_TSYN_TX flag set
5) ice_ptp_ts_irq() returns IRQ_WAKE_THREAD
7) The interrupt thread wakes up and kernel calls ice_misc_intr_thread_fn()
8) ice_ptp_process_ts() is called to handle any outstanding timestamps
9) ice_irq_dynamic_ena() is called to re-enable the OICR hardware interrupt
cause
10) ice_ptp_tx_tstamps_pending() is called to check if we missed any more
outstanding timestamps, checking both software and hardware indicators.
With this change, it should no longer be possible for new timestamps to
come in such a way that we lose an interrupt. If a timestamp comes in
before the ice_ptp_tx_tstamps_pending() call, it will be noticed by at
least one of the software bitmap check or the hardware bitmap check. If the
timestamp comes in *after* this check, it should cause a timestamp
interrupt as we have already read all timestamps from the PHY and the OICR
vector has been re-enabled.
Fixes: 7cab44f1c35f ("ice: Introduce ETH56G PHY model for E825C products")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Przemyslaw Korba <przemyslaw.korba@intel.com>
Tested-by: Vitaly Grinberg <vgrinber@redhat.com>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|