| Age | Commit message (Collapse) | Author | Files | Lines |
|
[ Upstream commit 8ee784fdf006cbe8739cfa093f54d326cbf54037 ]
The virtio transports derives its TX credit directly from peer_buf_alloc,
which is set from the remote endpoint's SO_VM_SOCKETS_BUFFER_SIZE value.
On the host side this means that the amount of data we are willing to
queue for a connection is scaled by a guest-chosen buffer size, rather
than the host's own vsock configuration. A malicious guest can advertise
a large buffer and read slowly, causing the host to allocate a
correspondingly large amount of sk_buff memory.
The same thing would happen in the guest with a malicious host, since
virtio transports share the same code base.
Introduce a small helper, virtio_transport_tx_buf_size(), that
returns min(peer_buf_alloc, buf_alloc), and use it wherever we consume
peer_buf_alloc.
This ensures the effective TX window is bounded by both the peer's
advertised buffer and our own buf_alloc (already clamped to
buffer_max_size via SO_VM_SOCKETS_BUFFER_MAX_SIZE), so a remote peer
cannot force the other to queue more data than allowed by its own
vsock settings.
On an unpatched Ubuntu 22.04 host (~64 GiB RAM), running a PoC with
32 guest vsock connections advertising 2 GiB each and reading slowly
drove Slab/SUnreclaim from ~0.5 GiB to ~57 GiB; the system only
recovered after killing the QEMU process. That said, if QEMU memory is
limited with cgroups, the maximum memory used will be limited.
With this patch applied:
Before:
MemFree: ~61.6 GiB
Slab: ~142 MiB
SUnreclaim: ~117 MiB
After 32 high-credit connections:
MemFree: ~61.5 GiB
Slab: ~178 MiB
SUnreclaim: ~152 MiB
Only ~35 MiB increase in Slab/SUnreclaim, no host OOM, and the guest
remains responsive.
Compatibility with non-virtio transports:
- VMCI uses the AF_VSOCK buffer knobs to size its queue pairs per
socket based on the local vsk->buffer_* values; the remote side
cannot enlarge those queues beyond what the local endpoint
configured.
- Hyper-V's vsock transport uses fixed-size VMBus ring buffers and
an MTU bound; there is no peer-controlled credit field comparable
to peer_buf_alloc, and the remote endpoint cannot drive in-flight
kernel memory above those ring sizes.
- The loopback path reuses virtio_transport_common.c, so it
naturally follows the same semantics as the virtio transport.
This change is limited to virtio_transport_common.c and thus affects
virtio-vsock, vhost-vsock, and loopback, bringing them in line with the
"remote window intersected with local policy" behaviour that VMCI and
Hyper-V already effectively have.
Fixes: 06a8fc78367d ("VSOCK: Introduce virtio_vsock_common.ko")
Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Melbin K Mathew <mlbnkm1@gmail.com>
[Stefano: small adjustments after changing the previous patch]
[Stefano: tweak the commit message]
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Luigi Leonardi <leonardi@redhat.com>
Link: https://patch.msgid.link/20260121093628.9941-4-sgarzare@redhat.com
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 3ef3d52a1a9860d094395c7a3e593f3aa26ff012 ]
The credit calculation in virtio_transport_get_credit() uses unsigned
arithmetic:
ret = vvs->peer_buf_alloc - (vvs->tx_cnt - vvs->peer_fwd_cnt);
If the peer shrinks its advertised buffer (peer_buf_alloc) while bytes
are in flight, the subtraction can underflow and produce a large
positive value, potentially allowing more data to be queued than the
peer can handle.
Reuse virtio_transport_has_space() which already handles this case and
add a comment to make it clear why we are doing that.
Fixes: 06a8fc78367d ("VSOCK: Introduce virtio_vsock_common.ko")
Suggested-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Melbin K Mathew <mlbnkm1@gmail.com>
[Stefano: use virtio_transport_has_space() instead of duplicating the code]
[Stefano: tweak the commit message]
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Luigi Leonardi <leonardi@redhat.com>
Link: https://patch.msgid.link/20260121093628.9941-2-sgarzare@redhat.com
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 19e4175e997a5b85eab97d522f00cc99abd1873c ]
This commit adds error handling and rollback logic to
rvu_mbox_handler_attach_resources() to properly clean up partially
attached resources when rvu_attach_block() fails.
Fixes: 746ea74241fa0 ("octeontx2-af: Add RVU block LF provisioning support")
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Link: https://patch.msgid.link/20260121033934.1900761-1-rkannoth@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5f9b329096596b7e53e07d041d7fca4cbe1be752 ]
After 3cbf4ffba5ee ("net: plumb network namespace into __skb_flow_dissect")
we have to provide a net pointer to __skb_flow_dissect(),
either via skb->dev, skb->sk, or a user provided pointer.
In the following case, syzbot was able to cook a bare skb.
WARNING: net/core/flow_dissector.c:1131 at __skb_flow_dissect+0xb57/0x68b0 net/core/flow_dissector.c:1131, CPU#1: syz.2.1418/11053
Call Trace:
<TASK>
bond_flow_dissect drivers/net/bonding/bond_main.c:4093 [inline]
__bond_xmit_hash+0x2d7/0xba0 drivers/net/bonding/bond_main.c:4157
bond_xmit_hash_xdp drivers/net/bonding/bond_main.c:4208 [inline]
bond_xdp_xmit_3ad_xor_slave_get drivers/net/bonding/bond_main.c:5139 [inline]
bond_xdp_get_xmit_slave+0x1fd/0x710 drivers/net/bonding/bond_main.c:5515
xdp_master_redirect+0x13f/0x2c0 net/core/filter.c:4388
bpf_prog_run_xdp include/net/xdp.h:700 [inline]
bpf_test_run+0x6b2/0x7d0 net/bpf/test_run.c:421
bpf_prog_test_run_xdp+0x795/0x10e0 net/bpf/test_run.c:1390
bpf_prog_test_run+0x2c7/0x340 kernel/bpf/syscall.c:4703
__sys_bpf+0x562/0x860 kernel/bpf/syscall.c:6182
__do_sys_bpf kernel/bpf/syscall.c:6274 [inline]
__se_sys_bpf kernel/bpf/syscall.c:6272 [inline]
__x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:6272
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xec/0xf80 arch/x86/entry/syscall_64.c:94
Fixes: 58deb77cc52d ("bonding: balance ICMP echoes in layer3+4 mode")
Reported-by: syzbot+c46409299c70a221415e@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/696faa23.050a0220.4cb9c.001f.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Matteo Croce <mcroce@redhat.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20260120161744.1893263-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 04708606fd7bdc34b69089a4ff848ff36d7088f9 ]
Both send_mcast4() and send_mcast6() use sleep 2 to wait for the tunnel
connection between the gateway and the relay, and for the listener
socket to be created in the LISTENER namespace.
However, tests sometimes fail because packets are sent before the
connection is fully established.
Increase the waiting time to make the tests more reliable, and use
wait_local_port_listen() to explicitly wait for the listener socket.
Fixes: c08e8baea78e ("selftests: add amt interface selftest script")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://patch.msgid.link/20260120133930.863845-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8215794403d264739cc676668087512950b2ff31 ]
When the parameter pmac_id_valid argument of be_cmd_get_mac_from_list() is
set to false, the driver may request the PMAC_ID from the firmware of the
network card, and this function will store that PMAC_ID at the provided
address pmac_id. This is the contract of this function.
However, there is a location within the driver where both
pmac_id_valid == false and pmac_id == NULL are being passed. This could
result in dereferencing a NULL pointer.
To resolve this issue, it is necessary to pass the address of a stub
variable to the function.
Fixes: 95046b927a54 ("be2net: refactor MAC-addr setup code")
Signed-off-by: Andrey Vatoropin <a.vatoropin@crpt.ru>
Link: https://patch.msgid.link/20260120113734.20193-1-a.vatoropin@crpt.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 764a90eb02268a23b1bb98be5f4a13671346804a ]
Radeon 430 and 520 are OEM GPUs from 2016~2017
They have the same device id: 0x6611 and revision: 0x87
On the Radeon 430, powertune is buggy and throttles the GPU,
never allowing it to reach its maximum SCLK. Work around this
bug by raising the TDP limits we program to the SMC from
24W (specified by the VBIOS on Radeon 430) to 32W.
Disabling powertune entirely is not a viable workaround,
because it causes the Radeon 520 to heat up above 100 C,
which I prefer to avoid.
Additionally, revise the maximum SCLK limit. Considering the
above issue, these GPUs never reached a high SCLK on Linux,
and the workarounds were added before the GPUs were released,
so the workaround likely didn't target these specifically.
Use 780 MHz (the maximum SCLK according to the VBIOS on the
Radeon 430). Note that the Radeon 520 VBIOS has a higher
maximum SCLK: 905 MHz, but in practice it doesn't seem to
perform better with the higher clock, only heats up more.
v2:
Move the workaround to si_populate_smc_tdp_limits.
Fixes: 841686df9f7d ("drm/amdgpu: add SI DPM support (v4)")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 966d70f1e160bdfdecaf7ff2b3f22ad088516e9f)
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d5077426e1a76d269e518e048bde2e9fc49b32ad ]
There is no reason to clear the SMC table.
We also don't need to recalculate the power limit then.
Fixes: 841686df9f7d ("drm/amdgpu: add SI DPM support (v4)")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e214d626253f5b180db10dedab161b7caa41f5e9)
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c7159e960f1472a5493ac99aff0086ab1d683594 ]
The usbnet driver initializes net->max_mtu to ETH_MAX_MTU before calling
the device's bind() callback. When the bind() callback sets
dev->hard_mtu based the device's actual capability (from CDC Ethernet's
wMaxSegmentSize descriptor), max_mtu is never updated to reflect this
hardware limitation).
This allows userspace (DHCP or IPv6 RA) to configure MTU larger than the
device can handle, leading to silent packet drops when the backend sends
packet exceeding the device's buffer size.
Fix this by limiting net->max_mtu to the device's hard_mtu after the
bind callback returns.
See https://gitlab.com/qemu-project/qemu/-/issues/3268 and
https://bugs.passt.top/attachment.cgi?bugid=189
Fixes: f77f0aee4da4 ("net: use core MTU range checking in USB NIC drivers")
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Link: https://bugs.passt.top/show_bug.cgi?id=189
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Link: https://patch.msgid.link/20260119075518.2774373-1-lvivier@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 9a063f96d87efc3a6cc667f8de096a3d38d74bb5 ]
syzbot found that ndisc_router_discovery() could read and write
in6_dev->ra_mtu without holding a lock [1]
This looks fine, IFLA_INET6_RA_MTU is best effort.
Add READ_ONCE()/WRITE_ONCE() to document the race.
Note that we might also reject illegal MTU values
(mtu < IPV6_MIN_MTU || mtu > skb->dev->mtu) in a future patch.
[1]
BUG: KCSAN: data-race in ndisc_router_discovery / ndisc_router_discovery
read to 0xffff888119809c20 of 4 bytes by task 25817 on cpu 1:
ndisc_router_discovery+0x151d/0x1c90 net/ipv6/ndisc.c:1558
ndisc_rcv+0x2ad/0x3d0 net/ipv6/ndisc.c:1841
icmpv6_rcv+0xe5a/0x12f0 net/ipv6/icmp.c:989
ip6_protocol_deliver_rcu+0xb2a/0x10d0 net/ipv6/ip6_input.c:438
ip6_input_finish+0xf0/0x1d0 net/ipv6/ip6_input.c:489
NF_HOOK include/linux/netfilter.h:318 [inline]
ip6_input+0x5e/0x140 net/ipv6/ip6_input.c:500
ip6_mc_input+0x27c/0x470 net/ipv6/ip6_input.c:590
dst_input include/net/dst.h:474 [inline]
ip6_rcv_finish+0x336/0x340 net/ipv6/ip6_input.c:79
...
write to 0xffff888119809c20 of 4 bytes by task 25816 on cpu 0:
ndisc_router_discovery+0x155a/0x1c90 net/ipv6/ndisc.c:1559
ndisc_rcv+0x2ad/0x3d0 net/ipv6/ndisc.c:1841
icmpv6_rcv+0xe5a/0x12f0 net/ipv6/icmp.c:989
ip6_protocol_deliver_rcu+0xb2a/0x10d0 net/ipv6/ip6_input.c:438
ip6_input_finish+0xf0/0x1d0 net/ipv6/ip6_input.c:489
NF_HOOK include/linux/netfilter.h:318 [inline]
ip6_input+0x5e/0x140 net/ipv6/ip6_input.c:500
ip6_mc_input+0x27c/0x470 net/ipv6/ip6_input.c:590
dst_input include/net/dst.h:474 [inline]
ip6_rcv_finish+0x336/0x340 net/ipv6/ip6_input.c:79
...
value changed: 0x00000000 -> 0xe5400659
Fixes: 49b99da2c9ce ("ipv6: add IFLA_INET6_RA_MTU to expose mtu value")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Rocco Yue <rocco.yue@mediatek.com>
Link: https://patch.msgid.link/20260118152941.2563857-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8175dbf174d487afab81e936a862a8d9b8a1ccb6 ]
dev->work can re read locklessly in mISDN_read()
and mISDN_poll(). Add READ_ONCE()/WRITE_ONCE() annotations.
BUG: KCSAN: data-race in mISDN_ioctl / mISDN_read
write to 0xffff88812d848280 of 4 bytes by task 10864 on cpu 1:
misdn_add_timer drivers/isdn/mISDN/timerdev.c:175 [inline]
mISDN_ioctl+0x2fb/0x550 drivers/isdn/mISDN/timerdev.c:233
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl+0xce/0x140 fs/ioctl.c:583
__x64_sys_ioctl+0x43/0x50 fs/ioctl.c:583
x64_sys_call+0x14b0/0x3000 arch/x86/include/generated/asm/syscalls_64.h:17
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xd8/0x2c0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
read to 0xffff88812d848280 of 4 bytes by task 10857 on cpu 0:
mISDN_read+0x1f2/0x470 drivers/isdn/mISDN/timerdev.c:112
do_loop_readv_writev fs/read_write.c:847 [inline]
vfs_readv+0x3fb/0x690 fs/read_write.c:1020
do_readv+0xe7/0x210 fs/read_write.c:1080
__do_sys_readv fs/read_write.c:1165 [inline]
__se_sys_readv fs/read_write.c:1162 [inline]
__x64_sys_readv+0x45/0x50 fs/read_write.c:1162
x64_sys_call+0x2831/0x3000 arch/x86/include/generated/asm/syscalls_64.h:20
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xd8/0x2c0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
value changed: 0x00000000 -> 0x00000001
Fixes: 1b2b03f8e514 ("Add mISDN core files")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260118132528.2349573-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f87e034d16e43af984380a95c32c25201b7759a7 ]
Use next_input_key instead of counter_id to set HCLGE_FD_AD_NXT_KEY.
Fixes: 117328680288 ("net: hns3: Add input key and action config support for flow director")
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Link: https://patch.msgid.link/20260119132840.410513-3-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d57c67c956a1bad15115eba6e59d77a6dfeba01d ]
HCLGE_FD_AD_COUNTER_NUM_M should be at GENMASK(19, 13),
rather than at GENMASK(20, 13), because bit 20 is
HCLGE_FD_AD_NXT_STEP_B.
This patch corrects the wrong definition.
Fixes: 117328680288 ("net: hns3: Add input key and action config support for flow director")
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Link: https://patch.msgid.link/20260119132840.410513-2-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b97d5eedf4976cc94321243be83b39efe81a0e15 ]
The netdevsim driver lacks a protection mechanism for operations on the
bpf_bound_progs list. When the nsim_bpf_create_prog() performs
list_add_tail, it is possible that nsim_bpf_destroy_prog() is
simultaneously performs list_del. Concurrent operations on the list may
lead to list corruption and trigger a kernel crash as follows:
[ 417.290971] kernel BUG at lib/list_debug.c:62!
[ 417.290983] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 417.290992] CPU: 10 PID: 168 Comm: kworker/10:1 Kdump: loaded Not tainted 6.19.0-rc5 #1
[ 417.291003] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 417.291007] Workqueue: events bpf_prog_free_deferred
[ 417.291021] RIP: 0010:__list_del_entry_valid_or_report+0xa7/0xc0
[ 417.291034] Code: a8 ff 0f 0b 48 89 fe 48 89 ca 48 c7 c7 48 a1 eb ae e8 ed fb a8 ff 0f 0b 48 89 fe 48 89 c2 48 c7 c7 80 a1 eb ae e8 d9 fb a8 ff <0f> 0b 48 89 d1 48 c7 c7 d0 a1 eb ae 48 89 f2 48 89 c6 e8 c2 fb a8
[ 417.291040] RSP: 0018:ffffb16a40807df8 EFLAGS: 00010246
[ 417.291046] RAX: 000000000000006d RBX: ffff8e589866f500 RCX: 0000000000000000
[ 417.291051] RDX: 0000000000000000 RSI: ffff8e59f7b23180 RDI: ffff8e59f7b23180
[ 417.291055] RBP: ffffb16a412c9000 R08: 0000000000000000 R09: 0000000000000003
[ 417.291059] R10: ffffb16a40807c80 R11: ffffffffaf9edce8 R12: ffff8e594427ac20
[ 417.291063] R13: ffff8e59f7b44780 R14: ffff8e58800b7a05 R15: 0000000000000000
[ 417.291074] FS: 0000000000000000(0000) GS:ffff8e59f7b00000(0000) knlGS:0000000000000000
[ 417.291079] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 417.291083] CR2: 00007fc4083efe08 CR3: 00000001c3626006 CR4: 0000000000770ee0
[ 417.291088] PKRU: 55555554
[ 417.291091] Call Trace:
[ 417.291096] <TASK>
[ 417.291103] nsim_bpf_destroy_prog+0x31/0x80 [netdevsim]
[ 417.291154] __bpf_prog_offload_destroy+0x2a/0x80
[ 417.291163] bpf_prog_dev_bound_destroy+0x6f/0xb0
[ 417.291171] bpf_prog_free_deferred+0x18e/0x1a0
[ 417.291178] process_one_work+0x18a/0x3a0
[ 417.291188] worker_thread+0x27b/0x3a0
[ 417.291197] ? __pfx_worker_thread+0x10/0x10
[ 417.291207] kthread+0xe5/0x120
[ 417.291214] ? __pfx_kthread+0x10/0x10
[ 417.291221] ret_from_fork+0x31/0x50
[ 417.291230] ? __pfx_kthread+0x10/0x10
[ 417.291236] ret_from_fork_asm+0x1a/0x30
[ 417.291246] </TASK>
Add a mutex lock, to prevent simultaneous addition and deletion operations
on the list.
Fixes: 31d3ad832948 ("netdevsim: add bpf offload support")
Reported-by: Yinhao Hu <dddddd@hust.edu.cn>
Reported-by: Kaiyan Mei <M202472210@hust.edu.cn>
Signed-off-by: Yun Lu <luyun@kylinos.cn>
Link: https://patch.msgid.link/20260116095308.11441-1-luyun_611@163.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 6b971191fcfc9e3c2c0143eea22534f1f48dbb62 ]
On at least the HyperX Cloud III, the range is 18944 (-18944 -> 0 in
steps of 1), so the original check for 255 steps is definitely obsolete.
Let's give ourselves a little more headroom before we emit a warning.
Fixes: 80acefff3bc7 ("ALSA: usb-audio - Add volume range check and warn if it too big")
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.com>
Cc: linux-sound@vger.kernel.org
Signed-off-by: Arun Raghavan <arunr@valvesoftware.com>
Link: https://patch.msgid.link/20260116225804.3845935-1-arunr@valvesoftware.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
each other
[ Upstream commit fe2f8ad6f0999db3b318359a01ee0108c703a8c3 ]
The fragile ordering between marking commands completed or failed so
that the error handler only wakes when the last running command
completes or times out has race conditions. These race conditions can
cause the SCSI layer to fail to wake the error handler, leaving I/O
through the SCSI host stuck as the error state cannot advance.
First, there is an memory ordering issue within scsi_dec_host_busy().
The write which clears SCMD_STATE_INFLIGHT may be reordered with reads
counting in scsi_host_busy(). While the local CPU will see its own
write, reordering can allow other CPUs in scsi_dec_host_busy() or
scsi_eh_inc_host_failed() to see a raised busy count, causing no CPU to
see a host busy equal to the host_failed count.
This race condition can be prevented with a memory barrier on the error
path to force the write to be visible before counting host busy
commands.
Second, there is a general ordering issue with scsi_eh_inc_host_failed(). By
counting busy commands before incrementing host_failed, it can race with a
final command in scsi_dec_host_busy(), such that scsi_dec_host_busy() does
not see host_failed incremented but scsi_eh_inc_host_failed() counts busy
commands before SCMD_STATE_INFLIGHT is cleared by scsi_dec_host_busy(),
resulting in neither waking the error handler task.
This needs the call to scsi_host_busy() to be moved after host_failed is
incremented to close the race condition.
Fixes: 6eb045e092ef ("scsi: core: avoid host-wide host_busy counter for scsi_mq")
Signed-off-by: David Jeffery <djeffery@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20260113161036.6730-1-djeffery@redhat.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit eaa9bb1d39d59e7c17b06cec12622b7c586ab629 ]
On RV32, updating the 64-bit stimecmp (or vstimecmp) CSR requires two
separate 32-bit writes. A race condition exists if the timer triggers
during these two writes.
The RISC-V Privileged Specification (e.g., Section 3.2.1 for mtimecmp)
recommends a specific 3-step sequence to avoid spurious interrupts
when updating 64-bit comparison registers on 32-bit systems:
1. Set the low-order bits (stimecmp) to all ones (ULONG_MAX).
2. Set the high-order bits (stimecmph) to the desired value.
3. Set the low-order bits (stimecmp) to the desired value.
Current implementation writes the LSB first without ensuring a future
value, which may lead to a transient state where the 64-bit comparison
is incorrectly evaluated as "expired" by the hardware. This results in
spurious timer interrupts.
This patch adopts the spec-recommended 3-step sequence to ensure the
intermediate 64-bit state is never smaller than the current time.
Fixes: 9f7a8ff6391f ("RISC-V: Prefer sstc extension if available")
Signed-off-by: Naohiko Shimizu <naohiko.shimizu@gmail.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://patch.msgid.link/20260104135938.524-2-naohiko.shimizu@gmail.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4b58aac989c1e3fafb1c68a733811859df388250 ]
Previously, the address of the shared member '&map->spinlock_flags' was
passed directly to 'hwspin_lock_timeout_irqsave'. This creates a race
condition where multiple contexts contending for the lock could overwrite
the shared flags variable, potentially corrupting the state for the
current lock owner.
Fix this by using a local stack variable 'flags' to store the IRQ state
temporarily.
Fixes: 8698b9364710 ("regmap: Add hardware spinlock support")
Signed-off-by: Cheng-Yu Lee <cylee12@realtek.com>
Co-developed-by: Yu-Chun Lin <eleanor.lin@realtek.com>
Signed-off-by: Yu-Chun Lin <eleanor.lin@realtek.com>
Link: https://patch.msgid.link/20260109032633.8732-1-eleanor.lin@realtek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 383d4f5cffcc8df930d95b06518a9d25a6d74aac ]
The driver currently uses spi_alloc_host() to allocate the controller
but registers it using devm_spi_register_controller().
If devm_register_restart_handler() fails, the code jumps to the
put_ctlr label and calls spi_controller_put(). However, since the
controller was registered via a devm function, the device core will
automatically call spi_controller_put() again when the probe fails.
This results in a double-free of the spi_controller structure.
Fix this by switching to devm_spi_alloc_host() and removing the
manual spi_controller_put() call.
Fixes: ac17750 ("spi: sprd: Add the support of restarting the system")
Signed-off-by: Felix Gu <gu_0233@qq.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Link: https://patch.msgid.link/tencent_AC7D389CE7E24318445E226F7CDCCC2F0D07@qq.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0a3d087d09a8f52c02d0014bad63be99c53c4812 ]
Switch to use modern name function spi_alloc_host().
No functional changed.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://msgid.link/r/20231128093031.3707034-2-yangyingliang@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Stable-dep-of: 383d4f5cffcc ("spi: spi-sprd-adi: Fix double free in probe error path")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8e6a43961f24cf841d3c0d199521d0b284d948b9 ]
Use device life-cycle managed register function to simplify probe error
path and eliminate need for explicit remove function.
Signed-off-by: Andrew Davis <afd@ti.com>
Link: https://lore.kernel.org/r/20231117161006.87734-5-afd@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Stable-dep-of: 383d4f5cffcc ("spi: spi-sprd-adi: Fix double free in probe error path")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8499d4b5970f5fd135ee8860075768562a5efe70 ]
According to commit 890cc39a8799 ("drivers: provide
devm_platform_get_and_ioremap_resource()"), convert
platform_get_resource(), devm_ioremap_resource() to a single
call to devm_platform_get_and_ioremap_resource(), as this is exactly
what this function does.
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230327060516.93509-1-yang.lee@linux.alibaba.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Stable-dep-of: 383d4f5cffcc ("spi: spi-sprd-adi: Fix double free in probe error path")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f7f785f125d03360d3766d96d04cf08b8472ce8f ]
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is (mostly) ignored
and this typically results in resource leaks. To improve here there is a
quest to make the remove callback return void. In the first step of this
quest all drivers are converted to .remove_new() which already returns
void.
Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20230303172041.2103336-71-u.kleine-koenig@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
Stable-dep-of: 383d4f5cffcc ("spi: spi-sprd-adi: Fix double free in probe error path")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 6b39824ac4c15783787e6434449772bfb2e31214 ]
The probe() function ignored the return value of spi_setup(), leaving SPI
configuration failures undetected. If spi_setup() fails, the driver should
stop initialization and propagate the error to the caller.
Add proper error handling: check the return value of spi_setup() and return
it on failure.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 2051f25d2a26 ("iio: adc: New driver for AD7280A Lithium Ion Battery Monitoring System")
Signed-off-by: Pavel Zhigulin <Pavel.Zhigulin@kaspersky.com>
Reviewed-by: Marcelo Schmitt <marcelo.schmitt@analog.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 10d28cffb3f6ec7ad67f0a4cd32c2afa92909452 upstream.
The `COMEDI_RANGEINFO` ioctl does not work properly for subdevice
indices above 15. Currently, the only in-tree COMEDI drivers that
support more than 16 subdevices are the "8255" driver and the
"comedi_bond" driver. Making the ioctl work for subdevice indices up to
255 is achievable. It needs minor changes to the handling of the
`COMEDI_RANGEINFO` and `COMEDI_CHANINFO` ioctls that should be mostly
harmless to user-space, apart from making them less broken. Details
follow...
The `COMEDI_RANGEINFO` ioctl command gets the list of supported ranges
(usually with units of volts or milliamps) for a COMEDI subdevice or
channel. (Only some subdevices have per-channel range tables, indicated
by the `SDF_RANGETYPE` flag in the subdevice information.) It uses a
`range_type` value and a user-space pointer, both supplied by
user-space, but the `range_type` value should match what was obtained
using the `COMEDI_CHANINFO` ioctl (if the subdevice has per-channel
range tables) or `COMEDI_SUBDINFO` ioctl (if the subdevice uses a
single range table for all channels). Bits 15 to 0 of the `range_type`
value contain the length of the range table, which is the only part that
user-space should care about (so it can use a suitably sized buffer to
fetch the range table). Bits 23 to 16 store the channel index, which is
assumed to be no more than 255 if the subdevice has per-channel range
tables, and is set to 0 if the subdevice has a single range table. For
`range_type` values produced by the `COMEDI_SUBDINFO` ioctl, bits 31 to
24 contain the subdevice index, which is assumed to be no more than 255.
But for `range_type` values produced by the `COMEDI_CHANINFO` ioctl,
bits 27 to 24 contain the subdevice index, which is assumed to be no
more than 15, and bits 31 to 28 contain the COMEDI device's minor device
number for some unknown reason lost in the mists of time. The
`COMEDI_RANGEINFO` ioctl extract the length from bits 15 to 0 of the
user-supplied `range_type` value, extracts the channel index from bits
23 to 16 (only used if the subdevice has per-channel range tables),
extracts the subdevice index from bits 27 to 24, and ignores bits 31 to
28. So for subdevice indices 16 to 255, the `COMEDI_SUBDINFO` or
`COMEDI_CHANINFO` ioctl will report a `range_type` value that doesn't
work with the `COMEDI_RANGEINFO` ioctl. It will either get the range
table for the subdevice index modulo 16, or will fail with `-EINVAL`.
To fix this, always use bits 31 to 24 of the `range_type` value to hold
the subdevice index (assumed to be no more than 255). This affects the
`COMEDI_CHANINFO` and `COMEDI_RANGEINFO` ioctls. There should not be
anything in user-space that depends on the old, broken usage, although
it may now see different values in bits 31 to 28 of the `range_type`
values reported by the `COMEDI_CHANINFO` ioctl for subdevices that have
per-channel subdevices. User-space should not be trying to decode bits
31 to 16 of the `range_type` values anyway.
Fixes: ed9eccbe8970 ("Staging: add comedi core")
Cc: stable@vger.kernel.org #5.17+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://patch.msgid.link/20251203162438.176841-1-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b505f1944535f83d369ae68813e7634d11b990d3 upstream.
For native, the choice of PTE is fine. There's real memory backing the
non-present PTE. However, for XenPV, Xen complains:
(XEN) d1 L1TF-vulnerable L1e 8010000018200066 - Shadowing
To explain, some background on XenPV pagetables:
Xen PV guests are control their own pagetables; they choose the new
PTE value, and use hypercalls to make changes so Xen can audit for
safety.
In addition to a regular reference count, Xen also maintains a type
reference count. e.g. SegDesc (referenced by vGDT/vLDT), Writable
(referenced with _PAGE_RW) or L{1..4} (referenced by vCR3 or a lower
pagetable level). This is in order to prevent e.g. a page being
inserted into the pagetables for which the guest has a writable mapping.
For non-present mappings, all other bits become software accessible,
and typically contain metadata rather a real frame address. There is
nothing that a reference count could sensibly be tied to. As such, even
if Xen could recognise the address as currently safe, nothing would
prevent that frame from changing owner to another VM in the future.
When Xen detects a PV guest writing a L1TF-PTE, it responds by
activating shadow paging. This is normally only used for the live phase
of migration, and comes with a reasonable overhead.
KFENCE only cares about getting #PF to catch wild accesses; it doesn't
care about the value for non-present mappings. Use a fully inverted PTE,
to avoid hitting the slow path when running under Xen.
While adjusting the logic, take the opportunity to skip all actions if the
PTE is already in the right state, half the number PVOps callouts, and
skip TLB maintenance on a !P -> P transition which benefits non-Xen cases
too.
Link: https://lkml.kernel.org/r/20260106180426.710013-1-andrew.cooper3@citrix.com
Fixes: 1dc0da6e9ec0 ("x86, kfence: enable KFENCE for x86")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Jann Horn <jannh@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 0368e4afcf20f377c81fa77b1c7d0dee4a625a44 upstream.
Shawn Lin from Rockchip strongly discourages attempts to use their
RK3399 PCIe core at 5.0 GT/s speed, citing concerns about catastrophic
failures that may happen. Even if the odds are low, drop from last user
of this non-default property for the RK3399 platform, helios64 board
dts.
Fixes: 755fff528b1b ("arm64: dts: rockchip: add variables for pcie completion to helios64")
Link: https://lore.kernel.org/all/e8524bf8-a90c-423f-8a58-9ef05a3db1dd@rock-chips.com/
Cc: stable@vger.kernel.org
Reported-by: Shawn Lin <shawn.lin@rock-chips.com>
Reviewed-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Geraldo Nascimento <geraldogabriel@gmail.com>
Acked-by: Shawn Lin <shawn.lin@rock-chips.com>
Link: https://patch.msgid.link/43bb639c120f599106fca2deee6c6599b2692c5c.1763415706.git.geraldogabriel@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9eacec5d18f98f89be520eeeef4b377acee3e4b8 upstream.
The Hyper-V host does not support MODE_SENSE_10 and MODE_SENSE. The
driver handles MODE_SENSE as unsupported command, but not for
MODE_SENSE_10. Add MODE_SENSE_10 to the same handling logic and return
correct code to SCSI layer.
Fixes: 89ae7d709357 ("Staging: hv: storvsc: Move the storage driver out of the staging area")
Cc: stable@kernel.org
Signed-off-by: Long Li <longli@microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Link: https://patch.msgid.link/20260117010302.294068-1-longli@linux.microsoft.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 2934325f56150ad8dab8ab92cbe2997242831396 upstream.
The ASUS Zenbook UX425QA_UM425QA fails to initialize the keyboard after
a cold boot.
A quirk already exists for "ZenBook UX425", but some Zenbooks report
"Zenbook" with a lowercase 'b'. Since DMI matching is case-sensitive,
the existing quirk is not applied to these "extra special" Zenbooks.
Testing confirms that this model needs the same quirks as the ZenBook
UX425 variants.
Signed-off-by: feng <alec.jiang@gmail.com>
Link: https://patch.msgid.link/20260122013957.11184-1-alec.jiang@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 19a5d9ba6208e9006a2a9d5962aea4d6e427d8ab upstream.
The MECHREVO Wujie 15X Pro requires several i8042 quirks to function
correctly. Specifically, NOMUX, RESET_ALWAYS, NOLOOP, and NOPNP are
needed to ensure the keyboard and touchpad work reliably.
Signed-off-by: gongqi <550230171hxy@gmail.com>
Link: https://patch.msgid.link/20260122155501.376199-3-550230171hxy@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
and count"
commit f40ddcc0c0ca1a0122a7f4440b429f97d5832bdf upstream.
This reverts commit 068648aab72c9ba7b0597354ef4d81ffaac7b979.
NFC packets may have NUL-bytes. Checking for string length is not a correct
assumption here. As long as there is a check for the length copied from
copy_from_user, all should be fine.
The fix only prevented the syzbot reproducer from triggering the bug
because the packet is not enqueued anymore and the code that triggers the
bug is not exercised.
The fix even broke
testing/selftests/nci/nci_dev, making all tests there fail. After the
revert, 6 out of 8 tests pass.
Fixes: 068648aab72c ("nfc/nci: Add the inconsistency check between the input data length and count")
Cc: stable@vger.kernel.org
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Link: https://patch.msgid.link/20260113202458.449455-1-cascardo@igalia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit cc8f92e41eb76f450f05234fef2054afc3633100 upstream.
In w1_attach_slave_device(), if __w1_attach_slave_device() fails,
put_device() -> w1_slave_release() is called to do the cleanup job.
In w1_slave_release(), sl->family->refcnt and sl->master->slave_count
have already been decremented. There is no need to decrement twice
in w1_attach_slave_device().
Fixes: 2c927c0c73fd ("w1: Fix slave count on 1-Wire bus (resend)")
Cc: stable@vger.kernel.org
Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn>
Link: https://patch.msgid.link/20251218111414.564403-1-lihaoxiang@isrc.iscas.ac.cn
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 761fcf46a1bd797bd32d23f3ea0141ffd437668a upstream.
The sysfs buffer passed to alarms_store() is allocated with 'size + 1'
bytes and a NUL terminator is appended. However, the 'size' argument
does not account for this extra byte. The original code then allocated
'size' bytes and used strcpy() to copy 'buf', which always writes one
byte past the allocated buffer since strcpy() copies until the NUL
terminator at index 'size'.
Fix this by parsing the 'buf' parameter directly using simple_strtoll()
without allocating any intermediate memory or string copying. This
removes the overflow while simplifying the code.
Cc: stable@vger.kernel.org
Fixes: e2c94d6f5720 ("w1_therm: adding alarm sysfs entry")
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Link: https://patch.msgid.link/20251216145007.44328-2-thorsten.blum@linux.dev
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e03b29b55f2b7c345a919a6ee36633b06bf3fb56 upstream.
Some of the hardware registers of the DMM-32-AT board are multiplexed,
using the least significant two bits of the Miscellaneous Control
register to select the function of registers at offsets 12 to 15:
00 => 8254 timer/counter registers are accessible
01 => 8255 digital I/O registers are accessible
10 => Reserved
11 => Calibration registers are accessible
The interrupt service routine (`dmm32at_isr()`) clobbers the bottom two
bits of the register with value 00, which would interfere with access to
the 8255 registers by the `dm32at_8255_io()` function (used for Comedi
instruction handling on the digital I/O subdevice).
Make use of the generic Comedi device spin-lock `dev->spinlock` (which
is otherwise unused by this driver) to serialize access to the
miscellaneous control register and paged registers.
Fixes: 3c501880ac44 ("Staging: comedi: add dmm32at driver")
Cc: stable@vger.kernel.org
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://patch.msgid.link/20260112162835.91688-1-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 27aff0a56b3c77ea1a73641c9b3c4172a8f7238f upstream.
Fintek F81504/508/512 can support both RTS_ON_SEND and RTS_AFTER_SEND,
but pci_fintek_rs485_supported only announces the former.
This makes it impossible to unset SER_RS485_RTS_ON_SEND from
userspace because of uart_sanitize_serial_rs485(). Some devices
with these chips need RTS low on TX, so they are effectively broken.
Fix this by announcing the support for SER_RS485_RTS_AFTER_SEND,
similar to commit 068d35a7be65 ("serial: sc16is7xx: announce support
for SER_RS485_RTS_ON_SEND").
Fixes: 4afeced55baa ("serial: core: fix sanitizing check for RTS settings")
Cc: stable <stable@kernel.org>
Signed-off-by: Marnix Rijnart <marnix.rijnart@iwell.eu>
Link: https://patch.msgid.link/20260112000931.61703-1-marnix.rijnart@iwell.eu
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 2397e9264676be7794f8f7f1e9763d90bd3c7335 ]
authencesn assumes an ESP/ESN-formatted AAD. When assoclen is shorter than
the minimum expected length, crypto_authenc_esn_decrypt() can advance past
the end of the destination scatterlist and trigger a NULL pointer dereference
in scatterwalk_map_and_copy(), leading to a kernel panic (DoS).
Add a minimum AAD length check to fail fast on invalid inputs.
Fixes: 104880a6b470 ("crypto: authencesn - Convert to new AEAD interface")
Reported-By: Taeyang Lee <0wn@theori.io>
Signed-off-by: Taeyang Lee <0wn@theori.io>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
qfq_rm_from_ag
[ Upstream commit d837fbee92453fbb829f950c8e7cf76207d73f33 ]
This is more of a preventive patch to make the code more consistent and
to prevent possible exploits that employ child qlen manipulations on qfq.
use cl_is_active instead of relying on the child qdisc's qlen to determine
class activation.
Fixes: 462dbc9101acd ("pkt_sched: QFQ Plus: fair-queueing service at DRR cost")
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260114160243.913069-3-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 50da4b9d07a7a463e2cfb738f3ad4cff6b2c9c3b ]
Design intent of teql is that it is only supposed to be used as root qdisc.
We need to check for that constraint.
Although not important, I will describe the scenario that unearthed this
issue for the curious.
GangMin Kim <km.kim1503@gmail.com> managed to concot a scenario as follows:
ROOT qdisc 1:0 (QFQ)
├── class 1:1 (weight=15, lmax=16384) netem with delay 6.4s
└── class 1:2 (weight=1, lmax=1514) teql
GangMin sends a packet which is enqueued to 1:1 (netem).
Any invocation of dequeue by QFQ from this class will not return a packet
until after 6.4s. In the meantime, a second packet is sent and it lands on
1:2. teql's enqueue will return success and this will activate class 1:2.
Main issue is that teql only updates the parent visible qlen (sch->q.qlen)
at dequeue. Since QFQ will only call dequeue if peek succeeds (and teql's
peek always returns NULL), dequeue will never be called and thus the qlen
will remain as 0. With that in mind, when GangMin updates 1:2's lmax value,
the qfq_change_class calls qfq_deact_rm_from_agg. Since the child qdisc's
qlen was not incremented, qfq fails to deactivate the class, but still
frees its pointers from the aggregate. So when the first packet is
rescheduled after 6.4 seconds (netem's delay), a dangling pointer is
accessed causing GangMin's causing a UAF.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: GangMin Kim <km.kim1503@gmail.com>
Tested-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260114160243.913069-2-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ab9b218a1521133a4410722907fa7189566be9bc ]
The RX flowid programming initializes the TCAM mask to all ones, but
then overwrites it when clearing the MAC DA mask bits. This results
in losing the intended initialization and may affect other match fields.
Update the code to clear the MAC DA bits using an AND operation, making
the handling of mask[0] consistent with mask[1], where the field-specific
bits are cleared after initializing the mask to ~0ULL.
Fixes: 57d00d4364f3 ("octeontx2-pf: mcs: Match macsec ethertype along with DMAC")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Link: https://patch.msgid.link/20260116164724.2733511-1-alok.a.tiwari@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d3ba32162488283c0a4c5bedd8817aec91748802 ]
Make the addrs_lock be per port, not per ipvlan dev.
Initial code seems to be written in the assumption,
that any address change must occur under RTNL.
But it is not so for the case of IPv6. So
1) Introduce per-port addrs_lock.
2) It was needed to fix places where it was forgotten
to take lock (ipvlan_open/ipvlan_close)
This appears to be a very minor problem though.
Since it's highly unlikely that ipvlan_add_addr() will
be called on 2 CPU simultaneously. But nevertheless,
this could cause:
1) False-negative of ipvlan_addr_busy(): one interface
iterated through all port->ipvlans + ipvlan->addrs
under some ipvlan spinlock, and another added IP
under its own lock. Though this is only possible
for IPv6, since looks like only ipvlan_addr6_event() can be
called without rtnl_lock.
2) Race since ipvlan_ht_addr_add(port) is called under
different ipvlan->addrs_lock locks
This should not affect performance, since add/remove IP
is a rare situation and spinlock is not taken on fast
paths.
Fixes: 8230819494b3 ("ipvlan: use per device spinlock to protect addrs list updates")
Signed-off-by: Dmitry Skorodumov <skorodumov.dmitry@huawei.com>
Reviewed-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/20260112142417.4039566-2-skorodumov.dmitry@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 7a29f6bf60f2590fe5e9c4decb451e19afad2bcf ]
We should read sk->sk_socket only when dealing with kernel sockets.
syzbot reported the following data-race:
BUG: KCSAN: data-race in l2tp_tunnel_del_work / sk_common_release
write to 0xffff88811c182b20 of 8 bytes by task 5365 on cpu 0:
sk_set_socket include/net/sock.h:2092 [inline]
sock_orphan include/net/sock.h:2118 [inline]
sk_common_release+0xae/0x230 net/core/sock.c:4003
udp_lib_close+0x15/0x20 include/net/udp.h:325
inet_release+0xce/0xf0 net/ipv4/af_inet.c:437
__sock_release net/socket.c:662 [inline]
sock_close+0x6b/0x150 net/socket.c:1455
__fput+0x29b/0x650 fs/file_table.c:468
____fput+0x1c/0x30 fs/file_table.c:496
task_work_run+0x131/0x1a0 kernel/task_work.c:233
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
__exit_to_user_mode_loop kernel/entry/common.c:44 [inline]
exit_to_user_mode_loop+0x1fe/0x740 kernel/entry/common.c:75
__exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline]
syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:256 [inline]
syscall_exit_to_user_mode_work include/linux/entry-common.h:159 [inline]
syscall_exit_to_user_mode include/linux/entry-common.h:194 [inline]
do_syscall_64+0x1e1/0x2b0 arch/x86/entry/syscall_64.c:100
entry_SYSCALL_64_after_hwframe+0x77/0x7f
read to 0xffff88811c182b20 of 8 bytes by task 827 on cpu 1:
l2tp_tunnel_del_work+0x2f/0x1a0 net/l2tp/l2tp_core.c:1418
process_one_work kernel/workqueue.c:3257 [inline]
process_scheduled_works+0x4ce/0x9d0 kernel/workqueue.c:3340
worker_thread+0x582/0x770 kernel/workqueue.c:3421
kthread+0x489/0x510 kernel/kthread.c:463
ret_from_fork+0x149/0x290 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246
value changed: 0xffff88811b818000 -> 0x0000000000000000
Fixes: d00fa9adc528 ("l2tp: fix races with tunnel socket close")
Reported-by: syzbot+7312e82745f7fa2526db@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6968b029.050a0220.58bed.0016.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Chapman <jchapman@katalix.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Link: https://patch.msgid.link/20260115092139.3066180-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 7a9bc9e3f42391e4c187e099263cf7a1c4b69ff5 ]
fou_udp_recv() has the same problem mentioned in the previous
patch.
If FOU_ATTR_IPPROTO is set to 0, skb is not freed by
fou_udp_recv() nor "resubmit"-ted in ip_protocol_deliver_rcu().
Let's forbid 0 for FOU_ATTR_IPPROTO.
Fixes: 23461551c0062 ("fou: Support for foo-over-udp RX path")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260115172533.693652-4-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 1d562c32e4392cc091c940918ee1ffd7bfcb9e96 ]
Generate and plug in the spec-based tables.
A little bit of renaming is needed in the FOU code.
Acked-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Stable-dep-of: 7a9bc9e3f423 ("fou: Don't allow 0 for FOU_ATTR_IPPROTO.")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 08d323234d10eab077cbf0093eeb5991478a261a ]
We'll need to link two objects together to form the fou module.
This means the source can't be called fou, the build system expects
fou.o to be the combined object.
Acked-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Stable-dep-of: 7a9bc9e3f423 ("fou: Don't allow 0 for FOU_ATTR_IPPROTO.")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4eb77b4ecd3c5eaab83adf76e67e0a7ed2a24418 ]
FOU has a reasonably modern Genetlink family. Add a spec.
Acked-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Stable-dep-of: 7a9bc9e3f423 ("fou: Don't allow 0 for FOU_ATTR_IPPROTO.")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 9a56796ad258786d3624eef5aefba394fc9bdded ]
syzbot reported skb memleak below. [0]
The repro generated a GUE packet with its inner protocol 0.
gue_udp_recv() returns -guehdr->proto_ctype for "resubmit"
in ip_protocol_deliver_rcu(), but this only works with
non-zero protocol number.
Let's drop such packets.
Note that 0 is a valid number (IPv6 Hop-by-Hop Option).
I think it is not practical to encap HOPOPT in GUE, so once
someone starts to complain, we could pass down a resubmit
flag pointer to distinguish two zeros from the upper layer:
* no error
* resubmit HOPOPT
[0]
BUG: memory leak
unreferenced object 0xffff888109695a00 (size 240):
comm "syz.0.17", pid 6088, jiffies 4294943096
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 40 c2 10 81 88 ff ff 00 00 00 00 00 00 00 00 .@..............
backtrace (crc a84b336f):
kmemleak_alloc_recursive include/linux/kmemleak.h:44 [inline]
slab_post_alloc_hook mm/slub.c:4958 [inline]
slab_alloc_node mm/slub.c:5263 [inline]
kmem_cache_alloc_noprof+0x3b4/0x590 mm/slub.c:5270
__build_skb+0x23/0x60 net/core/skbuff.c:474
build_skb+0x20/0x190 net/core/skbuff.c:490
__tun_build_skb drivers/net/tun.c:1541 [inline]
tun_build_skb+0x4a1/0xa40 drivers/net/tun.c:1636
tun_get_user+0xc12/0x2030 drivers/net/tun.c:1770
tun_chr_write_iter+0x71/0x120 drivers/net/tun.c:1999
new_sync_write fs/read_write.c:593 [inline]
vfs_write+0x45d/0x710 fs/read_write.c:686
ksys_write+0xa7/0x170 fs/read_write.c:738
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xa4/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Fixes: 37dd0247797b1 ("gue: Receive side for Generic UDP Encapsulation")
Reported-by: syzbot+4d8c7d16b0e95c0d0f0d@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6965534b.050a0220.38aacd.0001.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260115172533.693652-2-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c158f985cf6c2c36c99c4f67af2ff3f5ebe09f8f ]
On the receive path, packet can be damaged because of buffer
overflow in Rx FIFO. Avoid misleading per-packet error log when
packet->errors is set, this can flood the log. Instead, rely on the
standard rtnl_link_stats64 stats.
Fixes: c5aa9e3b8156 ("amd-xgbe: Initial AMD 10GbE platform driver")
Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Link: https://patch.msgid.link/20260114163037.2062606-1-Raju.Rangoju@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a80c9d945aef55b23b54838334345f20251dad83 ]
A null-ptr-deref was reported in the SCTP transmit path when SCTP-AUTH key
initialization fails:
==================================================================
KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
CPU: 0 PID: 16 Comm: ksoftirqd/0 Tainted: G W 6.6.0 #2
RIP: 0010:sctp_packet_bundle_auth net/sctp/output.c:264 [inline]
RIP: 0010:sctp_packet_append_chunk+0xb36/0x1260 net/sctp/output.c:401
Call Trace:
sctp_packet_transmit_chunk+0x31/0x250 net/sctp/output.c:189
sctp_outq_flush_data+0xa29/0x26d0 net/sctp/outqueue.c:1111
sctp_outq_flush+0xc80/0x1240 net/sctp/outqueue.c:1217
sctp_cmd_interpreter.isra.0+0x19a5/0x62c0 net/sctp/sm_sideeffect.c:1787
sctp_side_effects net/sctp/sm_sideeffect.c:1198 [inline]
sctp_do_sm+0x1a3/0x670 net/sctp/sm_sideeffect.c:1169
sctp_assoc_bh_rcv+0x33e/0x640 net/sctp/associola.c:1052
sctp_inq_push+0x1dd/0x280 net/sctp/inqueue.c:88
sctp_rcv+0x11ae/0x3100 net/sctp/input.c:243
sctp6_rcv+0x3d/0x60 net/sctp/ipv6.c:1127
The issue is triggered when sctp_auth_asoc_init_active_key() fails in
sctp_sf_do_5_1C_ack() while processing an INIT_ACK. In this case, the
command sequence is currently:
- SCTP_CMD_PEER_INIT
- SCTP_CMD_TIMER_STOP (T1_INIT)
- SCTP_CMD_TIMER_START (T1_COOKIE)
- SCTP_CMD_NEW_STATE (COOKIE_ECHOED)
- SCTP_CMD_ASSOC_SHKEY
- SCTP_CMD_GEN_COOKIE_ECHO
If SCTP_CMD_ASSOC_SHKEY fails, asoc->shkey remains NULL, while
asoc->peer.auth_capable and asoc->peer.peer_chunks have already been set by
SCTP_CMD_PEER_INIT. This allows a DATA chunk with auth = 1 and shkey = NULL
to be queued by sctp_datamsg_from_user().
Since command interpretation stops on failure, no COOKIE_ECHO should been
sent via SCTP_CMD_GEN_COOKIE_ECHO. However, the T1_COOKIE timer has already
been started, and it may enqueue a COOKIE_ECHO into the outqueue later. As
a result, the DATA chunk can be transmitted together with the COOKIE_ECHO
in sctp_outq_flush_data(), leading to the observed issue.
Similar to the other places where it calls sctp_auth_asoc_init_active_key()
right after sctp_process_init(), this patch moves the SCTP_CMD_ASSOC_SHKEY
immediately after SCTP_CMD_PEER_INIT, before stopping T1_INIT and starting
T1_COOKIE. This ensures that if shared key generation fails, authenticated
DATA cannot be sent. It also allows the T1_INIT timer to retransmit INIT,
giving the client another chance to process INIT_ACK and retry key setup.
Fixes: 730fc3d05cd4 ("[SCTP]: Implete SCTP-AUTH parameter processing")
Reported-by: Zhen Chen <chenzhen126@huawei.com>
Tested-by: Zhen Chen <chenzhen126@huawei.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/44881224b375aa8853f5e19b4055a1a56d895813.1768324226.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
usb_submit_urb() error
[ Upstream commit 79a6d1bfe1148bc921b8d7f3371a7fbce44e30f7 ]
In commit 7352e1d5932a ("can: gs_usb: gs_usb_receive_bulk_callback(): fix
URB memory leak"), the URB was re-anchored before usb_submit_urb() in
gs_usb_receive_bulk_callback() to prevent a leak of this URB during
cleanup.
However, this patch did not take into account that usb_submit_urb() could
fail. The URB remains anchored and
usb_kill_anchored_urbs(&parent->rx_submitted) in gs_can_close() loops
infinitely since the anchor list never becomes empty.
To fix the bug, unanchor the URB when an usb_submit_urb() error occurs,
also print an info message.
Fixes: 7352e1d5932a ("can: gs_usb: gs_usb_receive_bulk_callback(): fix URB memory leak")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/all/20260110223836.3890248-1-kuba@kernel.org/
Link: https://patch.msgid.link/20260116-can_usb-fix-reanchor-v1-1-9d74e7289225@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c84fcb79e5dbde0b8d5aeeaf04282d2149aebcf6 ]
BOND_MODE_8023AD makes sense for ARPHRD_ETHER only.
syzbot reported:
BUG: KASAN: global-out-of-bounds in __hw_addr_create net/core/dev_addr_lists.c:63 [inline]
BUG: KASAN: global-out-of-bounds in __hw_addr_add_ex+0x25d/0x760 net/core/dev_addr_lists.c:118
Read of size 16 at addr ffffffff8bf94040 by task syz.1.3580/19497
CPU: 1 UID: 0 PID: 19497 Comm: syz.1.3580 Tainted: G L syzkaller #0 PREEMPT(full)
Tainted: [L]=SOFTLOCKUP
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
Call Trace:
<TASK>
dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0xca/0x240 mm/kasan/report.c:482
kasan_report+0x118/0x150 mm/kasan/report.c:595
check_region_inline mm/kasan/generic.c:-1 [inline]
kasan_check_range+0x2b0/0x2c0 mm/kasan/generic.c:200
__asan_memcpy+0x29/0x70 mm/kasan/shadow.c:105
__hw_addr_create net/core/dev_addr_lists.c:63 [inline]
__hw_addr_add_ex+0x25d/0x760 net/core/dev_addr_lists.c:118
__dev_mc_add net/core/dev_addr_lists.c:868 [inline]
dev_mc_add+0xa1/0x120 net/core/dev_addr_lists.c:886
bond_enslave+0x2b8b/0x3ac0 drivers/net/bonding/bond_main.c:2180
do_set_master+0x533/0x6d0 net/core/rtnetlink.c:2963
do_setlink+0xcf0/0x41c0 net/core/rtnetlink.c:3165
rtnl_changelink net/core/rtnetlink.c:3776 [inline]
__rtnl_newlink net/core/rtnetlink.c:3935 [inline]
rtnl_newlink+0x161c/0x1c90 net/core/rtnetlink.c:4072
rtnetlink_rcv_msg+0x7cf/0xb70 net/core/rtnetlink.c:6958
netlink_rcv_skb+0x208/0x470 net/netlink/af_netlink.c:2550
netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
netlink_unicast+0x82f/0x9e0 net/netlink/af_netlink.c:1344
netlink_sendmsg+0x805/0xb30 net/netlink/af_netlink.c:1894
sock_sendmsg_nosec net/socket.c:727 [inline]
__sock_sendmsg+0x21c/0x270 net/socket.c:742
____sys_sendmsg+0x505/0x820 net/socket.c:2592
___sys_sendmsg+0x21f/0x2a0 net/socket.c:2646
__sys_sendmsg+0x164/0x220 net/socket.c:2678
do_syscall_32_irqs_on arch/x86/entry/syscall_32.c:83 [inline]
__do_fast_syscall_32+0x1dc/0x560 arch/x86/entry/syscall_32.c:307
do_fast_syscall_32+0x34/0x80 arch/x86/entry/syscall_32.c:332
entry_SYSENTER_compat_after_hwframe+0x84/0x8e
</TASK>
The buggy address belongs to the variable:
lacpdu_mcast_addr+0x0/0x40
Fixes: 872254dd6b1f ("net/bonding: Enable bonding to enslave non ARPHRD_ETHER")
Reported-by: syzbot+9c081b17773615f24672@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6966946b.a70a0220.245e30.0002.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>
Acked-by: Jay Vosburgh <jv@jvosburgh.net>
Link: https://patch.msgid.link/20260113191201.3970737-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|