Age | Commit message (Collapse) | Author | Files | Lines |
|
Prior to calling bind() a program may call connect() on a socket to
restrict to a remote peer address.
Using connect() is the normal mechanism to specify a remote network
peer, so we use that here. In MCTP connect() is only used for bound
sockets - send() is not available for MCTP since a tag must be provided
for each message.
The smctp_type must match between connect() and bind() calls.
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Link: https://patch.msgid.link/20250710-mctp-bind-v4-6-8ec2f6460c56@codeconstruct.com.au
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Ensure that a specific EID (remote or local) bind will match in
preference to a MCTP_ADDR_ANY bind.
This adds infrastructure for binding a socket to receive messages from a
specific remote peer address, a future commit will expose an API for
this.
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Link: https://patch.msgid.link/20250710-mctp-bind-v4-5-8ec2f6460c56@codeconstruct.com.au
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Test pairwise combinations of bind addresses and types.
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Link: https://patch.msgid.link/20250710-mctp-bind-v4-4-8ec2f6460c56@codeconstruct.com.au
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
When a specific EID is passed as a bind address, it only makes sense to
interpret with an actual network ID, so resolve that to the default
network at bind time.
For bind address of MCTP_ADDR_ANY, we want to be able to capture traffic
to any network and address, so keep the current behaviour of matching
traffic from any network interface (don't interpret MCTP_NET_ANY as
the default network ID).
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Link: https://patch.msgid.link/20250710-mctp-bind-v4-3-8ec2f6460c56@codeconstruct.com.au
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Disallow bind() calls that have the same arguments as existing bound
sockets. Previously multiple sockets could bind() to the same
type/local address, with an arbitrary socket receiving matched messages.
This is only a partial fix, a future commit will define precedence order
for MCTP_ADDR_ANY versus specific EID bind(), which are allowed to exist
together.
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Link: https://patch.msgid.link/20250710-mctp-bind-v4-2-8ec2f6460c56@codeconstruct.com.au
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The sock was not being released. Other than leaking, the stale socket
will conflict with subsequent bind() calls in unrelated MCTP tests.
Fixes: 46ee16462fed ("net: mctp: test: Add extaddr routing output test")
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Link: https://patch.msgid.link/20250710-mctp-bind-v4-1-8ec2f6460c56@codeconstruct.com.au
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Currently, the link_sinfo structure is being freed twice in
nl80211_dump_station(), once after the send_station() call and again
in the error handling path. This results in a double free of both
link_sinfo and link_sinfo->pertid, which might lead to undefined
behavior or kernel crashes.
Hence, fix by ensuring cfg80211_sinfo_release_content() is only
invoked once during execution of nl80211_station_dump().
Fixes: 49e47223ecc4 ("wifi: cfg80211: allocate memory for link_station info structure")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/81f30515-a83d-4b05-a9d1-e349969df9e9@sabinyo.mountain/
Reported-by: syzbot+4ba6272678aa468132c8@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68655325.a70a0220.5d25f.0316.GAE@google.com
Signed-off-by: Sarika Sharma <quic_sarishar@quicinc.com>
Link: https://patch.msgid.link/20250714084405.178066-1-quic_sarishar@quicinc.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
In cfg80211_off_channel_oper_allowed(), the current logic disallows
off-channel operations if any link operates on a radar channel,
assuming such channels cannot be vacated. This assumption holds for
non-MLO interfaces but not for MLO.
With MLO and multi-radio devices, different links may operate on
separate radios. This allows one link to scan off-channel while
another remains on a radar channel. For example, in a 5 GHz
split-phy setup, the lower band can scan while the upper band
stays on a radar channel.
Off-channel operations can be allowed if the radio/link onto which the
input channel falls is different from the radio/link which has an active
radar channel. Therefore, fix cfg80211_off_channel_oper_allowed() by
returning false only if the requested channel maps to the same radio as
an active radar channel. Allow off-channel operations when the requested
channel is on a different radio, as in MLO with multi-radio setups.
Signed-off-by: Aditya Kumar Singh <aditya.kumar.singh@oss.qualcomm.com>
Signed-off-by: Amith A <quic_amitajit@quicinc.com>
Link: https://patch.msgid.link/20250714040742.538550-1-quic_amitajit@quicinc.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The ieee80211_csa_finish() function currently uses for_each_sdata_link()
to iterate over links of sdata. However, this macro internally uses
wiphy_dereference(), which expects the wiphy->mtx lock to be held.
When ieee80211_csa_finish() is invoked under an RCU read-side critical
section (e.g., under rcu_read_lock()), this leads to a warning from the
RCU debugging framework.
WARNING: suspicious RCU usage
net/mac80211/cfg.c:3830 suspicious rcu_dereference_protected() usage!
This warning is triggered because wiphy_dereference() is not safe to use
without holding the wiphy mutex, and it is being used in an RCU context
without the required locking.
Fix this by introducing and using a new macro, for_each_sdata_link_rcu(),
which performs RCU-safe iteration over sdata links using
list_for_each_entry_rcu() and rcu_dereference(). This ensures that the
link pointers are accessed safely under RCU and eliminates the warning.
Fixes: f600832794c9 ("wifi: mac80211: restructure tx profile retrieval for MLO MBSSID")
Signed-off-by: Maharaja Kennadyrajan <maharaja.kennadyrajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250711033846.40455-1-maharaja.kennadyrajan@oss.qualcomm.com
[unindent like the non-RCU macro]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Avoid duplicate non-null pointer check for pmc in mld_del_delrec().
No functional changes.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250714081949.3109947-1-yuehaibing@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
During commands like channel switch and color change, the updated
unsolicited broadcast probe response template may be provided. However,
this data is not parsed or acted upon in mac80211.
Add support to parse it and set the BSS changed flag
BSS_CHANGED_UNSOL_BCAST_PROBE_RESP so that drivers could take further
action.
Signed-off-by: Yuvarani V <quic_yuvarani@quicinc.com>
Signed-off-by: Aditya Kumar Singh <aditya.kumar.singh@oss.qualcomm.com>
Link: https://patch.msgid.link/20250710-update_unsol_bcast_probe_resp-v2-2-31aca39d3b30@oss.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
At present, the updated unsolicited broadcast probe response template is
not processed during userspace commands such as channel switch or color
change. This leads to an issue where older incorrect unsolicited probe
response is still used during these events.
Add support to parse the netlink attribute and store it so that
mac80211/drivers can use it to set the BSS_CHANGED_UNSOL_BCAST_PROBE_RESP
flag in order to send the updated unsolicited broadcast probe response
templates during these events.
Signed-off-by: Yuvarani V <quic_yuvarani@quicinc.com>
Signed-off-by: Aditya Kumar Singh <aditya.kumar.singh@oss.qualcomm.com>
Link: https://patch.msgid.link/20250710-update_unsol_bcast_probe_resp-v2-1-31aca39d3b30@oss.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Since there's no TPE element in the (re)assoc response, trying
to use the data from it just leads to using the defaults, even
though the real values had been set during authentication from
the discovered BSS information.
Fix this by simply not handling the TPE data in assoc response
since it's not intended to be present, if it changes later the
necessary changes will be made by tracking beacons later.
As a side effect, by passing the real frame subtype, now print
a correct value for ML reconfiguration responses.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709233537.caa1ca853f5a.I588271f386731978163aa9d84ae75d6f79633e16@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
If this action frame, with the value of IEEE80211_HT_CHANWIDTH_ANY,
arrives right after a beacon that changed the operational bandwidth from
20 MHz to 40 MHz, then updating the rate control bandwidth to 40 can
race with updating the chanctx width (that happens in the beacon
proccesing) back to 40 MHz:
cpu0 cpu1
ieee80211_rx_mgmt_beacon
ieee80211_config_bw
ieee80211_link_change_chanreq
(*)ieee80211_link_update_chanreq
ieee80211_rx_h_action
(**)ieee80211_sta_cur_vht_bw
(***) ieee80211_recalc_chanctx_chantype
in (**), the maximum between the capability width and the bss width is
returned. But the bss width was just updated to 40 in (*),
so the action frame handling code will increase the width of the rate
control before the chanctx was increased (in ***), leading to a FW error
(at least in iwlwifi driver. But this is wrong regardless).
Fix this by simply handling the action frame async, so it won't race
with the beacon proccessing.
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218632
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709233537.bb9dc6f36c35.I39782d6077424e075974c3bee4277761494a1527@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The loop handling individual subframes can be simplified to
not use a somewhat confusing goto inside the loop.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709233537.a217a1e8c667.I5283df9627912c06c8327b5786d6b715c6f3a4e1@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
During resume, the driver can call ieee80211_add_gtk_rekey for keys that
are not programmed into the device, e.g. keys of inactive links.
Don't mark such a key as uploaded to avoid removing it later from the
driver/device.
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709233537.655094412b0b.Iacae31af3ba2a705da0a9baea976c2f799d65dc4@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
At the end of reconfig we are activating all the links that were active
before the error.
During the activation, _ieee80211_link_use_channel will unassign and
re-assign the chanctx from/to the link.
But we only need to do the assign, as we are re-building the state as it
was before the reconfig.
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709233537.6245c3ae7031.Ia5f68992c7c112bea8a426c9339f50c88be3a9ca@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Under the previous commit's assumption that FIPS isn't
supported by hardware, we don't need to modify the
cipher suite list, but just need to use the software
one instead of the driver's in this case, so clean up
the code.
Also fix it to exclude TKIP in this case, since that's
also dependent on RC4.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709233537.cff427e8f8a5.I744d1ea6a37e3ea55ae8bc3e770acee734eff268@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
When fips_enabled is set, don't send any keys to the driver
(including possibly WoWLAN KEK/KCK material), assuming that
no device exists with the necessary certifications. If this
turns out to be false in the future, we can add a HW flag.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709233537.e5eebc2b19d8.I968ef8c9ffb48d464ada78685bd25d22349fb063@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
All the paths that could return an error are considered
misuses of the function and WARN already, and none of
the callers ever check the return value. Remove it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709233537.5b436ee3c20c.Ieff61ec510939adb5fe6da4840557b649b3aa820@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
If a link has no chanctx, indicating it is an inactive link
that we tracked CSA for, then attempting to unreserve the
reserved chanctx will throw a warning and fail, since there
never was a reserved chanctx. Skip the unreserve.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20250709233537.022192f4b1ae.Ib58156ac13e674a9f4d714735be0764a244c0aae@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
There's no need to always print it, it's only useful for
debugging specific client issues. Make it a debug message.
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Link: https://lore.kernel.org/linux-wireless/20250529070922.3467-1-pmenzel@molgen.mpg.de/
Link: https://patch.msgid.link/20250708105849.22448-2-johannes@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Currently, TCP stack accepts incoming packet if sizes of receive queues
are below sk->sk_rcvbuf limit.
This can cause memory overshoot if the packet is big, like an 1/2 MB
BIG TCP one.
Refine the check to take into account the incoming skb truesize.
Note that we still accept the packet if the receive queue is empty,
to not completely freeze TCP flows in pathological conditions.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250711114006.480026-8-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
These functions to not modify the skb, add a const qualifier.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250711114006.480026-7-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
tcp_measure_rcv_mss() is used to update icsk->icsk_ack.rcv_mss
(tcpi_rcv_mss in tcp_info) and tp->scaling_ratio.
Calling it from tcp_data_queue_ofo() makes sure these
fields are updated, and permits a better tuning
of sk->sk_rcvbuf, in the case a new flow receives many ooo
packets.
Fixes: dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250711114006.480026-5-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a new SNMP MIB : LINUX_MIB_BEYOND_WINDOW
Incremented when an incoming packet is received beyond the
receiver window.
nstat -az | grep TcpExtBeyondWindow
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250711114006.480026-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently, TCP accepts incoming packets which might go beyond the
offered RWIN.
Add to tcp_sequence() the validation of packet end sequence.
Add the corresponding check in the fast path.
We relax this new constraint if the receive queue is empty,
to not freeze flows from buggy peers.
Add a new drop reason : SKB_DROP_REASON_TCP_INVALID_END_SEQUENCE.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250711114006.480026-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A net device has a threaded sysctl that can be used to enable threaded
NAPI polling on all of the NAPI contexts under that device. Allow
enabling threaded NAPI polling at individual NAPI level using netlink.
Extend the netlink operation `napi-set` and allow setting the threaded
attribute of a NAPI. This will enable the threaded polling on a NAPI
context.
Add a test in `nl_netdev.py` that verifies various cases of threaded
NAPI being set at NAPI and at device level.
Tested
./tools/testing/selftests/net/nl_netdev.py
TAP version 13
1..7
ok 1 nl_netdev.empty_check
ok 2 nl_netdev.lo_check
ok 3 nl_netdev.page_pool_check
ok 4 nl_netdev.napi_list_check
ok 5 nl_netdev.dev_set_threaded
ok 6 nl_netdev.napi_set_threaded
ok 7 nl_netdev.nsim_rxq_reset_down
# Totals: pass:7 fail:0 xfail:0 xpass:0 skip:0 error:0
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250710211203.3979655-1-skhawaja@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently, __mkroute_output overrules the MTU value configured for
broadcast routes.
This buggy behaviour can be reproduced with:
ip link set dev eth1 mtu 9000
ip route del broadcast 192.168.0.255 dev eth1 proto kernel scope link src 192.168.0.2
ip route add broadcast 192.168.0.255 dev eth1 proto kernel scope link src 192.168.0.2 mtu 1500
The maximum packet size should be 1500, but it is actually 8000:
ping -b 192.168.0.255 -s 8000
Fix __mkroute_output to allow MTU values to be configured for
for broadcast routes (to support a mixed-MTU local-area-network).
Signed-off-by: Oscar Maes <oscmaes92@gmail.com>
Link: https://patch.msgid.link/20250710142714.12986-1-oscmaes92@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
x25_terminate_link() has been unused since the last use was removed
in 2020 by:
commit 7eed751b3b2a ("net/x25: handle additional netdev events")
Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Acked-by: Martin Schiller <ms@dev.tdt.de>
Link: https://patch.msgid.link/20250712205759.278777-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syzbot reported weird splats [0][1] in cipso_v4_sock_setattr() while
freeing inet_sk(sk)->inet_opt.
The address was freed multiple times even though it was read-only memory.
cipso_v4_sock_setattr() did nothing wrong, and the root cause was type
confusion.
The cited commit made it possible to create smc_sock as an INET socket.
The issue is that struct smc_sock does not have struct inet_sock as the
first member but hijacks AF_INET and AF_INET6 sk_family, which confuses
various places.
In this case, inet_sock.inet_opt was actually smc_sock.clcsk_data_ready(),
which is an address of a function in the text segment.
$ pahole -C inet_sock vmlinux
struct inet_sock {
...
struct ip_options_rcu * inet_opt; /* 784 8 */
$ pahole -C smc_sock vmlinux
struct smc_sock {
...
void (*clcsk_data_ready)(struct sock *); /* 784 8 */
The same issue for another field was reported before. [2][3]
At that time, an ugly hack was suggested [4], but it makes both INET
and SMC code error-prone and hard to change.
Also, yet another variant was fixed by a hacky commit 98d4435efcbf3
("net/smc: prevent NULL pointer dereference in txopt_get").
Instead of papering over the root cause by such hacks, we should not
allow non-INET socket to reuse the INET infra.
Let's add inet_sock as the first member of smc_sock.
[0]:
kvfree_call_rcu(): Double-freed call. rcu_head 000000006921da73
WARNING: CPU: 0 PID: 6718 at mm/slab_common.c:1956 kvfree_call_rcu+0x94/0x3f0 mm/slab_common.c:1955
Modules linked in:
CPU: 0 UID: 0 PID: 6718 Comm: syz.0.17 Tainted: G W 6.16.0-rc4-syzkaller-g7482bb149b9f #0 PREEMPT
Tainted: [W]=WARN
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : kvfree_call_rcu+0x94/0x3f0 mm/slab_common.c:1955
lr : kvfree_call_rcu+0x94/0x3f0 mm/slab_common.c:1955
sp : ffff8000a03a7730
x29: ffff8000a03a7730 x28: 00000000fffffff5 x27: 1fffe000184823d3
x26: dfff800000000000 x25: ffff0000c2411e9e x24: ffff0000dd88da00
x23: ffff8000891ac9a0 x22: 00000000ffffffea x21: ffff8000891ac9a0
x20: ffff8000891ac9a0 x19: ffff80008afc2480 x18: 00000000ffffffff
x17: 0000000000000000 x16: ffff80008ae642c8 x15: ffff700011ede14c
x14: 1ffff00011ede14c x13: 0000000000000004 x12: ffffffffffffffff
x11: ffff700011ede14c x10: 0000000000ff0100 x9 : 5fa3c1ffaf0ff000
x8 : 5fa3c1ffaf0ff000 x7 : 0000000000000001 x6 : 0000000000000001
x5 : ffff8000a03a7078 x4 : ffff80008f766c20 x3 : ffff80008054d360
x2 : 0000000000000000 x1 : 0000000000000201 x0 : 0000000000000000
Call trace:
kvfree_call_rcu+0x94/0x3f0 mm/slab_common.c:1955 (P)
cipso_v4_sock_setattr+0x2f0/0x3f4 net/ipv4/cipso_ipv4.c:1914
netlbl_sock_setattr+0x240/0x334 net/netlabel/netlabel_kapi.c:1000
smack_netlbl_add+0xa8/0x158 security/smack/smack_lsm.c:2581
smack_inode_setsecurity+0x378/0x430 security/smack/smack_lsm.c:2912
security_inode_setsecurity+0x118/0x3c0 security/security.c:2706
__vfs_setxattr_noperm+0x174/0x5c4 fs/xattr.c:251
__vfs_setxattr_locked+0x1ec/0x218 fs/xattr.c:295
vfs_setxattr+0x158/0x2ac fs/xattr.c:321
do_setxattr fs/xattr.c:636 [inline]
file_setxattr+0x1b8/0x294 fs/xattr.c:646
path_setxattrat+0x2ac/0x320 fs/xattr.c:711
__do_sys_fsetxattr fs/xattr.c:761 [inline]
__se_sys_fsetxattr fs/xattr.c:758 [inline]
__arm64_sys_fsetxattr+0xc0/0xdc fs/xattr.c:758
__invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
el0_svc+0x58/0x180 arch/arm64/kernel/entry-common.c:879
el0t_64_sync_handler+0x84/0x12c arch/arm64/kernel/entry-common.c:898
el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:600
[1]:
Unable to handle kernel write to read-only memory at virtual address ffff8000891ac9a8
KASAN: probably user-memory-access in range [0x0000000448d64d40-0x0000000448d64d47]
Mem abort info:
ESR = 0x000000009600004e
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x0e: level 2 permission fault
Data abort info:
ISV = 0, ISS = 0x0000004e, ISS2 = 0x00000000
CM = 0, WnR = 1, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000207144000
[ffff8000891ac9a8] pgd=0000000000000000, p4d=100000020f950003, pud=100000020f951003, pmd=0040000201000781
Internal error: Oops: 000000009600004e [#1] SMP
Modules linked in:
CPU: 0 UID: 0 PID: 6946 Comm: syz.0.69 Not tainted 6.16.0-rc4-syzkaller-g7482bb149b9f #0 PREEMPT
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : kvfree_call_rcu+0x31c/0x3f0 mm/slab_common.c:1971
lr : add_ptr_to_bulk_krc_lock mm/slab_common.c:1838 [inline]
lr : kvfree_call_rcu+0xfc/0x3f0 mm/slab_common.c:1963
sp : ffff8000a28a7730
x29: ffff8000a28a7730 x28: 00000000fffffff5 x27: 1fffe00018b09bb3
x26: 0000000000000001 x25: ffff80008f66e000 x24: ffff00019beaf498
x23: ffff00019beaf4c0 x22: 0000000000000000 x21: ffff8000891ac9a0
x20: ffff8000891ac9a0 x19: 0000000000000000 x18: 00000000ffffffff
x17: ffff800093363000 x16: ffff80008052c6e4 x15: ffff700014514ecc
x14: 1ffff00014514ecc x13: 0000000000000004 x12: ffffffffffffffff
x11: ffff700014514ecc x10: 0000000000000001 x9 : 0000000000000001
x8 : ffff00019beaf7b4 x7 : ffff800080a94154 x6 : 0000000000000000
x5 : ffff8000935efa60 x4 : 0000000000000008 x3 : ffff80008052c7fc
x2 : 0000000000000001 x1 : ffff8000891ac9a0 x0 : 0000000000000001
Call trace:
kvfree_call_rcu+0x31c/0x3f0 mm/slab_common.c:1967 (P)
cipso_v4_sock_setattr+0x2f0/0x3f4 net/ipv4/cipso_ipv4.c:1914
netlbl_sock_setattr+0x240/0x334 net/netlabel/netlabel_kapi.c:1000
smack_netlbl_add+0xa8/0x158 security/smack/smack_lsm.c:2581
smack_inode_setsecurity+0x378/0x430 security/smack/smack_lsm.c:2912
security_inode_setsecurity+0x118/0x3c0 security/security.c:2706
__vfs_setxattr_noperm+0x174/0x5c4 fs/xattr.c:251
__vfs_setxattr_locked+0x1ec/0x218 fs/xattr.c:295
vfs_setxattr+0x158/0x2ac fs/xattr.c:321
do_setxattr fs/xattr.c:636 [inline]
file_setxattr+0x1b8/0x294 fs/xattr.c:646
path_setxattrat+0x2ac/0x320 fs/xattr.c:711
__do_sys_fsetxattr fs/xattr.c:761 [inline]
__se_sys_fsetxattr fs/xattr.c:758 [inline]
__arm64_sys_fsetxattr+0xc0/0xdc fs/xattr.c:758
__invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
el0_svc+0x58/0x180 arch/arm64/kernel/entry-common.c:879
el0t_64_sync_handler+0x84/0x12c arch/arm64/kernel/entry-common.c:898
el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:600
Code: aa1f03e2 52800023 97ee1e8d b4000195 (f90006b4)
Fixes: d25a92ccae6b ("net/smc: Introduce IPPROTO_SMC")
Reported-by: syzbot+40bf00346c3fe40f90f2@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/686d9b50.050a0220.1ffab7.0020.GAE@google.com/
Tested-by: syzbot+40bf00346c3fe40f90f2@syzkaller.appspotmail.com
Reported-by: syzbot+f22031fad6cbe52c70e7@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/686da0f3.050a0220.1ffab7.0022.GAE@google.com/
Reported-by: syzbot+271fed3ed6f24600c364@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=271fed3ed6f24600c364 # [2]
Link: https://lore.kernel.org/netdev/99f284be-bf1d-4bc4-a629-77b268522fff@huawei.com/ # [3]
Link: https://lore.kernel.org/netdev/20250331081003.1503211-1-wangliang74@huawei.com/ # [4]
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: D. Wythe <alibuda@linux.alibaba.com>
Reviewed-by: Wang Liang <wangliang74@huawei.com>
Link: https://patch.msgid.link/20250711060808.2977529-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This is a follow-up for commit eb1ac9ff6c4a5 ("ipv6: anycast: Don't
hold RTNL for IPV6_JOIN_ANYCAST.").
We should not add a new device lookup API without netdevice_tracker.
Let's pass netdevice_tracker to dev_get_by_flags_rcu() and rename it
with netdev_ prefix to match other newer APIs.
Note that we always use GFP_ATOMIC for netdev_hold() as it's expected
to be called under RCU.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/netdev/20250708184053.102109f6@kernel.org/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250711051120.2866855-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove a bunch of unused xdr_*decode* functions:
The last use of xdr_decode_netobj() was removed in 2021 by:
commit 7cf96b6d0104 ("lockd: Update the NLMv4 SHARE arguments decoder to
use struct xdr_stream")
The last use of xdr_decode_string_inplace() was removed in 2021 by:
commit 3049e974a7c7 ("lockd: Update the NLMv4 FREE_ALL arguments decoder
to use struct xdr_stream")
The last use of xdr_stream_decode_opaque() was removed in 2024 by:
commit fed8a17c61ff ("xdrgen: typedefs should use the built-in string and
opaque functions")
The functions xdr_stream_decode_string() and
xdr_stream_decode_opaque_dup() were both added in 2018 by the
commit 0e779aa70308 ("SUNRPC: Add helpers for decoding opaque and string
types")
but never used.
Remove them.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Link: https://lore.kernel.org/r/20250712233006.403226-1-linux@treblig.org
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
Replace the offset-based approach for tracking progress through a bucket
in the TCP table with one based on socket cookies. Remember the cookies
of unprocessed sockets from the last batch and use this list to
pick up where we left off or, in the case that the next socket
disappears between reads, find the first socket after that point that
still exists in the bucket and resume from there.
This approach guarantees that all sockets that existed when iteration
began and continue to exist throughout will be visited exactly once.
Sockets that are added to the table during iteration may or may not be
seen, but if they are they will be seen exactly once.
Signed-off-by: Jordan Rife <jordan@jrife.io>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
|
|
Prepare for the next patch that tracks cookies between iterations by
converting struct sock **batch to union bpf_tcp_iter_batch_item *batch
inside struct bpf_tcp_iter_state.
Signed-off-by: Jordan Rife <jordan@jrife.io>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
|
|
Get rid of the st_bucket_done field to simplify TCP iterator state and
logic. Before, st_bucket_done could be false if bpf_iter_tcp_batch
returned a partial batch; however, with the last patch ("bpf: tcp: Make
sure iter->batch always contains a full bucket snapshot"),
st_bucket_done == true is equivalent to iter->cur_sk == iter->end_sk.
Signed-off-by: Jordan Rife <jordan@jrife.io>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
|
|
Require that iter->batch always contains a full bucket snapshot. This
invariant is important to avoid skipping or repeating sockets during
iteration when combined with the next few patches. Before, there were
two cases where a call to bpf_iter_tcp_batch may only capture part of a
bucket:
1. When bpf_iter_tcp_realloc_batch() returns -ENOMEM.
2. When more sockets are added to the bucket while calling
bpf_iter_tcp_realloc_batch(), making the updated batch size
insufficient.
In cases where the batch size only covers part of a bucket, it is
possible to forget which sockets were already visited, especially if we
have to process a bucket in more than two batches. This forces us to
choose between repeating or skipping sockets, so don't allow this:
1. Stop iteration and propagate -ENOMEM up to userspace if reallocation
fails instead of continuing with a partial batch.
2. Try bpf_iter_tcp_realloc_batch() with GFP_USER just as before, but if
we still aren't able to capture the full bucket, call
bpf_iter_tcp_realloc_batch() again while holding the bucket lock to
guarantee the bucket does not change. On the second attempt use
GFP_NOWAIT since we hold onto the spin lock.
I did some manual testing to exercise the code paths where GFP_NOWAIT is
used and where ERR_PTR(err) is returned. I used the realloc test cases
included later in this series to trigger a scenario where a realloc
happens inside bpf_iter_tcp_batch and made a small code tweak to force
the first realloc attempt to allocate a too-small batch, thus requiring
another attempt with GFP_NOWAIT. Some printks showed both reallocs with
the tests passing:
Jun 27 00:00:53 crow kernel: again GFP_USER
Jun 27 00:00:53 crow kernel: again GFP_NOWAIT
Jun 27 00:00:53 crow kernel: again GFP_USER
Jun 27 00:00:53 crow kernel: again GFP_NOWAIT
With this setup, I also forced each of the bpf_iter_tcp_realloc_batch
calls to return -ENOMEM to ensure that iteration ends and that the
read() in userspace fails.
Signed-off-by: Jordan Rife <jordan@jrife.io>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
|
|
Prepare for the next patch which needs to be able to choose either
GFP_USER or GFP_NOWAIT for calls to bpf_iter_tcp_realloc_batch.
Signed-off-by: Jordan Rife <jordan@jrife.io>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
|
|
The return value of sock_sendmsg() is signed, and svc_tcp_sendto() wants
a signed value to return.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
This ends up returning AUTH_BADCRED when memory allocation fails today.
Fix it to return AUTH_FAILED, which better indicates a failure on the
server.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
rq_accept_statp should point to the location of the accept_status in the
reply. This field is not reset between RPCs so if svc_authenticate or
pg_authenticate return SVC_DENIED without setting the pointer, it could
result in the status being written to the wrong place.
This pointer starts its lifetime as NULL. Reset it on every iteration
so we get consistent behavior if this happens.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Nothing returns this error code.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
In the case of an unknown error code from svc_authenticate or
pg_authenticate, return AUTH_ERROR with a status of AUTH_FAILED. Also
add the other auth_stat value from RFC 5531, and document all the status
codes.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Convert the svc_wake_up tracepoint into svc_pool_thread_event class.
Have it also record the pool id, and add new tracepoints for when the
thread is already running and for when there are no idle threads.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
csum_partial_copy_to_xdr is only used inside the sunrpc module, so
remove the export.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
csum_partial_copy_to_xdr can handle a checksumming and non-checksumming
case and implements this using a callback, which leads to a lot of
boilerplate code and indirect calls in the fast path.
Switch to storing a need_checksum flag in struct xdr_skb_reader instead
to remove the indirect call and simplify the code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
The rqst argument to xdr_init_encode_pages is set to NULL by all callers,
and pages is always set to buf->pages. Remove the two arguments and
hardcode the assignments.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Rename the existing sha1_init() to sha1_init_raw(), since it conflicts
with the upcoming library function. This will later be removed, but
this keeps the kernel building for the introduction of the library.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
This reverts commit 465b9ee0ee7bc268d7f261356afd6c4262e48d82.
Such notifications fit better into core or nfnetlink_hook code,
following the NFNL_MSG_HOOK_GET message format.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Its a kernel implementation detail, at least at this time:
We can later decide to revert this patch if there is a compelling
reason, but then we should also remove the ifdef that prevents exposure
of ip_conntrack_status enum IPS_NAT_CLASH value in the uapi header.
Clash entries are not included in dumps (true for both old /proc
and ctnetlink) either. So for now exclude the clash bit when dumping.
Fixes: 7e5c6aa67e6f ("netfilter: nf_tables: add packets conntrack state to debug trace info")
Link: https://lore.kernel.org/netfilter-devel/aGwf3dCggwBlRKKC@strlen.de/
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|