| Age | Commit message (Collapse) | Author | Files | Lines |
|
Pretty self-explanatory, nobody needs these.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20260406212158.721806-4-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This has accumulated some fields which are no longer parsed by the core
or set by any driver. Remove them.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20260406212158.721806-3-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This is not used anywhere in the kernel.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20260406212158.721806-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The PERSISTENT RESERVE OUT command's PREEMPT service action provides two
different functions: 1. preempting persistent reservations and 2.
removing registrations. In the latter case the spec says:
b) ignore the contents of the SCOPE field and the TYPE field;
The code currently validates the SCOPE and TYPE fields even when PREEMPT
is called to remove registrations.
This patch achieves spec compliance by validating the SCOPE and TYPE
fields only when they will actually be used.
To confirm my interpretation of the specification I tested against HPE
3PAR storage and found the TYPE field is indeed ignored in this case.
Cc: Maurizio Lombardi <mlombard@redhat.com>
Cc: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://patch.msgid.link/20260402180342.126583-1-stefanha@redhat.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The devlink_fmsg_dump_skb function was incorrectly using the socket
type (sk->sk_type) instead of the socket family (sk->sk_family)
when filling the "family" field in the fast message dump.
This patch fixes this to properly display the socket family.
Fixes: 3dbfde7f6bc7b8 ("devlink: add devlink_fmsg_dump_skb() function")
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Link: https://patch.msgid.link/20260407022730.2393-1-lirongqing@baidu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Exact UNIX diag lookups hold a reference to the socket, but not to
u->path. Meanwhile, unix_release_sock() clears u->path under
unix_state_lock() and drops the path reference after unlocking.
Read the inode and device numbers for UNIX_DIAG_VFS while holding
unix_state_lock(), then emit the netlink attribute after dropping the
lock.
This keeps the VFS data stable while the reply is being built.
Fixes: 5f7b0569460b ("unix_diag: Unix inode info NLA")
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Co-developed-by: Yuan Tan <yuantan098@gmail.com>
Signed-off-by: Yuan Tan <yuantan098@gmail.com>
Suggested-by: Xin Liu <bird@lzu.edu.cn>
Tested-by: Ren Wei <enjou1224z@gmail.com>
Signed-off-by: Jiexun Wang <wangjiexun2025@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260407080015.1744197-1-n05ec@lzu.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Change the memory allocation for qp_cpu_map to use the actual number of
CPUs ('nr_cpu_ids') instead of the maximum possible CPUs ('NR_CPUS').
This saves memory on systems where the maximum CPU limit is much higher
than the active CPU count.
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Link: https://patch.msgid.link/20260331053245.1839-1-lirongqing@baidu.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Matthieu Baerts says:
====================
mptcp: autotune related improvement
Here are two patches from Paolo that have been crafted a couple of
months ago, but needed more validation because they were indirectly
causing instabilities in the sefltests. The root cause has been fixed in
'net' recently in commit 8c09412e584d ("selftests: mptcp: more stable
simult_flows tests").
These patches refactor the receive space and RTT estimator, overall
making DRS more correct while avoiding receive buffer drifting to
tcp_rmem[2], which in turn makes the throughput more stable and less
bursty, especially with high bandwidth and low delay environments.
Note that the first patch addresses a very old issue. 'net-next' is
targeted because the change is quite invasive and based on a recent
backlog refactor. The 'Fixes' tag is then there more as a FYI, because
backporting this patch will quickly be blocked due to large conflicts.
====================
Link: https://patch.msgid.link/20260407-net-next-mptcp-reduce-rbuf-v2-0-0d1d135bf6f6@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This is the MPTCP counter-part of commit ea33537d8292 ("tcp: add receive
queue awareness in tcp_rcv_space_adjust()").
Prior to this commit:
ESTAB 33165568 0 192.168.255.2:5201 192.168.255.1:53380 \
skmem:(r33076416,rb33554432,t0,tb91136,f448,w0,o0,bl0,d0)
After:
ESTAB 3279168 0 192.168.255.2:5201 192.168.255.1]:53042 \
skmem:(r3190912,rb3719956,t0,tb91136,f1536,w0,o0,bl0,d0)
Same throughput.
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260407-net-next-mptcp-reduce-rbuf-v2-2-0d1d135bf6f6@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The current MPTCP-level RTT estimator has several issues. On high speed
links, the MPTCP-level receive buffer auto-tuning happens with a
frequency well above the TCP-level's one. That in turn can cause
excessive/unneeded receive buffer increase.
On such links, the initial rtt_us value is considerably higher than the
actual delay, and the current mptcp_rcv_space_adjust() updates
msk->rcvq_space.rtt_us with a period equal to the such field previous
value. If the initial rtt_us is 40ms, its first update will happen after
40ms, even if the subflows see actual RTT orders of magnitude lower.
Additionally:
- setting the msk RTT to the maximum among all the subflows RTTs makes
DRS constantly overshooting the rcvbuf size when a subflow has
considerable higher latency than the other(s).
- during unidirectional bulk transfers with multiple active subflows,
the TCP-level RTT estimator occasionally sees considerably higher
value than the real link delay, i.e. when the packet scheduler reacts
to an incoming ACK on given subflow pushing data on a different
subflow.
- currently inactive but still open subflows (i.e. switched to backup
mode) are always considered when computing the msk-level RTT.
Address the all the issues above with a more accurate RTT estimation
strategy: the MPTCP-level RTT is set to the minimum of all the subflows
actually feeding data into the MPTCP receive buffer, using a small
sliding window.
While at it, also use EWMA to compute the msk-level scaling_ratio, to
that MPTCP can avoid traversing the subflow list is
mptcp_rcv_space_adjust().
Use some care to avoid updating msk and ssk level fields too often.
Fixes: a6b118febbab ("mptcp: add receive buffer auto-tuning")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260407-net-next-mptcp-reduce-rbuf-v2-1-0d1d135bf6f6@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This commit was originally adding the ability to add MPTCP endpoints
with ID 0 by accident. The in-kernel PM, handling MPTCP endpoints at the
net namespace level, is not supposed to handle endpoints with such ID,
because this ID 0 is reserved to the initial subflow, as mentioned in
the MPTCPv1 protocol [1], a per-connection setting.
Note that 'ip mptcp endpoint add id 0' stops early with an error, but
other tools might still request the in-kernel PM to create MPTCP
endpoints with this restricted ID 0.
In other words, it was wrong to call the mptcp_pm_has_addr_attr_id
helper to check whether the address ID attribute is set: if it was set
to 0, a new MPTCP endpoint would be created with ID 0, which is not
expected, and might cause various issues later.
Fixes: 584f38942626 ("mptcp: add needs_id for netlink appending addr")
Cc: stable@vger.kernel.org
Link: https://datatracker.ietf.org/doc/html/rfc8684#section-3.2-9 [1]
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260407-net-mptcp-revert-pm-needs-id-v2-1-7a25cbc324f8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The ehash table lookups are lockless and rely on
SLAB_TYPESAFE_BY_RCU to guarantee socket memory stability
during RCU read-side critical sections. Both tcp_prot and
tcpv6_prot have their slab caches created with this flag
via proto_register().
However, MPTCP's mptcp_subflow_init() copies tcpv6_prot into
tcpv6_prot_override during inet_init() (fs_initcall, level 5),
before inet6_init() (module_init/device_initcall, level 6) has
called proto_register(&tcpv6_prot). At that point,
tcpv6_prot.slab is still NULL, so tcpv6_prot_override.slab
remains NULL permanently.
This causes MPTCP v6 subflow child sockets to be allocated via
kmalloc (falling into kmalloc-4k) instead of the TCPv6 slab
cache. The kmalloc-4k cache lacks SLAB_TYPESAFE_BY_RCU, so
when these sockets are freed without SOCK_RCU_FREE (which is
cleared for child sockets by design), the memory can be
immediately reused. Concurrent ehash lookups under
rcu_read_lock can then access freed memory, triggering a
slab-use-after-free in __inet_lookup_established.
Fix this by splitting the IPv6-specific initialization out of
mptcp_subflow_init() into a new mptcp_subflow_v6_init(), called
from mptcp_proto_v6_init() before protocol registration. This
ensures tcpv6_prot_override.slab correctly inherits the
SLAB_TYPESAFE_BY_RCU slab cache.
Fixes: b19bc2945b40 ("mptcp: implement delegated actions")
Cc: stable@vger.kernel.org
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260406031512.189159-1-jiayuan.chen@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
sk_clone() initializes sk_tx_queue_mapping via sk_tx_queue_clear()
but does not initialize sk_rx_queue_mapping. Since this field is in
the sk_dontcopy region, it is neither copied from the parent socket
by sock_copy() nor zeroed by sk_prot_alloc() (called without
__GFP_ZERO from sk_clone).
Commit 03cfda4fa6ea ("tcp: fix another uninit-value
(sk_rx_queue_mapping)") attempted to fix this by introducing
sk_mark_napi_id_set() with force_set=true in tcp_child_process().
However, sk_mark_napi_id_set() -> sk_rx_queue_set() only writes
when skb_rx_queue_recorded(skb) is true. If the 3-way handshake
ACK arrives through a device that does not record rx_queue (e.g.
loopback or veth), sk_rx_queue_mapping remains uninitialized.
When a subsequent data packet arrives with a recorded rx_queue,
sk_mark_napi_id() -> sk_rx_queue_update() reads the uninitialized
field for comparison (force_set=false path), triggering KMSAN.
This was reproduced by establishing a TCP connection over loopback
(which does not call skb_record_rx_queue), then attaching a BPF TC
program on lo ingress to set skb->queue_mapping on data packets:
BUG: KMSAN: uninit-value in tcp_v4_do_rcv (net/ipv4/tcp_ipv4.c:1875)
tcp_v4_do_rcv (net/ipv4/tcp_ipv4.c:1875)
tcp_v4_rcv (net/ipv4/tcp_ipv4.c:2287)
ip_protocol_deliver_rcu (net/ipv4/ip_input.c:207)
ip_local_deliver_finish (net/ipv4/ip_input.c:242)
ip_local_deliver (net/ipv4/ip_input.c:262)
ip_rcv (net/ipv4/ip_input.c:573)
__netif_receive_skb (net/core/dev.c:6294)
process_backlog (net/core/dev.c:6646)
__napi_poll (net/core/dev.c:7710)
net_rx_action (net/core/dev.c:7929)
handle_softirqs (kernel/softirq.c:623)
do_softirq (kernel/softirq.c:523)
__local_bh_enable_ip (kernel/softirq.c:?)
__dev_queue_xmit (net/core/dev.c:?)
ip_finish_output2 (net/ipv4/ip_output.c:237)
ip_output (net/ipv4/ip_output.c:438)
__ip_queue_xmit (net/ipv4/ip_output.c:534)
__tcp_transmit_skb (net/ipv4/tcp_output.c:1693)
tcp_write_xmit (net/ipv4/tcp_output.c:3064)
tcp_sendmsg_locked (net/ipv4/tcp.c:?)
tcp_sendmsg (net/ipv4/tcp.c:1465)
inet_sendmsg (net/ipv4/af_inet.c:865)
sock_write_iter (net/socket.c:1195)
vfs_write (fs/read_write.c:688)
...
Uninit was created at:
kmem_cache_alloc_noprof (mm/slub.c:4873)
sk_prot_alloc (net/core/sock.c:2239)
sk_alloc (net/core/sock.c:2301)
inet_create (net/ipv4/af_inet.c:334)
__sock_create (net/socket.c:1605)
__sys_socket (net/socket.c:1747)
Fix this at the root by adding sk_rx_queue_clear() alongside
sk_tx_queue_clear() in sk_clone().
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260407084219.95718-1-jiayuan.chen@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Kioxia has another product that does not support the qTimestamp
attribute.
Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20260403-thgjfjt0e25baip-no-timestamp-v1-1-1ddb34225133@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The piece of code which processes the command line arguments and
populates NETIFS based on them is really unobvious. Rewrite it so that
the intention is clear and the code is easy to follow.
Suggested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20260407102058.867279-1-ioana.ciornei@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix typo in "synchronize".
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Link: https://patch.msgid.link/20260403133109.2744351-1-claudiu.beznea.uj@bp.renesas.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
This shows an improvement of 1.9% in reducing the CPU cycles and data
cache misses.
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Justin Chen <justin.chen@broadcom.com>
Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de>
Link: https://patch.msgid.link/20260408001813.635679-1-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
use_stdio was associated with struct perf_data and not perf_data_file
meaning there was implicit use of fd rather than fptr that may not be
safe. For example, in perf_data_file__write. Reorganize perf_data_file
to better abstract use_stdio, add kernel-doc and more consistently use
perf_data__ accessors so that use_stdio is better respected.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
As noticed in a sashiko review for a patch adding a missing libgen.h
in a file using basename():
https://sashiko.dev/#/patchset/20260402001740.2220481-1-acme%40kernel.org
So avoid these subtleties and instead reuse the gnu_basename() function
we had in srcline.c, renaming it to perf_basename() and replace
basename() calls with it, simplifying several cases by removing now
needless strdups.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Instead of using zalloc(nr_entries * sizeof_entry) that is what calloc()
does.
In some places where linux/zalloc.h isn't needed, remove it, add when
needed and was getting it indirectly.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
As suggested in an unrelated sashiko review:
https://sashiko.dev/#/patchset/20260407195145.2372104-1-acme%40kernel.org
"
Could a malformed perf.data file provide out-of-bounds values for cpu and
domain?
These variables are read directly from the file and used as indices for
cd_map and cd_map[cpu]->domains without any validation against
env->nr_cpus_avail or max_sched_domains.
Similar to the issue above, this is an existing lack of validation that
becomes apparent when looking at the allocation boundaries.
"
Validate it.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Sashiko suggests we use some reasonable max number of args to avoid
overflows when reading perf.data files, do it.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Those tables and variables don't change, better capture this by
explicitely using 'const'.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
`make check` will run sparse on the perf code base. A frequent warning
is "warning: symbol '...' was not declared. Should it be static?" Go
through and make global definitions without declarations static.
In some cases it is deliberate due to dlsym accessing the symbol, this
change doesn't clean up the missing declarations for perf test suites.
Sometimes things can opportunistically be made const.
Making somethings static exposed unused functions warnings, so
restructuring of ifdefs was necessary for that.
These changes reduce the size of the perf binary by 568 bytes.
Committer notes:
Refreshed the patch, the original one fell thru the cracks, updated the
size reduction.
Remove the trace-event-scripting.c changes, break the build, noticed
with container builds and with sashiko:
https://sashiko.dev/#/patchset/20260401215306.2152898-1-acme%40kernel.org
Also make two variables static to address another sashiko review
comment:
https://sashiko.dev/#/patchset/20260402001740.2220481-1-acme%40kernel.org
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Ankur Arora <ankur.a.arora@oracle.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yujie Liu <yujie.liu@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
In fef2a735167a827a ("perf tools: Kill die()") the die() function was
removed, but not the prototype in util.h, now when building with
LIBPERL=1, during a 'make -C tools/perf build-test' routine test, it is
failing as perl likes die() calls and then this clashes with this
remnant, remove it.
Fixes: fef2a735167a827a ("perf tools: Kill die()")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Fixing:
util/symbol.c: In function ‘symbol__config_symfs’:
util/symbol.c:2499:20: error: assignment discards ‘const’ qualifier from pointer target type [-Werror=discarded-qualifiers]
2499 | layout_str = strrchr(dir, ',');
|
With recent gcc/glibc.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
ipvlan and macvlan use queues to process broadcast/multicast packets
from a work queue.
Under attack these queues can drop packets.
Add MACVLAN_BROADCAST_BACKLOG drop_reason for macvlan broadcast queue.
Add IPVLAN_MULTICAST_BACKLOG drop_reason for ipvlan multicast queue.
Use different reasons as some deployments use both ipvlan and macvlan.
Also change ipvlan_rcv_frame() to use SKB_DROP_REASON_DEV_READY
when the device is not UP.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260407150710.1640747-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
codel_dump_stats() only runs with RTNL held,
reading fields that can be changed in qdisc fast path.
Add READ_ONCE()/WRITE_ONCE() annotations.
Alternative would be to acquire the qdisc spinlock, but our long-term
goal is to make qdisc dump operations lockless as much as we can.
tc_codel_xstats fields don't need to be latched atomically,
otherwise this bug would have been caught earlier.
No change in kernel size:
$ scripts/bloat-o-meter -t vmlinux.0 vmlinux
add/remove: 0/0 grow/shrink: 1/1 up/down: 3/-1 (2)
Function old new delta
codel_qdisc_dequeue 2462 2465 +3
codel_dump_stats 250 249 -1
Total: Before=29739919, After=29739921, chg +0.00%
Fixes: 76e3cc126bb2 ("codel: Controlled Delay AQM")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260407143053.1570620-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Replace the magic numbers with defines. Register names were obtained from
publicly available documentation[1]. This should make it clear what's going
on in the code.
1. RTL8201F/RTL8201FL/RTL8201FN Rev. 1.4 Datasheet
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Reviewed-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Nicolai Buchwitz nb@tipi-net.de
Link: https://patch.msgid.link/20260406201222.1043396-1-olek2@wp.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Lists of struct property_entry are supposed to be terminated with an
empty property, this driver currently seems to be allocating exactly the
amount of entry used.
Change the struct definition to leave an extra element for all
property_entry.
Fixes: c3e382ad6d15 ("net: txgbe: Add software nodes to support phylink")
Signed-off-by: Fabio Baltieri <fabio.baltieri@gmail.com>
Tested-by: Jiawen Wu <jiawenwu@trustnetic.com>
Link: https://patch.msgid.link/20260405222013.5347-1-fabio.baltieri@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If device_add(&sdkp->disk_dev) fails, put_device() runs
scsi_disk_release(), which frees the scsi_disk but leaves the gendisk
referenced. The device_add_disk() error path in sd_probe() calls
put_disk(gd); call put_disk(gd) here to mirror that cleanup.
Fixes: 265dfe8ebbab ("scsi: sd: Free scsi_disk device via put_device()")
Cc: stable@vger.kernel.org
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Yang Xiuwei <yangxiuwei@kylinos.cn>
Link: https://patch.msgid.link/20260330014952.152776-1-yangxiuwei@kylinos.cn
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
When trace->type.bit6 is set:
if (trace->type.bit6) {
...
queue = skb_get_tx_queue(dev, skb);
qdisc = rcu_dereference(queue->qdisc);
This code can lead to an out-of-bounds access of the dev->_tx[] array
when is_input is true. In such a case, the packet is on the RX path and
skb->queue_mapping contains the RX queue index of the ingress device. If
the ingress device has more RX queues than the egress device (dev) has
TX queues, skb_get_queue_mapping(skb) will exceed dev->num_tx_queues.
Add a check to avoid this situation since skb_get_tx_queue() does not
clamp the index. This issue has also revealed that per queue visibility
cannot be accurate and will be replaced later as a new feature.
While at it, add missing lock around qdisc_qstats_qlen_backlog(). The
function __ioam6_fill_trace_data() is called from both softirq and
process contexts, hence the use of spin_lock_bh() here.
Fixes: b63c5478e9cb ("ipv6: ioam: Support for Queue depth data field")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netdev/20260403214418.2233266-2-kuba@kernel.org/
Signed-off-by: Justin Iurman <justin.iurman@gmail.com>
Link: https://patch.msgid.link/20260404134137.24553-1-justin.iurman@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since commit 2884bf72fb8f ("net: bonding: fix use-after-free in
bond_xmit_broadcast()"), bond_is_last_slave() was only used in
bond_xmit_broadcast(). After the recent fix replaced that usage with
a simple index comparison, bond_is_last_slave() has no remaining
callers. bond_is_first_slave() likewise has no callers.
Remove both unused macros.
Signed-off-by: Xiang Mei <xmei5@asu.edu>
Link: https://patch.msgid.link/20260404220412.444753-1-xmei5@asu.edu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Reword the reviewer guidance based on behavior we see on the list.
Steer folks:
- towards sending tags
- away from process issues.
Reviewed-by: Joe Damato <joe@dama.to>
Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de>
Link: https://patch.msgid.link/20260406175334.3153451-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The global clock controller on the Nord SoC is partitioned into
GCC, SE_GCC, NE_GCC, and NW_GCC. Introduce driver support for each
of these controllers.
Signed-off-by: Taniya Das <taniya.das@oss.qualcomm.com>
[Shawn: Drop include of <linux/of.h> as the driver doesn't use any OF APIs]
Co-developed-by: Shawn Guo <shengchao.guo@oss.qualcomm.com>
Signed-off-by: Shawn Guo <shengchao.guo@oss.qualcomm.com>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260403-nord-clks-v1-6-018af14979fd@oss.qualcomm.com
[bjorn: Added missing .use_rpm to gcc_nord_desc]
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Florian Westphal says:
====================
netfilter: updates for net-next
1) Fix ancient sparse warnings in nf conntrack nat modules, from
Sun Jian.
2) Fix typo in enum description, from Jelle van der Waa.
3) remove redundant refetch of netns pointer in nf_conntrack_sip.
4) add a deprecation warning for dccp match.
We can extend the deadline later if needed, but plan atm is to
remove the feature.
5) remove nf_conntrack_h323 debug code that can read out-of-bounds
with malformed messages. This code was commented out, but better
remove this.
6+7) add more netlink policy validations in netfilter.
This could theoretically cause issues when a client sends e.g.
unsupported feature flags that were previously ignored, so we
may have to relax some changes. For now, try to be stricter and
reject upfront.
8+9) minor code cleanup in nft_set_pipapo (an nftables set backend).
10) Add nftables matching support fro double-tagged vlan and pppoe
frames, from Pablo Neira Ayuso.
11) Fix up indentation of debug messages in nf_conntrack_h323 conntrack
helper, from David Laight.
12) Add a helper to iterate to next flow action and bail out if the
maximum number of actions is reached, also from Pablo.
* tag 'nf-next-26-04-08' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
netfilter: nf_tables_offload: add nft_flow_action_entry_next() and use it
netfilter: nf_conntrack_h323: Correct indentation when H323_TRACE defined
netfilter: nft_meta: add double-tagged vlan and pppoe support
netfilter: nft_set_pipapo_avx2: remove redundant loop in lookup_slow
netfilter: nft_set_pipapo: increment data in one step
netfilter: nf_tables: add netlink policy based cap on registers
netfilter: add more netlink-based policy range checks
netfilter: nf_conntrack_h323: remove unreliable debug code in decode_octstr
netfilter: add deprecation warning for dccp support
netfilter: nf_conntrack_sip: remove net variable shadowing
netfilter: nf_tables: Fix typo in enum description
netfilter: use function typedefs for __rcu NAT helper hook pointers
====================
Link: https://patch.msgid.link/20260408060419.25258-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add RPMH clock support for the Nord SoC to allow enable/disable of the
clocks.
Signed-off-by: Taniya Das <taniya.das@oss.qualcomm.com>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260403-nord-clks-v1-5-018af14979fd@oss.qualcomm.com
[bjorn: sorted clk_rpmh_match_table[] addition]
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
Add a clock driver for the TCSR clock controller found on Nord SoC,
which provides refclks for PCIE, USB, SGMII, UFS subsystems.
[Shawn:
- Use compatible qcom,nord-tcsrcc
- Drop include of <linux/of.h> as the driver doesn't use any OF APIs]
Signed-off-by: Taniya Das <taniya.das@oss.qualcomm.com>
Co-developed-by: Shawn Guo <shengchao.guo@oss.qualcomm.com>
Signed-off-by: Shawn Guo <shengchao.guo@oss.qualcomm.com>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260403-nord-clks-v1-4-018af14979fd@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
Add device tree bindings for the global clock controller on Qualcomm
Nord platform. The global clock controller on Nord SoC is divided into
multiple clock controllers (GCC,SE_GCC,NE_GCC and NW_GCC). Add each of
the bindings to define the clock controllers.
Signed-off-by: Taniya Das <taniya.das@oss.qualcomm.com>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260403-nord-clks-v1-3-018af14979fd@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
Add bindings and update documentation compatible for RPMh clock
controller on Nord SoC.
Signed-off-by: Taniya Das <taniya.das@oss.qualcomm.com>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260403-nord-clks-v1-2-018af14979fd@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
The Nord SoC TCSR block provides CLKREF clocks for DP, PCIe, UFS, SGMII
and USB.
Signed-off-by: Taniya Das <taniya.das@oss.qualcomm.com>
[Shawn: Use compatible qcom,nord-tcsrcc rather than qcom,nord-tcsr]
Signed-off-by: Shawn Guo <shengchao.guo@oss.qualcomm.com>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260403-nord-clks-v1-1-018af14979fd@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:
====================
A few last-minute fixes:
- rfkill: prevent boundless event list
- rt2x00: fix USB resource management
- brcmfmac: validate firmware IDs
- brcmsmac: fix DMA free size
* tag 'wireless-2026-04-08' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
net: rfkill: prevent unlimited numbers of rfkill events from being created
wifi: rt2x00usb: fix devres lifetime
wifi: brcmfmac: validate bsscfg indices in IF events
wifi: brcmsmac: Fix dma_free_coherent() size
====================
Link: https://patch.msgid.link/20260408081802.111623-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
These macros are unused and to_dev_attr() will conflict with an upcoming
centralization of general attribute macros.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Link: https://patch.msgid.link/20260408-libsas-cleanup-v1-1-826325bbc0ba@weissschuh.net
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2026-04-08
1) Clear trailing padding in build_polexpire() to prevent
leaking unititialized memory. From Yasuaki Torimaru.
2) Fix aevent size calculation when XFRMA_IF_ID is used.
From Keenan Dong.
3) Wait for RCU readers during policy netns exit before
freeing the policy hash tables.
4) Fix dome too eaerly dropped references on the netdev
when uding transport mode. From Qi Tang.
5) Fix refcount leak in xfrm_migrate_policy_find().
From Kotlyarov Mihail.
6) Fix two fix info leaks in build_report() and
in build_mapping(). From Greg Kroah-Hartman.
7) Zero aligned sockaddr tail in PF_KEY exports.
From Zhengchuan Liang.
* tag 'ipsec-2026-04-08' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
net: af_key: zero aligned sockaddr tail in PF_KEY exports
xfrm_user: fix info leak in build_report()
xfrm_user: fix info leak in build_mapping()
xfrm: fix refcount leak in xfrm_migrate_policy_find
xfrm: hold dev ref until after transport_finish NF_HOOK
xfrm: Wait for RCU readers during policy netns exit
xfrm: account XFRMA_IF_ID in aevent size calculation
xfrm: clear trailing padding in build_polexpire()
====================
Link: https://patch.msgid.link/20260408095925.253681-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The storvsc driver has become stricter in handling SRB status codes
returned by the Hyper-V host. When using Virtual Fibre Channel (vFC)
passthrough, the host may return SRB_STATUS_DATA_OVERRUN for
PERSISTENT_RESERVE_IN commands if the allocation length in the CDB does
not match the host's expected response size.
Currently, this status is treated as a fatal error, propagating
Host_status=0x07 [DID_ERROR] to the SCSI mid-layer. This causes
userspace storage utilities (such as sg_persist) to fail with transport
errors, even when the host has actually returned the requested
reservation data in the buffer.
Refactor the existing command-specific workarounds into a new helper
function, storvsc_host_mishandles_cmd(), and add PERSISTENT_RESERVE_IN
to the list of commands where SRB status errors should be suppressed for
vFC devices. This ensures that the SCSI mid-layer processes the returned
data buffer instead of terminating the command.
Signed-off-by: Li Tian <litian@redhat.com>
Reviewed-by: Long Li <longli@microsoft.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Link: https://patch.msgid.link/20260406015344.12566-1-litian@redhat.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:
====================
pull request (net-next): ipsec-next 2026-04-08
1) Update outdated comment in xfrm_dst_check().
From kexinsun.
2) Drop support for HMAC-RIPEMD-160 from IPsec.
From Eric Biggers.
* tag 'ipsec-next-2026-04-08' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next:
xfrm: Drop support for HMAC-RIPEMD-160
xfrm: update outdated comment
====================
Link: https://patch.msgid.link/20260408094258.148555-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
Here are two batman-adv bugfixes:
- reject oversized global TT response buffers, by Ruide Cao
- hold claim backbone gateways by reference, by Haoze Xie
* tag 'batadv-net-pullrequest-20260408' of https://git.open-mesh.org/linux-merge:
batman-adv: hold claim backbone gateways by reference
batman-adv: reject oversized global TT response buffers
====================
Link: https://patch.msgid.link/20260408110255.976389-1-sw@simonwunderlich.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Florian Westphal says:
====================
netfilter updates for net
I only included crash fixes, as we're closer to a release, rest will
be handled via -next.
1) Fix a NULL pointer dereference in ip_vs_add_service error path, from
Weiming Shi, bug added in 6.2 development cycle.
2) Don't leak kernel data bytes from allocator to userspace: nfnetlink_log
needs to init the trailing NLMSG_DONE terminator. From Xiang Mei.
3) xt_multiport match lacks range validation, bogus userspace request will
cause out-of-bounds read. From Ren Wei.
4) ip6t_eui64 match must reject packets with invalid mac header before
calling eth_hdr. Make existing check unconditional. From Zhengchuan
Liang.
5) nft_ct timeout policies are free'd via kfree() while they may still
be reachable by other cpus that process a conntrack object that
uses such a timeout policy. Existing reaping of entries is not
sufficient because it doesn't wait for a grace period. Use kfree_rcu().
From Tuan Do.
6/7) Make nfnetlink_queue hash table per queue. As-is we can hit a page
fault in case underlying page of removed element was free'd. Per-queue
hash prevents parallel lookups. This comes with a test case that
demonstrates the bug, from Fernando Fernandez Mancera.
* tag 'nf-26-04-08' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
selftests: nft_queue.sh: add a parallel stress test
netfilter: nfnetlink_queue: make hash table per queue
netfilter: nft_ct: fix use-after-free in timeout object destroy
netfilter: ip6t_eui64: reject invalid MAC header for all packets
netfilter: xt_multiport: validate range encoding in checkentry
netfilter: nfnetlink_log: initialize nfgenmsg in NLMSG_DONE terminator
ipvs: fix NULL deref in ip_vs_add_service error path
====================
Link: https://patch.msgid.link/20260408163512.30537-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
David Howells says:
====================
rxrpc: Miscellaneous fixes
Here are some fixes for rxrpc:
(1) Fix key quota calculation.
(2) Fix a memory leak.
(3) Fix rxrpc_new_client_call_for_sendmsg() to substitute NULL for an
empty key.
Might want to remove this substitution entirely or handle it in
rxrpc_init_client_call_security() instead.
(4) Fix deletion of call->link to be RCU safe.
(5) Fix missing bounds checks when parsing RxGK tickets.
(6) Fix use of wrong skbuff to get challenge serial number. Also actually
substitute the newer response skbuff and release the older one.
(7) Fix unexpected RACK timer warning to report old mode.
(8) Fix call key refcount leak.
(9) Fix the interaction of jumbograms with Tx window space, setting the
request-ack flag when the window space is getting low, typically
because each jumbogram take a big bite out of the window and fewer UDP
packets get traded.
(10) Don't call rxrpc_put_call() with a NULL pointer.
(11) Reject undecryptable rxkad response tickets by checking result of
decryption.
(12) Fix buffer bounds calculation in the RESPONSE authenticator parser.
(13) Fix oversized response length check.
(14) Fix refcount leak on multiple setting of server keyring.
(15) Fix checks made by RXRPC_SECURITY_KEY and RXRPC_SECURITY_KEYRING (both
should be allowed).
(16) Fix lack of result checking on calls to crypto_skcipher_en/decrypt().
(17) Fix token_len limit check in rxgk_verify_response().
(18) Fix rxgk context leak in rxgk_verify_response().
(19) Fix read beyond end of buffer in rxgk_do_verify_authenticator().
(20) Fix parsing of RESPONSE packet on a connection that has already been set
from a prior response.
(21) Fix size of buffers used for rendering addresses into for procfiles.
====================
Link: https://patch.msgid.link/20260408121252.2249051-1-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The AF_RXRPC procfs helpers format local and remote socket addresses into
fixed 50-byte stack buffers with "%pISpc".
That is too small for the longest current-tree IPv6-with-port form the
formatter can produce. In lib/vsprintf.c, the compressed IPv6 path uses a
dotted-quad tail not only for v4mapped addresses, but also for ISATAP
addresses via ipv6_addr_is_isatap().
As a result, a case such as
[ffff:ffff:ffff:ffff:0:5efe:255.255.255.255]:65535
is possible with the current formatter. That is 50 visible characters, so
51 bytes including the trailing NUL, which does not fit in the existing
char[50] buffers used by net/rxrpc/proc.c.
Size the buffers from the formatter's maximum textual form and switch the
call sites to scnprintf().
Changes since v1:
- correct the changelog to cite the actual maximum current-tree case
explicitly
- frame the proof around the ISATAP formatting path instead of the earlier
mapped-v4 example
Fixes: 75b54cb57ca3 ("rxrpc: Add IPv6 support")
Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Anderson Nascimento <anderson@allelesecurity.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
cc: stable@kernel.org
Link: https://patch.msgid.link/20260408121252.2249051-22-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|