Age | Commit message (Collapse) | Author | Files | Lines |
|
This change adjusts bpftool's type marking logic, as used in conjunction
with TYPE_EXISTS relocations, to correctly recognize and handle the
RESTRICT BTF kind.
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220623212205.2805002-1-deso@posteo.net/T/#m4c75205145701762a4b398e0cdb911d5b5305ffc
Link: https://lore.kernel.org/bpf/20220706212855.1700615-2-deso@posteo.net
|
|
Lukas reported that after commit f36600634282 ("libbpf: move xsk.{c,h}
into selftests/bpf") MAINTAINERS file needed an update.
In the meantime, Magnus removed AF_XDP samples in commit cfb5a2dbf141
("bpf, samples: Remove AF_XDP samples"), but selftests part still misses
its entry in MAINTAINERS.
Now that xdpxceiver became xskxceiver, tools/testing/selftests/bpf/*xsk*
will match all of the files related to AF_XDP testing (test_xsk.sh,
xskxceiver, xsk_prereqs.sh, xsk.{c,h}).
Reported-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220707111613.49031-3-maciej.fijalkowski@intel.com
|
|
Recently, xsk part of libbpf was moved to selftests/bpf directory and
lives on its own because there is an AF_XDP testing application that
needs it called xdpxceiver. That name makes it a bit hard to indicate
who maintains it as there are other XDP samples in there, whereas this
one is strictly about AF_XDP.
Do s/xdpxceiver/xskxceiver so that it will be easier to figure out who
maintains it. A follow-up patch will correct MAINTAINERS file.
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220707111613.49031-2-maciej.fijalkowski@intel.com
|
|
Since xsk APIs has been removed from libbpf, let's clean up the
BPF docs simutaneously.
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/bpf/20220708042736.669132-1-pulehui@huawei.com
|
|
When building with Clang we encounter the following warnings:
| net/l2tp/l2tp_debugfs.c:187:40: error: format specifies type 'unsigned
| short' but the argument has type 'u32' (aka 'unsigned int')
| [-Werror,-Wformat] seq_printf(m, " nr %hu, ns %hu\n", session->nr,
| session->ns);
-
| net/l2tp/l2tp_debugfs.c:196:32: error: format specifies type 'unsigned
| short' but the argument has type 'int' [-Werror,-Wformat]
| session->l2specific_type, l2tp_get_l2specific_len(session));
-
| net/l2tp/l2tp_debugfs.c:219:6: error: format specifies type 'unsigned
| short' but the argument has type 'u32' (aka 'unsigned int')
| [-Werror,-Wformat] session->nr, session->ns,
Both session->nr and ->nc are of type `u32`. The currently used format
specifier is `%hu` which describes a `u16`. My proposed fix is to listen
to Clang and use the correct format specifier `%u`.
For the warning at line 196, l2tp_get_l2specific_len() returns an int
and should therefore be using the `%d` format specifier.
Link: https://github.com/ClangBuiltLinux/linux/issues/378
Signed-off-by: Justin Stitt <justinstitt@google.com>
Acked-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The Tx NAPI should use netif_napi_add_tx().
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Wells Lu <wellslutw@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
netif_napi_add_tx() does not require the weight argument.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Kuniyuki Iwashima says:
====================
sysctl: Fix data-races around ipv4_table.
A sysctl variable is accessed concurrently, and there is always a chance
of data-race. So, all readers and writers need some basic protection to
avoid load/store-tearing.
The first half of this series changes some proc handlers used in ipv4_table
to use READ_ONCE() and WRITE_ONCE() internally to fix data-races on the
sysctl side. Then, the second half adds READ_ONCE() to the other readers
of ipv4_table.
Changes:
v2:
* Drop some changes that makes backporting difficult
* First cleanup patch
* Lockless helpers and .proc_handler changes
* Drop the tracing part for .sysctl_mem
* Steve already posted a fix
* Drop int-to-bool change for cipso
* Should be posted to net-next later
* Drop proc_dobool() change
* Can be included in another series
v1: https://lore.kernel.org/netdev/20220706052130.16368-1-kuniyu@amazon.com/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
While reading sysctl_fib_sync_mem, it can be changed concurrently.
So, we need to add READ_ONCE() to avoid a data-race.
Fixes: 9ab948a91b2c ("ipv4: Allow amount of dirty memory from fib resizing to be controllable")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
While reading icmp sysctl variables, they can be changed concurrently.
So, we need to add READ_ONCE() to avoid data-races.
Fixes: 4cdf507d5452 ("icmp: add a global rate limitation")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
While reading cipso sysctl variables, they can be changed concurrently.
So, we need to add READ_ONCE() to avoid data-races.
Fixes: 446fda4f2682 ("[NetLabel]: CIPSOv4 engine")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
While reading .sysctl_mem, it can be changed concurrently.
So, we need to add READ_ONCE() to avoid data-races.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
While reading inetpeer sysctl variables, they can be changed
concurrently. So, we need to add READ_ONCE() to avoid data-races.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
While reading sysctl_tcp_max_orphans, it can be changed concurrently.
So, we need to add READ_ONCE() to avoid a data-race.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A sysctl variable is accessed concurrently, and there is always a chance
of data-race. So, all readers and writers need some basic protection to
avoid load/store-tearing.
This patch changes proc_dointvec_jiffies() to use READ_ONCE() and
WRITE_ONCE() internally to fix data-races on the sysctl side. For now,
proc_dointvec_jiffies() itself is tolerant to a data-race, but we still
need to add annotations on the other subsystem's side.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A sysctl variable is accessed concurrently, and there is always a chance
of data-race. So, all readers and writers need some basic protection to
avoid load/store-tearing.
This patch changes proc_doulongvec_minmax() to use READ_ONCE() and
WRITE_ONCE() internally to fix data-races on the sysctl side. For now,
proc_doulongvec_minmax() itself is tolerant to a data-race, but we still
need to add annotations on the other subsystem's side.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A sysctl variable is accessed concurrently, and there is always a chance
of data-race. So, all readers and writers need some basic protection to
avoid load/store-tearing.
This patch changes proc_douintvec_minmax() to use READ_ONCE() and
WRITE_ONCE() internally to fix data-races on the sysctl side. For now,
proc_douintvec_minmax() itself is tolerant to a data-race, but we still
need to add annotations on the other subsystem's side.
Fixes: 61d9b56a8920 ("sysctl: add unsigned int range support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A sysctl variable is accessed concurrently, and there is always a chance
of data-race. So, all readers and writers need some basic protection to
avoid load/store-tearing.
This patch changes proc_dointvec_minmax() to use READ_ONCE() and
WRITE_ONCE() internally to fix data-races on the sysctl side. For now,
proc_dointvec_minmax() itself is tolerant to a data-race, but we still
need to add annotations on the other subsystem's side.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A sysctl variable is accessed concurrently, and there is always a chance
of data-race. So, all readers and writers need some basic protection to
avoid load/store-tearing.
This patch changes proc_douintvec() to use READ_ONCE() and WRITE_ONCE()
internally to fix data-races on the sysctl side. For now, proc_douintvec()
itself is tolerant to a data-race, but we still need to add annotations on
the other subsystem's side.
Fixes: e7d316a02f68 ("sysctl: handle error writing UINT_MAX to u32 fields")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A sysctl variable is accessed concurrently, and there is always a chance
of data-race. So, all readers and writers need some basic protection to
avoid load/store-tearing.
This patch changes proc_dointvec() to use READ_ONCE() and WRITE_ONCE()
internally to fix data-races on the sysctl side. For now, proc_dointvec()
itself is tolerant to a data-race, but we still need to add annotations on
the other subsystem's side.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The trace event sock_exceed_buf_limit saves the prot->sysctl_mem pointer
and then dereferences it in the TP_printk() portion. This is unsafe as the
TP_printk() portion is executed at the time the buffer is read. That is,
it can be seconds, minutes, days, months, even years later. If the proto
is freed, then this dereference will can also lead to a kernel crash.
Instead, save the sysctl_mem array into the ring buffer and have the
TP_printk() reference that instead. This is the proper and safe way to
read pointers in trace events.
Link: https://lore.kernel.org/all/20220706052130.16368-12-kuniyu@amazon.com/
Cc: stable@vger.kernel.org
Fixes: 3847ce32aea9f ("core: add tracepoints for queueing skb to rcvbuf")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There are some VM configurations which have Skylake model but do not
support IBPB. In those cases, when using retbleed=ibpb, userspace is going
to be killed and kernel is going to panic.
If the CPU does not support IBPB, warn and proceed with the auto option. Also,
do not fallback to IBPB on AMD/Hygon systems if it is not supported.
Fixes: 3ebc17006888 ("x86/bugs: Add retbleed=ibpb")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Commit 13bbbfbea759 ("bpf: Add bpf_dynptr_read and bpf_dynptr_write")
added the bpf_dynptr_write() and bpf_dynptr_read() APIs.
However, it will be needed for some dynptr types to pass in flags as
well (e.g. when writing to a skb, the user may like to invalidate the
hash or recompute the checksum).
This patch adds a "u64 flags" arg to the bpf_dynptr_read() and
bpf_dynptr_write() APIs before their UAPI signature freezes where
we then cannot change them anymore with a 5.19.x released kernel.
Fixes: 13bbbfbea759 ("bpf: Add bpf_dynptr_read and bpf_dynptr_write")
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20220706232547.4016651-1-joannelkoong@gmail.com
|
|
The IOMMU mailing list has moved to iommu@lists.linux.dev
and the old list should bounce by now. Remove it from the
MAINTAINERS file.
Cc: stable@vger.kernel.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20220706103331.10215-1-joro@8bytes.org
|
|
Conor Dooley says:
====================
PolarFire SoC macb reset support
The Cadence MACBs on PolarFire SoC (MPFS) have reset capability and are
compatible with the zynqmp's init function. I have removed the zynqmp
specific comments from that function & renamed it to reflect what it
does, since it is no longer zynqmp only.
MPFS's MACB had previously used the generic binding, so I also added
the required specific binding.
For v2, I noticed some low hanging cleanup fruit so there are extra
patches added for that:
moving the init function out of the config structs, aligning the
alignment of the zynqmp & default config structs with the other dozen
or so structs & simplifing the error paths to use dev_err_probe().
Feel free to apply as many or as few of those as you like.
====================
Link: https://lore.kernel.org/r/20220706095129.828253-1-conor.dooley@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
init_reset_optional() is somewhat oddly placed amidst the macb_config
struct definitions. Move it to a more reasonable location alongside
the fu540 init functions.
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The error handling paths in init_reset_optional() can all be
simplified to return dev_err_probe(). Do so.
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The various macb_config structs have taken different approaches to
alignment when broken over newlines. Pick one style and make them
match.
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
To date, the Microchip PolarFire SoC (MPFS) has been using the
cdns,macb compatible, however the generic device does not have reset
support. Add a new compatible & .data for MPFS to hook into the reset
functionality added for zynqmp support (and make the zynqmp init
function generic in the process).
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Until now the PolarFire SoC (MPFS) has been using the generic
"cdns,macb" compatible but has optional reset support. Add a specific
compatible which falls back to the currently used generic binding.
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When building with clang we encounter this warning:
| net/l2tp/l2tp_ppp.c:1557:6: error: format specifies type 'unsigned
| short' but the argument has type 'u32' (aka 'unsigned int')
| [-Werror,-Wformat] session->nr, session->ns,
Both session->nr and session->ns are of type u32. The format specifier
previously used is `%hu` which would truncate our unsigned integer from
32 to 16 bits. This doesn't seem like intended behavior, if it is then
perhaps we need to consider suppressing the warning with pragma clauses.
This patch should get us closer to the goal of enabling the -Wformat
flag for Clang builds.
Link: https://github.com/ClangBuiltLinux/linux/issues/378
Signed-off-by: Justin Stitt <justinstitt@google.com>
Acked-by: Guillaume Nault <gnault@redhat.com>
Link: https://lore.kernel.org/r/20220706230833.535238-1-justinstitt@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Accidentally noticed, that this driver is the only user of
while (time_after(jiffies...)).
It looks like typo, because likely this while loop will finish after 1st
iteration, because time_after() returns true when 1st argument _is after_
2nd one.
There is one possible problem with this poll loop: the scheduler could put
the thread to sleep, and it does not get woken up for
OCELOT_FDMA_CH_SAFE_TIMEOUT_US. During that time, the hardware has done
its thing, but you exit the while loop and return -ETIMEDOUT.
Fix it by using sane poll API that avoids all problems described above
Fixes: 753a026cfec1 ("net: ocelot: add FDMA support")
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220706132845.27968-1-paskripkin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5 fixes 2022-07-06
This series provides bug fixes to mlx5 driver.
* tag 'mlx5-fixes-2022-07-06' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5e: Ring the TX doorbell on DMA errors
net/mlx5e: Fix capability check for updating vnic env counters
net/mlx5e: CT: Use own workqueue instead of mlx5e priv
net/mlx5: Lag, correct get the port select mode str
net/mlx5e: Fix enabling sriov while tc nic rules are offloaded
net/mlx5e: kTLS, Fix build time constant test in RX
net/mlx5e: kTLS, Fix build time constant test in TX
net/mlx5: Lag, decouple FDB selection and shared FDB
net/mlx5: TC, allow offload from uplink to other PF's VF
====================
Link: https://lore.kernel.org/r/20220706231309.38579-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Renaming interfaces using udevd depends on the interface being registered
before its netdev is registered. Otherwise, udevd reads an empty
phys_port_name value, resulting in the interface not being renamed.
Fix this by registering the interface before registering its netdev
by invoking am65_cpsw_nuss_register_devlink() before invoking
register_netdev() for the interface.
Move the function call to devlink_port_type_eth_set(), invoking it after
register_netdev() is invoked, to ensure that netlink notification for the
port state change is generated after the netdev is completely initialized.
Fixes: 58356eb31d60 ("net: ti: am65-cpsw-nuss: Add devlink support")
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Link: https://lore.kernel.org/r/20220706070208.12207-1-s-vadapalli@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There is a long-standing issue with the Synopsys DWC Ethernet driver
for Tegra194 where random system crashes have been observed [0]. The
problem occurs when the split header feature is enabled in the stmmac
driver. In the bad case, a larger than expected buffer length is
received and causes the calculation of the total buffer length to
overflow. This results in a very large buffer length that causes the
kernel to crash. Why this larger buffer length is received is not clear,
however, the feedback from the NVIDIA design team is that the split
header feature is not supported for Tegra194. Therefore, disable split
header support for Tegra194 to prevent these random crashes from
occurring.
[0] https://lore.kernel.org/linux-tegra/b0b17697-f23e-8fa5-3757-604a86f3a095@nvidia.com/
Fixes: 67afd6d1cfdf ("net: stmmac: Add Split Header support and enable it in XGMAC cores")
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://lore.kernel.org/r/20220706083913.13750-1-jonathanh@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently NIC packet receiving performance based on page pool deteriorates
occasionally. To analysis the causes of this problem page allocation stats
are collected. Here are the stats when NIC rx performance deteriorates:
bandwidth(Gbits/s) 16.8 6.91
rx_pp_alloc_fast 13794308 21141869
rx_pp_alloc_slow 108625 166481
rx_pp_alloc_slow_h 0 0
rx_pp_alloc_empty 8192 8192
rx_pp_alloc_refill 0 0
rx_pp_alloc_waive 100433 158289
rx_pp_recycle_cached 0 0
rx_pp_recycle_cache_full 0 0
rx_pp_recycle_ring 362400 420281
rx_pp_recycle_ring_full 6064893 9709724
rx_pp_recycle_released_ref 0 0
The rx_pp_alloc_waive count indicates that a large number of pages' numa
node are inconsistent with the NIC device numa node. Therefore these pages
can't be reused by the page pool. As a result, many new pages would be
allocated by __page_pool_alloc_pages_slow which is time consuming. This
causes the NIC rx performance fluctuations.
The main reason of huge numa mismatch pages in page pool is that page pool
uses alloc_pages_bulk_array to allocate original pages. This function is
not suitable for page allocation in NUMA scenario. So this patch uses
alloc_pages_bulk_array_node which has a NUMA id input parameter to ensure
the NUMA consistent between NIC device and allocated pages.
Repeated NIC rx performance tests are performed 40 times. NIC rx bandwidth
is higher and more stable compared to the datas above. Here are three test
stats, the rx_pp_alloc_waive count is zero and rx_pp_alloc_slow which
indicates pages allocated from slow patch is relatively low.
bandwidth(Gbits/s) 93 93.9 93.8
rx_pp_alloc_fast 60066264 61266386 60938254
rx_pp_alloc_slow 16512 16517 16539
rx_pp_alloc_slow_ho 0 0 0
rx_pp_alloc_empty 16512 16517 16539
rx_pp_alloc_refill 473841 481910 481585
rx_pp_alloc_waive 0 0 0
rx_pp_recycle_cached 0 0 0
rx_pp_recycle_cache_full 0 0 0
rx_pp_recycle_ring 29754145 30358243 30194023
rx_pp_recycle_ring_full 0 0 0
rx_pp_recycle_released_ref 0 0 0
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Link: https://lore.kernel.org/r/20220705113515.54342-1-huangguangbin2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 5.19
- another bogus identifier quirk (Keith Busch)
- use struct group in the tracer to avoid a gcc warning (Keith Busch)"
* tag 'nvme-5.19-2022-07-07' of git://git.infradead.org/nvme:
nvme: use struct group for generic command dwords
nvme-pci: phison e16 has bogus namespace ids
|
|
32 bit sqe->cmd_op is an union with 64 bit values. It's always a good
idea to do padding explicitly. Also zero check it in prep, so it can be
used in the future if needed without compatibility concerns.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/e6b95a05e970af79000435166185e85b196b2ba2.1657202417.git.asml.silence@gmail.com
[axboe: turn bitwise OR into logical variant]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This patch ensures that the clock notifier is unregistered
when driver probe is returning error.
Fixes: df8eb5691c48 ("i2c: Add driver for Cadence I2C controller")
Signed-off-by: Satish Nagireddy <satish.nagireddy@getcruise.com>
Tested-by: Lars-Peter Clausen <lars@metafoo.de>
Reviewed-by: Michal Simek <michal.simek@amd.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux
Pull a devfreq fix for 5.19-rc6 from Chanwoo Choi:
"- Fix exynos-bus NULL pointer dereference by correctly using the local
generated freq_table to output the debug values instead of using the
profile freq_table that is not used in the driver."
* tag 'devfreq-fixes-for-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux:
PM / devfreq: exynos-bus: Fix NULL pointer dereference
|
|
Fix exynos-bus NULL pointer dereference by correctly using the local
generated freq_table to output the debug values instead of using the
profile freq_table that is not used in the driver.
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Fixes: b5d281f6c16d ("PM / devfreq: Rework freq_table to be local to devfreq struct")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
|
|
No conflicts.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Kajetan Puchalski reports crash on ARM, with backtrace of:
__nf_ct_delete_from_lists
nf_ct_delete
early_drop
__nf_conntrack_alloc
Unlike atomic_inc_not_zero, refcount_inc_not_zero is not a full barrier.
conntrack uses SLAB_TYPESAFE_BY_RCU, i.e. it is possible that a 'newly'
allocated object is still in use on another CPU:
CPU1 CPU2
encounter 'ct' during hlist walk
delete_from_lists
refcount drops to 0
kmem_cache_free(ct);
__nf_conntrack_alloc() // returns same object
refcount_inc_not_zero(ct); /* might fail */
/* If set, ct is public/in the hash table */
test_bit(IPS_CONFIRMED_BIT, &ct->status);
In case CPU1 already set refcount back to 1, refcount_inc_not_zero()
will succeed.
The expected possibilities for a CPU that obtained the object 'ct'
(but no reference so far) are:
1. refcount_inc_not_zero() fails. CPU2 ignores the object and moves to
the next entry in the list. This happens for objects that are about
to be free'd, that have been free'd, or that have been reallocated
by __nf_conntrack_alloc(), but where the refcount has not been
increased back to 1 yet.
2. refcount_inc_not_zero() succeeds. CPU2 checks the CONFIRMED bit
in ct->status. If set, the object is public/in the table.
If not, the object must be skipped; CPU2 calls nf_ct_put() to
un-do the refcount increment and moves to the next object.
Parallel deletion from the hlists is prevented by a
'test_and_set_bit(IPS_DYING_BIT, &ct->status);' check, i.e. only one
cpu will do the unlink, the other one will only drop its reference count.
Because refcount_inc_not_zero is not a full barrier, CPU2 may try to
delete an object that is not on any list:
1. refcount_inc_not_zero() successful (refcount inited to 1 on other CPU)
2. CONFIRMED test also successful (load was reordered or zeroing
of ct->status not yet visible)
3. delete_from_lists unlinks entry not on the hlist, because
IPS_DYING_BIT is 0 (already cleared).
2) is already wrong: CPU2 will handle a partially initited object
that is supposed to be private to CPU1.
Add needed barriers when refcount_inc_not_zero() is successful.
It also inserts a smp_wmb() before the refcount is set to 1 during
allocation.
Because other CPU might still see the object, refcount_set(1)
"resurrects" it, so we need to make sure that other CPUs will also observe
the right content. In particular, the CONFIRMED bit test must only pass
once the object is fully initialised and either in the hash or about to be
inserted (with locks held to delay possible unlink from early_drop or
gc worker).
I did not change flow_offload_alloc(), as far as I can see it should call
refcount_inc(), not refcount_inc_not_zero(): the ct object is attached to
the skb so its refcount should be >= 1 in all cases.
v2: prefer smp_acquire__after_ctrl_dep to smp_rmb (Will Deacon).
v3: keep smp_acquire__after_ctrl_dep close to refcount_inc_not_zero call
add comment in nf_conntrack_netlink, no control dependency there
due to locks.
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/all/Yr7WTfd6AVTQkLjI@e126311.manchester.arm.com/
Reported-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
Diagnosed-by: Will Deacon <will@kernel.org>
Fixes: 719774377622 ("netfilter: conntrack: convert to refcount_t api")
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Will Deacon <will@kernel.org>
|
|
Classic BPF has a way to load bytes starting from the mac header.
Some skbs do not have a mac header, and skb_mac_header()
in this case is returning a pointer that 65535 bytes after
skb->head.
Existing range check in bpf_internal_load_pointer_neg_helper()
was properly kicking and no illegal access was happening.
New sanity check in skb_mac_header() is firing, so we need
to avoid it.
WARNING: CPU: 1 PID: 28990 at include/linux/skbuff.h:2785 skb_mac_header include/linux/skbuff.h:2785 [inline]
WARNING: CPU: 1 PID: 28990 at include/linux/skbuff.h:2785 bpf_internal_load_pointer_neg_helper+0x1b1/0x1c0 kernel/bpf/core.c:74
Modules linked in:
CPU: 1 PID: 28990 Comm: syz-executor.0 Not tainted 5.19.0-rc4-syzkaller-00865-g4874fb9484be #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/29/2022
RIP: 0010:skb_mac_header include/linux/skbuff.h:2785 [inline]
RIP: 0010:bpf_internal_load_pointer_neg_helper+0x1b1/0x1c0 kernel/bpf/core.c:74
Code: ff ff 45 31 f6 e9 5a ff ff ff e8 aa 27 40 00 e9 3b ff ff ff e8 90 27 40 00 e9 df fe ff ff e8 86 27 40 00 eb 9e e8 2f 2c f3 ff <0f> 0b eb b1 e8 96 27 40 00 e9 79 fe ff ff 90 41 57 41 56 41 55 41
RSP: 0018:ffffc9000309f668 EFLAGS: 00010216
RAX: 0000000000000118 RBX: ffffffffffeff00c RCX: ffffc9000e417000
RDX: 0000000000040000 RSI: ffffffff81873f21 RDI: 0000000000000003
RBP: ffff8880842878c0 R08: 0000000000000003 R09: 000000000000ffff
R10: 000000000000ffff R11: 0000000000000001 R12: 0000000000000004
R13: ffff88803ac56c00 R14: 000000000000ffff R15: dffffc0000000000
FS: 00007f5c88a16700(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fdaa9f6c058 CR3: 000000003a82c000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
____bpf_skb_load_helper_32 net/core/filter.c:276 [inline]
bpf_skb_load_helper_32+0x191/0x220 net/core/filter.c:264
Fixes: f9aefd6b2aa3 ("net: warn if mac header was not set")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220707123900.945305-1-edumazet@google.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
"A fix for tinyconfig build error, a fix for section mismatch warning,
and two cleanups of obsolete code"
* tag 'loongarch-fixes-5.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: Fix section mismatch warning
LoongArch: Fix build errors for tinyconfig
LoongArch: Remove obsolete mentions of vcsr
LoongArch: Drop these obsolete selects in Kconfig
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from bpf, netfilter, can, and bluetooth.
Current release - regressions:
- bluetooth: fix deadlock on hci_power_on_sync
Previous releases - regressions:
- sched: act_police: allow 'continue' action offload
- eth: usbnet: fix memory leak in error case
- eth: ibmvnic: properly dispose of all skbs during a failover
Previous releases - always broken:
- bpf:
- fix insufficient bounds propagation from
adjust_scalar_min_max_vals
- clear page contiguity bit when unmapping pool
- netfilter: nft_set_pipapo: release elements in clone from
abort path
- mptcp: netlink: issue MP_PRIO signals from userspace PMs
- can:
- rcar_canfd: fix data transmission failed on R-Car V3U
- gs_usb: gs_usb_open/close(): fix memory leak
Misc:
- add Wenjia as SMC maintainer"
* tag 'net-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits)
wireguard: Kconfig: select CRYPTO_CHACHA_S390
crypto: s390 - do not depend on CRYPTO_HW for SIMD implementations
wireguard: selftests: use microvm on x86
wireguard: selftests: always call kernel makefile
wireguard: selftests: use virt machine on m68k
wireguard: selftests: set fake real time in init
r8169: fix accessing unset transport header
net: rose: fix UAF bug caused by rose_t0timer_expiry
usbnet: fix memory leak in error case
Revert "tls: rx: move counting TlsDecryptErrors for sync"
mptcp: update MIB_RMSUBFLOW in cmd_sf_destroy
mptcp: fix local endpoint accounting
selftests: mptcp: userspace PM support for MP_PRIO signals
mptcp: netlink: issue MP_PRIO signals from userspace PMs
mptcp: Acquire the subflow socket lock before modifying MP_PRIO flags
mptcp: Avoid acquiring PM lock for subflow priority changes
mptcp: fix locking in mptcp_nl_cmd_sf_destroy()
net/mlx5e: Fix matchall police parameters validation
net/sched: act_police: allow 'continue' action offload
net: lan966x: hardcode the number of external ports
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
- Tag Intel pin control as supported in MAINTAINERS
- Fix a NULL pointer exception in the Aspeed driver
- Correct some NAND functions in the Sunxi A83T driver
- Use the right offset for some Sunxi pins
- Fix a zero base offset in the Freescale (NXP) i.MX93
- Fix the IRQ support in the STM32 driver
* tag 'pinctrl-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: stm32: fix optional IRQ support to gpios
pinctrl: imx: Add the zero base flag for imx93
pinctrl: sunxi: sunxi_pconf_set: use correct offset
pinctrl: sunxi: a83t: Fix NAND function name for some pins
pinctrl: aspeed: Fix potential NULL dereference in aspeed_pinmux_set_mux()
MAINTAINERS: Update Intel pin control to Supported
|
|
These are indeed "should not happen" situations, but it turns out recent
changes made the 'task_is_stopped_or_trace()' case trigger (fix for that
exists, is pending more testing), and the BUG_ON() makes it
unnecessarily hard to actually debug for no good reason.
It's been that way for a long time, but let's make it clear: BUG_ON() is
not good for debugging, and should never be used in situations where you
could just say "this shouldn't happen, but we can continue".
Use WARN_ON_ONCE() instead to make sure it gets logged, and then just
continue running. Instead of making the system basically unusuable
because you crashed the machine while potentially holding some very core
locks (eg this function is commonly called while holding 'tasklist_lock'
for writing).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The struct nhlt_format's fmt_config is a flexible array, it must not be
used as normal array.
When moving to the next nhlt_fmt_cfg we need to take into account the data
behind the ->config.caps (indicated by ->config.size).
The logic of the code also changed: it is no longer saves the _last_
fmt_cfg for all found rates.
Fixes: bc2bd45b1f7f3 ("ASoC: Intel: Skylake: Parse nhlt and register clock device")
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://lore.kernel.org/r/20220630065638.11183-3-peter.ujfalusi@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The present flag is only set once when one rate has been found to be saved.
This will effectively going to ignore any rate discovered at later time and
based on the code, this is not the intention.
Fixes: bc2bd45b1f7f3 ("ASoC: Intel: Skylake: Parse nhlt and register clock device")
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://lore.kernel.org/r/20220630065638.11183-2-peter.ujfalusi@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|