Age | Commit message (Collapse) | Author | Files | Lines |
|
Presently, RDMA devices are always registered within the init network
namespace, even if the associated devlink device's namespace was
changed via a devlink reload. This mismatch leads to discrepancies
between the network namespace of the devlink device and that of the
RDMA device.
Therefore, extend the RDMA device allocation API to optionally take
the net namespace. This isn't limited to devices that support devlink
but allows all users to provide the network namespace if they need to
do so.
If a network namespace is provided during device allocation, it's up
to the caller to make sure the namespace stays valid until
ib_register_device() is called.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
Add a little inline helper function
crypto_ahash_tested()
to the internal/hash.h header file to retrieve the tested
status (that is the CRYPTO_ALG_TESTED bit in the cra_flags).
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Suggested-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Make the hash walk functions
crypto_hash_walk_done()
crypto_hash_walk_first()
crypto_hash_walk_last()
public again.
These functions had been removed from the header file
include/crypto/internal/hash.h with commit 7fa481734016
("crypto: ahash - make hash walk functions private to ahash.c")
as there was no crypto algorithm code using them.
With the upcoming crypto implementation for s390 phmac
these functions will be exploited and thus need to be
public within the kernel again.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Acked-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
When user wants to clear a range in cpumask, the only option the API
provides now is a for-loop, like:
for_each_cpu_from(cpu, mask) {
if (cpu >= ncpus)
break;
__cpumask_clear_cpu(cpu, mask);
}
In the bitmap API we have bitmap_clear() for that, which is
significantly faster than a for-loop. Propagate it to cpumasks.
Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
Link: https://patch.msgid.link/20250604193947.11834-2-yury.norov@gmail.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Pull mount fixes from Al Viro:
"Several mount-related fixes"
* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
userns and mnt_idmap leak in open_tree_attr(2)
attach_recursive_mnt(): do not lock the covering tree when sliding something under it
replace collect_mounts()/drop_collected_mounts() with a safer variant
|
|
sched_ext performed a kfunc renaming pass in 6.13 and kept the old names
around for compatibility with old binaries. These were scheduled for
cleanup in 6.15 but were missed. Submitting for cleanup in for-next.
Removed the kfuncs, their flags, and any references I could find to them
in doc comments. Left the entries in include/scx/compat.bpf.h as they're
still useful to make new binaries compatible with old kernels.
Tested by applying to my kernel. It builds and a modern version of
scx_lavd loads fine.
Signed-off-by: Jake Hillion <jake@hillion.co.uk>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
percpu variable tag->counters
When loading a module, as long as the module has memory allocation
operations, kmemleak produces a false positive report that resembles the
following:
unreferenced object (percpu) 0x7dfd232a1650 (size 16):
comm "modprobe", pid 1301, jiffies 4294940249
hex dump (first 16 bytes on cpu 2):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace (crc 0):
kmemleak_alloc_percpu+0xb4/0xd0
pcpu_alloc_noprof+0x700/0x1098
load_module+0xd4/0x348
codetag_module_init+0x20c/0x450
codetag_load_module+0x70/0xb8
load_module+0xef8/0x1608
init_module_from_file+0xec/0x158
idempotent_init_module+0x354/0x608
__arm64_sys_finit_module+0xbc/0x150
invoke_syscall+0xd4/0x258
el0_svc_common.constprop.0+0xb4/0x240
do_el0_svc+0x48/0x68
el0_svc+0x40/0xf8
el0t_64_sync_handler+0x10c/0x138
el0t_64_sync+0x1ac/0x1b0
This is because the module can only indirectly reference
alloc_tag_counters through the alloc_tag section, which misleads kmemleak.
However, we don't have a kmemleak ignore interface for percpu allocations
yet. So let's create one and invoke it for tag->counters.
[gehao@kylinos.cn: fix build error when CONFIG_DEBUG_KMEMLEAK=n, s/igonore/ignore/]
Link: https://lkml.kernel.org/r/20250620093102.2416767-1-hao.ge@linux.dev
Link: https://lkml.kernel.org/r/20250619183154.2122608-1-hao.ge@linux.dev
Fixes: 12ca42c23775 ("alloc_tag: allocate percpu counters for module tags dynamically")
Signed-off-by: Hao Ge <gehao@kylinos.cn>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Suren Baghdasaryan <surenb@google.com> [lib/alloc_tag.c]
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
We're trying to add a strict regexp for the name format in the spec.
Underscores will not be allowed, dashes should be used instead.
This makes no difference to C (codegen, if used, replaces special
chars in names) but it gives more uniform naming in Python.
Fixes: bc8aeb2045e2 ("Documentation: netlink: add a YAML spec for mptcp")
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250624211002.3475021-8-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 4ea7e38696c7 ("dropmon: add ability to detect when hardware
drops rx packets") introduced is_drop_point_hw, but the symbol was
never referenced anywhere in the kernel tree and is currently not used
by dropwatch. I could not find, to the best of my abilities, a current
out-of-tree user of this macro.
The definition also contains a syntax error in its for-loop, so any
project that tried to compile against it would fail. Removing the
macro therefore eliminates dead code without breaking existing
users.
Signed-off-by: RubenKelevra <rubenkelevra@gmail.com>
Link: https://patch.msgid.link/20250624165711.1188691-1-rubenkelevra@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
packing.h uses ARRAY_SIZE(), BUILD_BUG_ON_MSG(), min(), max(), and
sizeof_field() without including the headers where they are defined,
potentially causing build failures.
Fix this in packing.h and sort the result.
Signed-off-by: Nathan Lynch <nathan.lynch@amd.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/20250624-packing-includes-v1-1-c23c81fab508@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since commit c54e1d920f04 ("flow_offload: add ops to tc_action_ops for
flow action setup") these are unused.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Link: https://patch.msgid.link/20250624014327.3686873-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In preparation for RSS_SET handling in ethnl introduce Netlink
notifications for RSS. Only cover modifications, not creation
and not removal of a context, because the latter may deserve
a different notification type. We should cross that bridge
when we add the support for context add / remove via Netlink.
Link: https://patch.msgid.link/20250623231720.3124717-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ethtool_notify() takes a const void *data argument, which presumably
was intended to pass information from the call site to the subcommand
handler. This argument currently has no users.
Expecting the data to be subcommand-specific has two complications.
Complication #1 is that its not plumbed thru any of the standardized
callbacks. It gets propagated to ethnl_default_notify() where it
remains unused. Coming from the ethnl_default_set_doit() side we pass
in NULL, because how could we have a command specific attribute in
a generic handler.
Complication #2 is that we expect the ethtool_notify() callers to
know what attribute type to pass in. Again, the data pointer is
untyped.
RSS will need to pass the context ID to the notifications.
I think it's a better design if the "subcommand" exports its own
typed interface and constructs the appropriate argument struct
(which will be req_info). Remove the unused data argument from
ethtool_notify() but retain it in a new internal helper which
subcommands can use to build a typed interface.
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250623231720.3124717-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add the multicast group's name to the YAML spec.
Without it YNL doesn't know how to subscribe to notifications.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250623231720.3124717-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syszbot reports various ordering issues for lower instance locks and
team lock. Switch to using rtnl lock for protecting team device,
similar to bonding. Based on the patch by Tetsuo Handa.
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot+705c61d60b091ef42c04@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=705c61d60b091ef42c04
Reported-by: syzbot+71fd22ae4b81631e22fd@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=71fd22ae4b81631e22fd
Fixes: 6b1d3c5f675c ("team: grab team lock during team_change_rx_flags")
Link: https://lkml.kernel.org/r/ZoZ2RH9BcahEB9Sb@nanopsycho.orion
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250623153147.3413631-1-sdf@fomichev.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There are places in BPF code which check if a BTF type is an integer
of particular size. This code can be made simpler by using helpers.
Add new btf_type_is_i{32,64} helpers, and simplify code in a few
files. (Suggested by Eduard for a patch which copy-pasted such a
check [1].)
v1 -> v2:
* export less generic helpers (Eduard)
* make subject less generic than in [v1] (Eduard)
[1] https://lore.kernel.org/bpf/7edb47e73baa46705119a23c6bf4af26517a640f.camel@gmail.com/
[v1] https://lore.kernel.org/bpf/20250624193655.733050-1-a.s.protopopov@gmail.com/
Suggested-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250625151621.1000584-1-a.s.protopopov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add range tracking for instruction BPF_NEG. Without this logic, a trivial
program like the following will fail
volatile bool found_value_b;
SEC("lsm.s/socket_connect")
int BPF_PROG(test_socket_connect)
{
if (!found_value_b)
return -1;
return 0;
}
with verifier log:
"At program exit the register R0 has smin=0 smax=4294967295 should have
been in [-4095, 0]".
This is because range information is lost in BPF_NEG:
0: R1=ctx() R10=fp0
; if (!found_value_b) @ xxxx.c:24
0: (18) r1 = 0xffa00000011e7048 ; R1_w=map_value(...)
2: (71) r0 = *(u8 *)(r1 +0) ; R0_w=scalar(smin32=0,smax=255)
3: (a4) w0 ^= 1 ; R0_w=scalar(smin32=0,smax=255)
4: (84) w0 = -w0 ; R0_w=scalar(range info lost)
Note that, the log above is manually modified to highlight relevant bits.
Fix this by maintaining proper range information with BPF_NEG, so that
the verifier will know:
4: (84) w0 = -w0 ; R0_w=scalar(smin32=-255,smax=0)
Also updated selftests based on the expected behavior.
Signed-off-by: Song Liu <song@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250625164025.3310203-2-song@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Initialize unique name for amdisp i2c adapter, which is used
in the platform driver to detect the matching adapter for
i2c_client creation.
Add definition of amdisp i2c adapter name in a new header file
(include/linux/soc/amd/isp4_misc.h) as it is referred in different
driver modules.
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20250609155601.1477055-3-pratap.nirujogi@amd.com
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes Berg says:
====================
The usual features/cleanups/etc., notably:
- rtw88: IBSS mode for SDIO devices
- rtw89:
- BT coex for MLO/WiFi7
- work on station + P2P concurrency
- ath: fix W=2 export.h warnings
- ath12k: fix scan on multi-radio devices
- cfg80211/mac80211: MLO statistics
- mac80211: S1G aggregation
- cfg80211/mac80211: per-radio RTS threshold
* tag 'wireless-next-2025-06-25' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (171 commits)
wifi: iwlwifi: dvm: fix potential overflow in rs_fill_link_cmd()
iwlwifi: Add missing check for alloc_ordered_workqueue
wifi: iwlwifi: Fix memory leak in iwl_mvm_init()
iwlwifi: api: delete repeated words
iwlwifi: remove unused no_sleep_autoadjust declaration
iwlwifi: Fix comment typo
iwlwifi: use DECLARE_BITMAP macro
iwlwifi: fw: simplify the iwl_fw_dbg_collect_trig()
wifi: iwlwifi: mld: ftm: fix switch end indentation
MAINTAINERS: update iwlwifi git link
wifi: iwlwifi: pcie: fix non-MSIX handshake register
wifi: iwlwifi: mld: don't exit EMLSR when we shouldn't
wifi: iwlwifi: move _iwl_trans_set_bits_mask utilities
wifi: iwlwifi: mld: make iwl_mld_add_all_rekeys void
wifi: iwlwifi: move iwl_trans_pcie_write_mem to iwl-trans.c
wifi: iwlwifi: pcie: move iwl_trans_pcie_dump_regs() to utils.c
wifi: iwlwifi: mld: advertise support for TTLM changes
wifi: iwlwifi: mld: Block EMLSR when scanning on P2P Device
wifi: iwlwifi: mld: use the correct struct size for tracing
wifi: iwlwifi: support RZL platform device ID
...
====================
Link: https://patch.msgid.link/20250625120135.41933-55-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Drop kvm_arch_{start,end}_assignment() and all associated code now that
KVM x86 no longer consumes assigned_device_count. Tracking whether or not
a VFIO-assigned device is formally associated with a VM is fundamentally
flawed, as such an association is optional for general usage, i.e. is prone
to false negatives. E.g. prior to commit 2edd9cb79fb3 ("kvm: detect
assigned device via irqbypass manager"), device passthrough via VFIO would
fail to enable IRQ bypass if userspace omitted the formal VFIO<=>KVM
binding.
And device drivers that *need* the VFIO<=>KVM connection, e.g. KVM-GT,
shouldn't be relying on generic x86 tracking infrastructure.
Cc: Jim Mattson <jmattson@google.com>
Link: https://lore.kernel.org/r/20250523011756.3243624-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Merge series from Bard Liao <yung-chuan.liao@linux.intel.com>:
The series adds support for combined speaker components with one "spk:"
tag in the card->components string. This is a UCM request.
|
|
Merge series from Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>:
Current Kconfig menu at [ALSA for SoC audio support] has no rules.
So, some venders are using menu style, some venders are listed each drivers
on top page, etc. It is difficult to find target vender and/or drivers
because it is very random.
Let's standardize ASoC menu, like below
--- ALSA for SoC audio support
Analog Devices --->
AMD --->
Apple --->
Atmel --->
Au1x ----
Broadcom --->
Cirrus Logic --->
DesignWare --->
Freescale --->
Google --->
Hisilicon --->
...
One concern is *vender folder* alphabetical order vs *vender name*
alphabetical order were different. For example "sunxi" menu is
"Allwinner".
Link: https://lore.kernel.org/r/8734c8bf3l.wl-kuninori.morimoto.gx@renesas.com
|
|
The sev_data_snp_launch_start structure should include a 4-byte
desired_tsc_khz field before the gosvw field, which was missed in the
initial implementation. As a result, the structure is 4 bytes shorter than
expected by the firmware, causing the gosvw field to start 4 bytes early.
Fix this by adding the missing 4-byte member for the desired TSC frequency.
Fixes: 3a45dc2b419e ("crypto: ccp: Define the SEV-SNP commands")
Cc: stable@vger.kernel.org
Suggested-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Tested-by: Vaishali Thakkar <vaishali.thakkar@suse.com>
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Link: https://lore.kernel.org/r/20250408093213.57962-3-nikunj@amd.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
gs101 differs to other currently supported SoCs in that it has 3 wakeup
mask registers for the 67 external wakeup interrupt pins in alive and
far_alive.
EINT_WAKEUP_MASK 0x3A80 EINT[31:0]
EINT_WAKEUP_MASK2 0x3A84 EINT[63:32]
EINT_WAKEUP_MASK3 0x3A88 EINT[66:64]
Add gs101 specific callbacks and a dedicated gs101_wkup_irq_chip struct to
handle these differences.
The current wakeup mask with upstream is programmed as
WAKEUP_MASK0[0x3A80] value[0xFFFFFFFF]
WAKEUP_MASK1[0x3A84] value[0xF2FFEFFF]
WAKEUP_MASK2[0x3A88] value[0xFFFFFFFF]
Which corresponds to the following wakeup sources:
gpa7-3 vol down
gpa8-1 vol up
gpa10-1 power
gpa8-2 typec-int
Signed-off-by: Peter Griffin <peter.griffin@linaro.org>
Link: https://lore.kernel.org/r/20250619-gs101-eint-mask-v1-2-89438cfd7499@linaro.org
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
|
|
Commit 37d078e51b4c ("drm/xe/uapi: Split xe_sync types from flags") renamed some DRM_XE_SYNC_*
defines but later commits kept using the old names. Correct them with the new definition.
v2: correct fixes tag and update commit message to explain why (Lucas)
Fixes: 9329f0667215 ("drm/xe/uapi: Use LR abbrev for long-running vms")
Fixes: 4b437893a826 ("drm/xe/uapi: More uAPI documentation additions and cosmetic updates")
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Zongyao Bai <zongyao.bai@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Link: https://lore.kernel.org/r/20250608230133.1250849-1-shuicheng.lin@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Currently the sdw machine driver uses different way to get the
component name from the DAI name for different codecs in the rtd_init
callback. It means that we need to rely on the rtd_init callback to get
the component name. Add an optional component string to the
asoc_sdw_dai_info struct allows the machine driver to get the component
name directly.
The commit adds the component names for the AMP dais for the preparation
to set card->components string for combined speaker configs.
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Shuming Fan <shumingf@realtek.com>
Link: https://patch.msgid.link/20250625140430.311865-2-yung-chuan.liao@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
There is a PPU0 reset control bit in the same register as the PPU1
reset control. This missing reset control is for the PCK-600 unit
in the SoC. Manual tests show that the reset control indeed exists,
and if not configured, the system will hang when the PCK-600 registers
are accessed.
Add a reset entry for it at the end of the existing ones.
Fixes: 52dbf84857f0 ("dt-bindings: clk: sunxi-ng: document two Allwinner A523 CCUs")
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Stephen Boyd <sboyd@kernel.org>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Link: https://patch.msgid.link/20250619171025.3359384-2-wens@kernel.org
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
|
|
Add definitions for the PCIe Congestion Event object
and the relevant FW command structures.
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Link: https://patch.msgid.link/20250619113721.60201-3-mbloch@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Make enum for capability bits of general object types depend on
the type definitions themselves.
Make sure that capabilities in the [64,127] bit range are
properly calculated (type id - 64).
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Link: https://patch.msgid.link/20250619113721.60201-2-mbloch@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
From Patrisious:
This short series from Patrisious extends mlx5 flow steering logic to
allow creation rule creation with priorities in RDMA TRANSPORT tables.
Thanks
Link: https://lore.kernel.org/all/cover.1750148083.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
* mlx5-next: (200 commits)
net/mlx5: fs, add multiple prios to RDMA TRANSPORT steering domain
Linux 6.16-rc2
...
|
|
RDMA TRANSPORT domains were initially limited to a single priority.
This change allows the domains to have multiple priorities, making
it possible to add several rules and control the order in which
they're evaluated.
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Link: https://patch.msgid.link/b299cbb4c8678a33da6e6b6988b5bf6145c54b88.1750148083.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Currently in ib_free_cq, it disables IRQ or cancel the CQ work before
driver destroy_cq. This isn't good as a new IRQ or a CQ work can be
submitted immediately after disabling IRQ or canceling CQ work, which
may run concurrently with destroy_cq and cause crashes.
The right flow should be:
1. Driver disables CQ to make sure no new CQ event will be submitted;
2. Disables IRQ or Cancels CQ work in core layer, to make sure no CQ
polling work is running;
3. Free all resources to destroy the CQ.
This patch adds 2 driver APIs:
- pre_destroy_cq(): Disable a CQ to prevent it from generating any new
work completions, but not free any kernel resources;
- post_destroy_cq(): Free all kernel resources.
In ib_free_cq, the IRQ is disabled or CQ work is canceled after
pre_destroy_cq, and before post_destroy_cq.
Fixes: 14d3a3b2498e ("IB: add a proper completion queue abstraction")
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Link: https://patch.msgid.link/b5f7ae3d75f44a3e15ff3f4eb2bbdea13e06b97f.1750062328.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Currently, the in-kernel kunit test case timeout is 300 seconds. (There
is a separate timeout mechanism for the whole test execution in
kunit.py, but that's unrelated.) However, tests marked 'slow' or 'very
slow' may timeout, particularly on slower machines.
Implement a multiplier to the test-case timeout, so that slower tests
have longer to complete:
- DEFAULT -> 1x default timeout
- KUNIT_SPEED_SLOW -> 3x default timeout
- KUNIT_SPEED_VERY_SLOW -> 12x default timeout
A further change is planned to allow user configuration of the
default/base timeout to allow people with faster or slower machines to
adjust these to their use-cases.
Link: https://lore.kernel.org/r/20250614084711.2654593-2-davidgow@google.com
Signed-off-by: Ujwal Jain <ujwaljain@google.com>
Co-developed-by: David Gow <davidgow@google.com>
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
UBLK_F_SUPPORT_ZERO_COPY has a very old comment describing the initial
idea for how zero-copy would be implemented. The actual implementation
added in commit 1f6540e2aabb ("ublk: zc register/unregister bvec") uses
io_uring registered buffers rather than shared memory mapping.
Remove the inaccurate remarks about mapping ublk request memory into the
ublk server's address space and requiring 4K block size. Replace them
with a description of the current zero-copy mechanism.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250621171015.354932-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
When a C++ file compiled with -Wc++11-narrowing includes the UAPI header
linux/ublk_cmd.h, ublk_sqe_addr_to_auto_buf_reg()'s assignments of u64
values to u8, u16, and u32 fields result in compiler warnings. Add
explicit casts to the intended types to avoid these warnings. Drop the
unnecessary bitmasks.
Reported-by: Uday Shankar <ushankar@purestorage.com>
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Fixes: 99c1e4eb6a3f ("ublk: register buffer to local io_uring with provided buf index via UBLK_F_AUTO_BUF_REG")
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250621162842.337452-1-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The only reason why alpha can't do what sparc et.al. are doing
is that include/asm-generic/param.h relies upon the value of HZ
set for userland header in uapi/asm/param.h being 100.
We need that value to define USER_HZ and we need that definition
to outlive the redefinition of HZ kernel-side. And alpha needs
it to be 1024, not 100 like everybody else.
So let's add __USER_HZ to uapi/asm-generic/param.h, defaulting to
100 and used to define HZ. That way include/asm-generic/param.h
can use that thing instead of open-coding it - it won't be affected
by undefining and redefining HZ.
That done, alpha asm/param.h can be removed and uapi/asm/param.h
switched to defining __USER_HZ and EXEC_PAGESIZE and then including
<asm-generic/param.h> - asm/param.h will resolve to uapi/asm/param.h,
which pulls <asm-generic/param.h>, which will do the right thing
both in the kernel and userland contexts.
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
If a userspace application just include <linux/vm_sockets.h> will fail
to build with the following errors:
/usr/include/linux/vm_sockets.h:182:39: error: invalid application of ‘sizeof’ to incomplete type ‘struct sockaddr’
182 | unsigned char svm_zero[sizeof(struct sockaddr) -
| ^~~~~~
/usr/include/linux/vm_sockets.h:183:39: error: ‘sa_family_t’ undeclared here (not in a function)
183 | sizeof(sa_family_t) -
|
Include <sys/socket.h> for userspace (guarded by ifndef __KERNEL__)
where `struct sockaddr` and `sa_family_t` are defined.
We already do something similar in <linux/mptcp.h> and <linux/if.h>.
Fixes: d021c344051a ("VSOCK: Introduce VM Sockets")
Reported-by: Daan De Meyer <daan.j.demeyer@gmail.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20250623100053.40979-1-sgarzare@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 3f4f1f8a1ab7 ("capabilities: remove cap_mmap_file()")
removed the implementation but leave declaration.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
While configuring a vxlan tunnel in a system with a i40e NIC driver, I
observe the following deadlock:
WARNING: possible recursive locking detected
6.16.0-rc2.net-next-6.16_92d87230d899+ #13 Tainted: G E
--------------------------------------------
kworker/u256:4/1125 is trying to acquire lock:
ffff88921ab9c8c8 (&utn->lock){+.+.}-{4:4}, at: i40e_udp_tunnel_set_port (/home/pabeni/net-next/include/net/udp_tunnel.h:343 /home/pabeni/net-next/drivers/net/ethernet/intel/i40e/i40e_main.c:13013) i40e
but task is already holding lock:
ffff88921ab9c8c8 (&utn->lock){+.+.}-{4:4}, at: udp_tunnel_nic_device_sync_work (/home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:739) udp_tunnel
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&utn->lock);
lock(&utn->lock);
*** DEADLOCK ***
May be due to missing lock nesting notation
4 locks held by kworker/u256:4/1125:
#0: ffff8892910ca158 ((wq_completion)udp_tunnel_nic){+.+.}-{0:0}, at: process_one_work (/home/pabeni/net-next/kernel/workqueue.c:3213)
#1: ffffc900244efd30 ((work_completion)(&utn->work)){+.+.}-{0:0}, at: process_one_work (/home/pabeni/net-next/kernel/workqueue.c:3214)
#2: ffffffff9a14e290 (rtnl_mutex){+.+.}-{4:4}, at: udp_tunnel_nic_device_sync_work (/home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:737) udp_tunnel
#3: ffff88921ab9c8c8 (&utn->lock){+.+.}-{4:4}, at: udp_tunnel_nic_device_sync_work (/home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:739) udp_tunnel
stack backtrace:
Hardware name: Dell Inc. PowerEdge R7525/0YHMCJ, BIOS 2.2.5 04/08/2021
i
Call Trace:
<TASK>
dump_stack_lvl (/home/pabeni/net-next/lib/dump_stack.c:123)
print_deadlock_bug (/home/pabeni/net-next/kernel/locking/lockdep.c:3047)
validate_chain (/home/pabeni/net-next/kernel/locking/lockdep.c:3901)
__lock_acquire (/home/pabeni/net-next/kernel/locking/lockdep.c:5240)
lock_acquire.part.0 (/home/pabeni/net-next/kernel/locking/lockdep.c:473 /home/pabeni/net-next/kernel/locking/lockdep.c:5873)
__mutex_lock (/home/pabeni/net-next/kernel/locking/mutex.c:604 /home/pabeni/net-next/kernel/locking/mutex.c:747)
i40e_udp_tunnel_set_port (/home/pabeni/net-next/include/net/udp_tunnel.h:343 /home/pabeni/net-next/drivers/net/ethernet/intel/i40e/i40e_main.c:13013) i40e
udp_tunnel_nic_device_sync_by_port (/home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:230 /home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:249) udp_tunnel
__udp_tunnel_nic_device_sync.part.0 (/home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:292) udp_tunnel
udp_tunnel_nic_device_sync_work (/home/pabeni/net-next/net/ipv4/udp_tunnel_nic.c:742) udp_tunnel
process_one_work (/home/pabeni/net-next/kernel/workqueue.c:3243)
worker_thread (/home/pabeni/net-next/kernel/workqueue.c:3315 /home/pabeni/net-next/kernel/workqueue.c:3402)
kthread (/home/pabeni/net-next/kernel/kthread.c:464)
AFAICS all the existing callsites of udp_tunnel_nic_set_port_priv() are
already under the utn lock scope, avoid (re-)acquiring it in such a
function.
Fixes: 1ead7501094c ("udp_tunnel: remove rtnl_lock dependency")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/95a827621ec78c12d1564ec3209e549774f9657d.1750675978.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
For systems using a sched_ext scheduler and has panic_on_rcu_stall
enabled, try kicking out the current scheduler before issuing a panic.
While there are numerous reasons for RCU CPU stalls that are not
directly attributed to the scheduler, deferring the panic gives
sched_ext an opportunity to provide additional debug info when ejecting
the current scheduler. Also, handling the event more gracefully allows
us to potentially recover the system instead of incurring additional
down time.
Suggested-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: David Dai <david.dai@linux.dev>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Header miscdevice.h includes linux/device.h which has definations for
below two forward declarations directly or indirectly:
struct device;
struct attribute_group;
Remove these redundant forward declarations from miscdevice.h
Signed-off-by: Zijun Hu <zijun.hu@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250620-fix_mischar-v1-1-6c2716bbf1fa@oss.qualcomm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
vmci_qpair_dequeue(), vmci_qpair_enqueue() and vmci_qpair_peek()
were added in 2013 by
commit 06164d2b72aa ("VMCI: queue pairs implementation.")
but have remained unused.
Remove them.
(The iov version of those functions is used)
Signed-off-by: "Dr. David Alan Gilbert" <linux@treblig.org>
Link: https://lore.kernel.org/r/20250614010344.636076-4-linux@treblig.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
vmci_doorbell_notify() was added in 2013 by
commit 83e2ec765be0 ("VMCI: doorbell implementation.")
but has remained unused.
Remove it.
Signed-off-by: "Dr. David Alan Gilbert" <linux@treblig.org>
Link: https://lore.kernel.org/r/20250614010344.636076-3-linux@treblig.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Like s390 and the jailhouse hypervisor, LoongArch's PCI architecture allows
passing isolated PCI functions to a guest OS instance. So it is possible
that there is a multi-function device without function 0 for the host or
guest.
Allow probing such functions by adding a IS_ENABLED(CONFIG_LOONGARCH) case
in the hypervisor_isolated_pci_functions() helper.
This is similar to commit 189c6c33ff42 ("PCI: Extend isolated function
probing to s390").
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250624062927.4037734-1-chenhuacai@loongson.cn
|
|
Add a special file descriptor indicating the root of the pidfs
filesystem.
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
A poorly implemented DisplayPort Alt Mode port partner can indicate
that its pin assignment capabilities are greater than the maximum
value, DP_PIN_ASSIGN_F. In this case, calls to pin_assignment_show
will cause a BRK exception due to an out of bounds array access.
Prevent for loop in pin_assignment_show from accessing
invalid values in pin_assignments by adding DP_PIN_ASSIGN_MAX
value in typec_dp.h and using i < DP_PIN_ASSIGN_MAX as a loop
condition.
Fixes: 0e3bb7d6894d ("usb: typec: Add driver for DisplayPort alternate mode")
Cc: stable <stable@kernel.org>
Signed-off-by: RD Babiera <rdbabiera@google.com>
Reviewed-by: Badhri Jagan Sridharan <badhri@google.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20250618224943.3263103-2-rdbabiera@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The commit below added a new helper, but omitted to move (and add) the
corressponding kernel-doc. Do it now.
Signed-off-by: "Jiri Slaby (SUSE)" <jirislaby@kernel.org>
Fixes: 2b5eac0f8c6e ("tty: introduce and use tty_port_tty_vhangup() helper")
Link: https://lore.kernel.org/all/b23d566c-09dc-7374-cc87-0ad4660e8b2e@linux.intel.com/
Reported-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Link: https://lore.kernel.org/r/20250624080641.509959-6-jirislaby@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
block_write_end() looks like it can be used as a ->write_end()
implementation. However, it can't as it does not unlock nor put
the folio. Since it does not use the 'file', 'mapping' nor 'fsdata'
arguments, remove them.
Signed-off-by: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Link: https://lore.kernel.org/20250624132130.1590285-1-willy@infradead.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Add a marker for an invalid file descriptor.
Link: https://lore.kernel.org/20250624-work-pidfs-fhandle-v2-7-d02a04858fe3@kernel.org
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Don't jump somewhere into the middle of the reserved range. We're still
able to change that value it won't be that widely used yet. If not, we
can revert.
Signed-off-by: Christian Brauner <brauner@kernel.org>
|