| Age | Commit message (Collapse) | Author | Files | Lines |
|
To end an ongoing game of whack-a-mole between KVM and syzkaller, WARN on
illegally cancelling a pending nested VM-Enter if and only if userspace
has NOT gained control of the vCPU since the nested run was initiated. As
proven time and time again by syzkaller, userspace can clobber vCPU state
so as to force a VM-Exit that violates KVM's architectural modelling of
VMRUN/VMLAUNCH/VMRESUME.
To detect that userspace has gained control, while minimizing the risk of
operating on stale data, convert nested_run_pending from a pure boolean to
a tri-state of sorts, where '0' is still "not pending", '1' is "pending",
and '2' is "pending but untrusted". Then on KVM_RUN, if the flag is in
the "trusted pending" state, move it to "untrusted pending".
Note, moving the state to "untrusted" even if KVM_RUN is ultimately
rejected is a-ok, because for the "untrusted" state to matter, KVM must
get past kvm_x86_vcpu_pre_run() at some point for the vCPU.
Reviewed-by: Yosry Ahmed <yosry@kernel.org>
Link: https://patch.msgid.link/20260312234823.3120658-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- fix kerneldocs for gpio-timberdale and gpio-nomadik
- clear the "requested" flag in error path in gpiod_request_commit()
- call of_xlate() if provided when setting up shared GPIOs
- handle pins shared by child firmware nodes of consumer devices
- fix return value check in gpio-qixis-fpga
- fix suspend on gpio-mxc
- fix gpio-microchip DT bindings
* tag 'gpio-fixes-for-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
dt-bindings: gpio: fix microchip #interrupt-cells
gpio: shared: shorten the critical section in gpiochip_setup_shared()
gpio: mxc: map Both Edge pad wakeup to Rising Edge
gpio: qixis-fpga: Fix error handling for devm_regmap_init_mmio()
gpio: shared: handle pins shared by child nodes of devices
gpio: shared: call gpio_chip::of_xlate() if set
gpiolib: clear requested flag if line is invalid
gpio: nomadik: repair some kernel-doc comments
gpio: timberdale: repair kernel-doc comments
gpio: Fix resource leaks on errors in gpiochip_add_data_with_key()
|
|
Move nested_run_pending field present in both svm_nested_state and
nested_vmx to the common kvm_vcpu_arch. This allows for common code to
use without plumbing it through per-vendor helpers.
nested_run_pending remains zero-initialized, as the entire kvm_vcpu
struct is, and all further accesses are done through vcpu->arch instead
of svm->nested or vmx->nested.
No functional change intended.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
[sean: expand the commend in the field declaration]
Link: https://patch.msgid.link/20260312234823.3120658-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Will Deacon:
- Implement a basic static call trampoline to fix CFI failures with the
generic implementation
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Use static call trampolines when kCFI is enabled
|
|
This complements the commit 18f7686a1ce6 ("selftests/seccomp:
Add hard-coded __NR_uretprobe for x86_64").
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: https://patch.msgid.link/ac_BAMSggw-_ABPE@redhat.com
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
Alexei Starovoitov says:
====================
bpf: Prep patches for static stack liveness.
v4->v5:
- minor test fixup
v3->v4:
- fixed invalid recursion detection when calback is called multiple times
v3: https://lore.kernel.org/bpf/20260402212856.86606-1-alexei.starovoitov@gmail.com/
v2->v3:
- added recursive call detection
- fixed ubsan warning
- removed double declaration in the header
- added Acks
v2: https://lore.kernel.org/bpf/20260402061744.10885-1-alexei.starovoitov@gmail.com/
v1->v2:
. fixed bugs spotted by Eduard, Mykyta, claude and gemini
. fixed selftests that were failing in unpriv
. gemini(sashiko) found several precision improvements in patch 6,
but they made no difference in real programs.
v1: https://lore.kernel.org/bpf/20260401021635.34636-1-alexei.starovoitov@gmail.com/
First 6 prep patches for static stack liveness.
. do src/dst_reg validation early and remove defensive checks
. sort subprog in topo order. We wanted to do this long ago
to process global subprogs this way and in other cases.
. Add constant folding pass that computes map_ptr, subprog_idx,
loads from readonly maps, and other constants that fit into 32-bit
. Use these constants to eliminate dead code. Replace predicted
conditional branches with "jmp always". That reduces JIT prog size.
. Add two helpers that return access size from their arguments.
====================
Link: https://patch.msgid.link/20260403024422.87231-1-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The static stack liveness analysis needs to know how many bytes a
helper or kfunc accesses through a stack pointer argument, so it can
precisely mark the affected stack slots as stack 'def' or 'use'.
Add bpf_helper_stack_access_bytes() and bpf_kfunc_stack_access_bytes()
which resolve the access size for a given call argument.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260403024422.87231-7-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Move several helpers to header as preparation for
the subsequent stack liveness patches.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260403024422.87231-6-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add two passes before the main verifier pass:
bpf_compute_const_regs() is a forward dataflow analysis that tracks
register values in R0-R9 across the program using fixed-point
iteration in reverse postorder. Each register is tracked with
a six-state lattice:
UNVISITED -> CONST(val) / MAP_PTR(map_index) /
MAP_VALUE(map_index, offset) / SUBPROG(num) -> UNKNOWN
At merge points, if two paths produce the same state and value for
a register, it stays; otherwise it becomes UNKNOWN.
The analysis handles:
- MOV, ADD, SUB, AND with immediate or register operands
- LD_IMM64 for plain constants, map FDs, map values, and subprogs
- LDX from read-only maps: constant-folds the load by reading the
map value directly via bpf_map_direct_read()
Results that fit in 32 bits are stored per-instruction in
insn_aux_data and bitmasks.
bpf_prune_dead_branches() uses the computed constants to evaluate
conditional branches. When both operands of a conditional jump are
known constants, the branch outcome is determined statically and the
instruction is rewritten to an unconditional jump.
The CFG postorder is then recomputed to reflect new control flow.
This eliminates dead edges so that subsequent liveness analysis
doesn't propagate through dead code.
Also add runtime sanity check to validate that precomputed
constants match the verifier's tracked state.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260403024422.87231-5-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add few tests for topo sort:
- linear chain: main -> A -> B
- diamond: main -> A, main -> B, A -> C, B -> C
- mixed global/static: main -> global -> static leaf
- shared callee: main -> leaf, main -> global -> leaf
- duplicate calls: main calls same subprog twice
- no calls: single subprog
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260403024422.87231-4-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add a pass that sorts subprogs in topological order so that iterating
subprog_topo_order[] walks leaf subprogs first, then their callers.
This is computed as a DFS post-order traversal of the CFG.
The pass runs after check_cfg() to ensure the CFG has been validated
before traversing and after postorder has been computed to avoid
walking dead code.
Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260403024422.87231-3-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Instead of checking src/dst range multiple times during
the main verifier pass do them once.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260403024422.87231-2-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Pull drm fixes from Dave Airlie:
"Hopefully no Easter eggs in this bunch of fixes. Usual stuff across
the amd/intel with some misc bits. Thanks to Thorsten and Alex for
making sure a regression fix that was hanging around in process land
finally made it in, that is probably the biggest change in here.
core:
- revert unplug/framebuffer fix as it caused problems
- compat ioctl speculation fix
bridge:
- refcounting fix
sysfb:
- error handling fix
amdgpu:
- fix renoir audio regression
- UserQ fixes
- PASID handling fix
- S4 fix for smu11 chips
- Misc small fixes
amdkfd:
- Non-4K page fixes
i915:
- Fix for #12045: Huawei Matebook E (DRR-WXX): Persistent Black
Screen on Boot with i915 and Gen11: Modesetting and Backlight
Control Malfunction
- Fix for #15826: i915: Raptor Lake-P [UHD Graphics] display
flicker/corruption on eDP panel
- Use crtc_state->enhanced_framing properly on ivb/hsw CPU eDP
xe:
- uapi: Accept canonical GPU addresses in xe_vm_madvise_ioctl
- Disallow writes to read-only VMAs
- PXP fixes
- Disable garbage collector work item on SVM close
- void memory allocations in xe_device_declare_wedged
qaic:
- hang fix
ast:
- initialisation fix"
* tag 'drm-fixes-2026-04-03' of https://gitlab.freedesktop.org/drm/kernel: (28 commits)
drm/amd/display: Wire up dcn10_dio_construct() for all pre-DCN401 generations
drm/ioc32: stop speculation on the drm_compat_ioctl path
drm/sysfb: Fix efidrm error handling and memory type mismatch
drm/i915/dp: Use crtc_state->enhanced_framing properly on ivb/hsw CPU eDP
drm/i915/cdclk: Do the full CDCLK dance for min_voltage_level changes
drm/amdkfd: Fix queue preemption/eviction failures by aligning control stack size to GPU page size
drm/amdgpu: Fix wait after reset sequence in S4
drm/amd/display: Fix NULL pointer dereference in dcn401_init_hw()
drm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 64KB
drm/amdgpu/userq: fix memory leak in MQD creation error paths
drm/amd: Fix MQD and control stack alignment for non-4K
drm/amdkfd: Align expected_queue_size to PAGE_SIZE
drm/amdgpu: fix the idr allocation flags
drm/amdgpu: validate doorbell_offset in user queue creation
drm/amdgpu/pm: drop SMU driver if version not matched messages
drm/xe: Avoid memory allocations in xe_device_declare_wedged()
drm/xe: Disable garbage collector work item on SVM close
drm/xe/pxp: Don't allow PXP on older PTL GSC FWs
drm/xe/pxp: Clear restart flag in pxp_start after jumping back
drm/xe/pxp: Remove incorrect handling of impossible state during suspend
...
|
|
Add some unit test for state machine when in SSIF_ABORTING state.
Fixes: dd2bc5cc9e25 ("ipmi: ssif_bmc: Add SSIF BMC driver")
Signed-off-by: Jian Zhang <zhangjian.3032@bytedance.com>
Message-ID: <20260403143939.434017-1-zhangjian.3032@bytedance.com>
Signed-off-by: Corey Minyard <corey@minyard.net>
|
|
Cross-merge BPF and other fixes after downstream PR.
Minor conflict in kernel/bpf/verifier.c
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add the OLPC XO-1 RTC compatible string to the trivial-rtc schema
instead of creating a standalone binding file, as it only requires
a compatible property with no additional configuration.
Signed-off-by: Anushka Badhe <anushkabadhe@gmail.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20260325093003.44051-1-anushkabadhe@gmail.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
|
|
The RTC block found in the SC2730 PMIC is compatible with the one found
in the SC2731 PMIC.
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Otto Pflüger <otto.pflueger@abscue.de>
Link: https://patch.msgid.link/20260329-sc27xx-mfd-cells-v3-1-9158dee41f74@abscue.de
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
|
|
On SMP systems alternative_instructions() frees memory occupied by
smp_locks section immediately after patching the lock instructions.
The memory is freed using free_init_pages() that calls free_reserved_area()
that essentially does __free_page() for every page in the range.
Up until recently it didn't update memblock state so in cases when
CONFIG_ARCH_KEEP_MEMBLOCK is enabled (on x86 it is selected by
INTEL_TDX_HOST), the state of memblock and the memory map would be
inconsistent.
Additionally, with CONFIG_DEFERRED_STRUCT_PAGE_INIT enabled, freeing of
smp_locks happens before the memory map is fully initialized and freeing
reserved memory may cause an access to not-yet-initialized struct page when
__free_page() searches for a buddy page.
Following the discussion in [1], implementation of memblock_free_late() and
free_reserved_area() was unified to ensure that reserved memory that's
freed after memblock transfers the pages to the buddy allocator is actually
freed and that the memblock and the memory map are consistent. As a part of
these changes, free_reserved_area() now WARN()s when it is called before
the initialization of the memory map is complete.
The memory map is fully initialized in page_alloc_init_late() that
completes before initcalls are executed, so it is safe to free reserved
memory in any initcall except early_initcall().
Move freeing of smp_locks section to an initcall to ensure it will happen
after the memory map is fully initialized. Since it does not matter which
exactly initcall to use and the code lives in arch/, pick arch_initcall.
[1] https://lore.kernel.org/all/ec2aaef14783869b3be6e3c253b2dcbf67dbc12a.camel@kernel.crashing.org
Reported-By: Bert Karwatzki <spasswolf@web.de>
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202603302154.b50adaf1-lkp@intel.com
Tested-By: Bert Karwatzki <spasswolf@web.de>
Link: https://lore.kernel.org/r/20260327140109.7561-1-spasswolf@web.de
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Fixes: b2129a39511b ("memblock: make free_reserved_area() update memblock if ARCH_KEEP_MEMBLOCK=y")
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
|
|
aravindanilraj0702@gmail.com <aravindanilraj0702@gmail.com> says:
From: Aravind Anilraj <aravindanilraj0702@gmail.com>
This series fixes MCLK resource leaks in the platform_clock_control()
implementations for bytcr_rt5640, bytcr_rt5651, and cht_bsw_rt5672.
In the SND_SOC_DAPM_EVENT_ON() path, clk_prepare_enable() is called to
enable MCLK, but subsequent failures in codec clock configuration (eg:
*_prepare_and_enable_pll1() or snd_soc_dai_set_sysclk()) return without
disabling the clock, leaking a reference.
Patches 1-3 fix this by adding the missing clk_disable_unprepare() calls
in the relevant error paths, ensuring proper symmetry between enable and
disable operations within the EVENT_ON scope.
Patch 4 moves unrelated logging changes into a separate patch and
standardizes error messages.
|
|
Standardize the error logging in platform_clock_control() by adding
missing newline characters to dev_err() strings. Additionally, include
the return code in the error messages to assist with debugging.
Signed-off-by: Aravind Anilraj <aravindanilraj0702@gmail.com>
Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://patch.msgid.link/20260401220507.23557-5-aravindanilraj0702@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
If snd_soc_dai_set_pll() or snd_soc_dai_set_sysclk() fail inside the
EVENT_ON path, the function returns without calling
clk_disable_unprepare() on ctx->mclk, which was already enabled earlier
in the same code path. Add the missing clk_disable_unprepare() calls
before returning the error.
Signed-off-by: Aravind Anilraj <aravindanilraj0702@gmail.com>
Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://patch.msgid.link/20260401220507.23557-4-aravindanilraj0702@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
If byt_rt5651_prepare_and_enable_pll1() fails, the function returns
without calling clk_disable_unprepare() on priv->mclk, which was
already enabled earlier in the same code path. Add the missing
cleanup call to prevent the clock from leaking.
Signed-off-by: Aravind Anilraj <aravindanilraj0702@gmail.com>
Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://patch.msgid.link/20260401220507.23557-3-aravindanilraj0702@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
If byt_rt5640_prepare_and_enable_pll1() fails, the function returns
without calling clk_disable_unprepare() on priv->mclk, which was
already enabled earlier in the same code path. Add the missing
cleanup call to prevent the clock from leaking.
Signed-off-by: Aravind Anilraj <aravindanilraj0702@gmail.com>
Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://patch.msgid.link/20260401220507.23557-2-aravindanilraj0702@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Add hw_params callback to dynamically switch DAI format between I2S
and PDM based on audio stream format. When DSD formats are detected,
the DAI format is switched to PDM mode.
Signed-off-by: Chancel Liu <chancel.liu@nxp.com>
Link: https://patch.msgid.link/20260326055614.3614104-1-chancel.liu@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Export the Q7.8 volume control helpers to allow reuse
by other ASoC drivers. These functions handle 16-bit
signed Q7.8 fixed-point format values for volume controls.
Changes include:
- Rename q78_get_volsw to sdca_asoc_q78_get_volsw
- Rename q78_put_volsw to sdca_asoc_q78_put_volsw
- Add a convenience macro SDCA_SINGLE_Q78_TLV and
SDCA_DOUBLE_Q78_TLV for creating mixer controls
This allows other ASoC drivers to easily implement controls
using the Q7.8 fixed-point format without duplicating code.
Signed-off-by: Niranjan H Y <niranjan.hy@ti.com>
Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://patch.msgid.link/20260401132148.2367-1-niranjan.hy@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Use a flexible array member and struct_size to use one allocation.
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Link: https://patch.msgid.link/20260402025040.93569-1-rosenp@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Add the retaskable jack Function to the list of Functions supported by
the class driver, it shouldn't require anything that isn't already
supported.
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://patch.msgid.link/20260327162732.877257-1-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Replace open-coded cluster chain walking logic with exfat_chain_advance()
across exfat_readdir, exfat_find_dir_entry, exfat_count_dir_entries,
exfat_search_empty_slot and exfat_check_dir_empty.
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Introduce exfat_chain_advance() to walk a exfat_chain structure by a
given step, updating both ->dir and ->size fields atomically. This
helper handles both ALLOC_NO_FAT_CHAIN and ALLOC_FAT_CHAIN modes with
proper boundary checking.
Suggested-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Since exfat_get_next_cluster has been updated, no callers pass a NULL
pointer to exfat_ent_get, so remove the handling logic for this case.
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Replace the custom exfat_walk_fat_chain() function and open-coded
FAT chain walking logic with the exfat_cluster_walk() helper across
exfat_find_location, __exfat_get_dentry_set, and exfat_map_cluster.
Suggested-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Introduce exfat_cluster_walk() to walk the FAT chain by a given step,
handling both ALLOC_NO_FAT_CHAIN and ALLOC_FAT_CHAIN modes. Also
redefine exfat_get_next_cluster as a thin wrapper around it for
backward compatibility.
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
When renaming a file in-place to a shorter name, exfat_remove_entries
marks excess entries as DELETED, but es->num_entries is not updated
accordingly. As a result, exfat_update_dir_chksum iterates over the
deleted entries and computes an incorrect checksum.
This does not lead to persistent corruption because mark_inode_dirty()
is called afterward, and __exfat_write_inode later recomputes the
checksum using the correct num_entries value.
Fix by setting es->num_entries = num_entries in exfat_init_ext_entry.
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Make various changes to the documentation formatting to avoid docs
build errors and otherwise improve the produced output format:
- use bullets for lists
- don't use a '.' at the end of echo commands
- fix indentation
Documentation/admin-guide/nfs/pnfs-block-server.rst:55: ERROR: Unexpected indentation. [docutils]
Documentation/admin-guide/nfs/pnfs-scsi-server.rst:37: ERROR: Unexpected indentation. [docutils]
Fixes: 6a97f70b45e7 ("NFSD: Enforce timeout on layout recall and integrate lease manager fencing")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
The file contains a spelling error in a source comment (occured).
Typos in comments reduce readability and make text searches less reliable
for developers and maintainers.
Replace 'occured' with 'occurred' in the affected comment. This is a
comment-only cleanup and does not change behavior.
Signed-off-by: Joseph Salisbury <joseph.salisbury@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
The file contains a spelling error in a source comment (occured).
Typos in comments reduce readability and make text searches less reliable
for developers and maintainers.
Replace 'occured' with 'occurred' in the affected comment. This is a
comment-only cleanup and does not change behavior.
Signed-off-by: Joseph Salisbury <joseph.salisbury@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
The callback channel's rpc_program, rpc_version, rpc_stat,
and per-procedure counts are declared as file-scope statics in
nfs4callback.c, shared across all network namespaces.
Forechannel RPC statistics are already maintained per-netns
(via nfsd_svcstats in struct nfsd_net); the backchannel
has no such separation. When backchannel statistics are
eventually surfaced to userspace, the global counters would
expose cross-namespace data.
Allocate per-netns copies of these structures through a new
opaque struct nfsd_net_cb, managed by nfsd_net_cb_init()
and nfsd_net_cb_shutdown(). The struct definition is private
to nfs4callback.c; struct nfsd_net holds only a pointer.
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
The callback RPC procedure table uses NFSPROC4_CB_##call for
p_statidx, which maps CB_NULL to index 0 and every
compound-based callback (CB_RECALL, CB_LAYOUT, CB_OFFLOAD,
etc.) to index 1. All compound callback operations therefore
share a single statistics counter, making per-operation
accounting impossible.
Assign p_statidx from the NFSPROC4_CLNT_##proc enum instead,
giving each callback operation its own counter slot. The
counts array is already sized by ARRAY_SIZE(nfs4_cb_procedures),
so no allocation change is needed.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
svc_rdma_build_read_segment() constructs RDMA Read sink
buffers by consuming pages one-at-a-time from rq_pages[]
and building one bvec per page. A 64KB NFS READ payload
produces 16 separate bvecs, 16 DMA mappings, and
potentially multiple RDMA Read WRs (on platforms with
4KB pages).
A single higher-order allocation followed by split_page()
yields physically contiguous memory while preserving
per-page refcounts. A single bvec spanning the contiguous
range causes rdma_rw_ctx_init_bvec() to take the
rdma_rw_init_single_wr_bvec() fast path: one DMA mapping,
one SGE, one WR.
The split sub-pages replace the original rq_pages[] entries,
so all downstream page tracking, completion handling, and
xdr_buf assembly remain unchanged.
Allocation uses __GFP_NORETRY | __GFP_NOWARN and falls back
through decreasing orders. If even order-1 fails, the
existing per-page path handles the segment.
When nr_pages is not a power of two, get_order() rounds up
and the allocation yields more pages than needed. The extra
split pages replace existing rq_pages[] entries (freed via
put_page() first), so there is no net increase in per-
request page consumption. Successive segments reuse the
same padding slots, preventing accumulation. The
rq_maxpages guard rejects any allocation that would
overrun the array, falling back to the per-page path.
Under memory pressure, __GFP_NORETRY causes the higher-
order allocation to fail without stalling.
The contiguous path is attempted when the segment starts
page-aligned (rc_pageoff == 0) and spans at least two
pages. NFS WRITE segments carry application-modified byte
ranges of arbitrary length, so the optimization is not
restricted to power-of-two page counts.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
svc_rqst_replace_page() releases displaced pages through a
per-rqst folio batch, but exposes the add-or-flush sequence
directly. svc_tcp_restore_pages() releases displaced pages
individually with put_page().
Introduce svc_rqst_page_release() to encapsulate the
batched release mechanism. Convert svc_rqst_replace_page()
and svc_tcp_restore_pages() to use it. The latter now
benefits from the same batched release that
svc_rqst_replace_page() already uses.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
Long-running tests indicate that this logging can occasionally disrupt
timing and lead to request/response corruption.
Irq handler need to be executed as fast as possible,
most I2C slave IRQ implementations are byte-level, logging here
can significantly affect transfer behavior and timing. It is recommended
to use dev_dbg() for these messages.
Fixes: dd2bc5cc9e25 ("ipmi: ssif_bmc: Add SSIF BMC driver")
Signed-off-by: Jian Zhang <zhangjian.3032@bytedance.com>
Message-ID: <20260403090603.3988423-4-zhangjian.3032@bytedance.com>
Signed-off-by: Corey Minyard <corey@minyard.net>
|
|
A truncated response, caused by host power-off, or other conditions,
can lead to message desynchronization.
Raw trace data (STOP loss scenario, add state transition comment):
1. T-1: Read response phase (SSIF_RES_SENDING)
8271.955342 WR_RCV [03] <- Read polling cmd
8271.955348 RD_REQ [04] <== SSIF_RES_SENDING <- start sending response
8271.955436 RD_PRO [b4]
8271.955527 RD_PRO [00]
8271.955618 RD_PRO [c1]
8271.955707 RD_PRO [00]
8271.955814 RD_PRO [ad] <== SSIF_RES_SENDING <- last byte
<- !! STOP lost (truncated response)
2. T: New Write request arrives, BMC still in SSIF_RES_SENDING
8271.967973 WR_REQ [] <== SSIF_RES_SENDING >> SSIF_ABORTING <- log: unexpected WR_REQ in RES_SENDING
8271.968447 WR_RCV [02] <== SSIF_ABORTING <- do nothing
8271.968452 WR_RCV [02] <== SSIF_ABORTING <- do nothing
8271.968454 WR_RCV [18] <== SSIF_ABORTING <- do nothing
8271.968456 WR_RCV [01] <== SSIF_ABORTING <- do nothing
8271.968458 WR_RCV [66] <== SSIF_ABORTING <- do nothing
8271.978714 STOP [] <== SSIF_ABORTING >> SSIF_READY <- log: unexpected SLAVE STOP in state=SSIF_ABORTING
3. T+1: Next Read polling, treated as a fresh transaction
8271.979125 WR_REQ [] <== SSIF_READY >> SSIF_START
8271.979326 WR_RCV [03] <== SSIF_START >> SSIF_SMBUS_CMD <- smbus_cmd=0x03
8271.979331 RD_REQ [04] <== SSIF_RES_SENDING <- sending response
8271.979427 RD_PRO [b4] <- !! this is T's stale response -> desynchronization
When in SSIF_ABORTING state, a newly arrived command should still be
handled to avoid dropping the request or causing message
desynchronization.
Fixes: dd2bc5cc9e25 ("ipmi: ssif_bmc: Add SSIF BMC driver")
Signed-off-by: Jian Zhang <zhangjian.3032@bytedance.com>
Message-ID: <20260403090603.3988423-3-zhangjian.3032@bytedance.com>
Signed-off-by: Corey Minyard <corey@minyard.net>
|
|
copy_to_user() returns the number of bytes that could not be copied,
with a non-zero value indicating a partial or complete failure. The
current code only checks for negative return values and treats all
non-negative results as success.
Treating any positive return value from copy_to_user() as
an error and returning -EFAULT.
Fixes: dd2bc5cc9e25 ("ipmi: ssif_bmc: Add SSIF BMC driver")
Signed-off-by: Jian Zhang <zhangjian.3032@bytedance.com>
Message-ID: <20260403090603.3988423-2-zhangjian.3032@bytedance.com>
Signed-off-by: Corey Minyard <corey@minyard.net>
|
|
The response timer can stay armed across device teardown. If it fires after
remove, the callback dereferences the SSIF context and the i2c client after
teardown has started.
Cancel the timer in remove so the callback cannot run after the device is
unregistered.
Signed-off-by: Jian Zhang <zhangjian.3032@bytedance.com>
Message-ID: <20260403090603.3988423-1-zhangjian.3032@bytedance.com>
Signed-off-by: Corey Minyard <corey@minyard.net>
|
|
component_dais[RSND_MAX_COMPONENT] is initially zero-initialized
and later populated in rsnd_dai_of_node(). However, the existing boundary check:
if (i >= RSND_MAX_COMPONENT)
does not guarantee that the last valid element remains zero. As a result,
the loop can rely on component_dais[RSND_MAX_COMPONENT] being zero,
which may lead to an out-of-bounds access.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 547b02f74e4a ("ASoC: rsnd: enable multi Component support for Audio Graph Card/Card2")
Signed-off-by: Denis Rastyogin <gerben@altlinux.org>
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://patch.msgid.link/20260327103311.459239-1-gerben@altlinux.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Currently there is no mechanism to read dmic_num in mach_params
structure. In this scenario mach_params->dmic_num check always
returns 0.
Remove unnecessary condition check for mach_params->dmic_num.
Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com>
Link: https://patch.msgid.link/20260403063452.159800-1-Vijendar.Mukunda@amd.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The MSRs, SMI_COUNT and PPERF, are model-specific MSRs. A very long
CPU ID list is maintained to indicate the supported platforms. With more
and more platforms being introduced, new CPU IDs have to be kept adding.
Also, the old kernel has to be updated to apply the new CPU ID.
The MSRs have been introduced for a long time. There is no plan to
change them in the near future. Furthermore, the current code utilizes
rdmsr_safe() to check the availability of MSRs before using it.
Make them on by default. It should be good enough to only rely on the
rdmsr_safe() to check their availability for both existing and future
platforms.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Co-developed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260327052844.818218-1-dapeng1.mi@linux.intel.com
|
|
Delayed dequeue feature aims to reduce the negative lag of a dequeued
task while sleeping but it can happens that newly enqueued tasks will
move backward the avg vruntime and increase its negative lag.
When the delayed dequeued task wakes up, it has more neg lag compared
to being dequeued immediately or to other tasks that have been
dequeued just before theses new enqueues.
Ensure that the negative lag of a delayed dequeued task doesn't
increase during its delayed dequeued phase while waiting for its neg
lag to diseappear. Similarly, we remove any positive lag that the
delayed dequeued task could have gain during thsi period.
Short slice tasks are particularly impacted in overloaded system.
Test on snapdragon rb5:
hackbench -T -p -l 16000000 -g 2 1> /dev/null &
cyclictest -t 1 -i 2777 -D 333 --policy=fair --mlock -h 20000 -q
The scheduling latency of cyclictest is:
tip/sched/core tip/sched/core +this patch
cyclictest slice (ms) (default)2.8 8 8
hackbench slice (ms) (default)2.8 20 20
Total Samples | 115632 119733 119806
Average (us) | 364 64(-82%) 61(- 5%)
Median (P50) (us) | 60 56(- 7%) 56( 0%)
90th Percentile (us) | 1166 62(-95%) 62( 0%)
99th Percentile (us) | 4192 73(-98%) 72(- 1%)
99.9th Percentile (us) | 8528 2707(-68%) 1300(-52%)
Maximum (us) | 17735 14273(-20%) 13525(- 5%)
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260331162352.551501-1-vincent.guittot@linaro.org
|
|
Use helper sched_energy_enabled() everywhere we want to test if EAS is
enabled instead of mixing sched_energy_enabled() and direct call to
static_branch_unlikely().
No functional change
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260327132013.2800517-1-vincent.guittot@linaro.org
|
|
Add logic to handle migrating a blocked waiter to a remote
cpu where the lock owner is runnable.
Additionally, as the blocked task may not be able to run
on the remote cpu, add logic to handle return migration once
the waiting task is given the mutex.
Because tasks may get migrated to where they cannot run, also
modify the scheduling classes to avoid sched class migrations on
mutex blocked tasks, leaving find_proxy_task() and related logic
to do the migrations and return migrations.
This was split out from the larger proxy patch, and
significantly reworked.
Credits for the original patch go to:
Peter Zijlstra (Intel) <peterz@infradead.org>
Juri Lelli <juri.lelli@redhat.com>
Valentin Schneider <valentin.schneider@arm.com>
Connor O'Brien <connoro@google.com>
Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260324191337.1841376-11-jstultz@google.com
|