| Age | Commit message (Collapse) | Author | Files | Lines |
|
Every event group has a private copy of the data of all telemetry event
aggregators (aka "telemetry regions") tracking its feature type. Included
may be regions that have the same feature type but tracking different GUID
from the event group's.
Traverse the event group's telemetry region data and mark all regions that
are not usable by the event group as unusable by clearing those regions'
MMIO addresses. A region is considered unusable if:
1) GUID does not match the GUID of the event group.
2) Package ID is invalid.
3) The enumerated size of the MMIO region does not match the expected
value from the XML description file.
Hereafter any telemetry region with an MMIO address is considered valid for
the event group it is associated with.
Enable all the event group's events as long as there is at least one usable
region from where data for its events can be read. Enabling of an event can
fail if the same event has already been enabled as part of another event
group. It should never happen that the same event is described by different
GUID supported by the same system so just WARN (via resctrl_enable_mon_event())
and skip the event.
Note that it is architecturally possible that some telemetry events are only
supported by a subset of the packages in the system. It is not expected that
systems will ever do this. If they do the user will see event files in resctrl
that always return "Unavailable".
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
|
|
The resctrl file system layer passes the domain, RMID, and event id to the
architecture to fetch an event counter.
Fetching a telemetry event counter requires additional information that is
private to the architecture, for example, the offset into MMIO space from
where the counter should be read.
Add mon_evt::arch_priv that architecture can use for any private data related
to the event. The resctrl filesystem initializes mon_evt::arch_priv when the
architecture enables the event and passes it back to architecture when needing
to fetch an event counter.
Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
|
|
The telemetry event aggregators of the Intel Clearwater Forest CPU support two
RMID-based feature types: "energy" with GUID 0x26696143¹, and "perf" with
GUID 0x26557651².
The event counter offsets in an aggregator's MMIO space are arranged in groups
for each RMID.
E.g., the "energy" counters for GUID 0x26696143 are arranged like this:
MMIO offset:0x0000 Counter for RMID 0 PMT_EVENT_ENERGY
MMIO offset:0x0008 Counter for RMID 0 PMT_EVENT_ACTIVITY
MMIO offset:0x0010 Counter for RMID 1 PMT_EVENT_ENERGY
MMIO offset:0x0018 Counter for RMID 1 PMT_EVENT_ACTIVITY
...
MMIO offset:0x23F0 Counter for RMID 575 PMT_EVENT_ENERGY
MMIO offset:0x23F8 Counter for RMID 575 PMT_EVENT_ACTIVITY
After all counters there are three status registers that provide indications
of how many times an aggregator was unable to process event counts, the time
stamp for the most recent loss of data, and the time stamp of the most recent
successful update.
MMIO offset:0x2400 AGG_DATA_LOSS_COUNT
MMIO offset:0x2408 AGG_DATA_LOSS_TIMESTAMP
MMIO offset:0x2410 LAST_UPDATE_TIMESTAMP
Define event_group structures for both of these aggregator types and define
the events tracked by the aggregators in the file system code.
PMT_EVENT_ENERGY and PMT_EVENT_ACTIVITY are produced in fixed point format.
File system code must output as floating point values.
¹https://github.com/intel/Intel-PMT/blob/main/xml/CWF/OOBMSM/RMID-ENERGY/cwf_aggregator.xml
²https://github.com/intel/Intel-PMT/blob/main/xml/CWF/OOBMSM/RMID-PERF/cwf_aggregator.xml
[ bp: Massage commit message. ]
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
|
|
Add a new PERF_PKG resource and introduce package level scope for monitoring
telemetry events so that CPU hotplug notifiers can build domains at the
package granularity.
Use the physical package ID available via topology_physical_package_id()
to identify the monitoring domains with package level scope. This enables
user space to use:
/sys/devices/system/cpu/cpuX/topology/physical_package_id
to identify the monitoring domain a CPU is associated with.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
|
|
Enumeration of Intel telemetry events is an asynchronous process involving
several mutually dependent drivers added as auxiliary devices during the
device_initcall() phase of Linux boot. The process finishes after the probe
functions of these drivers completes. But this happens after
resctrl_arch_late_init() is executed.
Tracing the enumeration process shows that it does complete a full seven
seconds before the earliest possible mount of the resctrl file system (when
included in /etc/fstab for automatic mount by systemd).
Add a hook for use by telemetry event enumeration and initialization and
run it once at the beginning of resctrl mount without any locks held.
The architecture is responsible for any required locking.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20260105191711.GBaVwON5nZn-uO6Sqg@fat_crate.local
|
|
resctrl assumes that all monitor events can be displayed as unsigned decimal
integers.
Hardware architecture counters may provide some telemetry events with greater
precision where the event is not a simple count, but is a measurement of some
sort (e.g. Joules for energy consumed).
Add a new argument to resctrl_enable_mon_event() for architecture code to
inform the file system that the value for a counter is a fixed-point value
with a specific number of binary places.
Only allow architecture to use floating point format on events that the file
system has marked with mon_evt::is_floating_point which reflects the contract
with user space on how the event values are displayed.
Display fixed point values with values rounded to ceil(binary_bits * log10(2))
decimal places. Special case for zero binary bits to print "{value}.0".
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
|
|
resctrl assumes that monitor events can only be read from a CPU in the
cpumask_t set of each domain. This is true for x86 events accessed with an
MSR interface, but may not be true for other access methods such as MMIO.
Introduce and use flag mon_evt::any_cpu, settable by architecture, that
indicates there are no restrictions on which CPU can read that event. This
flag is not supported by the L3 event reading that requires to be run on a CPU
that belongs to the L3 domain of the event being read.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
|
|
The upcoming telemetry event monitoring is not tied to the L3 resource and
will have a new domain structure.
Rename the L3 resource specific domain data structures to include "l3_"
in their names to avoid confusion between the different resource specific
domain structures:
rdt_mon_domain -> rdt_l3_mon_domain
rdt_hw_mon_domain -> rdt_hw_l3_mon_domain
No functional change.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
|
|
Convert the whole call sequence from mon_event_read() to resctrl_arch_rmid_read() to
pass resource independent struct rdt_domain_hdr instead of an L3 specific domain
structure to prepare for monitoring events in other resources.
This additional layer of indirection obscures which aspects of event counting depend
on a valid domain. Event initialization, support for assignable counters, and normal
event counting implicitly depend on a valid domain while summing of domains does not.
Split summing domains from the core event counting handling to make their respective
dependencies obvious.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
|
|
Up until now, all monitoring events were associated with the L3 resource and it
made sense to use the L3 specific "struct rdt_mon_domain *" argument to functions
operating on domains.
Telemetry events will be tied to a new resource with its instances represented
by a new domain structure that, just like struct rdt_mon_domain, starts with
the generic struct rdt_domain_hdr.
Prepare to support domains belonging to different resources by changing the
calling convention of functions operating on domains. Pass the generic header
and use that to find the domain specific structure where needed.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
|
|
Every resctrl resource has a list of domain structures. struct rdt_ctrl_domain
and struct rdt_mon_domain both begin with struct rdt_domain_hdr with
rdt_domain_hdr::type used in validity checks before accessing the domain of
a particular type.
Add the resource id to struct rdt_domain_hdr in preparation for a new monitoring
domain structure that will be associated with a new monitoring resource. Improve
existing domain validity checks with a new helper domain_header_is_valid()
that checks both domain type and resource id. domain_header_is_valid() should
be used before every call to container_of() that accesses a domain structure.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core
Pull driver core fixes from Danilo Krummrich:
- Introduce DMA Rust helpers to avoid build errors when !CONFIG_HAS_DMA
- Remove unnecessary (and hence incorrect) endian conversion in the
Rust PCI driver sample code
- Fix memory leak in the unwind path of debugfs_change_name()
- Support non-const struct software_node pointers in
SOFTWARE_NODE_REFERENCE(), after introducing _Generic()
- Avoid NULL pointer dereference in the unwind path of
simple_xattrs_free()
* tag 'driver-core-6.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core:
fs/kernfs: null-ptr deref in simple_xattrs_free()
software node: Also support referencing non-constant software nodes
debugfs: Fix memleak in debugfs_change_name().
samples: rust: fix endianness issue in rust_driver_pci
rust: dma: add helpers for architectures without CONFIG_HAS_DMA
|
|
virtio_features.h uses WARN_ON_ONCE and memset so it must
include linux/bug.h and linux/string.h
Message-ID: <579986aa9b8d023844990d2a0e267382f8ad85d5.1764873799.git.mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
virtio.h uses struct module, add a forward declaration to
make the header self-contained.
Message-ID: <9171b5cac60793eb59ab044c96ee038bf1363bee.1764873799.git.mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Fwnode references are be implemented differently if referenced node is a
software node. _Generic() is used to differentiate between the two cases
but only const software nodes were present in the selection. Also add
non-const software nodes.
Reported-by: Kenneth Crudup <kenny@panix.com>
Closes: https://lore.kernel.org/all/af773b82-bef2-4209-baaf-526d4661b7fc@panix.com/
Fixes: d7cdbbc93c56 ("software node: allow referencing firmware nodes")
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Tested-By: Kenneth R. Crudup <kenny@panix.com>
Tested-by: Mehdi Djait <mehdi.djait@linux.intel.com> # Dell XPS 9315
Reviewed-by: Mehdi Djait <mehdi.djait@linux.intel.com>
Link: https://patch.msgid.link/20251219083638.2454138-1-sakari.ailus@linux.intel.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
- Fix FPU core dumps on certain CPU models
- Fix htmldocs build warning
- Export TLB tracing event name via header
- Remove unused constant from <linux/mm_types.h>
- Fix comments
- Fix whitespace noise in documentation
- Fix variadic structure's definition to un-confuse UBSAN
- Fix posted MSI interrupts irq_retrigger() bug
- Fix asm build failure with older GCC builds
* tag 'x86-urgent-2025-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/bug: Fix old GCC compile fails
x86/msi: Make irq_retrigger() functional for posted MSI
x86/platform/uv: Fix UBSAN array-index-out-of-bounds
mm: Remove tlb_flush_reason::NR_TLB_FLUSH_REASONS from <linux/mm_types.h>
x86/mm/tlb/trace: Export the TLB_REMOTE_WRONG_CPU enum in <trace/events/tlb.h>
x86/sgx: Remove unmatched quote in __sgx_encl_extend function comment
x86/boot/Documentation: Fix whitespace noise in boot.rst
x86/fpu: Fix FPU state core dump truncation on CPUs with no extended xfeatures
x86/boot/Documentation: Fix htmldocs build warning due to malformed table in boot.rst
|
|
Work around clang problems with "=rm" asm constraint.
clang seems to always chose the memory output, while it is almost
always the worst choice.
Add ASM_OUTPUT_RM so that we can replace "=rm" constraint
where it matters for clang, while not penalizing gcc.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Suggested-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:
- ublk selftests for missing coverage
- two fixes for the block integrity code
- fix for the newly added newly added PR read keys ioctl, limiting the
memory that can be allocated
- work around for a deadlock that can occur with ublk, where partition
scanning ends up recursing back into file closure, which needs the
same mutex grabbed. Not the prettiest thing in the world, but an
acceptable work-around until we can eliminate the reliance on
disk->open_mutex for this
- fix for a race between enabling writeback throttling and new IO
submissions
- move a bit of bio flag handling code. No changes, but needed for a
patchset for a future kernel
- fix for an init time id leak failure in rnbd
- loop/zloop state check fix
* tag 'block-6.19-20251218' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
block: validate interval_exp integrity limit
block: validate pi_offset integrity limit
block: rnbd-clt: Fix leaked ID in init_dev()
ublk: fix deadlock when reading partition table
block: add allocation size check in blkdev_pr_read_keys()
Documentation: admin-guide: blockdev: replace zone_capacity with zone_capacity_mb when creating devices
zloop: use READ_ONCE() to read lo->lo_state in queue_rq path
loop: use READ_ONCE() to read lo->lo_state without locking
block: fix race between wbt_enable_default and IO submission
selftests: ublk: add user copy test cases
selftests: ublk: add support for user copy to kublk
selftests: ublk: forbid multiple data copy modes
selftests: ublk: don't share backing files between ublk servers
selftests: ublk: use auto_zc for PER_IO_DAEMON tests in stress_04
selftests: ublk: fix fio arguments in run_io_and_recover()
selftests: ublk: remove unused ios map in seq_io.bt
selftests: ublk: correct last_rw map type in seq_io.bt
selftests: ublk: fix overflow in ublk_queue_auto_zc_fallback()
block: move around bio flagging helpers
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix three issues in the power capping code including one recent
regression and a runtime PM framework regression introduced during the
6.17 development cycle:
- Fix CPU hotplug locking deadlock reported by lockdep after a recent
update of the Intel RAPL power capping driver (Srinivas Pandruvada)
- Fix sscanf() error return value handling in the power capping core
and a race condition in register_control_type() (Sumeet Pawnikar)
- Fix a concurrent bit field update issue in the runtime PM core code
by only updating the bit field in question when runtime PM is
disabled (Rafael Wysocki)"
* tag 'pm-6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
powercap: intel_rapl: Fix possible recursive lock warning
PM: runtime: Do not clear needs_force_resume with enabled runtime PM
powercap: fix sscanf() error return value handling
powercap: fix race condition in register_control_type()
|
|
With the RAPL PMU addition, there is a recursive locking when CPU online
callback function calls rapl_package_add_pmu(). Here cpu_hotplug_lock
is already acquired by cpuhp_thread_fun() and rapl_package_add_pmu()
tries to acquire again.
<4>[ 8.197433] ============================================
<4>[ 8.197437] WARNING: possible recursive locking detected
<4>[ 8.197440] 6.19.0-rc1-lgci-xe-xe-4242-05b7c58b3367dca84+ #1 Not tainted
<4>[ 8.197444] --------------------------------------------
<4>[ 8.197447] cpuhp/0/20 is trying to acquire lock:
<4>[ 8.197450] ffffffff83487870 (cpu_hotplug_lock){++++}-{0:0}, at:
rapl_package_add_pmu+0x37/0x370 [intel_rapl_common]
<4>[ 8.197463]
but task is already holding lock:
<4>[ 8.197466] ffffffff83487870 (cpu_hotplug_lock){++++}-{0:0}, at:
cpuhp_thread_fun+0x6d/0x290
<4>[ 8.197477]
other info that might help us debug this:
<4>[ 8.197480] Possible unsafe locking scenario:
<4>[ 8.197483] CPU0
<4>[ 8.197485] ----
<4>[ 8.197487] lock(cpu_hotplug_lock);
<4>[ 8.197490] lock(cpu_hotplug_lock);
<4>[ 8.197493]
*** DEADLOCK ***
..
..
<4>[ 8.197542] __lock_acquire+0x146e/0x2790
<4>[ 8.197548] lock_acquire+0xc4/0x2c0
<4>[ 8.197550] ? rapl_package_add_pmu+0x37/0x370 [intel_rapl_common]
<4>[ 8.197556] cpus_read_lock+0x41/0x110
<4>[ 8.197558] ? rapl_package_add_pmu+0x37/0x370 [intel_rapl_common]
<4>[ 8.197561] rapl_package_add_pmu+0x37/0x370 [intel_rapl_common]
<4>[ 8.197565] rapl_cpu_online+0x85/0x87 [intel_rapl_msr]
<4>[ 8.197568] ? __pfx_rapl_cpu_online+0x10/0x10 [intel_rapl_msr]
<4>[ 8.197570] cpuhp_invoke_callback+0x41f/0x6c0
<4>[ 8.197573] ? cpuhp_thread_fun+0x6d/0x290
<4>[ 8.197575] cpuhp_thread_fun+0x1e2/0x290
<4>[ 8.197578] ? smpboot_thread_fn+0x26/0x290
<4>[ 8.197581] smpboot_thread_fn+0x12f/0x290
<4>[ 8.197584] ? __pfx_smpboot_thread_fn+0x10/0x10
<4>[ 8.197586] kthread+0x11f/0x250
<4>[ 8.197589] ? __pfx_kthread+0x10/0x10
<4>[ 8.197592] ret_from_fork+0x344/0x3a0
<4>[ 8.197595] ? __pfx_kthread+0x10/0x10
<4>[ 8.197597] ret_from_fork_asm+0x1a/0x30
<4>[ 8.197604] </TASK>
Fix this issue in the same way as rapl powercap package domain is added
from the same CPU online callback by introducing another interface which
doesn't call cpus_read_lock(). Add rapl_package_add_pmu_locked() and
rapl_package_remove_pmu_locked() which don't call cpus_read_lock().
Fixes: 748d6ba43afd ("powercap: intel_rapl: Enable MSR-based RAPL PMU support")
Reported-by: Borah, Chaitanya Kumar <chaitanya.kumar.borah@intel.com>
Closes: https://lore.kernel.org/linux-pm/5427ede1-57a0-43d1-99f3-8ca4b0643e82@intel.com/T/#u
Tested-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Tested-by: RavitejaX Veesam <ravitejax.veesam@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20251217153455.3560176-1-srinivas.pandruvada@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Pull bpf fixes from Alexei Starovoitov:
- Fix BPF builds due to -fms-extensions. selftests (Alexei
Starovoitov), bpftool (Quentin Monnet).
- Fix build of net/smc when CONFIG_BPF_SYSCALL=y, but CONFIG_BPF_JIT=n
(Geert Uytterhoeven)
- Fix livepatch/BPF interaction and support reliable unwinding through
BPF stack frames (Josh Poimboeuf)
- Do not audit capability check in arm64 JIT (Ondrej Mosnacek)
- Fix truncated dmabuf BPF iterator reads (T.J. Mercier)
- Fix verifier assumptions of bpf_d_path's output buffer (Shuran Liu)
- Fix warnings in libbpf when built with -Wdiscarded-qualifiers under
C23 (Mikhail Gavrilov)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: add regression test for bpf_d_path()
bpf: Fix verifier assumptions of bpf_d_path's output buffer
selftests/bpf: Add test for truncated dmabuf_iter reads
bpf: Fix truncated dmabuf iterator reads
x86/unwind/orc: Support reliable unwinding through BPF stack frames
bpf: Add bpf_has_frame_pointer()
bpf, arm64: Do not audit capability check in do_jit()
libbpf: Fix -Wdiscarded-qualifiers under C23
bpftool: Fix build warnings due to MS extensions
net: smc: SMC_HS_CTRL_BPF should depend on BPF_JIT
selftests/bpf: Add -fms-extensions to bpf build flags
|
|
Pull shmem rename fixes from Al Viro:
"A couple of shmem rename fixes - recent regression from tree-in-dcache
series and older breakage from stable directory offsets stuff"
* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
shmem: fix recovery on rename failures
shmem_whiteout(): fix regression from tree-in-dcache series
|
|
maple_tree insertions can fail if we are seriously short on memory;
simple_offset_rename() does not recover well if it runs into that.
The same goes for simple_offset_rename_exchange().
Moreover, shmem_whiteout() expects that if it succeeds, the caller will
progress to d_move(), i.e. that shmem_rename2() won't fail past the
successful call of shmem_whiteout().
Not hard to fix, fortunately - mtree_store() can't fail if the index we
are trying to store into is already present in the tree as a singleton.
For simple_offset_rename_exchange() that's enough - we just need to be
careful about the order of operations.
For simple_offset_rename() solution is to preinsert the target into the
tree for new_dir; the rest can be done without any potentially failing
operations.
That preinsertion has to be done in shmem_rename2() rather than in
simple_offset_rename() itself - otherwise we'd need to deal with the
possibility of failure after successful shmem_whiteout().
Fixes: a2e459555c5f ("shmem: stable directory offsets")
Reviewed-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Ingo Molnar:
- Fix error code in the irqchip/mchp-eic driver
- Fix setup_percpu_irq() affinity assumptions
- Remove the unused irq_domain_add_tree() function
* tag 'irq-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/mchp-eic: Fix error code in mchp_eic_domain_alloc()
irqdomain: Delete irq_domain_add_tree()
genirq: Allow NULL affinity for setup_percpu_irq()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc core fixes from Ingo Molnar:
- Improve bug reporting
- Suppress W=1 format warning
- Improve rseq scalability on Clang builds
* tag 'core-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rseq: Always inline rseq_debug_syscall_return()
bug: Hush suggest-attribute=format for __warn_printf()
bug: Let report_bug_entry() provide the correct bugaddr
|
|
This has been unused since it was added 11 years ago in:
d17d8f9dedb9 ("x86/mm: Add tracepoints for TLB flushes")
Signed-off-by: Tal Zussman <tz2294@columbia.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Rik van Riel <riel@surriel.com>
Acked-by: David Hildenbrand <david@redhat.com>
Link: https://patch.msgid.link/20251212-tlb-trace-fix-v2-2-d322e0ad9b69@columbia.edu
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc updates from Andrew Morton:
"There are no significant series in this small merge. Please see the
individual changelogs for details"
[ Editor's note: it's mainly ocfs2 and a couple of random fixes ]
* tag 'mm-nonmm-stable-2025-12-11-11-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm: memfd_luo: add CONFIG_SHMEM dependency
mm: shmem: avoid build warning for CONFIG_SHMEM=n
ocfs2: fix memory leak in ocfs2_merge_rec_left()
ocfs2: invalidate inode if i_mode is zero after block read
ocfs2: avoid -Wflex-array-member-not-at-end warning
ocfs2: convert remaining read-only checks to ocfs2_emergency_state
ocfs2: add ocfs2_emergency_state helper and apply to setattr
checkpatch: add uninitialized pointer with __free attribute check
args: fix documentation to reflect the correct numbers
ocfs2: fix kernel BUG in ocfs2_find_victim_chain
liveupdate: luo_core: fix redundant bound check in luo_ioctl()
ocfs2: validate inline xattr size and entry count in ocfs2_xattr_ibody_list
fs/fat: remove unnecessary wrapper fat_max_cache()
ocfs2: replace deprecated strcpy with strscpy
ocfs2: check tl_used after reading it from trancate log inode
liveupdate: luo_file: don't use invalid list iterator
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull more MM updates from Andrew Morton:
- "powerpc/pseries/cmm: two smaller fixes" (David Hildenbrand)
fixes a couple of minor things in ppc land
- "Improve folio split related functions" (Zi Yan)
some cleanups and minorish fixes in the folio splitting code
* tag 'mm-stable-2025-12-11-11-39' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm/damon/tests/core-kunit: avoid damos_test_commit stack warning
mm: vmscan: correct nr_requested tracing in scan_folios
MAINTAINERS: add idr core-api doc file to XARRAY
mm/hugetlb: fix incorrect error return from hugetlb_reserve_pages()
mm: fix CONFIG_STACK_GROWSUP typo in mm.h
mm/huge_memory: fix folio split stats counting
mm/huge_memory: make min_order_for_split() always return an order
mm/huge_memory: replace can_split_folio() with direct refcount calculation
mm/huge_memory: change folio_split_supported() to folio_check_splittable()
mm/sparse: fix sparse_vmemmap_init_nid_early definition without CONFIG_SPARSEMEM
powerpc/pseries/cmm: adjust BALLOON_MIGRATE when migrating pages
powerpc/pseries/cmm: call balloon_devinfo_init() also without CONFIG_BALLOON_COMPACTION
|
|
Brown paper bag time. This is a silly oversight where I missed to drop
the error condition checking to ensure we clean up on early error
returns. I have an internal unit testset coming up for this which will
catch all such issues going forward.
Reported-by: Chris Mason <clm@fb.com>
Reported-by: Jeff Layton <jlayton@kernel.org>
Fixes: 011703a9acd7 ("file: add FD_{ADD,PREPARE}()")
Signed-off-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux
Pull further i3c update from Alexandre Belloni:
"We are removing a legacy API callback and having this sooner rather
than later will help ensuring no one introduces a new driver using it.
I've also added patches removing the "__free(...) = NULL" pattern
because I'm sure we won't avoid people sending those following the
mailing list discussion..."
* tag 'i3c/for-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux:
i3c: adi: Fix confusing cleanup.h syntax
i3c: master: Fix confusing cleanup.h syntax
i3c: master: cleanup callback .priv_xfers()
i3c: master: switch to use new callback .i3c_xfers() from .priv_xfers()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Pull RTC updates from Alexandre Belloni:
"Subsystem:
- stop setting max_user_freq from the individual drivers as this has
not been hardware related for a while
New drivers:
- Andes ATCRTC100
- Apple SMC
- Nvidia VRS
Drivers:
- renesas-rtca3: add RZ/V2H support
- tegra: add ACPI support"
* tag 'rtc-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (34 commits)
rtc: spacemit: MFD_SPACEMIT_P1 as dependencies
rtc: atcrtc100: Fix signedness bug in probe()
rtc: max31335: Fix ignored return value in set_alarm
rtc: gamecube: Check the return value of ioremap()
Documentation: ABI: testing: Fix "upto" typo in rtc-cdev
rtc: Add new rtc-macsmc driver for Apple Silicon Macs
dt-bindings: rtc: Add Apple SMC RTC
MAINTAINERS: drop unneeded file entry in NVIDIA VRS RTC DRIVER
rtc: isl12026: Add id_table
rtc: renesas-rtca3: Add support for multiple reset lines
dt-bindings: rtc: renesas,rz-rtca3: Add RZ/V2H support
rtc: tegra: Replace deprecated SIMPLE_DEV_PM_OPS
rtc: tegra: Add ACPI support
rtc: tegra: Use devm_clk_get_enabled() in probe
rtc: Kconfig: add MC34708 to mc13xxx help text
rtc: s35390a: use u8 instead of char for register buffer
rtc: nvvrs: add NVIDIA VRS RTC device driver
dt-bindings: rtc: Document NVIDIA VRS RTC
rtc: atcrtc100: Add ATCRTC100 RTC driver
MAINTAINERS: Add entry for ATCRTC100 RTC driver
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire
Pull soundwire updates from Vinod Koul:
- Support for multiple sections in a BPT stream
- Align DMA frame with BPT frames
- Qualcomm support for v3.1.0 controllers
* tag 'soundwire-6.19-rc1_updated' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
soundwire: intel_ace2x: handle multi BPT sections
soundwire: pass sdw_bpt_section to cdns BPT helpers
soundwire: introduce BPT section
soundwire: intel_ace2x: add fake frame to BRA read command
soundwire: cadence_master: add fake_size parameter to sdw_cdns_prepare_read_dma_buffer
ASoC: SOF: Intel: export hda_sdw_bpt_get_buf_size_aligment
soundwire: cadence: export sdw_cdns_bpt_find_bandwidth
soundwire: cadence_master: set data_per_frame as frame capability
soundwire: only compute BPT stream in sdw_compute_dp0_port_params
soundwire: cadence_master: make frame index trace more readable
soundwire: qcom: adding support for v3.1.0
dt-bindings: soundwire: qcom: Document v3.1.0 version of IP block
soundwire: qcom: prepare for v3.x
soundwire: qcom: deprecate qcom,din/out-ports
dt-bindings: soundwire: qcom: deprecate qcom,din/out-ports
soundwire: qcom: remove unused rd_fifo_depth
of: base: Add of_property_read_u8_index
|
|
Remove the .priv_xfers() callback from the framework after all master
controller drivers have switched to use the new .i3c_xfers() callback.
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Tested-by: Tommaso Merciai <tommaso.merciai.xr@bp.renesas.com>
Link: https://patch.msgid.link/20251203-i3c_xfer_cleanup_master-v2-2-7dd94d04ee2d@nxp.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
|
|
We'll need bio_flagged() earlier in bio.h for later patches, move it
together with all related helpers, and mark the bio_flagged()'s bio
argument as const.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Pull NFS client updates from Trond Myklebust:
"Bugfixes:
- Fix 'nlink' attribute update races when unlinking a file
- Add missing initialisers for the directory verifier in various
places
- Don't regress the NFSv4 open state due to misordered racing replies
- Ensure the NFSv4.x callback server uses the correct transport
connection
- Fix potential use-after-free races when shutting down the NFSv4.x
callback server
- Fix a pNFS layout commit crash
- Assorted fixes to ensure correct propagation of mount options when
the client crosses a filesystem boundary and triggers the VFS
automount code
- More localio fixes
Features and cleanups:
- Add initial support for basic directory delegations
- SunRPC back channel code cleanups"
* tag 'nfs-for-6.19-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (24 commits)
NFSv4: Handle NFS4ERR_NOTSUPP errors for directory delegations
nfs/localio: remove 61 byte hole from needless ____cacheline_aligned
nfs/localio: remove alignment size checking in nfs_is_local_dio_possible
NFS: Fix up the automount fs_context to use the correct cred
NFS: Fix inheritance of the block sizes when automounting
NFS: Automounted filesystems should inherit ro,noexec,nodev,sync flags
Revert "nfs: ignore SB_RDONLY when mounting nfs"
Revert "nfs: clear SB_RDONLY before getting superblock"
Revert "nfs: ignore SB_RDONLY when remounting nfs"
NFS: Add a module option to disable directory delegations
NFS: Shortcut lookup revalidations if we have a directory delegation
NFS: Request a directory delegation during RENAME
NFS: Request a directory delegation on ACCESS, CREATE, and UNLINK
NFS: Add support for sending GDD_GETATTR
NFSv4/pNFS: Clear NFS_INO_LAYOUTCOMMIT in pnfs_mark_layout_stateid_invalid
NFSv4.1: protect destroying and nullifying bc_serv structure
SUNRPC: new helper function for stopping backchannel server
SUNRPC: cleanup common code in backchannel request
NFSv4.1: pass transport for callback shutdown
NFSv4: ensure the open stateid seqid doesn't go backwards
...
|
|
To get the full benefit of:
eaa9088d568c ("rseq: Use static branch for syscall exit debug when GENERIC_IRQ_ENTRY=y")
clang needs an __always_inline instead of a plain inline qualifier:
$ for i in {1..10}; do taskset -c 4 perf5 bench syscall basic -l 100000000 | grep "ops/sec"; done
Before After
ops/sec 15424491 15872221 +2.9%
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://patch.msgid.link/20251205100753.4073221-1-edumazet@google.com
|
|
The macro uses up to 15 arguments. Reflect this in the top level comment.
Link: https://lkml.kernel.org/r/20251201201018.765475-1-andriy.shevchenko@linux.intel.com
Fixes: d51e783c17ba ("lsm: count the LSMs enabled at compile time")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab fix from Vlastimil Babka:
- A stable fix for performance regression in tests that perform
kmem_cache_destroy() a lot, due to unnecessarily wide scope of
kvfree_rcu_barrier() (Harry Yoo)
* tag 'slab-for-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
mm/slab: introduce kvfree_rcu_barrier_on_cache() for cache destruction
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull more s390 updates from Heiko Carstens:
- Use the MSI parent domain API instead of the legacy API for setup and
teardown of PCI MSI IRQs
- Select POSIX_CPU_TIMERS_TASK_WORK now that VIRT_XFER_TO_GUEST_WORK
has been implemented for s390
- Fix a KVM bug which can lead to guest memory corruption
- Fix KASAN shadow memory mapping for hotplugged memory
- Minor bug fixes and improvements
* tag 's390-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/bug: Add missing alignment
s390/bug: Add missing CONFIG_BUG ifdef again
KVM: s390: Fix gmap_helper_zap_one_page() again
s390/pci: Migrate s390 IRQ logic to IRQ domain API
genirq: Change hwirq parameter to irq_hw_number_t
s390: Select POSIX_CPU_TIMERS_TASK_WORK
s390: Unmap early KASAN shadow on memory offlining
s390/vmem: Support 2G page splitting for KASAN shadow freeing
s390/boot: Use entire page for PTEs
s390/vmur: Use scnprintf() instead of sprintf()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux
Pull dma-mapping fixes from Marek Szyprowski:
- last minute fix for missing parenthesis in recently merged code (Hans
de Goede)
- removal of excessive, non-fatal warnings (Dave Kleikamp)
* tag 'dma-mapping-6.19-2025-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux:
dma-mapping: Fix DMA_BIT_MASK() macro being broken
dma/pool: eliminate alloc_pages warning in atomic_pool_expand
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull futex updates from Ingo Molnar:
- Standardize on ktime_t in restart_block::time as well (Thomas
Weißschuh)
- Futex selftests:
- Add robust list testcases (André Almeida)
- Formatting fixes/cleanups (Carlos Llamas)
* tag 'locking-futex-2025-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex: Store time as ktime_t in restart block
selftests/futex: Create test for robust list
selftests/futex: Skip tests if shmget unsupported
selftests/futex: Add newline to ksft_exit_fail_msg()
selftests/futex: Remove unused test_futex_mpol()
|
|
Introduce a bpf_has_frame_pointer() helper that unwinders can call to
determine whether a given instruction pointer is within the valid frame
pointer region of a BPF JIT program or trampoline (i.e., after the
prologue, before the epilogue).
This will enable livepatch (with the ORC unwinder) to reliably unwind
through BPF JIT frames.
Acked-by: Song Liu <song@kernel.org>
Acked-and-tested-by: Andrey Grodzovsky <andrey.grodzovsky@crowdstrike.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Link: https://lore.kernel.org/r/fd2bc5b4e261a680774b28f6100509fd5ebad2f0.1764818927.git.jpoimboe@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
|
|
No in-tree users anymore.
[ tglx: Remove the reference in the Chinese documentation as well ]
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20251202202327.1444693-1-andriy.shevchenko@linux.intel.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver updates from Ilpo Järvinen:
- acer-wmi: Add PH16-72, PHN16-72, and PT14-51 fan control support
- acpi: platform_profile: Add max-power profile option (power draw
limited by the cooling hardware, may exceed battery power draw limit
when on AC power)
- amd/hsmp: Allow more than one data-fabric per socket
- asus-armoury: Add WMI attributes driver to expose miscellaneous WMI
functions through fw_attributes (deprecates the custom BIOS features
interface through asus-wmi)
- asus-wmi: Use brightness_set_blocking() for kbd led
- ayaneo-ec: Add Ayaneo Embedded Controller driver
- fs/nls:
- Fix utf16 to utf8 string conversion when output size restricted
- Improve error code consistency for utf8 to utf32 conversions
- ideapad-laptop: Fast (Rapid Charge) charge type support
- intel/hid: Add Dell Pro Rugged 10/12 tablet to VGBS DMI quirks
- intel/pmc:
- Arrow Lake telemetry GUID improvements
- Add support for Wildcat Lake PMC information
- intel_pmc_ipc: Fix ACPI buffer memleak
- intel/punit_ipc: Fix memory corruption
- intel/vsec: Wildcat Lake PMT telemetry support
- lenovo-wmi-gamezone: Map "Extreme" performance mode to max-power
- lg-laptop: Add support for the HDAP opregion field
- serial-multi-instantiate: Add IRQ_RESOURCE_OPT for IRQ missing
projects
- thinkpad-t14s-ec: Improve suspend/resume support (lid LEDs, keyboard
backlight)
- uniwill: Add Uniwill laptop driver
- wmi: Move under drivers/platform/wmi as non-x86 WMI support is around
the corner and other WMI features will require adding more C files as
well
- tools/power/x86/intel-speed-select: v1.24
- Check feature status to check if the feature enablement was
successful
- Reset SST-TF bucket structure to display valid bucket info
- Miscellaneous cleanups / refactoring / improvements
* tag 'platform-drivers-x86-v6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (73 commits)
tools/power/x86/intel-speed-select: v1.24 release
tools/power/x86/intel-speed-select: Reset isst_turbo_freq_info for invalid buckets
tools/power/x86/intel-speed-select: Check feature status
platform/x86: asus-wmi: use brightness_set_blocking() for kbd led
fs/nls: Fix inconsistency between utf8_to_utf32() and utf32_to_utf8()
platform/x86: asus-armoury: add support for GA503QR
platform/x86: intel_pmc_ipc: fix ACPI buffer memory leak
platform/x86: hp-wmi: Order DMI board name arrays
platform/x86/intel/hid: Add Dell Pro Rugged 10/12 tablet to VGBS DMI quirks
platform: surface: replace use of system_wq with system_percpu_wq
platform: x86: replace use of system_wq with system_percpu_wq
platform/surface: acpi-notify: add WQ_PERCPU to alloc_workqueue users
platform/x86: wmi-gamezone: Add Legion Go 2 Quirks
platform/x86: lenovo-wmi-gamezone Use max-power rather than balanced-performance
acpi: platform_profile - Add max-power profile option
platform/x86/amd/pmf: Use devm_mutex_init() for mutex initialization
platform/x86/amd/pmf: Add BIOS_INPUTS_MAX macro to replace hardcoded array size
platform/x86: serial-multi-instantiate: Add IRQ_RESOURCE_OPT for IRQ missing projects
platform/x86/amd/pmf: Refactor repetitive BIOS output handling
platform/x86/uniwill: Add TUXEDO devices
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"Fix a runtime PM unit test added during the 6.18 development cycle and
change the pm_runtime_barrier() return type to void (Brian Norris)"
* tag 'pm-6.19-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
coccinelle: Drop pm_runtime_barrier() error code checks
PM: runtime: Make pm_runtime_barrier() return void
PM: runtime: Stop checking pm_runtime_barrier() return code
|
|
Add a cond_lock annotation for lockref_put_or_lock to make sparse
happy with using it. Note that for this the return value has to be
double-inverted as the return value convention of lockref_put_or_lock
is inverted compared to _trylock conventions expected by __cond_lock,
as lockref_put_or_lock returns true when it did not need to take the
lock.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-auto
Pull __auto_type to auto conversion from Peter Anvin:
"Convert '__auto_type' to 'auto', defining a macro for 'auto' unless
C23+ is in use"
* tag 'auto-type-conversion-for-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-auto:
tools/virtio: replace "__auto_type" with "auto"
selftests/bpf: replace "__auto_type" with "auto"
arch/x86: replace "__auto_type" with "auto"
arch/nios2: replace "__auto_type" and adjacent equivalent with "auto"
fs/proc: replace "__auto_type" with "const auto"
include/linux: change "__auto_type" to "auto"
compiler_types.h: add "auto" as a macro for "__auto_type"
|
|
Commit 2b6a3f061f11 ("mm: declare VMA flags by bit") significantly
refactors the header file include/linux/mm.h. In that step, it introduces
a typo in an ifdef, referring to a non-existing config option
STACK_GROWS_UP, whereas the actual config option is called STACK_GROWSUP.
Fix this typo in the mm header file.
Link: https://lkml.kernel.org/r/20251201122922.352480-1-lukas.bulwahn@redhat.com
Fixes: 2b6a3f061f11 ("mm: declare VMA flags by bit")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com>
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
min_order_for_split() returns -EBUSY when the folio is truncated and
cannot be split. In commit 77008e1b2ef7 ("mm/huge_memory: do not change
split_huge_page*() target order silently"), memory_failure() does not
handle it and pass -EBUSY to try_to_split_thp_page() directly.
try_to_split_thp_page() returns -EINVAL since -EBUSY becomes 0xfffffff0 as
new_order is unsigned int in __folio_split() and this large new_order is
rejected as an invalid input. The code does not cause a bug.
soft_offline_in_use_page() also uses min_order_for_split() but it always
passes 0 as new_order for split.
Fix it by making min_order_for_split() always return an order. When the
given folio is truncated, namely folio->mapping == NULL, return 0 and let
a subsequent split function handle the situation and return -EBUSY.
Add kernel-doc to min_order_for_split() to clarify its use.
Link: https://lkml.kernel.org/r/20251126210618.1971206-4-ziy@nvidia.com
Signed-off-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Balbir Singh <balbirs@nvidia.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
can_split_folio() is just a refcount comparison, making sure only the
split caller holds an extra pin. Open code it with
folio_expected_ref_count() != folio_ref_count() - 1. For the extra_pins
used by folio_ref_freeze(), add folio_cache_ref_count() to calculate it.
Also replace folio_expected_ref_count() with folio_cache_ref_count() used
by folio_ref_unfreeze(), since they are returning the same values when a
folio is frozen and folio_cache_ref_count() does not have unnecessary
folio_mapcount() in its implementation.
Link: https://lkml.kernel.org/r/20251126210618.1971206-3-ziy@nvidia.com
Signed-off-by: Zi Yan <ziy@nvidia.com>
Suggested-by: David Hildenbrand (Red Hat) <david@kernel.org>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
Cc: Balbir Singh <balbirs@nvidia.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|