| Age | Commit message (Collapse) | Author | Files | Lines |
|
The struct pyrf_evlist embeds the evlist requiring the copying from
things like parsed events. The copying logic handles the leader being
the event itself, but if the leader group event is a different in the
list it will cause an evsel to point to the evsel in the list that was
copied from which is bad. Fix this by adding another pass over the
evlist rewriting leaders, simplified by the introductin of two evlist
helpers.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-13-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Tool PMUs assume that stat's process_counter_values is being used to
read the counters. Specifically they hold onto old values in
evsel->prev_raw_counts and give the cumulative count based off of this
value. Update pyrf_evsel__read to allocate counts and prev_raw_counts,
use evsel__read_counter rather than perf_evsel__read so tool PMUs are
read from not just perf_event_open events, make the returned
pyrf_counts_values contain the delta value rather than the cumulative
value.
Fixes: 739621f65702 ("perf python: Add evsel read method")
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-12-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
The CPU index is incorrectly checked rather than the thread index.
Fixes: 739621f65702 ("perf python: Add evsel read method")
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-11-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
The evsel__pmu_name helper will internally use evsel__find_pmu that
handles legacy events, extended types, etc. in determining a PMU and
will provide a better value than just trying to access the PMU's name
directly as the PMU may not have been computed.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-10-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
If the short and long descriptions are the same then save space and
don't store both of them. When storing the desc in the perf_pmu_alias,
don't duplicate the desc into the long_desc.
By avoiding storing the duplicate the size of the events string in the
binary on x86 is reduced by 29,840 bytes.
Fix tests that expect a duplicated description.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-9-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Metrics will fill in the context to have mappings from an event to a
count. When counts are added they replace existing mappings which
generally shouldn't exist with aggregation. Switch to accumulating to
better support cases where perf stat's aggregation isn't used and we
may see a counter more than once.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-8-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
The rblist of metric_event that then have a list of associated
metric_expr is moved out of the stat_config and into the evlist. This
is done as part of refactoring things for python, having the state
split in two places complicates that implementation. The evlist is
doing the harder work of enabling and disabling events, the metrics
are needed to compute a value and it doesn't seem unreasonable to hang
them from the evlist.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-7-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Factor metricgroup__for_each_metric into its own function handling
regular and sys metrics. Make the metric adding and printing code use
it, move the printing code into print-events files.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
If sysfs isn't mounted then we may fail to read a PMU's type. In this
situation resort to lookup of wellknown types. Only applies to
software, tracepoint and breakpoint PMUs.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
scnprintf is declared in linux/kernel.h, directly depend upon it.
Add missing SPDX comments.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Add missing breakpoint and raw types. Avoid a switch, just use a
lookup array. Switch the type to unsigned to avoid checking negative
values.
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Long names like ucsi_source_psy_USBC000:001 when prefixed with hwmon_
exceed the buffer size and the last digit is lost. This causes
confusion with similar names like ucsi_source_psy_USBC000:002. Extend
the buffer size to avoid this.
Fixes: 53cc0b351ec9 ("perf hwmon_pmu: Add a tool PMU exposing events from hwmon in sysfs")
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Since the commit e9846f5ead26 ("perf test: In forked mode add check that
fds aren't leaked"), the test "Breakpoint accounting" reports the error:
# perf test -vvv "Breakpoint accounting"
20: Breakpoint accounting:
--- start ---
test child forked, pid 373
failed opening event 0
failed opening event 0
watchpoints count 4, breakpoints count 6, has_ioctl 1, share 0
wp 0 created
wp 1 created
wp 2 created
wp 3 created
wp 0 modified to bp
wp max created
---- end(0) ----
Leak of file descriptor 7 that opened: 'anon_inode:[perf_event]'
A watchpoint's file descriptor was not properly released. This patch
fixes the leak.
Fixes: 032db28e5fa3 ("perf tests: Add breakpoint accounting/modify test")
Reported-by: Aishwarya TCV <aishwarya.tcv@arm.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250711-perf_fix_breakpoint_accounting-v1-1-b314393023f9@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
Cross-merge networking fixes after downstream PR (net-6.16-rc6-2).
No conflicts.
Adjacent changes:
drivers/net/wireless/mediatek/mt76/mt7925/mcu.c
c701574c5412 ("wifi: mt76: mt7925: fix invalid array index in ssid assignment during hw scan")
b3a431fe2e39 ("wifi: mt76: mt7925: fix off by one in mt7925_mcu_hw_scan()")
drivers/net/wireless/mediatek/mt76/mt7996/mac.c
62da647a2b20 ("wifi: mt76: mt7996: Add MLO support to mt7996_tx_check_aggr()")
dc66a129adf1 ("wifi: mt76: add a wrapper for wcid access with validation")
drivers/net/wireless/mediatek/mt76/mt7996/main.c
3dd6f67c669c ("wifi: mt76: Move RCU section in mt7996_mcu_add_rate_ctrl()")
8989d8e90f5f ("wifi: mt76: mt7996: Do not set wcid.sta to 1 in mt7996_mac_sta_event()")
net/mac80211/cfg.c
58fcb1b4287c ("wifi: mac80211: reject VHT opmode for unsupported channel widths")
037dc18ac3fb ("wifi: mac80211: add support for storing station S1G capabilities")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Document the Top Level Mode Multiplexer on the Milos Platform.
Signed-off-by: Luca Weiss <luca.weiss@fairphone.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://lore.kernel.org/20250702-sm7635-pinctrl-v2-1-c138624b9924@fairphone.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
The description above vpanic() has the wrong function name. Fix it up.
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/23a7e8add6546b155371b7e0fbb37bb1def13d6e.1752232374.git.namcao@linutronix.de
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/lkml/20250711183802.2d8c124d@canb.auug.org.au/
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
vpanic() does not return. However, objtool doesn't know this and gets
confused:
kernel/trace/rv/reactor_panic.o: warning: objtool: rv_panic_reaction(): unexpected end of section .text
Add vpanic() to the list of noreturn functions.
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/073f826ebec18b2bb59cba88606cd865d8039fd2.1752232374.git.namcao@linutronix.de
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202507110826.2ekbVdWZ-lkp@intel.com/
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
PM7550 is a PMIC, featuring 12 GPIOs. Describe it.
Signed-off-by: Luca Weiss <luca.weiss@fairphone.com>
Link: https://lore.kernel.org/20250709-sm7635-pmxr2230-v2-4-09777dab0a95@fairphone.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
Update the Qualcomm Technologies, Inc. PMIC GPIO binding documentation
to include the compatible string for the PM7550 PMICs.
Signed-off-by: Luca Weiss <luca.weiss@fairphone.com>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://lore.kernel.org/20250709-sm7635-pmxr2230-v2-3-09777dab0a95@fairphone.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
PMIV0104 is a PMIC, featuring 10 GPIOs. Describe it.
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Luca Weiss <luca.weiss@fairphone.com>
Link: https://lore.kernel.org/20250709-sm7635-pmiv0104-v2-3-ebf18895edd6@fairphone.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
Update the Qualcomm Technologies, Inc. PMIC GPIO binding documentation
to include the compatible string for the PMIV0104 PMICs.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Luca Weiss <luca.weiss@fairphone.com>
Link: https://lore.kernel.org/20250709-sm7635-pmiv0104-v2-2-ebf18895edd6@fairphone.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
commit 5a3e85c3c397 ("pinmux: Use sequential access to access
desc->pinmux data") tried to address the issue when two client of the
same gpio calls pinctrl_select_state() for the same functionality, was
resulting in NULL pointer issue while accessing desc->mux_owner.
However, issue was not completely fixed due to the way it was handled
and it can still result in the same NULL pointer.
The issue occurs due to the following interleaving:
cpu0 (process A) cpu1 (process B)
pin_request() { pin_free() {
mutex_lock()
desc->mux_usecount--; //becomes 0
..
mutex_unlock()
mutex_lock(desc->mux)
desc->mux_usecount++; // becomes 1
desc->mux_owner = owner;
mutex_unlock(desc->mux)
mutex_lock(desc->mux)
desc->mux_owner = NULL;
mutex_unlock(desc->mux)
This sequence leads to a state where the pin appears to be in use
(`mux_usecount == 1`) but has no owner (`mux_owner == NULL`), which can
cause NULL pointer on next pin_request on the same pin.
Ensure that updates to mux_usecount and mux_owner are performed
atomically under the same lock. Only clear mux_owner when mux_usecount
reaches zero and no new owner has been assigned.
Fixes: 5a3e85c3c397 ("pinmux: Use sequential access to access desc->pinmux data")
Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Link: https://lore.kernel.org/20250708-pinmux-race-fix-v2-1-8ae9e8a0d1a1@oss.qualcomm.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-drivers into devel
pinctrl: renesas: Updates for v6.17 (take two)
- Sort Kconfig symbols and improve their descriptions,
- Simplify PINCTRL_RZV2M logic.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
nouveau_drm_ioctl() only checks the _IOC_NR() bits in the
DRM_NOUVEAU_NVIF command, but ignores the type and direction bits, so any
command with '7' in the low eight bits gets passed into
nouveau_abi16_ioctl() instead of drm_ioctl().
Check for all the bits except the size that is handled inside of the
handler.
Fixes: 27111a23d01c ("drm/nouveau: expose the full object/event interfaces to userspace")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[ Fix up two checkpatch warnings and a typo. - Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Link: https://lore.kernel.org/r/20250711072458.2665325-1-arnd@kernel.org
|
|
Tao Chen says:
====================
Move attach_type into bpf_link
Andrii suggested moving the attach_type into bpf_link, the previous discussion
is as follows:
https://lore.kernel.org/bpf/CAEf4BzY7TZRjxpCJM-+LYgEqe23YFj5Uv3isb7gat2-HU4OSng@mail.gmail.com
patch1 add attach_type in bpf_link, and pass it to bpf_link_init, which
will init the attach_type field.
patch2-7 remove the attach_type in struct bpf_xx_link, update the info
with bpf_link attach_type.
There are some functions finally call bpf_link_init but do not have bpf_attr
from user or do not need to init attach_type from user like bpf_raw_tracepoint_open,
now use prog->expected_attach_type to init attach_type.
bpf_struct_ops_map_update_elem
bpf_raw_tracepoint_open
bpf_struct_ops_test_run
Feedback of any kind is welcome, thanks.
====================
Link: https://patch.msgid.link/20250710032038.888700-1-chen.dylane@linux.dev
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
|
|
Use attach_type in bpf_link to replace the location field, and
remove location field in netkit_link.
Signed-off-by: Tao Chen <chen.dylane@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20250710032038.888700-8-chen.dylane@linux.dev
|
|
Use attach_type in bpf_link, and remove it in bpf_tracing_link.
Signed-off-by: Tao Chen <chen.dylane@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20250710032038.888700-7-chen.dylane@linux.dev
|
|
Use attach_type in bpf_link, and remove it in bpf_netns_link.
And move netns_type field to the end to fill the byte hole.
Signed-off-by: Tao Chen <chen.dylane@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20250710032038.888700-6-chen.dylane@linux.dev
|
|
Use attach_type in bpf_link to replace the location filed, and
remove location field in tcx_link.
Signed-off-by: Tao Chen <chen.dylane@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20250710032038.888700-5-chen.dylane@linux.dev
|
|
Use attach_type in bpf_link, and remove it in sockmap_link.
Signed-off-by: Tao Chen <chen.dylane@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20250710032038.888700-4-chen.dylane@linux.dev
|
|
Use attach_type in bpf_link, and remove it in bpf_cgroup_link.
Signed-off-by: Tao Chen <chen.dylane@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20250710032038.888700-3-chen.dylane@linux.dev
|
|
Attach_type will be set when a link is created by user. It is better to
record attach_type in bpf_link generically and have it available
universally for all link types. So add the attach_type field in bpf_link
and move the sleepable field to avoid unnecessary gap padding.
Signed-off-by: Tao Chen <chen.dylane@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20250710032038.888700-2-chen.dylane@linux.dev
|
|
This patch adds coverage for the warning detected by syzkaller and fixed
in the previous patch. Without the previous patch, this test fails with:
verifier bug: REG INVARIANTS VIOLATION (false_reg1): range bounds
violation u64=[0x0, 0x0] s64=[0x0, 0x0] u32=[0x1, 0x0] s32=[0x0, 0x0]
var_off=(0x0, 0x0)(1)
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/c7893be1170fdbcf64e0200c110cdbd360ce7086.1752171365.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Syzbot reported a kernel warning due to a range invariant violation on
the following BPF program.
0: call bpf_get_netns_cookie
1: if r0 == 0 goto <exit>
2: if r0 & Oxffffffff goto <exit>
The issue is on the path where we fall through both jumps.
That path is unreachable at runtime: after insn 1, we know r0 != 0, but
with the sign extension on the jset, we would only fallthrough insn 2
if r0 == 0. Unfortunately, is_branch_taken() isn't currently able to
figure this out, so the verifier walks all branches. The verifier then
refines the register bounds using the second condition and we end
up with inconsistent bounds on this unreachable path:
1: if r0 == 0 goto <exit>
r0: u64=[0x1, 0xffffffffffffffff] var_off=(0, 0xffffffffffffffff)
2: if r0 & 0xffffffff goto <exit>
r0 before reg_bounds_sync: u64=[0x1, 0xffffffffffffffff] var_off=(0, 0)
r0 after reg_bounds_sync: u64=[0x1, 0] var_off=(0, 0)
Improving the range refinement for JSET to cover all cases is tricky. We
also don't expect many users to rely on JSET given LLVM doesn't generate
those instructions. So instead of improving the range refinement for
JSETs, Eduard suggested we forget the ranges whenever we're narrowing
tnums after a JSET. This patch implements that approach.
Reported-by: syzbot+c711ce17dd78e5d4fdcf@syzkaller.appspotmail.com
Suggested-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/9d4fd6432a095d281f815770608fdcd16028ce0b.1752171365.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Emil Tsalapatis says:
====================
bpf/arena: Add kfunc for reserving arena memory
Add a new kfunc for BPF arenas that reserves a region of the mapping
to prevent it from being mapped. These regions serve as guards against
out-of-bounds accesses and are useful for debugging arena-related code.
>From v3 (20250709015712.97099-1-emil@etsalapatis.com)
------------------------------------------------------
- Added Acked-by tags by Yonghong.
- Replace hardcoded error numbers in selftests (Yonghong).
- Fixed selftest for partially freeing a reserved region (Yonghong).
>From v2 (20250702003351.197234-1-emil@etsalapatis.com)
------------------------------------------------------
- Removed -EALREADY and replaced with -EINVAL to bring error handling in
line with the rest of the BPF code (Alexei).
>From v1 (20250620031118.245601-1-emil@etsalapatis.com)
------------------------------------------------------
- Removed the additional guard range tree. Adjusted tests accordingly.
Reserved regions now behave like allocated regions, and can be
unreserved using bpf_arena_free_pages(). They can also be allocated
from userspace through minor faults. It is up to the user to prevent
erroneous frees and/or use the BPF_F_SEGV_ON_FAULT flag to catch
stray userspace accesses (Alexei).
- Changed terminology from guard pages to reserved pages (Alexei,
Kartikeya).
Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
====================
Link: https://patch.msgid.link/20250709191312.29840-1-emil@etsalapatis.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add selftests for the new bpf_arena_reserve_pages kfunc.
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20250709191312.29840-3-emil@etsalapatis.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add a new BPF arena kfunc for reserving a range of arena virtual
addresses without backing them with pages. This prevents the range from
being populated using bpf_arena_alloc_pages().
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20250709191312.29840-2-emil@etsalapatis.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The prefetch code was referencing CONFIG_DRM_XE_DEVMEM_MIRROR, which has
been replaced by CONFIG_DRM_XE_PAGEMAP. As a result, prefetches were
limited to SRAM. Update the code to use CONFIG_DRM_XE_PAGEMAP instead of
the deprecated option.
Fixes: f86ad0ed620c ("drm/gpusvm, drm/pagemap: Move migration functionality to drm_pagemap")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250710205413.1105595-1-matthew.brost@intel.com
|
|
We need the topology to determine GT page fault queue size, move page
fault init after topology init.
Cc: stable@vger.kernel.org
Fixes: 3338e4f90c14 ("drm/xe: Use topology to determine page fault queue size")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://lore.kernel.org/r/20250710191208.1040215-1-matthew.brost@intel.com
|
|
Pull block fixes from Jens Axboe:
- MD changes via Yu:
- fix UAF due to stack memory used for bio mempool (Jinchao)
- fix raid10/raid1 nowait IO error path (Nigel and Qixing)
- fix kernel crash from reading bitmap sysfs entry (Håkon)
- Fix for a UAF in the nbd connect error path
- Fix for blocksize being bigger than pagesize, if THP isn't enabled
* tag 'block-6.16-20250710' of git://git.kernel.dk/linux:
block: reject bs > ps block devices when THP is disabled
nbd: fix uaf in nbd_genl_connect() error path
md/md-bitmap: fix GPF in bitmap_get_stats()
md/raid1,raid10: strip REQ_NOWAIT from member bios
raid10: cleanup memleak at raid10_make_request
md/raid1: Fix stack memory use after return in raid1_reshape
|
|
Add a new vEVENTQ type for VINTFs that are assigned to the user space.
Simply report the two 64-bit LVCMDQ_ERR_MAPs register values.
Link: https://patch.msgid.link/r/68161a980da41fa5022841209638aeff258557b5.1752126748.git.nicolinc@nvidia.com
Reviewed-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The CMDQV HW supports a user-space use for virtualization cases. It allows
the VM to issue guest-level TLBI or ATC_INV commands directly to the queue
and executes them without a VMEXIT, as HW will replace the VMID field in a
TLBI command and the SID field in an ATC_INV command with the preset VMID
and SID.
This is built upon the vIOMMU infrastructure by allowing VMM to allocate a
VINTF (as a vIOMMU object) and assign VCMDQs (HW QUEUE objs) to the VINTF.
So firstly, replace the standard vSMMU model with the VINTF implementation
but reuse the standard cache_invalidate op (for unsupported commands) and
the standard alloc_domain_nested op (for standard nested STE).
Each VINTF has two 64KB MMIO pages (128B per logical VCMDQ):
- Page0 (directly accessed by guest) has all the control and status bits.
- Page1 (trapped by VMM) has guest-owned queue memory location/size info.
VMM should trap the emulated VINTF0's page1 of the guest VM for the guest-
level VCMDQ location/size info and forward that to the kernel to translate
to a physical memory location to program the VCMDQ HW during an allocation
call. Then, it should mmap the assigned VINTF's page0 to the VINTF0 page0
of the guest VM. This allows the guest OS to read and write the guest-own
VINTF's page0 for direct control of the VCMDQ HW.
For ATC invalidation commands that hold an SID, it requires all devices to
register their virtual SIDs to the SID_MATCH registers and their physical
SIDs to the pairing SID_REPLACE registers, so that HW can use those as a
lookup table to replace those virtual SIDs with the correct physical SIDs.
Thus, implement the driver-allocated vDEVICE op with a tegra241_vintf_sid
structure to allocate SID_REPLACE and to program the SIDs accordingly.
This enables the HW accelerated feature for NVIDIA Grace CPU. Compared to
the standard SMMUv3 operating in the nested translation mode trapping CMDQ
for TLBI and ATC_INV commands, this gives a huge performance improvement:
70% to 90% reductions of invalidation time were measured by various DMA
unmap tests running in a guest OS.
Link: https://patch.msgid.link/r/fb0eab83f529440b6aa181798912a6f0afa21eb0.1752126748.git.nicolinc@nvidia.com
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
To simplify the mappings from global VCMDQs to VINTFs' LVCMDQs, the design
chose to do static allocations and mappings in the global reset function.
However, with the user-owned VINTF support, it exposes a security concern:
if user space VM only wants one LVCMDQ for a VINTF, statically mapping two
or more LVCMDQs creates a hidden VCMDQ that user space could DoS attack by
writing random stuff to overwhelm the kernel with unhandleable IRQs.
Thus, to support the user-owned VINTF feature, a LVCMDQ mapping has to be
done dynamically.
HW allows pre-assigning global VCMDQs in the CMDQ_ALLOC registers, without
finalizing the mappings by keeping CMDQV_CMDQ_ALLOCATED=0. So, add a pair
of map/unmap helper that simply sets/clears that bit.
For kernel-owned VINTF0, move LVCMDQ mappings to tegra241_vintf_hw_init(),
and the unmappings to tegra241_vintf_hw_deinit().
For user-owned VINTFs that will be added, the mappings/unmappings will be
on demand upon an LVCMDQ allocation from the user space.
However, the dynamic LVCMDQ mapping/unmapping can complicate the timing of
calling tegra241_vcmdq_hw_init/deinit(), which write LVCMDQ address space,
i.e. requiring LVCMDQ to be mapped. Highlight that with a note to the top
of either of them.
Link: https://patch.msgid.link/r/be115a8f75537632daf5995b3e583d8a76553fba.1752126748.git.nicolinc@nvidia.com
Acked-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The current flow of tegra241_cmdqv_remove_vintf() is:
1. For each LVCMDQ, tegra241_vintf_remove_lvcmdq():
a. Disable the LVCMDQ HW
b. Release the LVCMDQ SW resource
2. For current VINTF, tegra241_vintf_hw_deinit():
c. Disable all LVCMDQ HWs
d. Disable VINTF HW
Obviously, the step 1.a and the step 2.c are redundant.
Since tegra241_vintf_hw_deinit() disables all of its LVCMDQ HWs, it could
simplify the flow in tegra241_cmdqv_remove_vintf() by calling that first:
1. For current VINTF, tegra241_vintf_hw_deinit():
a. Disable all LVCMDQ HWs
b. Disable VINTF HW
2. Release all LVCMDQ SW resources
Drop tegra241_vintf_remove_lvcmdq(), and move tegra241_vintf_free_lvcmdq()
as the new step 2.
Link: https://patch.msgid.link/r/86c97c8c4ee9ca192e7e7fa3007c10399d792ce6.1752126748.git.nicolinc@nvidia.com
Acked-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
A vEVENT can be reported only from a threaded IRQ context. Change to using
request_threaded_irq to support that.
Link: https://patch.msgid.link/r/f160193980e3b273afbd1d9cfc3e360084c05ba6.1752126748.git.nicolinc@nvidia.com
Acked-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
This will be used by Tegra241 CMDQV implementation to report a non-default
HW info data.
Link: https://patch.msgid.link/r/8a3bf5709358eb21aed2e8434534c30ecf83917c.1752126748.git.nicolinc@nvidia.com
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
An impl driver might want to allocate its own type of vIOMMU object or the
standard IOMMU_VIOMMU_TYPE_ARM_SMMUV3 by setting up its own SW/HW bits, as
the tegra241-cmdqv driver will add IOMMU_VIOMMU_TYPE_TEGRA241_CMDQV.
Add vsmmu_size/type and vsmmu_init to struct arm_smmu_impl_ops. Prioritize
them in arm_smmu_get_viommu_size() and arm_vsmmu_init().
Link: https://patch.msgid.link/r/375ac2b056764534bb7c10ecc4f34a0bae82b108.1752126748.git.nicolinc@nvidia.com
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Test both IOMMU_HW_INFO_TYPE_DEFAULT and IOMMU_HW_INFO_TYPE_SELFTEST, and
add a negative test for an unsupported type.
Also drop the unused mask in test_cmd_get_hw_capabilities() as checkpatch
is complaining.
Link: https://patch.msgid.link/r/f01a1e50cd7366f217cbf192ad0b2b79e0eb89f0.1752126748.git.nicolinc@nvidia.com
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The iommu_hw_info can output via the out_data_type field the vendor data
type from a driver, but this only allows driver to report one data type.
Now, with SMMUv3 having a Tegra241 CMDQV implementation, it has two sets
of types and data structs to report.
One way to support that is to use the same type field bidirectionally.
Reuse the same field by adding an "in_data_type", allowing user space to
request for a specific type and to get the corresponding data.
For backward compatibility, since the ioctl handler has never checked an
input value, add an IOMMU_HW_INFO_FLAG_INPUT_TYPE to switch between the
old output-only field and the new bidirectional field.
Link: https://patch.msgid.link/r/887378a7167e1786d9d13cde0c36263ed61823d7.1752126748.git.nicolinc@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The hw_info uAPI will support a bidirectional data_type field that can be
used as an input field for user space to request for a specific info data.
To prepare for the uAPI update, change the iommu layer first:
- Add a new IOMMU_HW_INFO_TYPE_DEFAULT as an input, for which driver can
output its only (or firstly) supported type
- Update the kdoc accordingly
- Roll out the type validation in the existing drivers
Link: https://patch.msgid.link/r/00f4a2d3d930721f61367014717b3ba2d1e82a81.1752126748.git.nicolinc@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|