summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-12-14perf tool: Move pmus list variable to a new fileRavi Bangoria4-1/+16
The 'pmus' list variable is defined as static variable under pmu.c file. Introduce a new pmus.c file and migrate this variable to it. Also make it non static so that it can be accessed from outside. Suggested-by: Ian Rogers <irogers@google.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ananth Narayan <ananth.narayan@amd.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Santosh Shukla <santosh.shukla@amd.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: carsten.haitzler@arm.com Link: https://lore.kernel.org/r/20221206043237.12159-2-ravi.bangoria@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14perf util: Add host_is_bigendian to util.hIan Rogers7-22/+29
Avoid libtraceevent dependency for tep_is_bigendian or trace-event.h dependency for bigendian. Add a new host_is_bigendian to util.h, using the compiler defined __BYTE_ORDER__ when available. Committer notes: Added: #else /* !__BYTE_ORDER__ */ On that nested #ifdef block, as per Namhyung's suggestion. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20221130062935.2219247-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14perf util: Make header guard consistent with toolIan Rogers1-3/+3
Remove git reference by changing GIT_COMPAT_UTIL_H to __PERF_UTIL_H. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20221130062935.2219247-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14perf stat: Fix invalid output handleJames Clark1-1/+1
In this context, 'os' is already a pointer so the extra dereference isn't required. This fixes the following test failure on aarch64: $ ./perf test "json output" -vvv 92: perf stat JSON output linter : --- start --- Checking json output: no args Test failed for input: ... Fatal error: glibc detected an invalid stdio handle ---- end ---- perf stat JSON output linter: FAILED! Fixes: e7f4da312259e618 ("perf stat: Pass struct outstate to printout()") Signed-off-by: James Clark <james.clark@arm.com> Tested-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20221130111521.334152-2-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14perf stat: Fix multi-line metric output in JSONNamhyung Kim1-1/+1
When a metric produces more than one values, it missed to print the opening bracket. Fixes: ab6baaae27357290 ("perf stat: Fix JSON output in metric-only mode") Reported-by: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Weilin Wang <weilin.wang@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221202190447.1588680-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14tools lib symbol: Add dependency test to install_headersIan Rogers1-7/+14
Compute the headers to be installed from their source headers and make each have its own build target to install it. Using dependencies avoids headers being reinstalled and getting a new timestamp which then causes files that depend on the header to be rebuilt. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nicolas Schier <nicolas@fjasle.eu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Rix <trix@redhat.com> Cc: bpf@vger.kernel.org Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20221202045743.2639466-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14tools lib subcmd: Add dependency test to install_headersIan Rogers1-10/+13
Compute the headers to be installed from their source headers and make each have its own build target to install it. Using dependencies avoids headers being reinstalled and getting a new timestamp which then causes files that depend on the header to be rebuilt. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nicolas Schier <nicolas@fjasle.eu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Rix <trix@redhat.com> Cc: bpf@vger.kernel.org Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20221202045743.2639466-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14tools lib perf: Add dependency test to install_headersIan Rogers1-21/+22
Compute the headers to be installed from their source headers and make each have its own build target to install it. Using dependencies avoids headers being reinstalled and getting a new timestamp which then causes files that depend on the header to be rebuilt. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nicolas Schier <nicolas@fjasle.eu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Rix <trix@redhat.com> Cc: bpf@vger.kernel.org Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20221202045743.2639466-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14tools lib api: Add dependency test to install_headersIan Rogers1-12/+26
Compute the headers to be installed from their source headers and make each have its own build target to install it. Using dependencies avoids headers being reinstalled and getting a new timestamp which then causes files that depend on the header to be rebuilt. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nicolas Schier <nicolas@fjasle.eu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Rix <trix@redhat.com> Cc: bpf@vger.kernel.org Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20221202045743.2639466-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14perf stat: Fix printing field separator in CSV metrics outputAthira Rajeev1-12/+1
In 'perf stat' with CSV output option, number of fields in metrics output is not matching with number of fields in other event output lines. Sample output below after applying patch to fix printing os->prefix. # ./perf stat -x, --per-socket -a -C 1 ls S0,1,82.11,msec,cpu-clock,82111626,100.00,1.000,CPUs utilized S0,1,2,,context-switches,82109314,100.00,24.358,/sec ------ ====> S0,1,,,,,,,1.71,stalled cycles per insn The above command line uses field separator as "," via "-x," option and per-socket option displays socket value as first field. But here the last line for "stalled cycles per insn" has more separators. Each csv output line is expected to have 8 field separators (for the 9 fields), where as last line has 9 "," in the result. Patch fixes this issue. The counter stats are displayed by function "perf_stat__print_shadow_stats" in code "util/stat-shadow.c". While printing the stats info for "stalled cycles per insn", function "new_line_csv" is used as new_line callback. The fields printed in each line contains: "Socket_id,aggr nr,Avg,unit,event_name,run,enable_percent,ratio,unit" The metric output prints Socket_id, aggr nr, ratio and unit. It has to skip through remaining five fields ie, Avg,unit,event_name,run,enable_percent. The csv line callback uses "os->nfields" to know the number of fields to skip to match with other lines. Currently it is set as: os.nfields = 3 + aggr_fields[config->aggr_mode] + (counter->cgrp ? 1 : 0); But in case of aggregation modes, csv_sep already gets printed along with each field (Function "aggr_printout" in util/stat-display.c). So aggr_fields can be removed from nfields. And fixed number of fields to skip has to be "4". This is to skip fields for: "avg, unit, event name, run, enable_percent" This needs 4 csv separators. Patch removes aggr_fields and uses 4 as fixed number of os->nfields to skip. After the patch: # ./perf stat -x, --per-socket -a -C 1 ls S0,1,79.08,msec,cpu-clock,79085956,100.00,1.000,CPUs utilized S0,1,7,,context-switches,79084176,100.00,88.514,/sec ------ ====> S0,1,,,,,,0.81,stalled cycles per insn Fixes: 92a61f6412d3a09d ("perf stat: Implement CSV metrics output") Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20221205042852.83382-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14perf record: Add remaining branch filters: "no_cycles", "no_flags" & "hw_index"Anshuman Khandual2-0/+8
This adds all remaining branch filters i.e "no_cycles", "no_flags" and "hw_index". While here, also updates the documentation. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20221205064443.533587-1-anshuman.khandual@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14perf stat: Check existence of os->prefix, fixing a segfaultIan Rogers1-1/+2
We need to check if we have a OS prefix, otherwise we stumble on a metric segv that I'm now seeing in Arnaldo's tree: $ gdb --args perf stat -M Backend true ... Performance counter stats for 'true': 4,712,355 TOPDOWN.SLOTS # 17.3 % tma_core_bound Program received signal SIGSEGV, Segmentation fault. __strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex.S:77 77 ../sysdeps/x86_64/multiarch/strlen-evex.S: No such file or directory. (gdb) bt #0 __strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex.S:77 #1 0x00007ffff74749a5 in __GI__IO_fputs (str=0x0, fp=0x7ffff75f5680 <_IO_2_1_stderr_>) #2 0x0000555555779f28 in do_new_line_std (config=0x555555e077c0 <stat_config>, os=0x7fffffffbf10) at util/stat-display.c:356 #3 0x000055555577a081 in print_metric_std (config=0x555555e077c0 <stat_config>, ctx=0x7fffffffbf10, color=0x0, fmt=0x5555558b77b5 "%8.1f", unit=0x7fffffffbb10 "% tma_memory_bound", val=13.165355724442199) at util/stat-display.c:380 #4 0x00005555557768b6 in generic_metric (config=0x555555e077c0 <stat_config>, metric_expr=0x55555593d5b7 "((CYCLE_ACTIVITY.STALLS_MEM_ANY + EXE_ACTIVITY.BOUND_ON_STORES) / (CYCLE_ACTIVITY.STALLS_TOTAL + (EXE_ACTIVITY.1_PORTS_UTIL + tma_retiring * EXE_ACTIVITY.2_PORTS_UTIL) + EXE_ACTIVITY.BOUND_ON_STORES))"..., metric_events=0x555555f334e0, metric_refs=0x555555ec81d0, name=0x555555f32e80 "TOPDOWN.SLOTS", metric_name=0x555555f26c80 "tma_memory_bound", metric_unit=0x55555593d5b1 "100%", runtime=0, map_idx=0, out=0x7fffffffbd90, st=0x555555e9e620 <rt_stat>) at util/stat-shadow.c:934 #5 0x0000555555778cac in perf_stat__print_shadow_stats (config=0x555555e077c0 <stat_config>, evsel=0x555555f289d0, avg=4712355, map_idx=0, out=0x7fffffffbd90, metric_events=0x555555e078e8 <stat_config+296>, st=0x555555e9e620 <rt_stat>) at util/stat-shadow.c:1329 #6 0x000055555577b6a0 in printout (config=0x555555e077c0 <stat_config>, os=0x7fffffffbf10, uval=4712355, run=325322, ena=325322, noise=4712355, map_idx=0) at util/stat-display.c:741 #7 0x000055555577bc74 in print_counter_aggrdata (config=0x555555e077c0 <stat_config>, counter=0x555555f289d0, s=0, os=0x7fffffffbf10) at util/stat-display.c:838 #8 0x000055555577c1d8 in print_counter (config=0x555555e077c0 <stat_config>, counter=0x555555f289d0, os=0x7fffffffbf10) at util/stat-display.c:957 #9 0x000055555577dba0 in evlist__print_counters (evlist=0x555555ec3610, config=0x555555e077c0 <stat_config>, _target=0x555555e01c80 <target>, ts=0x0, argc=1, argv=0x7fffffffe450) at util/stat-display.c:1413 #10 0x00005555555fc821 in print_counters (ts=0x0, argc=1, argv=0x7fffffffe450) at builtin-stat.c:1040 #11 0x000055555560091a in cmd_stat (argc=1, argv=0x7fffffffe450) at builtin-stat.c:2665 #12 0x00005555556b1eea in run_builtin (p=0x555555e11f70 <commands+336>, argc=4, argv=0x7fffffffe450) at perf.c:322 #13 0x00005555556b2181 in handle_internal_command (argc=4, argv=0x7fffffffe450) at perf.c:376 #14 0x00005555556b22d7 in run_argv (argcp=0x7fffffffe27c, argv=0x7fffffffe270) at perf.c:420 #15 0x00005555556b26ef in main (argc=4, argv=0x7fffffffe450) at perf.c:550 (gdb) Fixes: f123b2d84ecec9a3 ("perf stat: Remove prefix argument in print_metric_headers()") Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/CAP-5=fUOjSM5HajU9TCD6prY39LbX4OQbkEbtKPPGRBPBN=_VQ@mail.gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14tracing: Do not synchronize freeing of trigger filter on boot upSteven Rostedt (Google)1-2/+8
If a trigger filter on the kernel command line fails to apply (due to syntax error), it will be freed. The freeing will call tracepoint_synchronize_unregister(), but this is not needed during early boot up, and will even trigger a lockdep splat. Avoid calling the synchronization function when system_state is SYSTEM_BOOTING. Link: https://lore.kernel.org/linux-trace-kernel/20221213172429.7774f4ba@gandalf.local.home Cc: Andrew Morton <akpm@linux-foundation.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-12-14thermal: intel: Don't set HFI status bit to 1Srinivas Pandruvada1-1/+4
When CPU doesn't support HFI (Hardware Feedback Interface), don't include BIT 26 in the mask to prevent clearing. otherwise this results in: unchecked MSR access error: WRMSR to 0x1b1 (tried to write 0x0000000004000aa8) at rIP: 0xffffffff8b8559fe (throttle_active_work+0xbe/0x1b0) Fixes: 6fe1e64b6026 ("thermal: intel: Prevent accidental clearing of HFI status") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-12-14gpio: sim: set a limit on the number of GPIOsBartosz Golaszewski1-0/+4
With the removal of ARCH_NR_GPIOS in commit 7b61212f2a07 ("gpiolib: Get rid of ARCH_NR_GPIOS") the gpiolib core no longer sanitizes the number of GPIOs for us. This causes the gpio-sim selftests to now fail when setting the number of GPIOs to 99999 and expecting the probe() to fail. Set a sane limit of 1024 on the number of simulated GPIOs and bail out of probe if it's exceeded. Reported-by: kernel test robot <oliver.sang@intel.com> Link: https://lore.kernel.org/oe-lkp/202212112236.756f5db9-oliver.sang@intel.com Fixes: 7b61212f2a07 ("gpiolib: Get rid of ARCH_NR_GPIOS") Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2022-12-14modpost: Include '.text.*' in TEXT_SECTIONSNathan Chancellor1-2/+2
Commit 6c730bfc894f ("modpost: handle -ffunction-sections") added ".text.*" to the OTHER_TEXT_SECTIONS macro to fix certain section mismatch warnings. Unfortunately, this makes it impossible for modpost to warn about section mismatches with LTO, which implies '-ffunction-sections', as all functions are put in their own '.text.<func_name>' sections, which may still reference functions in sections they are not supposed to, such as __init. Fix this by moving ".text.*" into TEXT_SECTIONS, so that configurations with '-ffunction-sections' will see warnings about mismatched sections. Link: https://lore.kernel.org/Y39kI3MOtVI5BAnV@google.com/ Reported-by: Vincent Donnefort <vdonnefort@google.com> Reviewed-and-tested-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-12-14padata: Mark padata_work_init() as __refNathan Chancellor1-2/+10
When building arm64 allmodconfig + ThinLTO with clang and a proposed modpost update to account for -ffuncton-sections, the following warning appears: WARNING: modpost: vmlinux.o: section mismatch in reference: padata_work_init (section: .text.padata_work_init) -> padata_mt_helper (section: .init.text) WARNING: modpost: vmlinux.o: section mismatch in reference: padata_work_init (section: .text.padata_work_init) -> padata_mt_helper (section: .init.text) LLVM has optimized padata_work_init() to include the address of padata_mt_helper() directly because it inlined the other call to padata_work_init() with padata_parallel_worker(), meaning the remaining uses of padata_work_init() use padata_mt_helper() as the work_fn argument. This optimization causes modpost to complain since padata_work_init() is not __init, whereas padata_mt_helper() is. Since padata_work_init() is only called from __init code when padata_mt_helper() is passed as the work_fn argument, mark padata_work_init() as __ref, which makes it clear to modpost that this scenario is okay. Suggested-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-12-14kbuild: ensure Make >= 3.82 is usedMasahiro Yamada1-0/+4
Documentation/process/changes.rst notes the minimal GNU Make version, but it is not checked anywhere. We could check $(MAKE_VERSION), but another simple way is to check $(.FEATURES) since the feature list always grows. GNU Make 3.81 expands $(.FEATURES) to: target-specific order-only second-expansion else-if archives jobserver check-symlink GNU Make 3.82 expands $(.FEATURES) to: target-specific order-only second-expansion else-if shortest-stem undefine archives jobserver check-symlink To ensure Make >= 3.82, you can check either 'shortest-stem' or 'undefine'. This way is not always possible. For example, Make 4.0 through 4.2 have the same set of $(.FEATURES). At that point, we will need to come up with a different approach. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2022-12-14kbuild: refactor the prerequisites of the modpost ruleMasahiro Yamada1-14/+22
The prerequisites of modpost are cluttered. The variables *-if-present and *-if-needed are unreadable. It is cleaner to append them into modpost-deps. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-12-14kbuild: change module.order to list *.o instead of *.koMasahiro Yamada9-21/+21
scripts/Makefile.build replaces the suffix .o with .ko, then scripts/Makefile.modpost calls the sed command to change .ko back to the original .o suffix. Instead of converting the suffixes back-and-forth, store the .o paths in modules.order, and replace it with .ko in 'make modules_install'. This avoids the unneeded sed command. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
2022-12-14cifs: Remove duplicated include in cifsglob.hYang Li1-1/+0
./fs/cifs/cifsglob.h: linux/scatterlist.h is included more than once. Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3459 Fixes: f7f291e14dde ("cifs: fix oops during encryption") Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2022-12-14Merge tag 'mm-stable-2022-12-13' of ↵Linus Torvalds237-5047/+9281
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - More userfaultfs work from Peter Xu - Several convert-to-folios series from Sidhartha Kumar and Huang Ying - Some filemap cleanups from Vishal Moola - David Hildenbrand added the ability to selftest anon memory COW handling - Some cpuset simplifications from Liu Shixin - Addition of vmalloc tracing support by Uladzislau Rezki - Some pagecache folioifications and simplifications from Matthew Wilcox - A pagemap cleanup from Kefeng Wang: we have VM_ACCESS_FLAGS, so use it - Miguel Ojeda contributed some cleanups for our use of the __no_sanitize_thread__ gcc keyword. This series should have been in the non-MM tree, my bad - Naoya Horiguchi improved the interaction between memory poisoning and memory section removal for huge pages - DAMON cleanups and tuneups from SeongJae Park - Tony Luck fixed the handling of COW faults against poisoned pages - Peter Xu utilized the PTE marker code for handling swapin errors - Hugh Dickins reworked compound page mapcount handling, simplifying it and making it more efficient - Removal of the autonuma savedwrite infrastructure from Nadav Amit and David Hildenbrand - zram support for multiple compression streams from Sergey Senozhatsky - David Hildenbrand reworked the GUP code's R/O long-term pinning so that drivers no longer need to use the FOLL_FORCE workaround which didn't work very well anyway - Mel Gorman altered the page allocator so that local IRQs can remnain enabled during per-cpu page allocations - Vishal Moola removed the try_to_release_page() wrapper - Stefan Roesch added some per-BDI sysfs tunables which are used to prevent network block devices from dirtying excessive amounts of pagecache - David Hildenbrand did some cleanup and repair work on KSM COW breaking - Nhat Pham and Johannes Weiner have implemented writeback in zswap's zsmalloc backend - Brian Foster has fixed a longstanding corner-case oddity in file[map]_write_and_wait_range() - sparse-vmemmap changes for MIPS, LoongArch and NIOS2 from Feiyang Chen - Shiyang Ruan has done some work on fsdax, to make its reflink mode work better under xfstests. Better, but still not perfect - Christoph Hellwig has removed the .writepage() method from several filesystems. They only need .writepages() - Yosry Ahmed wrote a series which fixes the memcg reclaim target beancounting - David Hildenbrand has fixed some of our MM selftests for 32-bit machines - Many singleton patches, as usual * tag 'mm-stable-2022-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (313 commits) mm/hugetlb: set head flag before setting compound_order in __prep_compound_gigantic_folio mm: mmu_gather: allow more than one batch of delayed rmaps mm: fix typo in struct pglist_data code comment kmsan: fix memcpy tests mm: add cond_resched() in swapin_walk_pmd_entry() mm: do not show fs mm pc for VM_LOCKONFAULT pages selftests/vm: ksm_functional_tests: fixes for 32bit selftests/vm: cow: fix compile warning on 32bit selftests/vm: madv_populate: fix missing MADV_POPULATE_(READ|WRITE) definitions mm/gup_test: fix PIN_LONGTERM_TEST_READ with highmem mm,thp,rmap: fix races between updates of subpages_mapcount mm: memcg: fix swapcached stat accounting mm: add nodes= arg to memory.reclaim mm: disable top-tier fallback to reclaim on proactive reclaim selftests: cgroup: make sure reclaim target memcg is unprotected selftests: cgroup: refactor proactive reclaim code to reclaim_until() mm: memcg: fix stale protection of reclaim target memcg mm/mmap: properly unaccount memory on mas_preallocate() failure omfs: remove ->writepage jfs: remove ->writepage ...
2022-12-14LoongArch: Update Loongson-3 default config fileHuacai Chen1-24/+32
1, Enable suspend (ACPI S3) and hibernation (ACPI S4). 2, Enable some options for FDT-based systems (e.g., SERIAL_OF_PLATFORM). 3, Enable CONFIG_KALLSYMS_ALL and CONFIG_DEBUG_FS to convenient ftrace. 4, Regenerate the whole file to keep the order of options be the same as the latest source code. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14LoongArch: modules/ftrace: Initialize PLT at load timeQing Zhang8-2/+148
This patch implements ftrace trampolines through plt entry. Tested by forcing ftrace_make_call() to use the module PLT, and then loading up a module after setting up ftrace with: | echo ":mod:<module-name>" > set_ftrace_filter; | echo function > current_tracer; | modprobe <module-name> Since FTRACE_ADDR/FTRACE_REGS_ADDR is only defined when CONFIG_DYNAMIC_ FTRACE is selected, we wrap their usage in module_init_ftrace_plt() with ifdeffery rather than using IS_ENABLED(). Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14LoongArch/ftrace: Add HAVE_FUNCTION_GRAPH_RET_ADDR_PTR supportQing Zhang5-6/+19
ftrace_graph_ret_addr() can be called by stack unwinding code to convert a found stack return address ('ret') to its original value, in case the function graph tracer has modified it to be 'return_to_handler'. If the hasn't been modified, the unchanged value of 'ret' is returned. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14LoongArch/ftrace: Add HAVE_DYNAMIC_FTRACE_WITH_ARGS supportQing Zhang3-0/+29
Allow for arguments to be passed in to ftrace_regs by default. If this is set, then arguments and stack can be found from the pt_regs. 1. HAVE_DYNAMIC_FTRACE_WITH_ARGS don't need special hook for graph tracer entry point, but instead we can use graph_ops::func function to install the return_hooker. 2. Livepatch requires this option in the future. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14LoongArch/ftrace: Add HAVE_DYNAMIC_FTRACE_WITH_REGS supportQing Zhang4-2/+51
This patch implements CONFIG_DYNAMIC_FTRACE_WITH_REGS on LoongArch, which allows a traced function's arguments (and some other registers) to be captured into a struct pt_regs, allowing these to be inspected and modified. Co-developed-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14LoongArch/ftrace: Add dynamic function graph tracer supportQing Zhang3-0/+101
Once the function_graph tracer is enabled, a filtered function has the following call sequence: 1) ftracer_caller ==> on/off by ftrace_make_call/ftrace_make_nop 2) ftrace_graph_caller 3) ftrace_graph_call ==> on/off by ftrace_en/disable_ftrace_graph_caller 4) prepare_ftrace_return Considering the following DYNAMIC_FTRACE_WITH_REGS feature, it would be more extendable to have a ftrace_graph_caller function, instead of calling prepare_ftrace_return directly in ftrace_caller. Co-developed-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14LoongArch/ftrace: Add dynamic function tracer supportQing Zhang10-10/+367
The compiler has inserted 2 NOPs before the regular function prologue. T series registers are available and safe because of LoongArch's psABI. At runtime, we can replace nop with bl to enable ftrace call and replace bl with nop to disable ftrace call. The bl instruction requires us to save the original RA value, so it saves RA at t0 here. Details are: | Compiled | Disabled | Enabled | +------------+------------------------+------------------------+ | nop | move t0, ra | move t0, ra | | nop | nop | bl ftrace_caller | | func_body | func_body | func_body | The RA value will be recovered by ftrace_regs_entry, and restored into RA before returning to the regular function prologue. When a function is not being traced, the "move t0, ra" is not harmful. 1) ftrace_make_call, ftrace_make_nop (in kernel/ftrace.c) The two functions turn each recorded call site of filtered functions into a call to ftrace_caller or nops. 2) ftracce_update_ftrace_func (in kernel/ftrace.c) turns the nops at ftrace_call into a call to a generic entry for function tracers. 3) ftrace_caller (in kernel/mcount_dyn.S) The entry where each _mcount call sites calls to once they are filtered to be traced. Co-developed-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14LoongArch/ftrace: Add recordmcount supportQing Zhang2-0/+41
Recordmcount utility under scripts is run, after compiling each object, to find out all the locations of calling _mcount() and put them into specific seciton named __mcount_loc. Then the linker collects all such information into a table in the kernel image (between __start_mcount_loc and __stop_mcount_loc) for later use by ftrace. This patch adds LoongArch specific definitions to identify such locations. And on LoongArch, only the C version is used to build the kernel now that CONFIG_HAVE_C_RECORDMCOUNT is on. Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14LoongArch/ftrace: Add basic supportQing Zhang5-0/+200
This patch contains basic ftrace support for LoongArch. Specifically, function tracer (HAVE_FUNCTION_TRACER), function graph tracer (HAVE_ FUNCTION_GRAPH_TRACER) are implemented following the instructions in Documentation/trace/ftrace-design.txt. Use `-pg` makes stub like a child function `void _mcount(void *ra)`. Thus, it can be seen store RA and alloc stack before `call _mcount`. Find `alloc stack` at first, and then find `store RA`. Note that the functions in both inst.c and time.c should not be hooked with the compiler's -pg option: to prevent infinite self-referencing for the former, and to ignore early setup stuff for the latter. Co-developed-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14LoongArch: module: Use got/plt section indices for relocationsHuacai Chen3-47/+68
Instead of saving a pointer to the .got, .plt and .plt_idx sections to apply {got,plt}-based relocations, save and use their section indices instead. The mod->arch.{core,init}.{got,plt} pointers were problematic for live- patch because they pointed within temporary section headers (provided by the module loader via info->sechdrs) that would be freed after module load. Since livepatch modules may need to apply relocations post-module- load (for example, to patch a module that is loaded later), using section indices to offset into the section headers (instead of accessing them through a saved pointer) allows livepatch modules on LoongArch to pass in their own copy of the section headers to apply_relocate_add() to apply delayed relocations. The method used is same as commit c8ebf64eab743 ("arm64/module: use plt section indices for relocations"). Signed-off-by: Hongchen Zhang <zhanghongchen@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14LoongArch: Add basic STACKPROTECTOR supportHuacai Chen5-0/+53
Add basic stack protector support similar to other architectures. A constant canary value is set at boot time, and with help of compiler's -fstack-protector we can detect stack corruption. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14LoongArch: Add hibernation (ACPI S4) supportHuacai Chen7-0/+154
Add hibernation (Suspend to Disk, aka ACPI S4) support for LoongArch. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14LoongArch: Add suspend (ACPI S3) supportHuacai Chen13-3/+260
Add suspend (Suspend To RAM, aka ACPI S3) support for LoongArch. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14LoongArch: Add processing ISA Node in DeviceTreeBinbin Zhou1-0/+75
Similar to commit 6d0068ad15e4f771b3 ("MIPS: Loongson64: Process ISA Node in DeviceTree"), we process ISA node in DeviceTree for FDT-based systems. Previously, we are hardcoding reserved ISA I/O Space in, now we are processing it I/O via DeviceTree directly. The ranges property of ISA node is used to determine the size and address of reserved I/O space. Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14LoongArch: Add FDT booting support from efi system tableBinbin Zhou10-7/+145
Since commit 40cd01a9c324("efi/loongarch: libstub: remove dependency on flattened DT"), we can parse the FDT from efi system table. And now, LoongArch is coming to support booting with FDT, so we add the relevant booting support as well as parameter parsing. Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14LoongArch: Use alternative to optimize librariesHuacai Chen7-11/+465
Use the alternative to optimize common libraries according whether CPU has UAL (hardware unaligned access support) feature, including memset(), memcopy(), memmove(), copy_user() and clear_user(). We have tested UnixBench on a Loongson-3A5000 quad-core machine (1.6GHz): 1, One copy, before patch: System Benchmarks Index Values BASELINE RESULT INDEX Dhrystone 2 using register variables 116700.0 9566582.0 819.8 Double-Precision Whetstone 55.0 2805.3 510.1 Execl Throughput 43.0 2120.0 493.0 File Copy 1024 bufsize 2000 maxblocks 3960.0 209833.0 529.9 File Copy 256 bufsize 500 maxblocks 1655.0 89400.0 540.2 File Copy 4096 bufsize 8000 maxblocks 5800.0 320036.0 551.8 Pipe Throughput 12440.0 340624.0 273.8 Pipe-based Context Switching 4000.0 109939.1 274.8 Process Creation 126.0 4728.7 375.3 Shell Scripts (1 concurrent) 42.4 2223.1 524.3 Shell Scripts (8 concurrent) 6.0 883.1 1471.9 System Call Overhead 15000.0 518639.1 345.8 ======== System Benchmarks Index Score 500.2 2, One copy, after patch: System Benchmarks Index Values BASELINE RESULT INDEX Dhrystone 2 using register variables 116700.0 9567674.7 819.9 Double-Precision Whetstone 55.0 2805.5 510.1 Execl Throughput 43.0 2392.7 556.4 File Copy 1024 bufsize 2000 maxblocks 3960.0 417804.0 1055.1 File Copy 256 bufsize 500 maxblocks 1655.0 112909.5 682.2 File Copy 4096 bufsize 8000 maxblocks 5800.0 1255207.4 2164.2 Pipe Throughput 12440.0 555712.0 446.7 Pipe-based Context Switching 4000.0 99964.5 249.9 Process Creation 126.0 5192.5 412.1 Shell Scripts (1 concurrent) 42.4 2302.4 543.0 Shell Scripts (8 concurrent) 6.0 919.6 1532.6 System Call Overhead 15000.0 511159.3 340.8 ======== System Benchmarks Index Score 640.1 3, Four copies, before patch: System Benchmarks Index Values BASELINE RESULT INDEX Dhrystone 2 using register variables 116700.0 38268610.5 3279.2 Double-Precision Whetstone 55.0 11222.2 2040.4 Execl Throughput 43.0 7892.0 1835.3 File Copy 1024 bufsize 2000 maxblocks 3960.0 235149.6 593.8 File Copy 256 bufsize 500 maxblocks 1655.0 74959.6 452.9 File Copy 4096 bufsize 8000 maxblocks 5800.0 545048.5 939.7 Pipe Throughput 12440.0 1337359.0 1075.0 Pipe-based Context Switching 4000.0 473663.9 1184.2 Process Creation 126.0 17491.2 1388.2 Shell Scripts (1 concurrent) 42.4 6865.7 1619.3 Shell Scripts (8 concurrent) 6.0 1015.9 1693.1 System Call Overhead 15000.0 1899535.2 1266.4 ======== System Benchmarks Index Score 1278.3 4, Four copies, after patch: System Benchmarks Index Values BASELINE RESULT INDEX Dhrystone 2 using register variables 116700.0 38272815.5 3279.6 Double-Precision Whetstone 55.0 11222.8 2040.5 Execl Throughput 43.0 8839.2 2055.6 File Copy 1024 bufsize 2000 maxblocks 3960.0 313912.9 792.7 File Copy 256 bufsize 500 maxblocks 1655.0 80976.1 489.3 File Copy 4096 bufsize 8000 maxblocks 5800.0 1176594.3 2028.6 Pipe Throughput 12440.0 2100941.9 1688.9 Pipe-based Context Switching 4000.0 476696.4 1191.7 Process Creation 126.0 18394.7 1459.9 Shell Scripts (1 concurrent) 42.4 7172.2 1691.6 Shell Scripts (8 concurrent) 6.0 1058.3 1763.9 System Call Overhead 15000.0 1874714.7 1249.8 ======== System Benchmarks Index Score 1488.8 Signed-off-by: Jun Yi <yijun@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14LoongArch: Add alternative runtime patching mechanismHuacai Chen9-1/+507
Introduce the "alternative" mechanism from ARM64 and x86 for LoongArch to apply runtime patching. The main purpose of this patch is to provide a framework. In future we can use this mechanism (i.e., the ALTERNATIVE and ALTERNATIVE_2 macros) to optimize hotspot functions according to cpu features. Signed-off-by: Jun Yi <yijun@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14LoongArch: Add unaligned access supportHuacai Chen9-7/+634
Loongson-2 series (Loongson-2K500, Loongson-2K1000) don't support unaligned access in hardware, while Loongson-3 series (Loongson-3A5000, Loongson-3C5000) are configurable whether support unaligned access in hardware. This patch add unaligned access emulation for those LoongArch processors without hardware support. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14LoongArch: BPF: Add BPF exception tablesYouling Tang5-5/+96
Inspired by commit 800834285361("bpf, arm64: Add BPF exception tables"), do similar to LoongArch to add BPF exception tables. When a tracing BPF program attempts to read memory without using the bpf_probe_read() helper, the verifier marks the load instruction with the BPF_PROBE_MEM flag. Since the LoongArch JIT does not currently recognize this flag it falls back to the interpreter. Add support for BPF_PROBE_MEM, by appending an exception table to the BPF program. If the load instruction causes a data abort, the fixup infrastructure finds the exception table and fixes up the fault, by clearing the destination register and jumping over the faulting instruction. To keep the compact exception table entry format, inspect the pc in fixup_exception(). A more generic solution would add a "handler" field to the table entry, like on x86, s390 and arm64, etc. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14LoongArch: Remove the .fixup section usageYouling Tang2-19/+11
Use the `.L_xxx` label to improve fixup code and then remove the .fixup section usage. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14LoongArch: extable: Add a dedicated uaccess handlerYouling Tang5-29/+76
Inspired by commit 2e77a62cb3a6("arm64: extable: add a dedicated uaccess handler"), do similar to LoongArch to add a dedicated uaccess exception handler to update registers in exception context and subsequently return back into the function which faulted, so we remove the need for fixups specialized to each faulting instruction. Add gpr-num.h here because we need to map the same GPR names to integer constants, so that we can use this to build meta-data for the exception fixups. The compiler treats gpr 0 as zero rather than $r0, so set it separately to .L__gpr_num_zero, otherwise the following assembly error will occurs: {standard input}: Assembler messages: {standard input}:1074: Error: invalid operands (*UND* and *ABS* sections) for `<<' {standard input}:1160: Error: invalid operands (*UND* and *ABS* sections) for `<<' make[1]: *** [scripts/Makefile.build:249: fs/fcntl.o] Error 1 Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14LoongArch: extable: Add `type` and `data` fieldsYouling Tang5-8/+30
This is a LoongArch port of commit d6e2cc564775 ("arm64: extable: add `type` and `data` fields"). Subsequent patches will add specialized handlers for fixups, in addition to the simple PC fixup we have today. In preparation, this patch adds a new `type` field to struct exception_table_entry, and uses this to distinguish the fixup and other cases. A `data` field is also added so that subsequent patches can associate data specific to each exception site (e.g. register numbers). Handlers are named ex_handler_*() for consistency, following the example of x86. At the same time, get_ex_fixup() is split out into a helper so that it can be used by other ex_handler_*() functions in the subsequent patches. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14LoongArch: Switch to relative exception tablesYouling Tang6-18/+69
Similar to other architectures such as arm64, x86, riscv and so on, use offsets relative to the exception table entry values rather than their absolute addresses for both the exception location and the fixup. However, LoongArch label difference because it will actually produce two relocations, a pair of R_LARCH_ADD32 and R_LARCH_SUB32. Take simple code below for example: $ cat test_ex_table.S .section .text 1: nop .section __ex_table,"a" .balign 4 .long (1b - .) .previous $ loongarch64-unknown-linux-gnu-gcc -c test_ex_table.S $ loongarch64-unknown-linux-gnu-readelf -Wr test_ex_table.o Relocation section '.rela__ex_table' at offset 0x100 contains 2 entries: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000600000032 R_LARCH_ADD32 0000000000000000 .L1^B1 + 0 0000000000000000 0000000500000037 R_LARCH_SUB32 0000000000000000 L0^A + 0 The modpost will complain the R_LARCH_SUB32 relocation, so we need to patch modpost.c to skip this relocation for .rela__ex_table section. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14LoongArch: Consolidate __ex_table constructionYouling Tang6-23/+49
Consolidate all the __ex_table constuction code with a _ASM_EXTABLE or _asm_extable helper. There should be no functional change as a result of this patch. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14Merge branch 'zstd-next' into zstd-linusNick Terrell40-2622/+6987
2022-12-14Merge branch 'main' into zstd-linusNick Terrell34342-1047106/+4115128
2022-12-14Merge tag 'net-next-6.2' of ↵Linus Torvalds2013-34345/+165926
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Paolo Abeni: "Core: - Allow live renaming when an interface is up - Add retpoline wrappers for tc, improving considerably the performances of complex queue discipline configurations - Add inet drop monitor support - A few GRO performance improvements - Add infrastructure for atomic dev stats, addressing long standing data races - De-duplicate common code between OVS and conntrack offloading infrastructure - A bunch of UBSAN_BOUNDS/FORTIFY_SOURCE improvements - Netfilter: introduce packet parser for tunneled packets - Replace IPVS timer-based estimators with kthreads to scale up the workload with the number of available CPUs - Add the helper support for connection-tracking OVS offload BPF: - Support for user defined BPF objects: the use case is to allocate own objects, build own object hierarchies and use the building blocks to build own data structures flexibly, for example, linked lists in BPF - Make cgroup local storage available to non-cgroup attached BPF programs - Avoid unnecessary deadlock detection and failures wrt BPF task storage helpers - A relevant bunch of BPF verifier fixes and improvements - Veristat tool improvements to support custom filtering, sorting, and replay of results - Add LLVM disassembler as default library for dumping JITed code - Lots of new BPF documentation for various BPF maps - Add bpf_rcu_read_{,un}lock() support for sleepable programs - Add RCU grace period chaining to BPF to wait for the completion of access from both sleepable and non-sleepable BPF programs - Add support storing struct task_struct objects as kptrs in maps - Improve helper UAPI by explicitly defining BPF_FUNC_xxx integer values - Add libbpf *_opts API-variants for bpf_*_get_fd_by_id() functions Protocols: - TCP: implement Protective Load Balancing across switch links - TCP: allow dynamically disabling TCP-MD5 static key, reverting back to fast[er]-path - UDP: Introduce optional per-netns hash lookup table - IPv6: simplify and cleanup sockets disposal - Netlink: support different type policies for each generic netlink operation - MPTCP: add MSG_FASTOPEN and FastOpen listener side support - MPTCP: add netlink notification support for listener sockets events - SCTP: add VRF support, allowing sctp sockets binding to VRF devices - Add bridging MAC Authentication Bypass (MAB) support - Extensions for Ethernet VPN bridging implementation to better support multicast scenarios - More work for Wi-Fi 7 support, comprising conversion of all the existing drivers to internal TX queue usage - IPSec: introduce a new offload type (packet offload) allowing complete header processing and crypto offloading - IPSec: extended ack support for more descriptive XFRM error reporting - RXRPC: increase SACK table size and move processing into a per-local endpoint kernel thread, reducing considerably the required locking - IEEE 802154: synchronous send frame and extended filtering support, initial support for scanning available 15.4 networks - Tun: bump the link speed from 10Mbps to 10Gbps - Tun/VirtioNet: implement UDP segmentation offload support Driver API: - PHY/SFP: improve power level switching between standard level 1 and the higher power levels - New API for netdev <-> devlink_port linkage - PTP: convert existing drivers to new frequency adjustment implementation - DSA: add support for rx offloading - Autoload DSA tagging driver when dynamically changing protocol - Add new PCP and APPTRUST attributes to Data Center Bridging - Add configuration support for 800Gbps link speed - Add devlink port function attribute to enable/disable RoCE and migratable - Extend devlink-rate to support strict prioriry and weighted fair queuing - Add devlink support to directly reading from region memory - New device tree helper to fetch MAC address from nvmem - New big TCP helper to simplify temporary header stripping New hardware / drivers: - Ethernet: - Marvel Octeon CNF95N and CN10KB Ethernet Switches - Marvel Prestera AC5X Ethernet Switch - WangXun 10 Gigabit NIC - Motorcomm yt8521 Gigabit Ethernet - Microchip ksz9563 Gigabit Ethernet Switch - Microsoft Azure Network Adapter - Linux Automation 10Base-T1L adapter - PHY: - Aquantia AQR112 and AQR412 - Motorcomm YT8531S - PTP: - Orolia ART-CARD - WiFi: - MediaTek Wi-Fi 7 (802.11be) devices - RealTek rtw8821cu, rtw8822bu, rtw8822cu and rtw8723du USB devices - Bluetooth: - Broadcom BCM4377/4378/4387 Bluetooth chipsets - Realtek RTL8852BE and RTL8723DS - Cypress.CYW4373A0 WiFi + Bluetooth combo device Drivers: - CAN: - gs_usb: bus error reporting support - kvaser_usb: listen only and bus error reporting support - Ethernet NICs: - Intel (100G): - extend action skbedit to RX queue mapping - implement devlink-rate support - support direct read from memory - nVidia/Mellanox (mlx5): - SW steering improvements, increasing rules update rate - Support for enhanced events compression - extend H/W offload packet manipulation capabilities - implement IPSec packet offload mode - nVidia/Mellanox (mlx4): - better big TCP support - Netronome Ethernet NICs (nfp): - IPsec offload support - add support for multicast filter - Broadcom: - RSS and PTP support improvements - AMD/SolarFlare: - netlink extened ack improvements - add basic flower matches to offload, and related stats - Virtual NICs: - ibmvnic: introduce affinity hint support - small / embedded: - FreeScale fec: add initial XDP support - Marvel mv643xx_eth: support MII/GMII/RGMII modes for Kirkwood - TI am65-cpsw: add suspend/resume support - Mediatek MT7986: add RX wireless wthernet dispatch support - Realtek 8169: enable GRO software interrupt coalescing per default - Ethernet high-speed switches: - Microchip (sparx5): - add support for Sparx5 TC/flower H/W offload via VCAP - Mellanox mlxsw: - add 802.1X and MAC Authentication Bypass offload support - add ip6gre support - Embedded Ethernet switches: - Mediatek (mtk_eth_soc): - improve PCS implementation, add DSA untag support - enable flow offload support - Renesas: - add rswitch R-Car Gen4 gPTP support - Microchip (lan966x): - add full XDP support - add TC H/W offload via VCAP - enable PTP on bridge interfaces - Microchip (ksz8): - add MTU support for KSZ8 series - Qualcomm 802.11ax WiFi (ath11k): - support configuring channel dwell time during scan - MediaTek WiFi (mt76): - enable Wireless Ethernet Dispatch (WED) offload support - add ack signal support - enable coredump support - remain_on_channel support - Intel WiFi (iwlwifi): - enable Wi-Fi 7 Extremely High Throughput (EHT) PHY capabilities - 320 MHz channels support - RealTek WiFi (rtw89): - new dynamic header firmware format support - wake-over-WLAN support" * tag 'net-next-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2002 commits) ipvs: fix type warning in do_div() on 32 bit net: lan966x: Remove a useless test in lan966x_ptp_add_trap() net: ipa: add IPA v4.7 support dt-bindings: net: qcom,ipa: Add SM6350 compatible bnxt: Use generic HBH removal helper in tx path IPv6/GRO: generic helper to remove temporary HBH/jumbo header in driver selftests: forwarding: Add bridge MDB test selftests: forwarding: Rename bridge_mdb test bridge: mcast: Support replacement of MDB port group entries bridge: mcast: Allow user space to specify MDB entry routing protocol bridge: mcast: Allow user space to add (*, G) with a source list and filter mode bridge: mcast: Add support for (*, G) with a source list and filter mode bridge: mcast: Avoid arming group timer when (S, G) corresponds to a source bridge: mcast: Add a flag for user installed source entries bridge: mcast: Expose __br_multicast_del_group_src() bridge: mcast: Expose br_multicast_new_group_src() bridge: mcast: Add a centralized error path bridge: mcast: Place netlink policy before validation functions bridge: mcast: Split (*, G) and (S, G) addition into different functions bridge: mcast: Do not derive entry type from its filter mode ...
2022-12-14Merge tag 'xtensa-20221213' of https://github.com/jcmvbkbc/linux-xtensaLinus Torvalds10-11/+234
Pull Xtensa updates from Max Filippov: - fix kernel build with gcc-13 - various minor fixes * tag 'xtensa-20221213' of https://github.com/jcmvbkbc/linux-xtensa: xtensa: add __umulsidi3 helper xtensa: update config files MAINTAINERS: update the 'T:' entry for xtensa