Age | Commit message (Collapse) | Author | Files | Lines |
|
When shift is nul, operations remain valid so no test needed.
And 'ret' is unnecessary.
And use IS_ALIGNED() to check alignment, that's more clear.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/373ec500f386374bc5735007df3d3869eac47be1.1624618701.git.christophe.leroy@csgroup.eu
|
|
This patch converts powerpc to the generic PTDUMP implementation.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/03166d569526be70214fe9370a7bad219d2f41c8.1625762907.git.christophe.leroy@csgroup.eu
|
|
Do the same as commit f8f0d0b6fa20 ("mm: ptdump: reduce level numbers
by 1 in note_page()") and add missing p4d level.
This will align powerpc to the users of generic ptdump.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/d76495c574132b197b445a1f133755cca4b912a4.1625762906.git.christophe.leroy@csgroup.eu
|
|
note_page_update_state() doesn't use page_size. Remove it.
Could also be removed to note_page() but as a following patch
will remove all current users of note_page(), just leave it as
is for now.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e2f80d052001155251bfe009c360d0c5d9242c6b.1625762906.git.christophe.leroy@csgroup.eu
|
|
Use DEFINE_SHOW_ATTRIBUTE() instead of open coding
open() and fops.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b864a92693ca8413ef0b19f0c12065c212899b6e.1625762905.git.christophe.leroy@csgroup.eu
|
|
bl;mflr is used at several places to get code position.
Use bcl 20,31,+4 instead of bl in order to preserve link stack.
See commit c974809a26a1 ("powerpc/vdso: Avoid link stack corruption
in __get_datapage()") for details.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c6eabb4fb6c156f75d56dcbcc6f243e5ac0fba42.1629791763.git.christophe.leroy@csgroup.eu
|
|
Use bcl 20,31,+4 instead of bl in order to preserve link stack.
See commit c974809a26a1 ("powerpc/vdso: Avoid link stack corruption
in __get_datapage()") for details.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e9fbc285eceb720e6c0e032ef47fe8b05f669b48.1629791751.git.christophe.leroy@csgroup.eu
|
|
ret_from_kernel_thread
Copied from commit 89bbe4c798bc ("powerpc/64: indirect function call
use bctrl rather than blrl in ret_from_kernel_thread")
blrl is not recommended to use as an indirect function call, as it may
corrupt the link stack predictor.
This is not a performance critical path but this should be fixed for
consistency.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/91b1d242525307ceceec7ef6e832bfbacdd4501b.1629436472.git.christophe.leroy@csgroup.eu
|
|
The book3s_64_mmu_radix.o object is not part of the KVM builtins and
all the callers of the exported symbols are in the same kvm-hv.ko
module so we should not need to export any symbols.
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210805212616.2641017-4-farosas@linux.ibm.com
|
|
Both paths into __kvmhv_copy_tofrom_guest_radix ensure that we arrive
with an effective address that is smaller than our total addressable
space and addresses quadrant 0.
- The H_COPY_TOFROM_GUEST hypercall path rejects the call with
H_PARAMETER if the effective address has any of the twelve most
significant bits set.
- The kvmhv_copy_tofrom_guest_radix path clears the top twelve bits
before calling the internal function.
Although the callers make sure that the effective address is sane, any
future use of the function is exposed to a programming error, so add a
sanity check.
Suggested-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210805212616.2641017-3-farosas@linux.ibm.com
|
|
The __kvmhv_copy_tofrom_guest_radix function was introduced along with
nested HV guest support. It uses the platform's Radix MMU quadrants to
provide a nested hypervisor with fast access to its nested guests
memory (H_COPY_TOFROM_GUEST hypercall). It has also since been added
as a fast path for the kvmppc_ld/st routines which are used during
instruction emulation.
The commit def0bfdbd603 ("powerpc: use probe_user_read() and
probe_user_write()") changed the low level copy function from
raw_copy_from_user to probe_user_read, which adds a check to
access_ok. In powerpc that is:
static inline bool __access_ok(unsigned long addr, unsigned long size)
{
return addr < TASK_SIZE_MAX && size <= TASK_SIZE_MAX - addr;
}
and TASK_SIZE_MAX is 0x0010000000000000UL for 64-bit, which means that
setting the two MSBs of the effective address (which correspond to the
quadrant) now cause access_ok to reject the access.
This was not caught earlier because the most common code path via
kvmppc_ld/st contains a fallback (kvm_read_guest) that is likely to
succeed for L1 guests. For nested guests there is no fallback.
Another issue is that probe_user_read (now __copy_from_user_nofault)
does not return the number of bytes not copied in case of failure, so
the destination memory is not being cleared anymore in
kvmhv_copy_from_guest_radix:
ret = kvmhv_copy_tofrom_guest_radix(vcpu, eaddr, to, NULL, n);
if (ret > 0) <-- always false!
memset(to + (n - ret), 0, ret);
This patch fixes both issues by skipping access_ok and open-coding the
low level __copy_to/from_user_inatomic.
Fixes: def0bfdbd603 ("powerpc: use probe_user_read() and probe_user_write()")
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210805212616.2641017-2-farosas@linux.ibm.com
|
|
This fixes a compile error with W=1.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210823090039.166120-3-clg@kaod.org
|
|
This fixes a compile error with W=1.
arch/powerpc/kernel/prom.c: In function ‘early_reserve_mem’:
arch/powerpc/kernel/prom.c:625:10: error: variable ‘reserve_map’ set but not used [-Werror=unused-but-set-variable]
__be64 *reserve_map;
^~~~~~~~~~~
cc1: all warnings being treated as errors
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210823090039.166120-2-clg@kaod.org
|
|
__NR__exit is nowhere used. On most architectures it was removed by
commit 135ab6ec8fda ("[PATCH] remove remaining errno and
__KERNEL_SYSCALLS__ references") but not on powerpc.
powerpc removed __KERNEL_SYSCALLS__ in commit 3db03b4afb3e ("[PATCH]
rename the provided execve functions to kernel_execve"), but __NR__exit
was left over.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/6457eb4f327313323ed1f70e540bbb4ddc9178fa.1629701106.git.christophe.leroy@csgroup.eu
|
|
Make use of is_32bit_task() and CONFIG_CPU_LITTLE_ENDIAN
to simplify syscall_get_arch().
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/4be53b9187a4d8c163968f4d224267e41a7fcc33.1629451479.git.christophe.leroy@csgroup.eu
|
|
Use is_32bit_task() which already handles CONFIG_COMPAT.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/ba49cdd574558a0363300c3f6b5b062b397cb071.1629451483.git.christophe.leroy@csgroup.eu
|
|
This fixes a compile error with W=1.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210819125656.14498-5-clg@kaod.org
|
|
This fixes a compile error with W=1.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210819125656.14498-4-clg@kaod.org
|
|
This fixes a compile error with W=1.
Fixes: 6d0aaf5e0de0 ("powerpc/pseries/vas: Setup IRQ and fault handling")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210819125656.14498-3-clg@kaod.org
|
|
H_GetPerformanceCounterInfo (0xF080) hcall returns the counter data in
the result buffer. Result buffer has specific format defined in the PAPR
specification. One of the fields is counter offset and width of the
counter data returned.
Counter data are returned in a unsigned char array in big endian byte
order. To get the final counter data, the values must be left shifted
byte at a time. But commit 220a0c609ad17 ("powerpc/perf: Add support for
the hv gpci (get performance counter info) interface") made the shifting
bitwise and also assumed little endian order. Because of that, hcall
counters values are reported incorrectly.
In particular this can lead to counters go backwards which messes up the
counter prev vs now calculation and leads to huge counter value
reporting:
#: perf stat -e hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
-C 0 -I 1000
time counts unit events
1.000078854 18,446,744,073,709,535,232 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
2.000213293 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
3.000320107 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
4.000428392 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
5.000537864 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
6.000649087 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
7.000760312 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
8.000865218 16,448 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
9.000978985 18,446,744,073,709,535,232 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
10.001088891 16,384 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
11.001201435 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
12.001307937 18,446,744,073,709,535,232 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
Fix the shifting logic to correct match the format, ie. read bytes in
big endian order.
Fixes: e4f226b1580b ("powerpc/perf/hv-gpci: Increase request buffer size")
Cc: stable@vger.kernel.org # v4.6+
Reported-by: Nageswara R Sastry<rnsastry@linux.ibm.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Tested-by: Nageswara R Sastry<rnsastry@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210813082158.429023-1-kjain@linux.ibm.com
|
|
This patch prevents the following sparse warning.
arch/powerpc/kernel/tau_6xx.c:199:1: sparse: sparse: symbol 'tau_work'
was not declared. Should it be static?
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Finn Thain <fthain@linux-m68k.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/44ab381741916a51e783c4a50d0b186abdd8f280.1629334014.git.fthain@linux-m68k.org
|
|
Commit a278e7ea608b ("powerpc: Fix compile issue with force DAWR")
selects the non-existing config PPC_DAWR_FORCE_ENABLE for config
KVM_BOOK3S_64_HANDLER. As this commit also introduces a config PPC_DAWR
and this config PPC_DAWR is selected with PPC if PPC64, there is no
need for any further select in the KVM_BOOK3S_64_HANDLER.
Remove an obsolete and unneeded select in config KVM_BOOK3S_64_HANDLER.
The issue was identified with ./scripts/checkkconfigsymbols.py.
Fixes: a278e7ea608b ("powerpc: Fix compile issue with force DAWR")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210819113954.17515-2-lukas.bulwahn@gmail.com
|
|
No need to re-read SPRN_THREAD, we can calculate thread address
from current (r2).
And remove a reload of value 1 into r4 as r4 is already 1.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c04cce578b97a76a9e69a096698b1d89f721768a.1629276437.git.christophe.leroy@csgroup.eu
|
|
Remove duplicated include.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210326064808.3262568-1-zhengyongjun3@huawei.com
|
|
interrupt.c: asm/interrupt.h has been included at line 12, so remove the
duplicate one at line 10.
time.c: linux/sched/clock.h has been included at line 33,so remove the
duplicate one at line 56 and move sched/cputime.h under sched including
segament.
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210323062916.295346-1-wanjiabing@vivo.com
|
|
Regenerate atop v5.14-rc6 by doing a make savedefconfig.
The changes a re-ordering except for the following (which are still set
indirectly):
- CONFIG_DEBUG_KERNEL=y selected by EXPERT
- CONFIG_PPC_EARLY_DEBUG_CPM_ADDR=0xff002008 which is the default
setting
Signed-off-by: Joel Stanley <joel@jms.id.au>
Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210817045407.2445664-4-joel@jms.id.au
|
|
CONFIG_MTD_PHYSMAP_OF is not longer enabled as it depends on
MTD_PHYSMAP which is not enabled.
This is a regression from commit 642b1e8dbed7 ("mtd: maps: Merge
physmap_of.c into physmap-core.c"), which added the extra dependency.
Add CONFIG_MTD_PHYSMAP=y so this stays in the config, as Christophe said
it is useful for build coverage.
Fixes: 642b1e8dbed7 ("mtd: maps: Merge physmap_of.c into physmap-core.c")
Signed-off-by: Joel Stanley <joel@jms.id.au>
Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210817045407.2445664-3-joel@jms.id.au
|
|
When building this config there's a warning:
79:warning: override: reassigning to symbol IPV6
Commit 9a1762a4a4ff ("powerpc/8xx: Update mpc885_ads_defconfig to
improve CI") added CONFIG_IPV6=y, but left '# CONFIG_IPV6 is not set'
in.
IPV6 is default y, so remove both to clean up the build.
Fixes: 9a1762a4a4ff ("powerpc/8xx: Update mpc885_ads_defconfig to improve CI")
Signed-off-by: Joel Stanley <joel@jms.id.au>
Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210817045407.2445664-2-joel@jms.id.au
|
|
Replace "cat file | grep pattern" with "grep pattern file", and quote a
few variables. Together that fixes all shellcheck errors.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210817125154.3369884-1-mpe@ellerman.id.au
|
|
Prefer stderr instead of stdout for error messages.
This is a good practice and can help CI error detecting and
reporting (0day in this case).
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210815222334.9575-1-rdunlap@infradead.org
|
|
As reported by lkp, if NUMA=n we see a build error:
arch/powerpc/platforms/pseries/hotplug-cpu.c: In function 'pseries_cpu_hotplug_init':
arch/powerpc/platforms/pseries/hotplug-cpu.c:1022:8: error: 'node_to_cpumask_map' undeclared
1022 | node_to_cpumask_map[node]);
Use cpumask_of_node() which has an empty stub for NUMA=n, and when
NUMA=y does a lookup from node_to_cpumask_map[].
Fixes: bd1dd4c5f528 ("powerpc/pseries: Prevent free CPU ids being reused on another node")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210816041032.2839343-1-mpe@ellerman.id.au
|
|
Object files used to link .tmp_vmlinux.kallsyms1 have many
R_PPC64_ADDR64 relocations in non-SHF_WRITE sections. There are many
text relocations (e.g. in .rela___ksymtab_gpl+* and .rela__mcount_loc
sections) in a -pie link and are disallowed by LLD:
ld.lld: error: can't create dynamic relocation R_PPC64_ADDR64 against local symbol in readonly segment; recompile object files with -fPIC or pass '-Wl,-z,notext' to allow text relocations in the output
>>> defined in arch/powerpc/kernel/head_64.o
>>> referenced by arch/powerpc/kernel/head_64.o:(__restart_table+0x10)
Newer GNU ld configured with "--enable-textrel-check=error" will report
an error as well:
$ ld-new -EL -m elf64lppc -pie ... -o .tmp_vmlinux.kallsyms1 ...
ld-new: read-only segment has dynamic relocations
Add "-z notext" to suppress the errors. Non-CONFIG_RELOCATABLE builds
use the default -no-pie mode and thus R_PPC64_ADDR64 relocations can be
resolved at link-time.
Reported-by: Itaru Kitayama <itaru.kitayama@riken.jp>
Co-developed-by: Bill Wendling <morbo@google.com>
Signed-off-by: Fangrui Song <maskray@google.com>
Signed-off-by: Bill Wendling <morbo@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210813200511.1905703-1-morbo@google.com
|
|
Using asm goto in __WARN_FLAGS() and WARN_ON() allows more
flexibility to GCC.
For that add an entry to the exception table so that
program_check_exception() knowns where to resume execution
after a WARNING.
Here are two exemples. The first one is done on PPC32 (which
benefits from the previous patch), the second is on PPC64.
unsigned long test(struct pt_regs *regs)
{
int ret;
WARN_ON(regs->msr & MSR_PR);
return regs->gpr[3];
}
unsigned long test9w(unsigned long a, unsigned long b)
{
if (WARN_ON(!b))
return 0;
return a / b;
}
Before the patch:
000003a8 <test>:
3a8: 81 23 00 84 lwz r9,132(r3)
3ac: 71 29 40 00 andi. r9,r9,16384
3b0: 40 82 00 0c bne 3bc <test+0x14>
3b4: 80 63 00 0c lwz r3,12(r3)
3b8: 4e 80 00 20 blr
3bc: 0f e0 00 00 twui r0,0
3c0: 80 63 00 0c lwz r3,12(r3)
3c4: 4e 80 00 20 blr
0000000000000bf0 <.test9w>:
bf0: 7c 89 00 74 cntlzd r9,r4
bf4: 79 29 d1 82 rldicl r9,r9,58,6
bf8: 0b 09 00 00 tdnei r9,0
bfc: 2c 24 00 00 cmpdi r4,0
c00: 41 82 00 0c beq c0c <.test9w+0x1c>
c04: 7c 63 23 92 divdu r3,r3,r4
c08: 4e 80 00 20 blr
c0c: 38 60 00 00 li r3,0
c10: 4e 80 00 20 blr
After the patch:
000003a8 <test>:
3a8: 81 23 00 84 lwz r9,132(r3)
3ac: 71 29 40 00 andi. r9,r9,16384
3b0: 40 82 00 0c bne 3bc <test+0x14>
3b4: 80 63 00 0c lwz r3,12(r3)
3b8: 4e 80 00 20 blr
3bc: 0f e0 00 00 twui r0,0
0000000000000c50 <.test9w>:
c50: 7c 89 00 74 cntlzd r9,r4
c54: 79 29 d1 82 rldicl r9,r9,58,6
c58: 0b 09 00 00 tdnei r9,0
c5c: 7c 63 23 92 divdu r3,r3,r4
c60: 4e 80 00 20 blr
c70: 38 60 00 00 li r3,0
c74: 4e 80 00 20 blr
In the first exemple, we see GCC doesn't need to duplicate what
happens after the trap.
In the second exemple, we see that GCC doesn't need to emit a test
and a branch in the likely path in addition to the trap.
We've got some WARN_ON() in .softirqentry.text section so it needs
to be added in the OTHER_TEXT_SECTIONS in modpost.c
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/389962b1b702e3c78d169e59bcfac56282889173.1618331882.git.christophe.leroy@csgroup.eu
|
|
powerpc BUG_ON() and WARN_ON() are based on using twnei instruction.
For catching simple conditions like a variable having value 0, this
is efficient because it does the test and the trap at the same time.
But most conditions used with BUG_ON or WARN_ON are more complex and
forces GCC to format the condition into a 0 or 1 value in a register.
This will usually require 2 to 3 instructions.
The most efficient solution would be to use __builtin_trap() because
GCC is able to optimise the use of the different trap instructions
based on the requested condition, but this is complex if not
impossible for the following reasons:
- __builtin_trap() is a non-recoverable instruction, so it can't be
used for WARN_ON
- Knowing which line of code generated the trap would require the
analysis of DWARF information. This is not a feature we have today.
As mentioned in commit 8d4fbcfbe0a4 ("Fix WARN_ON() on bitfield ops")
the way WARN_ON() is implemented is suboptimal. That commit also
mentions an issue with 'long long' condition. It fixed it for
WARN_ON() but the same problem still exists today with BUG_ON() on
PPC32. It will be fixed by using the generic implementation.
By using the generic implementation, gcc will naturally generate a
branch to the unconditional trap generated by BUG().
As modern powerpc implement zero-cycle branch,
that's even more efficient.
And for the functions using WARN_ON() and its return, the test
on return from WARN_ON() is now also used for the WARN_ON() itself.
On PPC64 we don't want it because we want to be able to use CFAR
register to track how we entered the code that trapped. The CFAR
register would be clobbered by the branch.
A simple test function:
unsigned long test9w(unsigned long a, unsigned long b)
{
if (WARN_ON(!b))
return 0;
return a / b;
}
Before the patch:
0000046c <test9w>:
46c: 7c 89 00 34 cntlzw r9,r4
470: 55 29 d9 7e rlwinm r9,r9,27,5,31
474: 0f 09 00 00 twnei r9,0
478: 2c 04 00 00 cmpwi r4,0
47c: 41 82 00 0c beq 488 <test9w+0x1c>
480: 7c 63 23 96 divwu r3,r3,r4
484: 4e 80 00 20 blr
488: 38 60 00 00 li r3,0
48c: 4e 80 00 20 blr
After the patch:
00000468 <test9w>:
468: 2c 04 00 00 cmpwi r4,0
46c: 41 82 00 0c beq 478 <test9w+0x10>
470: 7c 63 23 96 divwu r3,r3,r4
474: 4e 80 00 20 blr
478: 0f e0 00 00 twui r0,0
47c: 38 60 00 00 li r3,0
480: 4e 80 00 20 blr
So we see before the patch we need 3 instructions on the likely path
to handle the WARN_ON(). With the patch the trap goes on the unlikely
path.
See below the difference at the entry of system_call_exception where
we have several BUG_ON(), allthough less impressing.
With the patch:
00000000 <system_call_exception>:
0: 81 6a 00 84 lwz r11,132(r10)
4: 90 6a 00 88 stw r3,136(r10)
8: 71 60 00 02 andi. r0,r11,2
c: 41 82 00 70 beq 7c <system_call_exception+0x7c>
10: 71 60 40 00 andi. r0,r11,16384
14: 41 82 00 6c beq 80 <system_call_exception+0x80>
18: 71 6b 80 00 andi. r11,r11,32768
1c: 41 82 00 68 beq 84 <system_call_exception+0x84>
20: 94 21 ff e0 stwu r1,-32(r1)
24: 93 e1 00 1c stw r31,28(r1)
28: 7d 8c 42 e6 mftb r12
...
7c: 0f e0 00 00 twui r0,0
80: 0f e0 00 00 twui r0,0
84: 0f e0 00 00 twui r0,0
Without the patch:
00000000 <system_call_exception>:
0: 94 21 ff e0 stwu r1,-32(r1)
4: 93 e1 00 1c stw r31,28(r1)
8: 90 6a 00 88 stw r3,136(r10)
c: 81 6a 00 84 lwz r11,132(r10)
10: 69 60 00 02 xori r0,r11,2
14: 54 00 ff fe rlwinm r0,r0,31,31,31
18: 0f 00 00 00 twnei r0,0
1c: 69 60 40 00 xori r0,r11,16384
20: 54 00 97 fe rlwinm r0,r0,18,31,31
24: 0f 00 00 00 twnei r0,0
28: 69 6b 80 00 xori r11,r11,32768
2c: 55 6b 8f fe rlwinm r11,r11,17,31,31
30: 0f 0b 00 00 twnei r11,0
34: 7d 8c 42 e6 mftb r12
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b286e07fb771a664b631cd07a40b09c06f26e64b.1618331881.git.christophe.leroy@csgroup.eu
|
|
PAPR interface currently supports two different ways of communicating resource
grouping details to the OS. These are referred to as Form 0 and Form 1
associativity grouping. Form 0 is the older format and is now considered
deprecated. This patch adds another resource grouping named FORM2.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210812132223.225214-6-aneesh.kumar@linux.ibm.com
|
|
This helper is only used with the dispatch trace log collection.
A later patch will add Form2 affinity support and this change helps
in keeping that simpler. Also add a comment explaining we don't expect
the code to be called with FORM0
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210812132223.225214-5-aneesh.kumar@linux.ibm.com
|
|
The associativity details of the newly added resourced are collected from
the hypervisor via "ibm,configure-connector" rtas call. Update the numa
distance details of the newly added numa node after the above call.
Instead of updating NUMA distance every time we lookup a node id
from the associativity property, add helpers that can be used
during boot which does this only once. Also remove the distance
update from node id lookup helpers.
Currently, we duplicate parsing code for ibm,associativity and
ibm,associativity-lookup-arrays in the kernel. The associativity array provided
by these device tree properties are very similar and hence can use
a helper to parse the node id and numa distance details.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210812132223.225214-4-aneesh.kumar@linux.ibm.com
|
|
Also make related code cleanup that will allow adding FORM2_AFFINITY in
later patches. No functional change in this patch.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210812132223.225214-3-aneesh.kumar@linux.ibm.com
|
|
No functional change in this patch.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210812132223.225214-2-aneesh.kumar@linux.ibm.com
|
|
No functional change in this patch. arch_debugfs_dir is the generic kernel
name declared in linux/debugfs.h for arch-specific debugfs directory.
Architectures like x86/s390 already use the name. Rename powerpc
specific powerpc_debugfs_root to arch_debugfs_dir.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210812132831.233794-2-aneesh.kumar@linux.ibm.com
|
|
Similar to x86/s390 add a debugfs file to tune tlb_single_page_flush_ceiling.
Also add a debugfs entry for tlb_local_single_page_flush_ceiling.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210812132831.233794-1-aneesh.kumar@linux.ibm.com
|
|
In the numa=off kernel command-line configuration init_chip_info() loops
around the number of chips and attempts to copy the cpumask of that node
which is NULL for all iterations after the first chip.
Hence, store the cpu mask for each chip instead of derving cpumask from
node while populating the "chips" struct array and copy that to the
chips[i].mask
Fixes: 053819e0bf84 ("cpufreq: powernv: Handle throttling due to Pmax capping at chip level")
Cc: stable@vger.kernel.org # v4.3+
Reported-by: Shirisha Ganta <shirisha.ganta1@ibm.com>
Signed-off-by: Pratik R. Sampat <psampat@linux.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
[mpe: Rename goto label to out_free_chip_cpu_mask]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210728120500.87549-2-psampat@linux.ibm.com
|
|
This selects the nintendo-otp module when building for this platform.
Signed-off-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210801073822.12452-6-linkmauve@linkmauve.fr
|
|
This can be used by the newly-added nintendo-otp nvmem module.
Signed-off-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210801073822.12452-5-linkmauve@linkmauve.fr
|
|
This is wrong, but needed in order to avoid overlapping ranges with the
OTP area added in the next commit. A refactor of this part of the
device tree is needed: according to Wiibrew[1], this area starts at
0x0d800000 and spans 0x400 bytes (that is, 0x100 32-bit registers),
encompassing PIC and GPIO registers, amongst the ones already exposed in
this device tree, which should become children of the control@d800000
node.
[1] https://wiibrew.org/wiki/Hardware/Hollywood_Registers
Signed-off-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210801073822.12452-4-linkmauve@linkmauve.fr
|
|
Wherever possible, replace constructs that match either
generic_handle_irq(irq_find_mapping()) or
generic_handle_irq(irq_linear_revmap()) to a single call to
generic_handle_domain_irq().
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210802162630.2219813-13-maz@kernel.org
|
|
On P10, the feature doing an automatic "save & restore" of a VCPU
interrupt context is set by default in OPAL. When a VP context is
pulled out, the state of the interrupt registers are saved by the XIVE
interrupt controller under the internal NVP structure representing the
VP. This saves a costly store/load in guest entries and exits.
If OPAL advertises the "save & restore" feature in the device tree,
it should also have set the 'H' bit in the CAM line. Check that when
vCPUs are connected to their ICP in KVM before going any further.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210720134209.256133-3-clg@kaod.org
|
|
Use it to hold platform specific features. P9 DD2 introduced
single-escalation support. P10 will add others.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210720134209.256133-2-clg@kaod.org
|
|
There is no need to use the lockup detector ("noirqdebug") for IPIs.
The ipistorm benchmark measures a ~10% improvement on high systems
when this flag is set.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210719130614.195886-1-clg@kaod.org
|
|
The default domain of the PCI/MSIs is not the XIVE domain anymore. To
list the IRQ mappings under XMON and debugfs, query the IRQ data from
the low level XIVE domain.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210701132750.1475580-32-clg@kaod.org
|