summaryrefslogtreecommitdiff
path: root/arch/powerpc
AgeCommit message (Collapse)AuthorFilesLines
2023-10-19powerpc/nohash: Add _PAGE_WRITE to supplement _PAGE_RWChristophe Leroy10-15/+26
Several places, _PAGE_RW maps to write permission and don't always imply read. To make it more clear, do as book3s/64 in commit c7d54842deb1 ("powerpc/mm: Use _PAGE_READ to indicate Read access") and use _PAGE_WRITE when more relevant. For the time being _PAGE_WRITE is equivalent to _PAGE_RW but that will change when _PAGE_READ gets added in following patches. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/1f79b88db54d030ada776dc9845e0e88345bfc28.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc/64s: Use generic permission masksChristophe Leroy1-15/+5
book3s64 need specific masks because it needs _PAGE_PRIVILEGED for PAGE_NONE. book3s64 already has _PAGE_RW and _PAGE_RWX. So add _PAGE_NA, _PAGE_RO and _PAGE_ROX and remove specific permission masks. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/d37a418de52e2c950cad6797e81663b56991366f.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc/8xx: Use generic permission masksChristophe Leroy1-8/+5
8xx already has _PAGE_NA and _PAGE_RO. So add _PAGE_ROX, _PAGE_RW and _PAGE_RWX and remove specific permission masks. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/5d0b5ce43485f697313eee4326ddff97157fb219.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc: Refactor permission masks used for __P/__S table and kernel memory ↵Christophe Leroy1-0/+30
flags Prepare a common version of the permission masks that will be based on _PAGE_NA, _PAGE_RO, _PAGE_ROX, _PAGE_RW, _PAGE_RWX that will be defined in platform specific headers in later patches. Put them in a new header pgtable-masks.h which will be included by platforms. And prepare a common version of flags used for mapping kernel memory that will be based on _PAGE_RO, _PAGE_ROX, _PAGE_RW, _PAGE_RWX that will be defined in platform specific headers. Put them in unless _PAGE_KERNEL_RO is already defined so that platform specific definitions can be dismantled one by one. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/d31b59efcdb38e675479563307af891aeadf6ec9.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc: Rely on address instead of pte_user()Christophe Leroy6-25/+15
pte_user() may return 'false' when a user page is PAGE_NONE. In that case it is still a user page and needs to be handled as such. So use is_kernel_addr() instead. And remove "user" text from ptdump as ptdump only dumps kernel tables. Note: no change done for book3s/64 which still has it 'priviledge' bit. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/c778dad89fad07727c31717a9c62f45357c29ebc.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc: Remove pte_mkuser() and pte_mkpriviledged()Christophe Leroy5-63/+0
pte_mkuser() is never used. Remove it. pte_mkpriviledged() is not used anymore. Remove it too. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/1a1dc18816456c637dc8a9c38d532f7598b60ac4.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc: Fail ioremap() instead of silently ignoring flags when PAGE_USER is setChristophe Leroy1-1/+2
Calling ioremap() with _PAGE_USER (or _PAGE_PRIVILEDGE unset) is wrong. Loudly fail the call to ioremap() instead of blindly clearing the flags. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/b6dd5485ad00d2aafd2bb9b7c2c4eac3ebf2cdaf.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc: Implement and use pgprot_nx()Christophe Leroy2-3/+8
ioremap_page_range() calls pgprot_nx() vmap() and vmap_pfn() clear execute permission by calling pgprot_nx(). When pgprot_nx() is not defined it falls back to a nop. Implement it for powerpc then use it in early_ioremap_range(). Then the call to pte_exprotect() can be removed from ioremap_prot(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/5993a7a097e989af1c97fc4a6c011fefc67dbe6e.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc/e500: Simplify pte_mkexec()Christophe Leroy1-4/+1
Commit b6cb20fdc273 ("powerpc/book3e: Fix set_memory_x() and set_memory_nx()") implemented a more elaborated version of pte_mkwrite() suitable for both kernel and user pages. That was needed because set_memory_x() was using pte_mkwrite(). But since commit a4c182ecf335 ("powerpc/set_memory: Avoid spinlock recursion in change_page_attr()") pte_mkwrite() is not used anymore by set_memory_x() so pte_mkwrite() can be simplified as it is only used for user pages. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/cdc822322fe2ff4b0f5ecfde71d09d950b1c7557.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc/nohash: Refactor __ptep_set_access_flags()Christophe Leroy3-31/+17
nohash/32 version of __ptep_set_access_flags() does the same as nohash/64 version, the only difference is that nohash/32 version is more complete and uses pte_update(). Make it common and remove the nohash/64 version. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/e296885df46289d3e5f4cb51efeefe593f76ef24.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc/nohash: Refactor pte_clear()Christophe Leroy3-10/+5
pte_clear() are doing the same on nohash/32 and nohash/64, Keep the static inline version of nohash/64, make it common and remove the macro version of nohash/32. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/818f7df83d7e9e18f55e274cd3c44f2871ade4dd.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc/nohash: Deduplicate ptep_set_wrprotect() and ptep_get_and_clear()Christophe Leroy3-31/+16
ptep_set_wrprotect() and ptep_get_and_clear are identical for nohash/32 and nohash/64. Make them common. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/ffe46edecdabce915e2d1a4b79a3b2ab770f2248.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc/nohash: Refactor ptep_test_and_clear_young()Christophe Leroy3-29/+12
Remove ptep_test_and_clear_young() macro, make __ptep_test_and_clear_young() common to nohash/32 and nohash/64 and change it to become ptep_test_and_clear_young() Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/de90679ae169104fa3c98d0a8828d7ede228fc52.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc/nohash: Deduplicate pte helpersChristophe Leroy3-61/+36
Deduplicate following helpers that are identical on nohash/32 and nohash/64: pte_mkwrite_novma() pte_mkdirty() pte_mkyoung() pte_wrprotect() pte_mkexec() pte_young() Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/485f3cbafaa948143f92692e17ca652dfed68da2.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc/nohash: Deduplicate _PAGE_CHG_MASKChristophe Leroy3-13/+7
_PAGE_CHG_MASK is identical between nohash/32 and nohash/64, deduplicate it. While at it, clean the #ifdef for PTE_RPN_MASK in nohash/32 as it is already CONFIG_PPC32. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/c1a1893dbba4afb825bfa78b262f0b0e0fc3b490.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc/nohash: Refactor checking of no-change in pte_update()Christophe Leroy2-9/+3
On nohash/64, a few callers of pte_update() check if there is really a change in order to avoid an unnecessary write. Refactor that inside pte_update(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/076563e611c2b51036686a8d378bfd5ef1726341.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc/nohash: Refactor pte_update()Christophe Leroy3-50/+42
pte_update() is similar. Take the nohash/32 version which works on nohash/64 and add the debug call to assert_pte_locked() which is only on nohash/64. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/e01cb630cad42f645915ce7702d23985241b71fc.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc/nohash: Replace #ifdef CONFIG_44x by IS_ENABLED(CONFIG_44x) in pgtable.hChristophe Leroy1-5/+2
No need of a #ifdef, use IS_ENABLED(CONFIG_44x) Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/7c7d97322a6a05a9842b1e8c4b41265916f542ca.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc/nohash: Move 8xx version of pte_update() into pte-8xx.hChristophe Leroy2-56/+58
No point in having 8xx special pte_update() in common header, move it into pte-8xx.h Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/17e209b1a1a43ed219e9e1f2947ec594ed4f9394.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc/nohash: Refactor declaration of {map/unmap}_kernel_page()Christophe Leroy4-11/+4
map_kernel_page() and unmap_kernel_page() have the same prototypes on nohash/32 and nohash/64, keep only one declaration. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/7fec5f3288cf0d0eac61b1b3f48c3ea54eb80cad.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc/nohash: Remove {pte/pmd}_protnone()Christophe Leroy1-17/+0
Only book3s/64 selects ARCH_SUPPORTS_NUMA_BALANCING so CONFIG_NUMA_BALANCING can't be selected on nohash targets. Remove pte_protnone() and pmd_protnone(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/a50c1a19828a8eced82cbcc5c61754b667037d21.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc: Untangle fixmap.h and pgtable.h and mmu.hChristophe Leroy11-15/+29
fixmap.h need pgtable.h for [un]map_kernel_page() pgtable.h need fixmap.h for FIXADDR_TOP. Untangle the two files by moving FIXADDR_TOP into pgtable.h Also move VIRT_IMMR_BASE to fixmap.h to avoid fixmap.h in mmu.h Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/5eba12392a018be28ad0a02ed844767b132589e7.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc: Refactor update_mmu_cache_range()Christophe Leroy4-41/+20
On nohash, this function voids except for E500 with hugepages. On book3s, this function is for hash MMUs only. Combine those tests and rename E500 update_mmu_cache_range() as __update_mmu_cache() which gets called by update_mmu_cache_range(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/b029842cb6783cbeb43d202e69a90341d65295a4.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc: Deduplicate prototypes of ptep_set_access_flags() and ↵Christophe Leroy3-19/+10
phys_mem_access_prot() Prototypes of ptep_set_access_flags() and phys_mem_access_prot() are identical for book3s and nohash. Deduplicate them. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/b846832cce842a2852615b7356937fb9507e436d.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc: Remove pte_ERROR()Christophe Leroy4-10/+0
pte_ERROR() is used neither in powerpc code nor in common mm code. Remove it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/bec9eb973ecc1cba091e5c9201d877a7797f3242.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc/40x: Remove stale PTE_ATOMIC_UPDATES macroChristophe Leroy1-3/+0
40x TLB handlers were reworked by commit 2c74e2586bb9 ("powerpc/40x: Rework 40x PTE access and TLB miss") to not require PTE_ATOMIC_UPDATES anymore. Then commit 4e1df545e2fa ("powerpc/pgtable: Drop PTE_ATOMIC_UPDATES") removed all code related to PTE_ATOMIC_UPDATES. Remove left over PTE_ATOMIC_UPDATES macro. Fixes: 2c74e2586bb9 ("powerpc/40x: Rework 40x PTE access and TLB miss") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/f061db5857fcd748f84a6707aad01754686ce97e.1695659959.git.christophe.leroy@csgroup.eu
2023-10-19powerpc: implement arch_xor_unlock_is_negative_byte on 32-bitMatthew Wilcox (Oracle)1-4/+0
Simply remove the ifdef. The assembly is identical to that in the non-optimised case of test_and_clear_bits() on PPC32, and it's not clear to me how the PPC32 optimisation works, nor whether it would work for arch_xor_unlock_is_negative_byte(). If that optimisation would work, someone can implement it later, but this is more efficient than the implementation in filemap.c. Link: https://lkml.kernel.org/r/20231004165317.1061855-12-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-19bitops: add xor_unlock_is_negative_byte()Matthew Wilcox (Oracle)1-12/+5
Replace clear_bit_and_unlock_is_negative_byte() with xor_unlock_is_negative_byte(). We have a few places that like to lock a folio, set a flag and unlock it again. Allow for the possibility of combining the latter two operations for efficiency. We are guaranteed that the caller holds the lock, so it is safe to unlock it with the xor. The caller must guarantee that nobody else will set the flag without holding the lock; it is not safe to do this with the PG_dirty flag, for example. Link: https://lkml.kernel.org/r/20231004165317.1061855-8-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18powerpc/bpf: Fixed 'instead' typo in bpf_jit_build_body()Muhammad Muzammil1-1/+1
Fixed 'instead' typo. Signed-off-by: Muhammad Muzammil <m.muzzammilashraf@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231013053118.11221-1-m.muzzammilashraf@gmail.com
2023-10-18Merge branch fixes into nextMichael Ellerman15-69/+87
Merge our fixes branch to bring in commits that are prerequisities for further development or would cause conflicts.
2023-10-18spufs: convert to new timestamp accessorsJeff Layton1-1/+1
Convert to using the new inode timestamp accessor functions. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20231004185347.80880-1-jlayton@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-10-18powerpc/qspinlock: Fix stale propagated yield_cpuNicholas Piggin1-0/+3
yield_cpu is a sample of a preempted lock holder that gets propagated back through the queue. Queued waiters use this to yield to the preempted lock holder without continually sampling the lock word (which would defeat the purpose of MCS queueing by bouncing the cache line). The problem is that yield_cpu can become stale. It can take some time to be passed down the chain, and if any queued waiter gets preempted then it will cease to propagate the yield_cpu to later waiters. This can result in yielding to a CPU that no longer holds the lock, which is bad, but particularly if it is currently in H_CEDE (idle), then it appears to be preempted and some hypervisors (PowerVM) can cause very long H_CONFER latencies waiting for H_CEDE wakeup. This results in latency spikes and hard lockups on oversubscribed partitions with lock contention. This is a minimal fix. Before yielding to yield_cpu, sample the lock word to confirm yield_cpu is still the owner, and bail out of it is not. Thanks to a bunch of people who reported this and tracked down the exact problem using tracepoints and dispatch trace logs. Fixes: 28db61e207ea ("powerpc/qspinlock: allow propagation of yield CPU down the queue") Cc: stable@vger.kernel.org # v6.2+ Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Reported-by: Laurent Dufour <ldufour@linux.ibm.com> Reported-by: Shrikanth Hegde <sshegde@linux.vnet.ibm.com> Debugged-by: "Nysal Jan K.A" <nysal@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Shrikanth Hegde <sshegde@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231016124305.139923-2-npiggin@gmail.com
2023-10-18powerpc/64s/radix: Don't warn on copros in radix__tlb_flush()Michael Ellerman1-8/+1
Sachin reported a warning when running the inject-ra-err selftest: # selftests: powerpc/mce: inject-ra-err Disabling lock debugging due to kernel taint MCE: CPU19: machine check (Severe) Real address Load/Store (foreign/control memory) [Not recovered] MCE: CPU19: PID: 5254 Comm: inject-ra-err NIP: [0000000010000e48] MCE: CPU19: Initiator CPU MCE: CPU19: Unknown ------------[ cut here ]------------ WARNING: CPU: 19 PID: 5254 at arch/powerpc/mm/book3s64/radix_tlb.c:1221 radix__tlb_flush+0x160/0x180 CPU: 19 PID: 5254 Comm: inject-ra-err Kdump: loaded Tainted: G M E 6.6.0-rc3-00055-g9ed22ae6be81 #4 Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries ... NIP radix__tlb_flush+0x160/0x180 LR radix__tlb_flush+0x104/0x180 Call Trace: radix__tlb_flush+0xf4/0x180 (unreliable) tlb_finish_mmu+0x15c/0x1e0 exit_mmap+0x1a0/0x510 __mmput+0x60/0x1e0 exit_mm+0xdc/0x170 do_exit+0x2bc/0x5a0 do_group_exit+0x4c/0xc0 sys_exit_group+0x28/0x30 system_call_exception+0x138/0x330 system_call_vectored_common+0x15c/0x2ec And bisected it to commit e43c0a0c3c28 ("powerpc/64s/radix: combine final TLB flush and lazy tlb mm shootdown IPIs"), which added a warning in radix__tlb_flush() if mm->context.copros is still elevated. However it's possible for the copros count to be elevated if a process exits without first closing file descriptors that are associated with a copro, eg. VAS. If the process exits with a VAS file still open, the release callback is queued up for exit_task_work() via: exit_files() put_files_struct() close_files() filp_close() fput() And called via: exit_task_work() ____fput() __fput() file->f_op->release(inode, file) coproc_release() vas_user_win_ops->close_win() vas_deallocate_window() mm_context_remove_vas_window() mm_context_remove_copro() But that is after exit_mm() has been called from do_exit() and triggered the warning. Fix it by dropping the warning, and always calling __flush_all_mm(). In the normal case of no copros, that will result in a call to _tlbiel_pid(mm->context.id, RIC_FLUSH_ALL) just as the current code does. If the copros count is elevated then it will cause a global flush, which should flush translations from any copros. Note that the process table entry was cleared in arch_exit_mmap(), so copros should not be able to fetch any new translations. Fixes: e43c0a0c3c28 ("powerpc/64s/radix: combine final TLB flush and lazy tlb mm shootdown IPIs") Reported-by: Sachin Sant <sachinp@linux.ibm.com> Closes: https://lore.kernel.org/all/A8E52547-4BF1-47CE-8AEA-BC5A9D7E3567@linux.ibm.com/ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Link: https://msgid.link/20231017121527.1574104-1-mpe@ellerman.id.au
2023-10-17powerpc/pseries: PLPKS SED Opal keystore supportGreg Joyce3-0/+138
Define operations for SED Opal to read/write keys from POWER LPAR Platform KeyStore(PLPKS). This allows non-volatile storage of SED Opal keys. Signed-off-by: Greg Joyce <gjoyce@linux.vnet.ibm.com> Reviewed-by: Jonathan Derrick <jonathan.derrick@linux.dev> Link: https://lore.kernel.org/r/20231004201957.1451669-4-gjoyce@linux.vnet.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-10-17vga16fb: drop powerpc supportArnd Bergmann1-16/+0
I noticed that commit 0db5b61e0dc07 ("fbdev/vga16fb: Create EGA/VGA devices in sysfb code") broke vga16fb on non-x86 platforms, because the sysfb code never creates a vga-framebuffer device when screen_info.orig_video_isVGA is set to '1' instead of VIDEO_TYPE_VGAC. However, it turns out that the only architecture that has allowed building vga16fb in the past 20 years is powerpc, and this only worked on two 32-bit platforms and never on 64-bit powerpc. The last machine that actually used this was removed in linux-3.10, so this is all dead code and can be removed. The big-endian support in vga16fb.c could also be removed, but I'd just leave this in place. Fixes: 933ee7119fb14 ("powerpc: remove PReP platform") Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Acked-by: Helge Deller <deller@gmx.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20231009211845.3136536-8-arnd@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-15Merge tag 'powerpc-6.6-4' of ↵Linus Torvalds6-12/+17
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix softlockup/crash when using hcall tracing - Fix pte_access_permitted() for PAGE_NONE on 8xx - Fix inverted pte_young() test in __ptep_test_and_clear_young() on 64-bit BookE - Fix unhandled math emulation exception on 85xx - Fix kernel crash on syscall return on 476 Thanks to Athira Rajeev, Christophe Leroy, Eddie James, and Naveen N Rao. * tag 'powerpc-6.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/47x: Fix 47x syscall return crash powerpc/85xx: Fix math emulation exception powerpc/64e: Fix wrong test in __ptep_test_and_clear_young() powerpc/8xx: Fix pte_access_permitted() for PAGE_NONE powerpc/pseries: Remove unused r0 in the hcall tracing code powerpc/pseries: Fix STK_PARAM access in the hcall tracing code
2023-10-15powerpc/mm: Allow ARCH_FORCE_MAX_ORDER up to 12Michael Ellerman1-1/+1
Christophe reported that the change to ARCH_FORCE_MAX_ORDER to limit the range to 10 had broken his ability to configure hugepages: # echo 1 > /sys/kernel/mm/hugepages/hugepages-8192kB/nr_hugepages sh: write error: Invalid argument Several of the powerpc defconfigs previously set the ARCH_FORCE_MAX_ORDER value to 12, via the definition in arch/powerpc/configs/fsl-emb-nonhw.config, used by: mpc85xx_defconfig mpc85xx_smp_defconfig corenet32_smp_defconfig corenet64_smp_defconfig mpc86xx_defconfig mpc86xx_smp_defconfig Fix it by increasing the allowed range to 12 to restore the previous behaviour. Fixes: 358e526a1648 ("powerpc/mm: Reinstate ARCH_FORCE_MAX_ORDER ranges") Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu> Closes: https://lore.kernel.org/all/8011d806-5b30-bf26-2bfe-a08c39d57e20@csgroup.eu/ Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230824122849.942072-1-mpe@ellerman.id.au
2023-10-12sched/topology: Rename 'DIE' domain to 'PKG'Peter Zijlstra1-2/+2
While reworking the x86 topology code Thomas tripped over creating a 'DIE' domain for the package mask. :-) Since these names are CONFIG_SCHED_DEBUG=y only, rename them to make the name less ambiguous. [ Shrikanth Hegde: rename on s390 as well. ] [ Valentin Schneider: also rename it in the comments. ] [ mingo: port to recent kernels & find all remaining occurances. ] Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Valentin Schneider <vschneid@redhat.com> Acked-by: Mel Gorman <mgorman@suse.de> Acked-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Acked-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lore.kernel.org/r/20230712141056.GI3100107@hirez.programming.kicks-ass.net
2023-10-12fbdev: Replace fb_pgprotect() with pgprot_framebuffer()Thomas Zimmermann1-8/+5
Rename the fbdev mmap helper fb_pgprotect() to pgprot_framebuffer(). The helper sets VMA page-access flags for framebuffers in device I/O memory. Also clean up the helper's parameters and return value. Instead of the VMA instance, pass the individial parameters separately: existing page-access flags, the VMAs start and end addresses and the offset in the underlying device memory rsp file. Return the new page-access flags. These changes align pgprot_framebuffer() with other pgprot_() functions. v4: * fix commit message (Christophe) v3: * rename fb_pgprotect() to pgprot_framebuffer() (Arnd) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230922080636.26762-3-tzimmermann@suse.de
2023-10-12fbdev: Avoid file argument in fb_pgprotect()Thomas Zimmermann1-1/+6
Only PowerPC's fb_pgprotect() needs the file argument, although the implementation in either phys_mem_access_prot() or pci_phys_mem_access_prot() does not use it. Pass NULL to the internal helper in preparation of further updates. A later patch will remove the file parameter from fb_pgprotect(). While at it, replace the shift operation with PHYS_PFN(). v5: * state function names in commit description (Javier) Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230922080636.26762-2-tzimmermann@suse.de
2023-10-11powerpc/47x: Fix 47x syscall return crashMichael Ellerman1-3/+5
Eddie reported that newer kernels were crashing during boot on his 476 FSP2 system: kernel tried to execute user page (b7ee2000) - exploit attempt? (uid: 0) BUG: Unable to handle kernel instruction fetch Faulting instruction address: 0xb7ee2000 Oops: Kernel access of bad area, sig: 11 [#1] BE PAGE_SIZE=4K FSP-2 Modules linked in: CPU: 0 PID: 61 Comm: mount Not tainted 6.1.55-d23900f.ppcnf-fsp2 #1 Hardware name: ibm,fsp2 476fpe 0x7ff520c0 FSP-2 NIP:  b7ee2000 LR: 8c008000 CTR: 00000000 REGS: bffebd83 TRAP: 0400   Not tainted (6.1.55-d23900f.ppcnf-fs p2) MSR:  00000030 <IR,DR>  CR: 00001000  XER: 20000000 GPR00: c00110ac bffebe63 bffebe7e bffebe88 8c008000 00001000 00000d12 b7ee2000 GPR08: 00000033 00000000 00000000 c139df10 48224824 1016c314 10160000 00000000 GPR16: 10160000 10160000 00000008 00000000 10160000 00000000 10160000 1017f5b0 GPR24: 1017fa50 1017f4f0 1017fa50 1017f740 1017f630 00000000 00000000 1017f4f0 NIP [b7ee2000] 0xb7ee2000 LR [8c008000] 0x8c008000 Call Trace: Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX ---[ end trace 0000000000000000 ]--- The problem is in ret_from_syscall where the check for icache_44x_need_flush is done. When the flush is needed the code jumps out-of-line to do the flush, and then intends to jump back to continue the syscall return. However the branch back to label 1b doesn't return to the correct location, instead branching back just prior to the return to userspace, causing bogus register values to be used by the rfi. The breakage was introduced by commit 6f76a01173cc ("powerpc/syscall: implement system call entry/exit logic in C for PPC32") which inadvertently removed the "1" label and reused it elsewhere. Fix it by adding named local labels in the correct locations. Note that the return label needs to be outside the ifdef so that CONFIG_PPC_47x=n compiles. Fixes: 6f76a01173cc ("powerpc/syscall: implement system call entry/exit logic in C for PPC32") Cc: stable@vger.kernel.org # v5.12+ Reported-by: Eddie James <eajames@linux.ibm.com> Tested-by: Eddie James <eajames@linux.ibm.com> Link: https://lore.kernel.org/linuxppc-dev/fdaadc46-7476-9237-e104-1d2168526e72@linux.ibm.com/ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Link: https://msgid.link/20231010114750.847794-1-mpe@ellerman.id.au
2023-10-11powerpc: Remove now superfluous sentinel element from ctl_table arraysJoel Granados2-2/+0
This commit comes at the tail end of a greater effort to remove the empty elements at the end of the ctl_table arrays (sentinels) which will reduce the overall build time size of the kernel and run time memory bloat by ~64 bytes per sentinel (further information Link : https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/) Remove sentinel from powersave_nap_ctl_table and nmi_wd_lpm_factor_ctl_table. This removal is safe because register_sysctl implicitly uses ARRAY_SIZE() in addition to checking for the sentinel. Signed-off-by: Joel Granados <j.granados@samsung.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-10-10docs: move powerpc under archCosta Shulyupin3-5/+5
and fix all in-tree references. Architecture-specific documentation is being moved into Documentation/arch/ as a way of cleaning up the top-level documentation directory and making the docs hierarchy more closely match the source hierarchy. Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20230826165737.2101199-1-costa.shul@redhat.com
2023-10-10powerpc: Only define __parse_fpscr() when requiredChristophe Leroy1-0/+2
Clang 17 reports: arch/powerpc/kernel/traps.c:1167:19: error: unused function '__parse_fpscr' [-Werror,-Wunused-function] __parse_fpscr() is called from two sites. First call is guarded by #ifdef CONFIG_PPC_FPU_REGS Second call is guarded by CONFIG_MATH_EMULATION which selects CONFIG_PPC_FPU_REGS. So only define __parse_fpscr() when CONFIG_PPC_FPU_REGS is defined. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202309210327.WkqSd5Bq-lkp@intel.com/ Fixes: b6254ced4da6 ("powerpc/signal: Don't manage floating point regs when no FPU") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/5de2998c57f3983563b27b39228ea9a7229d4110.1695385984.git.christophe.leroy@csgroup.eu
2023-10-10powerpc/85xx: Fix math emulation exceptionChristophe Leroy1-1/+1
Booting mpc85xx_defconfig kernel on QEMU leads to: Bad trap at PC: fe9bab0, SR: 2d000, vector=800 awk[82]: unhandled trap (5) at 0 nip fe9bab0 lr fe9e01c code 5 in libc-2.27.so[fe5a000+17a000] awk[82]: code: 3aa00000 3a800010 4bffe03c 9421fff0 7ca62b78 38a00000 93c10008 83c10008 awk[82]: code: 38210010 4bffdec8 9421ffc0 7c0802a6 <fc00048e> d8010008 4815190d 93810030 Trace/breakpoint trap WARNING: no useful console This is because allthough CONFIG_MATH_EMULATION is selected, Exception 800 calls unknown_exception(). Call emulation_assist_interrupt() instead. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/066caa6d9480365da9b8ed83692d7101e10ac5f8.1695657339.git.christophe.leroy@csgroup.eu
2023-10-09Merge tag 'v6.6-rc5' into locking/core, to pick up fixesIngo Molnar12-53/+77
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2023-10-09powerpc/64e: Fix wrong test in __ptep_test_and_clear_young()Christophe Leroy1-1/+1
Commit 45201c879469 ("powerpc/nohash: Remove hash related code from nohash headers.") replaced: if ((pte_val(*ptep) & (_PAGE_ACCESSED | _PAGE_HASHPTE)) == 0) return 0; By: if (pte_young(*ptep)) return 0; But it should be: if (!pte_young(*ptep)) return 0; Fix it. Fixes: 45201c879469 ("powerpc/nohash: Remove hash related code from nohash headers.") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/8bb7f06494e21adada724ede47a4c3d97e879d40.1695659959.git.christophe.leroy@csgroup.eu
2023-10-09powerpc/8xx: Fix pte_access_permitted() for PAGE_NONEChristophe Leroy2-0/+9
On 8xx, PAGE_NONE is handled by setting _PAGE_NA instead of clearing _PAGE_USER. But then pte_user() returns 1 also for PAGE_NONE. As _PAGE_NA prevent reads, add a specific version of pte_read() that returns 0 when _PAGE_NA is set instead of always returning 1. Fixes: 351750331fc1 ("powerpc/mm: Introduce _PAGE_NA") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/57bcfbe578e43123f9ed73e040229b80f1ad56ec.1695659959.git.christophe.leroy@csgroup.eu
2023-10-06arch: Reserve map_shadow_stack() syscall number for all architecturesSohil Mehta1-0/+1
commit c35559f94ebc ("x86/shstk: Introduce map_shadow_stack syscall") recently added support for map_shadow_stack() but it is limited to x86 only for now. There is a possibility that other architectures (namely, arm64 and RISC-V), that are implementing equivalent support for shadow stacks, might need to add support for it. Independent of that, reserving arch-specific syscall numbers in the syscall tables of all architectures is good practice and would help avoid future conflicts. map_shadow_stack() is marked as a conditional syscall in sys_ni.c. Adding it to the syscall tables of other architectures is harmless and would return ENOSYS when exercised. Note, map_shadow_stack() was assigned #453 during the merge process since #452 was taken by fchmodat2(). For Powerpc, map it to sys_ni_syscall() as is the norm for Powerpc syscall tables. For Alpha, map_shadow_stack() takes up #563 as Alpha still diverges from the common syscall numbering system in the other architectures. Link: https://lore.kernel.org/lkml/20230515212255.GA562920@debug.ba.rivosinc.com/ Link: https://lore.kernel.org/lkml/b402b80b-a7c6-4ef0-b977-c0f5f582b78a@sirena.org.uk/ Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-10-06powerpc/iommu: Do not do platform domain attach atctions after probeJason Gunthorpe1-2/+8
POWER throws a splat at boot, it looks like the DMA ops were probably changed while a driver was attached. Something is still weird about how power sequences its bootup. Previously this was hidden since the core iommu code did nothing during probe, now it calls spapr_tce_platform_iommu_attach_dev(). Make spapr_tce_platform_iommu_attach_dev() do nothing on the probe time call like it did before. WARNING: CPU: 0 PID: 8 at arch/powerpc/kernel/iommu.c:407 __iommu_free+0x1e4/0x1f0 Modules linked in: sd_mod t10_pi crc64_rocksoft crc64 sg ibmvfc mlx5_core(+) scsi_transport_fc ibmveth mlxfw psample dm_multipath dm_mirror dm_region_hash dm_log dm_mod fuse CPU: 0 PID: 8 Comm: kworker/0:0 Not tainted 6.6.0-rc3-next-20230929-auto #1 Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.30 (NH1030_062) hv:phyp pSeries Workqueue: events work_for_cpu_fn NIP: c00000000005f6d4 LR: c00000000005f6d0 CTR: 00000000005ca81c REGS: c000000003a27890 TRAP: 0700 Not tainted (6.6.0-rc3-next-20230929-auto) MSR: 800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 48000824 XER: 00000008 CFAR: c00000000020f738 IRQMASK: 0 GPR00: c00000000005f6d0 c000000003a27b30 c000000001481800 000000000000017 GPR04: 00000000ffff7fff c000000003a27950 c000000003a27948 0000000000000027 GPR08: c000000c18c07c10 0000000000000001 0000000000000027 c000000002ac8a08 GPR12: 0000000000000000 c000000002ff0000 c00000000019cc88 c000000003042300 GPR16: 0000000000000000 0000000000000000 0000000000000000 c000000003071ab0 GPR20: c00000000349f80d c000000003215440 c000000003215480 61c8864680b583eb GPR24: 0000000000000000 000000007fffffff 0800000020000000 0000000000000010 GPR28: 0000000000020000 0000800000020000 c00000000c5dc800 c00000000c5dc880 NIP [c00000000005f6d4] __iommu_free+0x1e4/0x1f0 LR [c00000000005f6d0] __iommu_free+0x1e0/0x1f0 Call Trace: [c000000003a27b30] [c00000000005f6d0] __iommu_free+0x1e0/0x1f0 (unreliable) [c000000003a27bc0] [c00000000005f848] iommu_free+0x28/0x70 [c000000003a27bf0] [c000000000061518] iommu_free_coherent+0x68/0xa0 [c000000003a27c20] [c00000000005e8d4] dma_iommu_free_coherent+0x24/0x40 [c000000003a27c40] [c00000000024698c] dma_free_attrs+0x10c/0x140 [c000000003a27c90] [c008000000dcb8d4] mlx5_cmd_cleanup+0x5c/0x90 [mlx5_core] [c000000003a27cc0] [c008000000dc45a0] mlx5_mdev_uninit+0xc8/0x100 [mlx5_core] [c000000003a27d00] [c008000000dc4ac4] probe_one+0x3ec/0x530 [mlx5_core] [c000000003a27d90] [c0000000008c5edc] local_pci_probe+0x6c/0x110 [c000000003a27e10] [c000000000189c98] work_for_cpu_fn+0x38/0x60 [c000000003a27e40] [c00000000018d1d0] process_scheduled_works+0x230/0x4f0 [c000000003a27f10] [c00000000018ff14] worker_thread+0x1e4/0x500 [c000000003a27f90] [c00000000019cdb8] kthread+0x138/0x140 [c000000003a27fe0] [c00000000000df98] start_kernel_thread+0x14/0x18 Code: 481b004d 60000000 e89e0028 3c62ffe0 3863dd20 481b0039 60000000 e89e0038 3c62ffe0 3863dd38 481b0025 60000000 <0fe00000> 4bffff20 60000000 3c4c0142 ---[ end trace 0000000000000000 ]--- iommu_free: invalid entry entry = 0x8000000203d0 dma_addr = 0x8000000203d0000 Table = 0xc00000000c5dc800 bus# = 0x1 size = 0x20000 startOff = 0x800000000000 index = 0x70200016 Fixes: 2ad56efa80db ("powerpc/iommu: Setup a default domain and remove set_platform_dma_ops") Reported-by: Tasmiya Nalatwad <tasmiya@linux.vnet.ibm.com> Link: https://lore.kernel.org/r/d06cee81-c47f-9d62-dfc6-4c77b60058db@linux.vnet.ibm.com Tested-by: Tasmiya Nalatwad <tasmiya@linux.vnet.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/0-v1-2b52423411b9+164fc-iommu_ppc_defdomain_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>