summaryrefslogtreecommitdiff
path: root/arch/powerpc
AgeCommit message (Collapse)AuthorFilesLines
2020-04-01powerpc/64/syscall: Reconcile interruptsNicholas Piggin2-14/+21
This reconciles interrupts in the system call case like all other interrupts. This allows system_call_common to be shared with the scv system call implementation in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-31-npiggin@gmail.com
2020-04-01powerpc/64s/exception: Remove lite interrupt returnNicholas Piggin2-16/+14
Regular interrupt return restores NVGPRS whereas lite returns do not. This is clumsy: most interrupts can return without restoring NVGPRS in most of the time, but there are special cases that require it (when registers have been modified by the kernel). So change interrupt return to not restore NVGPRS, and have interrupt handlers restore them explicitly in the cases that requires it. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-30-npiggin@gmail.com
2020-04-01powerpc/64s: Implement interrupt exit logic in CNicholas Piggin11-526/+652
Implement the bulk of interrupt return logic in C. The asm return code must handle a few cases: restoring full GPRs, and emulating stack store. The stack store emulation is significantly simplfied, rather than creating a new return frame and switching to that before performing the store, it uses the PACA to keep a scratch register around to perform the store. The asm return code is moved into 64e for now. The new logic has made allowance for 64e, but I don't have a full environment that works well to test it, and even booting in emulated qemu is not great for stress testing. 64e shouldn't be too far off working with this, given a bit more testing and auditing of the logic. This is slightly faster on a POWER9 (page fault speed increases about 1.1%), probably due to reduced mtmsrd. mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE handling (including the fast_interrupt_return path), to remove trace_hardirqs_on(), and fixes the interrupt-return part of the MSR_VSX restore bug caught by tm-unavailable selftest. mpe: Incorporate fix from Nick: The return-to-kernel path has to replay any soft-pending interrupts if it is returning to a context that had interrupts soft-enabled. It has to do this carefully and avoid plain enabling interrupts if this is an irq context, which can cause multiple nesting of interrupts on the stack, and other unexpected issues. The code which avoided this case got the soft-mask state wrong, and marked interrupts as enabled before going around again to retry. This seems to be mostly harmless except when PREEMPT=y, this calls preempt_schedule_irq with irqs apparently enabled and runs into a BUG in kernel/sched/core.c Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-04-01powerpc/64: Implement soft interrupt replay in CNicholas Piggin4-115/+130
When local_irq_enable() finds a pending soft-masked interrupt, it "replays" it by setting up registers like the initial interrupt entry, then calls into the low level handler to set up an interrupt stack frame and process the interrupt. This is not necessary, and uses more stack than needed. The high level interrupt handler can be called directly from C, with just pt_regs set up on stack. This should be faster and use less stack. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-28-npiggin@gmail.com
2020-04-01powerpc/64/syscall: Zero volatile registers when returningNicholas Piggin1-0/+13
Kernel addresses and potentially other sensitive data could be leaked in volatile registers after a syscall. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-27-npiggin@gmail.com
2020-04-01powerpc/64/sycall: Implement syscall entry/exit logic in CNicholas Piggin13-306/+326
System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. mpe: Includes subsequent fixes from Nick: This fixes 4 issues caught by TM selftests. First was a tm-syscall bug that hit due to tabort_syscall being called after interrupts were reconciled (in a subsequent patch), which led to interrupts being enabled before tabort_syscall was called. Rather than going through an un-reconciling interrupts for the return, I just go back to putting the test early in asm, the C-ification of that wasn't a big win anyway. Second is the syscall return _TIF_USER_WORK_MASK check would go into an infinite loop if _TIF_RESTORE_TM became set. The asm code uses _TIF_USER_WORK_MASK to brach to slowpath which includes restore_tm_state. Third is system call return was not calling restore_tm_state, I missed this completely (alhtough it's in the return from interrupt C conversion because when the asm syscall code encountered problems it would branch to the interrupt return code. Fourth is MSR_VEC missing from restore_math, which was caught by tm-unavailable selftest taking an unexpected facility unavailable interrupt when testing VSX unavailble exception with MSR.FP=1 MSR.VEC=1. Fourth case also has a fixup in a subsequent patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-26-npiggin@gmail.com
2020-04-01powerpc/64/sstep: Ifdef the deprecated fast endian switch syscallNicholas Piggin1-2/+3
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-25-npiggin@gmail.com
2020-04-01powerpc/64/syscall: Remove non-volatile GPR save optimisationNicholas Piggin2-66/+28
powerpc has an optimisation where interrupts avoid saving the non-volatile (or callee saved) registers to the interrupt stack frame if they are not required. Two problems with this are that an interrupt does not always know whether it will need non-volatiles; and if it does need them, they can only be saved from the entry-scoped asm code (because we don't control what the C compiler does with these registers). system calls are the most difficult: some system calls always require all registers (e.g., fork, to copy regs into the child). Sometimes registers are only required under certain conditions (e.g., tracing, signal delivery). These cases require ugly logic in the call chains (e.g., ppc_fork), and require a lot of logic to be implemented in asm. So remove the optimisation for system calls, and always save NVGPRs on entry. Modern high performance CPUs are not so sensitive, because the stores are dense in cache and can be hidden by other expensive work in the syscall path -- the null syscall selftests benchmark on POWER9 is not slowed (124.40ns before and 123.64ns after, i.e., within the noise). Other interrupts retain the NVGPR optimisation for now. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-24-npiggin@gmail.com
2020-04-01powerpc/64s/exception: Soft NMI interrupt should not use ret_from_exceptNicholas Piggin1-1/+29
The soft NMI handler does not reconcile interrupt state, so it should not return via the normal ret_from_except path. Return like other NMIs, using the EXCEPTION_RESTORE_REGS macro. This becomes important when the scv interrupt is implemented, which must handle soft-masked interrupts that have r13 set to something other than the PACA -- returning to kernel in this case must restore r13. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-23-npiggin@gmail.com
2020-04-01powerpc/64s/exception: Reconcile interrupts in system_resetNicholas Piggin1-4/+10
This adds IRQ_HARD_DIS to irq_happened. Although it doesn't seem to matter much because we're not allowed to enable irqs in an NMI handler, the soft-irq debugging code is becoming more strict about ensuring IRQ_HARD_DIS is in sync with MSR[EE], this may help avoid asserts or other issues. Add a comment explaining why MCE does not have this. Early machine check is generally much smaller and more contained code which will explode if you look at it wrong anyway as it runs in real mode, though there's an argument that we should do similar reconciling for the MCE as well. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-22-npiggin@gmail.com
2020-04-01powerpc/64s/exception: Only test KVM in SRR interrupts when PR KVM is supportedNicholas Piggin1-3/+62
Apart from SRESET, MCE, and syscall (hcall variant), the SRR type interrupts are not escalated to hypervisor mode, so are delivered to the OS. When running PR KVM, the OS is the hypervisor, and the guest runs with MSR[PR]=1 (ie. usermode), so these interrupts must test if a guest was running when interrupted. These tests are required at the real-mode entry points because the PR KVM host runs with LPCR[AIL]=0. In HV KVM and nested HV KVM, the guest always receives these interrupts, so there is no need for the host to make this test. So remove the tests if PR KVM is not configured. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-21-npiggin@gmail.com
2020-04-01powerpc/64s/exception: Add more comments for interrupt handlersNicholas Piggin1-38/+353
A few of the non-standard handlers are left uncommented. Some more description could be added to some. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-20-npiggin@gmail.com
2020-04-01powerpc/64s/exception: Clean up SRR specifiersNicholas Piggin1-36/+32
Remove more magic numbers and replace with nicely named bools. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-19-npiggin@gmail.com
2020-04-01powerpc/64s/exception: Re-inline some handlersNicholas Piggin1-3/+3
The reduction in interrupt entry size allows some handlers to be re-inlined. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-18-npiggin@gmail.com
2020-04-01powerpc/64s/exception: Avoid touching the stack in hdecrementerNicholas Piggin3-15/+20
The hdec interrupt handler is reported to sometimes fire in Linux if KVM leaves it pending after a guest exists. This is harmless, so there is a no-op handler for it. The interrupt handler currently uses the regular kernel stack. Change this to avoid touching the stack entirely. This should be the last place where the regular Linux stack can be accessed with asynchronous interrupts (including PMI) soft-masked. It might be possible to take advantage of this invariant, e.g., to context switch the kernel stack SLB entry without clearing MSR[EE]. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-17-npiggin@gmail.com
2020-04-01powerpc/64s/exception: Trim unused arguments from KVMTEST macroNicholas Piggin1-6/+6
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-16-npiggin@gmail.com
2020-04-01powerpc/64s/exception: Remove the SPR saving patch code macrosNicholas Piggin1-54/+28
These are used infrequently enough they don't provide much help, so inline them. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-15-npiggin@gmail.com
2020-04-01powerpc/64s/exception: Remove confusing IEARLY optionNicholas Piggin1-25/+24
Replace IEARLY=1 and IEARLY=2 with IBRANCH_COMMON, which controls if the entry code branches to a common handler; and IREALMODE_COMMON, which controls whether the common handler should remain in real mode. These special cases no longer avoid loading the SRR registers, there is no point as most of them load the registers immediately anyway. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-14-npiggin@gmail.com
2020-04-01powerpc/64s/exception: Move KVM test to common codeNicholas Piggin3-139/+139
This allows more code to be moved out of unrelocated regions. The system call KVMTEST is changed to be open-coded and remain in the tramp area to avoid having to move it to entry_64.S. The custom nature of the system call entry code means the hcall case can be made more streamlined than regular interrupt handlers. mpe: Incorporate fix from Nick: Moving KVM test to the common entry code missed the case of HMI and MCE, which do not do __GEN_COMMON_ENTRY (because they don't want to switch to virt mode). This means a MCE or HMI exception that is taken while KVM is running a guest context will not be switched out of that context, and KVM won't be notified. Found by running sigfuz in guest with patched host on POWER9 DD2.3, which causes some TM related HMI interrupts (which are expected and supposed to be handled by KVM). This fix adds a __GEN_REALMODE_COMMON_ENTRY for those handlers to add the KVM test. This makes them look a little more like other handlers that all use __GEN_COMMON_ENTRY. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-13-npiggin@gmail.com
2020-04-01powerpc/64s/exception: Move soft-mask test to common codeNicholas Piggin1-57/+49
As well as moving code out of the unrelocated vectors, this allows the masked handlers to be moved to common code, and allows the soft_nmi handler to be generated more like a regular handler. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-12-npiggin@gmail.com
2020-04-01powerpc/64s/exception: Move real to virt switch into the common handlerNicholas Piggin2-153/+116
The real mode interrupt entry points currently use rfid to branch to the common handler in virtual mode. This is a significant amount of code, and forces other code (notably the KVM test) to live in the real mode handler. In the interest of minimising the amount of code that runs unrelocated move the switch to virt mode into the common code, and do it with mtmsrd, which avoids clobbering SRRs (although the post-KVMTEST performance of real-mode interrupt handlers is not a big concern these days). This requires CTR to always be saved (real-mode needs to reach 0xc...) but that's not a huge impact these days. It could be optimized away in future. mpe: Incorporate fix from Nick: It's possible for interrupts to be replayed when TM is enabled and suspended, for example rt_sigreturn, where the mtmsrd MSR_KERNEL in the real-mode entry point to the common handler causes a TM Bad Thing exception (due to attempting to clear suspended). The fix for this is to have replay interrupts go to the _virt entry point and skip the mtmsrd, which matches what happens before this patch. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-11-npiggin@gmail.com
2020-04-01powerpc/64s/exception: Add ISIDE optionNicholas Piggin1-7/+16
Rather than using DAR=2 to select the i-side registers, add an explicit option. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-10-npiggin@gmail.com
2020-04-01powerpc/64s/exception: Remove old INT_KVM_HANDLERNicholas Piggin1-29/+26
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-9-npiggin@gmail.com
2020-04-01powerpc/64s/exception: Remove old INT_COMMON macroNicholas Piggin1-27/+24
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-8-npiggin@gmail.com
2020-04-01powerpc/64s/exception: Remove old INT_ENTRY macroNicholas Piggin1-38/+30
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-7-npiggin@gmail.com
2020-04-01powerpc/64s/exception: Move all interrupt handlers to new style code gen macrosNicholas Piggin1-138/+418
Aside from label names and BUG line numbers, the generated code change is an additional HMI KVM handler added for the "late" KVM handler, because early and late HMI generation is achieved by defining two different interrupt types. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-6-npiggin@gmail.com
2020-04-01powerpc/64s/exception: Expand EXC_COMMON and EXC_COMMON_ASYNC macrosNicholas Piggin1-43/+117
These don't provide a large amount of code sharing. Removing them makes code easier to shuffle around. For example, some of the common instructions will be moved into the common code gen macro. No generated code change. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-5-npiggin@gmail.com
2020-04-01powerpc/64s/exception: Add GEN_KVM macro that uses INT_DEFINE parametersNicholas Piggin1-1/+11
No generated code change. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-4-npiggin@gmail.com
2020-04-01powerpc/64s/exception: Add GEN_COMMON macro that uses INT_DEFINE parametersNicholas Piggin1-7/+17
No generated code change. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-3-npiggin@gmail.com
2020-04-01powerpc/64s/exception: Introduce INT_DEFINE parameter block for code generationNicholas Piggin1-4/+73
The code generation macro arguments are difficult to read, and defaults can't easily be used. This introduces a block where parameters can be set for interrupt handler code generation by the subsequent macros, and adds the first generation macro for interrupt entry. One interrupt handler is converted to the new macros to demonstrate the change, the rest will be coverted all at once. No generated code change. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200225173541.1549955-2-npiggin@gmail.com
2020-04-01powerpc/64: mark emergency stacks valid to unwindNicholas Piggin1-1/+30
Before: WARNING: CPU: 0 PID: 494 at arch/powerpc/kernel/irq.c:343 CPU: 0 PID: 494 Comm: a Tainted: G W NIP: c00000000001ed2c LR: c000000000d13190 CTR: c00000000003f910 REGS: c0000001fffd3870 TRAP: 0700 Tainted: G W MSR: 8000000000021003 <SF,ME,RI,LE> CR: 28000488 XER: 00000000 CFAR: c00000000001ec90 IRQMASK: 0 GPR00: c000000000aeb12c c0000001fffd3b00 c0000000012ba300 0000000000000000 GPR04: 0000000000000000 0000000000000000 000000010bd207c8 6b00696e74657272 GPR08: 0000000000000000 0000000000000000 0000000000000000 efbeadde00000000 GPR12: 0000000000000000 c0000000014a0000 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24: 0000000000000000 0000000000000000 0000000000000000 000000010bd207bc GPR28: 0000000000000000 c00000000148a898 0000000000000000 c0000001ffff3f50 NIP [c00000000001ed2c] arch_local_irq_restore.part.0+0xac/0x100 LR [c000000000d13190] _raw_spin_unlock_irqrestore+0x50/0xc0 Call Trace: Instruction dump: 60000000 7d2000a6 71298000 41820068 39200002 7d210164 4bffff9c 60000000 60000000 7d2000a6 71298000 4c820020 <0fe00000> 4e800020 60000000 60000000 After: WARNING: CPU: 0 PID: 499 at arch/powerpc/kernel/irq.c:343 CPU: 0 PID: 499 Comm: a Not tainted NIP: c00000000001ed2c LR: c000000000d13210 CTR: c00000000003f980 REGS: c0000001fffd3870 TRAP: 0700 Not tainted MSR: 8000000000021003 <SF,ME,RI,LE> CR: 28000488 XER: 00000000 CFAR: c00000000001ec90 IRQMASK: 0 GPR00: c000000000aeb1ac c0000001fffd3b00 c0000000012ba300 0000000000000000 GPR04: 0000000000000000 0000000000000000 00000001347607c8 6b00696e74657272 GPR08: 0000000000000000 0000000000000000 0000000000000000 efbeadde00000000 GPR12: 0000000000000000 c0000000014a0000 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24: 0000000000000000 0000000000000000 0000000000000000 00000001347607bc GPR28: 0000000000000000 c00000000148a898 0000000000000000 c0000001ffff3f50 NIP [c00000000001ed2c] arch_local_irq_restore.part.0+0xac/0x100 LR [c000000000d13210] _raw_spin_unlock_irqrestore+0x50/0xc0 Call Trace: [c0000001fffd3b20] [c000000000aeb1ac] of_find_property+0x6c/0x90 [c0000001fffd3b70] [c000000000aeb1f0] of_get_property+0x20/0x40 [c0000001fffd3b90] [c000000000042cdc] rtas_token+0x3c/0x70 [c0000001fffd3bb0] [c0000000000dc318] fwnmi_release_errinfo+0x28/0x70 [c0000001fffd3c10] [c0000000000dcd8c] pseries_machine_check_realmode+0x1dc/0x540 [c0000001fffd3cd0] [c00000000003fe04] machine_check_early+0x54/0x70 [c0000001fffd3d00] [c000000000008384] machine_check_early_common+0x134/0x1f0 --- interrupt: 200 at 0x1347607c8 LR = 0x7fffafbd8328 Instruction dump: 60000000 7d2000a6 71298000 41820068 39200002 7d210164 4bffff9c 60000000 60000000 7d2000a6 71298000 4c820020 <0fe00000> 4e800020 60000000 60000000 Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200325104144.158362-1-npiggin@gmail.com
2020-04-01powerpc/64/tm: Don't let userspace set regs->trap via sigreturnMichael Ellerman1-1/+3
In restore_tm_sigcontexts() we take the trap value directly from the user sigcontext with no checking: err |= __get_user(regs->trap, &sc->gp_regs[PT_TRAP]); This means we can be in the kernel with an arbitrary regs->trap value. Although that's not immediately problematic, there is a risk we could trigger one of the uses of CHECK_FULL_REGS(): #define CHECK_FULL_REGS(regs) BUG_ON(regs->trap & 1) It can also cause us to unnecessarily save non-volatile GPRs again in save_nvgprs(), which shouldn't be problematic but is still wrong. It's also possible it could trick the syscall restart machinery, which relies on regs->trap not being == 0xc00 (see 9a81c16b5275 ("powerpc: fix double syscall restarts")), though I haven't been able to make that happen. Finally it doesn't match the behaviour of the non-TM case, in restore_sigcontext() which zeroes regs->trap. So change restore_tm_sigcontexts() to zero regs->trap. This was discovered while testing Nick's upcoming rewrite of the syscall entry path. In that series the call to save_nvgprs() prior to signal handling (do_notify_resume()) is removed, which leaves the low-bit of regs->trap uncleared which can then trigger the FULL_REGS() WARNs in setup_tm_sigcontexts(). Fixes: 2b0a576d15e0 ("powerpc: Add new transactional memory state to the signal context") Cc: stable@vger.kernel.org # v3.9+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200401023836.3286664-1-mpe@ellerman.id.au
2020-04-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds1-6/+0
Pull networking updates from David Miller: "Highlights: 1) Fix the iwlwifi regression, from Johannes Berg. 2) Support BSS coloring and 802.11 encapsulation offloading in hardware, from John Crispin. 3) Fix some potential Spectre issues in qtnfmac, from Sergey Matyukevich. 4) Add TTL decrement action to openvswitch, from Matteo Croce. 5) Allow paralleization through flow_action setup by not taking the RTNL mutex, from Vlad Buslov. 6) A lot of zero-length array to flexible-array conversions, from Gustavo A. R. Silva. 7) Align XDP statistics names across several drivers for consistency, from Lorenzo Bianconi. 8) Add various pieces of infrastructure for offloading conntrack, and make use of it in mlx5 driver, from Paul Blakey. 9) Allow using listening sockets in BPF sockmap, from Jakub Sitnicki. 10) Lots of parallelization improvements during configuration changes in mlxsw driver, from Ido Schimmel. 11) Add support to devlink for generic packet traps, which report packets dropped during ACL processing. And use them in mlxsw driver. From Jiri Pirko. 12) Support bcmgenet on ACPI, from Jeremy Linton. 13) Make BPF compatible with RT, from Thomas Gleixnet, Alexei Starovoitov, and your's truly. 14) Support XDP meta-data in virtio_net, from Yuya Kusakabe. 15) Fix sysfs permissions when network devices change namespaces, from Christian Brauner. 16) Add a flags element to ethtool_ops so that drivers can more simply indicate which coalescing parameters they actually support, and therefore the generic layer can validate the user's ethtool request. Use this in all drivers, from Jakub Kicinski. 17) Offload FIFO qdisc in mlxsw, from Petr Machata. 18) Support UDP sockets in sockmap, from Lorenz Bauer. 19) Fix stretch ACK bugs in several TCP congestion control modules, from Pengcheng Yang. 20) Support virtual functiosn in octeontx2 driver, from Tomasz Duszynski. 21) Add region operations for devlink and use it in ice driver to dump NVM contents, from Jacob Keller. 22) Add support for hw offload of MACSEC, from Antoine Tenart. 23) Add support for BPF programs that can be attached to LSM hooks, from KP Singh. 24) Support for multiple paths, path managers, and counters in MPTCP. From Peter Krystad, Paolo Abeni, Florian Westphal, Davide Caratti, and others. 25) More progress on adding the netlink interface to ethtool, from Michal Kubecek" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2121 commits) net: ipv6: rpl_iptunnel: Fix potential memory leak in rpl_do_srh_inline cxgb4/chcr: nic-tls stats in ethtool net: dsa: fix oops while probing Marvell DSA switches net/bpfilter: remove superfluous testing message net: macb: Fix handling of fixed-link node net: dsa: ksz: Select KSZ protocol tag netdevsim: dev: Fix memory leak in nsim_dev_take_snapshot_write net: stmmac: add EHL 2.5Gbps PCI info and PCI ID net: stmmac: add EHL PSE0 & PSE1 1Gbps PCI info and PCI ID net: stmmac: create dwmac-intel.c to contain all Intel platform net: dsa: bcm_sf2: Support specifying VLAN tag egress rule net: dsa: bcm_sf2: Add support for matching VLAN TCI net: dsa: bcm_sf2: Move writing of CFP_DATA(5) into slicing functions net: dsa: bcm_sf2: Check earlier for FLOW_EXT and FLOW_MAC_EXT net: dsa: bcm_sf2: Disable learning for ASP port net: dsa: b53: Deny enslaving port 7 for 7278 into a bridge net: dsa: b53: Prevent tagged VLAN on port 7 for 7278 net: dsa: b53: Restore VLAN entries upon (re)configuration net: dsa: bcm_sf2: Fix overflow checks hv_netvsc: Remove unnecessary round_up for recv_completion_cnt ...
2020-04-01libnvdimm: Update persistence domain value for of_pmem and papr_scm deviceAneesh Kumar K.V1-1/+3
Currently, kernel shows the below values "persistence_domain":"cpu_cache" "persistence_domain":"memory_controller" "persistence_domain":"unknown" "cpu_cache" indicates no extra instructions is needed to ensure the persistence of data in the pmem media on power failure. "memory_controller" indicates cpu cache flush instructions are required to flush the data. Platform provides mechanisms to automatically flush outstanding write data from memory controler to pmem on system power loss. Based on the above use memory_controller for non volatile regions on ppc64. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/20200324034821.60869-1-aneesh.kumar@linux.ibm.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-03-31KVM: Pass kvm_init()'s opaque param to additional arch funcsSean Christopherson1-2/+2
Pass @opaque to kvm_arch_hardware_setup() and kvm_arch_check_processor_compat() to allow architecture specific code to reference @opaque without having to stash it away in a temporary global variable. This will enable x86 to separate its vendor specific callback ops, which are passed via @opaque, into "init" and "runtime" ops without having to stash away the "init" ops. No functional change intended. Reviewed-by: Cornelia Huck <cohuck@redhat.com> Tested-by: Cornelia Huck <cohuck@redhat.com> #s390 Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200321202603.19355-2-sean.j.christopherson@intel.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-31Merge tag 'kvm-ppc-next-5.7-1' of ↵Paolo Bonzini23-122/+187
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD KVM PPC update for 5.7 * Add a capability for enabling secure guests under the Protected Execution Framework ultravisor * Various bug fixes and cleanups.
2020-03-31Merge tag 'kvmarm-5.7' of ↵Paolo Bonzini17-99/+308
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm updates for Linux 5.7 - GICv4.1 support - 32bit host removal
2020-03-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller1-6/+0
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31Merge tag 'timers-nohz-2020-03-30' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull NOHZ update from Thomas Gleixner: "Remove TIF_NOHZ from three architectures These architectures use a static key to decide whether context tracking needs to be invoked and the TIF_NOHZ flag just causes a pointless slowpath execution for nothing" * tag 'timers-nohz-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: arm64: Remove TIF_NOHZ arm: Remove TIF_NOHZ x86: Remove TIF_NOHZ context-tracking: Introduce CONFIG_HAVE_TIF_NOHZ x86/entry: Remove _TIF_NOHZ from _TIF_WORK_SYSCALL_ENTRY
2020-03-31Merge tag 'smp-core-2020-03-30' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core SMP updates from Thomas Gleixner: "CPU (hotplug) updates: - Support for locked CSD objects in smp_call_function_single_async() which allows to simplify callsites in the scheduler core and MIPS - Treewide consolidation of CPU hotplug functions which ensures the consistency between the sysfs interface and kernel state. The low level functions cpu_up/down() are now confined to the core code and not longer accessible from random code" * tag 'smp-core-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits) cpu/hotplug: Ignore pm_wakeup_pending() for disable_nonboot_cpus() cpu/hotplug: Hide cpu_up/down() cpu/hotplug: Move bringup of secondary CPUs out of smp_init() torture: Replace cpu_up/down() with add/remove_cpu() firmware: psci: Replace cpu_up/down() with add/remove_cpu() xen/cpuhotplug: Replace cpu_up/down() with device_online/offline() parisc: Replace cpu_up/down() with add/remove_cpu() sparc: Replace cpu_up/down() with add/remove_cpu() powerpc: Replace cpu_up/down() with add/remove_cpu() x86/smp: Replace cpu_up/down() with add/remove_cpu() arm64: hibernate: Use bringup_hibernate_cpu() cpu/hotplug: Provide bringup_hibernate_cpu() arm64: Use reboot_cpu instead of hardconding it to 0 arm64: Don't use disable_nonboot_cpus() ARM: Use reboot_cpu instead of hardcoding it to 0 ARM: Don't use disable_nonboot_cpus() ia64: Replace cpu_down() with smp_shutdown_nonboot_cpus() cpu/hotplug: Create a new function to shutdown nonboot cpus cpu/hotplug: Add new {add,remove}_cpu() functions sched/core: Remove rq.hrtick_csd_pending ...
2020-03-31Merge branch 'perf-core-for-linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "The main changes in this cycle were: Kernel side changes: - A couple of x86/cpu cleanups and changes were grandfathered in due to patch dependencies. These clean up the set of CPU model/family matching macros with a consistent namespace and C99 initializer style. - A bunch of updates to various low level PMU drivers: * AMD Family 19h L3 uncore PMU * Intel Tiger Lake uncore support * misc fixes to LBR TOS sampling - optprobe fixes - perf/cgroup: optimize cgroup event sched-in processing - misc cleanups and fixes Tooling side changes are to: - perf {annotate,expr,record,report,stat,test} - perl scripting - libapi, libperf and libtraceevent - vendor events on Intel and S390, ARM cs-etm - Intel PT updates - Documentation changes and updates to core facilities - misc cleanups, fixes and other enhancements" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (89 commits) cpufreq/intel_pstate: Fix wrong macro conversion x86/cpu: Cleanup the now unused CPU match macros hwrng: via_rng: Convert to new X86 CPU match macros crypto: Convert to new CPU match macros ASoC: Intel: Convert to new X86 CPU match macros powercap/intel_rapl: Convert to new X86 CPU match macros PCI: intel-mid: Convert to new X86 CPU match macros mmc: sdhci-acpi: Convert to new X86 CPU match macros intel_idle: Convert to new X86 CPU match macros extcon: axp288: Convert to new X86 CPU match macros thermal: Convert to new X86 CPU match macros hwmon: Convert to new X86 CPU match macros platform/x86: Convert to new CPU match macros EDAC: Convert to new X86 CPU match macros cpufreq: Convert to new X86 CPU match macros ACPI: Convert to new X86 CPU match macros x86/platform: Convert to new CPU match macros x86/kernel: Convert to new CPU match macros x86/kvm: Convert to new CPU match macros x86/perf/events: Convert to new CPU match macros ...
2020-03-31Merge branch 'locking-core-for-linus' of ↵Linus Torvalds2-12/+11
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in this cycle were: - Continued user-access cleanups in the futex code. - percpu-rwsem rewrite that uses its own waitqueue and atomic_t instead of an embedded rwsem. This addresses a couple of weaknesses, but the primary motivation was complications on the -rt kernel. - Introduce raw lock nesting detection on lockdep (CONFIG_PROVE_RAW_LOCK_NESTING=y), document the raw_lock vs. normal lock differences. This too originates from -rt. - Reuse lockdep zapped chain_hlocks entries, to conserve RAM footprint on distro-ish kernels running into the "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!" depletion of the lockdep chain-entries pool. - Misc cleanups, smaller fixes and enhancements - see the changelog for details" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (55 commits) fs/buffer: Make BH_Uptodate_Lock bit_spin_lock a regular spinlock_t thermal/x86_pkg_temp: Make pkg_temp_lock a raw_spinlock_t Documentation/locking/locktypes: Minor copy editor fixes Documentation/locking/locktypes: Further clarifications and wordsmithing m68knommu: Remove mm.h include from uaccess_no.h x86: get rid of user_atomic_cmpxchg_inatomic() generic arch_futex_atomic_op_inuser() doesn't need access_ok() x86: don't reload after cmpxchg in unsafe_atomic_op2() loop x86: convert arch_futex_atomic_op_inuser() to user_access_begin/user_access_end() objtool: whitelist __sanitizer_cov_trace_switch() [parisc, s390, sparc64] no need for access_ok() in futex handling sh: no need of access_ok() in arch_futex_atomic_op_inuser() futex: arch_futex_atomic_op_inuser() calling conventions change completion: Use lockdep_assert_RT_in_threaded_ctx() in complete_all() lockdep: Add posixtimer context tracing bits lockdep: Annotate irq_work lockdep: Add hrtimer context tracing bits lockdep: Introduce wait-type checks completion: Use simple wait queues sched/swait: Prepare usage in completions ...
2020-03-28Merge branch 'uaccess.futex' of ↵Thomas Gleixner1-3/+2
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs into locking/core Pull uaccess futex cleanups for Al Viro: Consolidate access_ok() usage and the futex uaccess function zoo.
2020-03-28futex: arch_futex_atomic_op_inuser() calling conventions changeAl Viro1-3/+2
Move access_ok() in and pagefault_enable()/pagefault_disable() out. Mechanical conversion only - some instances don't really need a separate access_ok() at all (e.g. the ones only using get_user()/put_user(), or architectures where access_ok() is always true); we'll deal with that in followups. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-03-27powerpc/64: Avoid isync in flush_dcache_range()Aneesh Kumar K.V1-5/+1
As per ISA an isync is only needed on instruction cache block invalidate. Remove the same from dcache invalidate. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200320103242.229223-1-aneesh.kumar@linux.ibm.com
2020-03-27powerpc/boot: Delete unneeded .globl _zimage_startFangrui Song1-3/+0
.globl sets the symbol binding to STB_GLOBAL while .weak sets the binding to STB_WEAK. GNU as let .weak override .globl since binutils-gdb 5ca547dc2399a0a5d9f20626d4bf5547c3ccfddd (1996). Clang integrated assembler let the last win but it may error in the future. Since it is a convention that only one binding directive is used, just delete .globl. Fixes: ee9d21b3b358 ("powerpc/boot: Ensure _zimage_start is a weak symbol") Signed-off-by: Fangrui Song <maskray@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200325164257.170229-1-maskray@google.com
2020-03-27powerpc/pseries: Handle UE event for memcpy_mcsafeGanesh Goudar4-6/+21
memcpy_mcsafe has been implemented for power machines which is used by pmem infrastructure, so that an UE encountered during memcpy from pmem devices would not result in panic instead a right error code is returned. The implementation expects machine check handler to ignore the event and set nip to continue the execution from fixup code. Appropriate changes are already made to powernv machine check handler, make similar changes to pseries machine check handler to ignore the the event and set nip to continue execution at the fixup entry if we hit UE at an instruction with a fixup entry. while we are at it, have a common function which searches the exception table entry and updates nip with fixup address, and any future common changes can be made in this function that are valid for both architectures. powernv changes are made by commit 895e3dceeb97 ("powerpc/mce: Handle UE event for memcpy_mcsafe") Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Reviewed-by: Santosh S <santosh@fossix.org> Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200326184916.31172-1-ganeshgr@linux.ibm.com
2020-03-26mm: handle multiple owners of device private pages in migrate_vmaChristoph Hellwig1-0/+1
Add a new src_owner field to struct migrate_vma. If the field is set, only device private pages with page->pgmap->owner equal to that field are migrated. If the field is not set only "normal" pages are migrated. Fixes: df6ad69838fc ("mm/device-public-memory: device memory cache coherent with CPU") Link: https://lore.kernel.org/r/20200316193216.920734-3-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Tested-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-26memremap: add an owner field to struct dev_pagemapChristoph Hellwig1-0/+2
Add a new opaque owner field to struct dev_pagemap, which will allow the hmm and migrate_vma code to identify who owns ZONE_DEVICE memory, and refuse to work on mappings not owned by the calling entity. Link: https://lore.kernel.org/r/20200316193216.920734-2-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Tested-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-26powerpc/smp: Use IS_ENABLED() to avoid #ifdefMichael Ellerman1-3/+2
We can avoid the #ifdef by using IS_ENABLED() in the existing condition check. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200313112020.28235-2-mpe@ellerman.id.au