summaryrefslogtreecommitdiff
path: root/arch/x86/kernel/fpu/signal.c
AgeCommit message (Collapse)AuthorFilesLines
2021-10-16x86/fpu: Mask out the invalid MXCSR bits properlyBorislav Petkov1-1/+1
This is a fix for the fix (yeah, /facepalm). The correct mask to use is not the negation of the MXCSR_MASK but the actual mask which contains the supported bits in the MXCSR register. Reported and debugged by Ville Syrjälä <ville.syrjala@linux.intel.com> Fixes: d298b03506d3 ("x86/fpu: Restore the masking out of reserved MXCSR bits") Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Tested-by: Ser Olmy <ser.olmy@protonmail.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/YWgYIYXLriayyezv@intel.com
2021-10-08x86/fpu: Restore the masking out of reserved MXCSR bitsBorislav Petkov1-3/+8
Ser Olmy reported a boot failure: init[1] bad frame in sigreturn frame:(ptrval) ip:b7c9fbe6 sp:bf933310 orax:ffffffff \ in libc-2.33.so[b7bed000+156000] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b CPU: 0 PID: 1 Comm: init Tainted: G W 5.14.9 #1 Hardware name: Hewlett-Packard HP PC/HP Board, BIOS JD.00.06 12/06/2001 Call Trace: dump_stack_lvl dump_stack panic do_exit.cold do_group_exit get_signal arch_do_signal_or_restart ? force_sig_info_to_task ? force_sig exit_to_user_mode_prepare syscall_exit_to_user_mode do_int80_syscall_32 entry_INT80_32 on an old 32-bit Intel CPU: vendor_id : GenuineIntel cpu family : 6 model : 6 model name : Celeron (Mendocino) stepping : 5 microcode : 0x3 Ser bisected the problem to the commit in Fixes. tglx suggested reverting the rejection of invalid MXCSR values which this commit introduced and replacing it with what the old code did - simply masking them out to zero. Further debugging confirmed his suggestion: fpu->state.fxsave.mxcsr: 0xb7be13b4, mxcsr_feature_mask: 0xffbf WARNING: CPU: 0 PID: 1 at arch/x86/kernel/fpu/signal.c:384 __fpu_restore_sig+0x51f/0x540 so restore the original behavior only for 32-bit kernels where you have ancient machines with buggy hardware. For 32-bit programs on 64-bit kernels, user space which supplies wrong MXCSR values is considered malicious so fail the sigframe restoration there. Fixes: 6f9866a166cd ("x86/fpu/signal: Let xrstor handle the features to init") Reported-by: Ser Olmy <ser.olmy@protonmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Ser Olmy <ser.olmy@protonmail.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/YVtA67jImg3KlBTw@zn.tnic
2021-06-24x86/fpu/signal: Let xrstor handle the features to initThomas Gleixner1-58/+31
There is no reason to do an extra XRSTOR from init_fpstate for feature bits which have been cleared by user space in the FX magic xfeatures storage. Just clear them in the task's XSTATE header and do a full restore which will put these cleared features into init state. There is no real difference in performance because the current code already does a full restore when the xfeatures bits are preserved as the signal frame setup has stored them, which is the full UABI feature set. [ bp: Use the negated mxcsr_feature_mask in the MXCSR check. ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121457.804115017@linutronix.de
2021-06-23x86/fpu/signal: Handle #PF in the direct restore pathThomas Gleixner1-34/+33
If *RSTOR raises an exception, then the slow path is taken. That's wrong because if the reason was not #PF then going through the slow path is waste of time because that will end up with the same conclusion that the data is invalid. Now that the wrapper around *RSTOR return an negative error code, which is the negated trap number, it's possible to differentiate. If the *RSTOR raised #PF then handle it directly in the fast path and if it was some other exception, e.g. #GP, then give up and do not try the fast path. This removes the legacy frame FRSTOR code from the slow path because FRSTOR is not a ia32_fxstate frame and is therefore handled in the fast path. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121457.696022863@linutronix.de
2021-06-23x86/fpu/signal: Split out the direct restore codeThomas Gleixner1-54/+58
Prepare for smarter failure handling of the direct restore. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121457.493455414@linutronix.de
2021-06-23x86/fpu/signal: Sanitize copy_user_to_fpregs_zeroing()Thomas Gleixner1-21/+15
Now that user_xfeatures is correctly set when xsave is enabled, remove the duplicated initialization of components. Rename the function while at it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121457.377341297@linutronix.de
2021-06-23x86/fpu/signal: Sanitize the xstate check on sigframeThomas Gleixner1-37/+33
Utilize the check for the extended state magic in the FX software reserved bytes and set the parameters for restoring fx_only in the relevant members of fw_sw_user. This allows further cleanups on top because the data is consistent. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121457.277738268@linutronix.de
2021-06-23x86/fpu/signal: Remove the legacy alignment checkThomas Gleixner1-3/+0
Checking for the XSTATE buffer being 64-byte aligned, and if not, deciding just to restore the FXSR state is daft. If user space provides an unaligned math frame and has the extended state magic set in the FX software reserved bytes, then it really can keep the pieces. If the frame is unaligned and the FX software magic is not set, then fx_only is already set and the restore will use fxrstor. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121457.184149902@linutronix.de
2021-06-23x86/fpu/signal: Move initial checks into fpu__restore_sig()Thomas Gleixner1-35/+41
__fpu__restore_sig() is convoluted and some of the basic checks can trivially be done in the calling function as well as the final error handling of clearing user state. [ bp: Fixup typos. ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121457.086336154@linutronix.de
2021-06-23x86/fpu: Rename xfeatures_mask_user() to xfeatures_mask_uabi()Thomas Gleixner1-5/+5
Rename it so it's clear that this is about user ABI features which can differ from the feature set which the kernel saves and restores because the kernel handles e.g. PKRU differently. But the user ABI (ptrace, signal frame) expects it to be there. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121456.211585137@linutronix.de
2021-06-23x86/fpu: Rename __fpregs_load_activate() to fpregs_restore_userregs()Thomas Gleixner1-1/+1
Rename it so that it becomes entirely clear what this function is about. It's purpose is to restore the FPU registers to the state which was saved in the task's FPU memory state either at context switch or by an in kernel FPU user. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121456.018867925@linutronix.de
2021-06-23x86/fpu: Rename xstate copy functions which are related to UABIThomas Gleixner1-1/+1
Rename them to reflect that these functions deal with user space format XSAVE buffers. copy_kernel_to_xstate() -> copy_uabi_from_kernel_to_xstate() copy_user_to_xstate() -> copy_sigframe_from_user_to_xstate() Again a clear statement that these functions deal with user space ABI. Suggested-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121454.318485015@linutronix.de
2021-06-23x86/fpu: Rename fregs-related copy functionsThomas Gleixner1-3/+3
The function names for fnsave/fnrstor operations are horribly named and a permanent source of confusion. Rename: copy_kernel_to_fregs() to frstor() copy_fregs_to_user() to fnsave_to_user_sigframe() copy_user_to_fregs() to frstor_from_user_sigframe() so it's clear what these are doing. All these functions are really low level wrappers around the equally named instructions, so mapping to the documentation is just natural. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121454.223594101@linutronix.de
2021-06-23x86/fpu: Rename fxregs-related copy functionsThomas Gleixner1-5/+5
The function names for fxsave/fxrstor operations are horribly named and a permanent source of confusion. Rename: copy_fxregs_to_kernel() to fxsave() copy_kernel_to_fxregs() to fxrstor() copy_fxregs_to_user() to fxsave_to_user_sigframe() copy_user_to_fxregs() to fxrstor_from_user_sigframe() so it's clear what these are doing. All these functions are really low level wrappers around the equally named instructions, so mapping to the documentation is just natural. While at it, replace the static_cpu_has(X86_FEATURE_FXSR) with use_fxsr() to be consistent with the rest of the code. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121454.017863494@linutronix.de
2021-06-23x86/fpu: Rename copy_user_to_xregs() and copy_xregs_to_user()Thomas Gleixner1-2/+2
The function names for xsave[s]/xrstor[s] operations are horribly named and a permanent source of confusion. Rename: copy_xregs_to_user() to xsave_to_user_sigframe() copy_user_to_xregs() to xrstor_from_user_sigframe() so it's entirely clear what this is about. This is also a clear indicator of the potentially different storage format because this is user ABI and cannot use compacted format. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121453.924266705@linutronix.de
2021-06-23x86/fpu: Rename copy_xregs_to_kernel() and copy_kernel_to_xregs()Thomas Gleixner1-10/+11
The function names for xsave[s]/xrstor[s] operations are horribly named and a permanent source of confusion. Rename: copy_xregs_to_kernel() to os_xsave() copy_kernel_to_xregs() to os_xrstor() These are truly low level wrappers around the actual instructions XSAVE[OPT]/XRSTOR and XSAVES/XRSTORS with the twist that the selection based on the available CPU features happens with an alternative to avoid conditionals all over the place and to provide the best performance for hot paths. The os_ prefix tells that this is the OS selected mechanism. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121453.830239347@linutronix.de
2021-06-23x86/fpu: Get rid of copy_supervisor_to_kernel()Thomas Gleixner1-5/+8
If the fast path of restoring the FPU state on sigreturn fails or is not taken and the current task's FPU is active then the FPU has to be deactivated for the slow path to allow a safe update of the tasks FPU memory state. With supervisor states enabled, this requires to save the supervisor state in the memory state first. Supervisor states require XSAVES so saving only the supervisor state requires to reshuffle the memory buffer because XSAVES uses the compacted format and therefore stores the supervisor states at the beginning of the memory state. That's just an overengineered optimization. Get rid of it and save the full state for this case. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210623121453.734561971@linutronix.de
2021-06-23Merge x86/urgent into x86/fpuBorislav Petkov1-37/+43
Pick up dependent changes which either went mainline (x86/urgent is based on -rc7 and that contains them) as urgent fixes and the current x86/urgent branch which contains two more urgent fixes, so that the bigger FPU rework can base off ontop. Signed-off-by: Borislav Petkov <bp@suse.de>
2021-06-22x86/fpu: Preserve supervisor states in sanitize_restored_user_xstate()Thomas Gleixner1-18/+8
sanitize_restored_user_xstate() preserves the supervisor states only when the fx_only argument is zero, which allows unprivileged user space to put supervisor states back into init state. Preserve them unconditionally. [ bp: Fix a typo or two in the text. ] Fixes: 5d6b6a6f9b5c ("x86/fpu/xstate: Update sanitize_restored_xstate() for supervisor xstates") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20210618143444.438635017@linutronix.de
2021-06-10x86/fpu: Reset state for all signal restore failuresThomas Gleixner1-11/+15
If access_ok() or fpregs_soft_set() fails in __fpu__restore_sig() then the function just returns but does not clear the FPU state as it does for all other fatal failures. Clear the FPU state for these failures as well. Fixes: 72a671ced66d ("x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/87mtryyhhz.ffs@nanos.tec.linutronix.de
2021-06-09x86/fpu: Invalidate FPU state after a failed XRSTOR from a user bufferAndy Lutomirski1-0/+19
Both Intel and AMD consider it to be architecturally valid for XRSTOR to fail with #PF but nonetheless change the register state. The actual conditions under which this might occur are unclear [1], but it seems plausible that this might be triggered if one sibling thread unmaps a page and invalidates the shared TLB while another sibling thread is executing XRSTOR on the page in question. __fpu__restore_sig() can execute XRSTOR while the hardware registers are preserved on behalf of a different victim task (using the fpu_fpregs_owner_ctx mechanism), and, in theory, XRSTOR could fail but modify the registers. If this happens, then there is a window in which __fpu__restore_sig() could schedule out and the victim task could schedule back in without reloading its own FPU registers. This would result in part of the FPU state that __fpu__restore_sig() was attempting to load leaking into the victim task's user-visible state. Invalidate preserved FPU registers on XRSTOR failure to prevent this situation from corrupting any state. [1] Frequent readers of the errata lists might imagine "complex microarchitectural conditions". Fixes: 1d731e731c4c ("x86/fpu: Add a fastpath to __fpu__restore_sig()") Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Rik van Riel <riel@surriel.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20210608144345.758116583@linutronix.de
2021-06-09x86/fpu: Prevent state corruption in __fpu__restore_sig()Thomas Gleixner1-8/+1
The non-compacted slowpath uses __copy_from_user() and copies the entire user buffer into the kernel buffer, verbatim. This means that the kernel buffer may now contain entirely invalid state on which XRSTOR will #GP. validate_user_xstate_header() can detect some of that corruption, but that leaves the onus on callers to clear the buffer. Prior to XSAVES support, it was possible just to reinitialize the buffer, completely, but with supervisor states that is not longer possible as the buffer clearing code split got it backwards. Fixing that is possible but not corrupting the state in the first place is more robust. Avoid corruption of the kernel XSAVE buffer by using copy_user_to_xstate() which validates the XSAVE header contents before copying the actual states to the kernel. copy_user_to_xstate() was previously only called for compacted-format kernel buffers, but it works for both compacted and non-compacted forms. Using it for the non-compacted form is slower because of multiple __copy_from_user() operations, but that cost is less important than robust code in an already slow path. [ Changelog polished by Dave Hansen ] Fixes: b860eb8dce59 ("x86/fpu/xstate: Define new functions for clearing fpregs and xstates") Reported-by: syzbot+2067e764dbcd10721e2e@syzkaller.appspotmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Rik van Riel <riel@surriel.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20210608144345.611833074@linutronix.de
2021-05-19x86/signal: Introduce helpers to get the maximum signal frame sizeChang S. Bae1-0/+19
Signal frames do not have a fixed format and can vary in size when a number of things change: supported XSAVE features, 32 vs. 64-bit apps, etc. Add support for a runtime method for userspace to dynamically discover how large a signal stack needs to be. Introduce a new variable, max_frame_size, and helper functions for the calculation to be used in a new user interface. Set max_frame_size to a system-wide worst-case value, instead of storing multiple app-specific values. Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Len Brown <len.brown@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: H.J. Lu <hjl.tools@gmail.com> Link: https://lkml.kernel.org/r/20210518200320.17239-3-chang.seok.bae@intel.com
2020-07-27x86: switch to ->regset_get()Al Viro1-1/+2
All instances of ->get() in arch/x86 switched; that might or might not be worth splitting up. Notes: * for xstateregs_get() the amount we want to store is determined at the boot time; see init_xstate_size() and update_regset_xstate_info() for details. task->thread.fpu.state.xsave ends with a flexible array member and the amount of data in it depends upon the FPU features supported/enabled. * fpregs_get() writes slightly less than full ->thread.fpu.state.fsave (the last word is not copied); we pass the full size of state.fsave and let membuf_write() trim to the amount declared by regset - __regset_get() will make sure that the space in buffer is no more than that. * copy_xstate_to_user() and its helpers are gone now. * fpregs_soft_get() was getting user_regset_copyout() arguments wrong. Since "x86: x86 user_regset math_emu" back in 2008... I really doubt that it's worth splitting out for -stable, though - you need a 486SX box for that to trigger... [Kevin's braino fix for copy_xstate_to_kernel() essentially duplicated here] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-06-26x86: copy_fpstate_to_sigframe(): have fpregs_soft_get() use kernel bufferAl Viro1-6/+6
... then copy_to_user() the results Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-05-16x86/fpu/xstate: Restore supervisor states for signal returnYu-cheng Yu1-5/+39
The signal return fast path directly restores user states from the user buffer. Once that succeeds, restore supervisor states (but only when they are not yet restored). For the slow path, save supervisor states to preserve them across context switches, and restore after the user states are restored. The previous version has the overhead of an XSAVES in both the fast and the slow paths. It is addressed as the following: - In the fast path, only do an XRSTORS. - In the slow path, do a supervisor-state-only XSAVES, and relocate the buffer contents. Some thoughts in the implementation: - In the slow path, can any supervisor state become stale between save/restore? Answer: set_thread_flag(TIF_NEED_FPU_LOAD) protects the xstate buffer. - In the slow path, can any code reference a stale supervisor state register between save/restore? Answer: In the current lazy-restore scheme, any reference to xstate registers needs fpregs_lock()/fpregs_unlock() and __fpregs_load_activate(). - Are there other options? One other option is eagerly restoring all supervisor states. Currently, CET user-mode states and ENQCMD's PASID do not need to be eagerly restored. The upcoming CET kernel-mode states (24 bytes) need to be eagerly restored. To me, eagerly restoring all supervisor states adds more overhead then benefit at this point. Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/20200512145444.15483-11-yu-cheng.yu@intel.com
2020-05-16x86/fpu/xstate: Preserve supervisor states for the slow path in ↵Yu-cheng Yu1-25/+28
__fpu__restore_sig() The signal return code is responsible for taking an XSAVE buffer present in user memory and loading it into the hardware registers. This operation only affects user XSAVE state and never affects supervisor state. The fast path through this code simply points XRSTOR directly at the user buffer. However, since user memory is not guaranteed to be always mapped, this XRSTOR can fail. If it fails, the signal return code falls back to a slow path which can tolerate page faults. That slow path copies the xfeatures one by one out of the user buffer into the task's fpu state area. However, by being in a context where it can handle page faults, the code can also schedule. The lazy-fpu-load code would think it has an up-to-date fpstate and would fail to save the supervisor state when scheduling the task out. When scheduling back in, it would likely restore stale supervisor state. To fix that, preserve supervisor state before the slow path. Modify copy_user_to_fpregs_zeroing() so that if it fails, fpregs are not zeroed, and there is no need for fpregs_deactivate() and supervisor states are preserved. Move set_thread_flag(TIF_NEED_FPU_LOAD) to the slow path. Without doing this, the fast path also needs supervisor states to be saved first. Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200512145444.15483-10-yu-cheng.yu@intel.com
2020-05-13x86/fpu/xstate: Update sanitize_restored_xstate() for supervisor xstatesYu-cheng Yu1-13/+24
The function sanitize_restored_xstate() sanitizes user xstates of an XSAVE buffer by clearing bits not in the input 'xfeatures' from the buffer's header->xfeatures, effectively resetting those features back to the init state. When supervisor xstates are introduced, it is necessary to make sure only user xstates are sanitized. Ensure supervisor bits in header->xfeatures stay set and supervisor states are not modified. To make names clear, also: - Rename the function to sanitize_restored_user_xstate(). - Rename input parameter 'xfeatures' to 'user_xfeatures'. - In __fpu__restore_sig(), rename 'xfeatures' to 'user_xfeatures'. Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lkml.kernel.org/r/20200512145444.15483-7-yu-cheng.yu@intel.com
2020-05-13x86/fpu/xstate: Define new functions for clearing fpregs and xstatesFenghua Yu1-2/+2
Currently, fpu__clear() clears all fpregs and xstates. Once XSAVES supervisor states are introduced, supervisor settings (e.g. CET xstates) must remain active for signals; It is necessary to have separate functions: - Create fpu__clear_user_states(): clear only user settings for signals; - Create fpu__clear_all(): clear both user and supervisor settings in flush_thread(). Also modify copy_init_fpstate_to_fpregs() to take a mask from above two functions. Remove obvious side-comment in fpu__clear(), while at it. [ bp: Make the second argument of fpu__clear() bool after requesting it a bunch of times during review. - Add a comment about copy_init_fpstate_to_fpregs() locking needs. ] Co-developed-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200512145444.15483-6-yu-cheng.yu@intel.com
2020-05-13x86/fpu/xstate: Separate user and supervisor xfeatures maskYu-cheng Yu1-5/+11
Before the introduction of XSAVES supervisor states, 'xfeatures_mask' is used at various places to determine XSAVE buffer components and XCR0 bits. It contains only user xstates. To support supervisor xstates, it is necessary to separate user and supervisor xstates: - First, change 'xfeatures_mask' to 'xfeatures_mask_all', which represents the full set of bits that should ever be set in a kernel XSAVE buffer. - Introduce xfeatures_mask_supervisor() and xfeatures_mask_user() to extract relevant xfeatures from xfeatures_mask_all. Co-developed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200512145444.15483-4-yu-cheng.yu@intel.com
2020-05-12x86/fpu/xstate: Rename validate_xstate_header() to validate_user_xstate_header()Fenghua Yu1-1/+1
The function validate_xstate_header() validates an xstate header coming from userspace (PTRACE or sigreturn). To make it clear, rename it to validate_user_xstate_header(). Suggested-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200512145444.15483-2-yu-cheng.yu@intel.com
2020-01-07x86/fpu: Deactivate FPU state after failure during state loadSebastian Andrzej Siewior1-0/+3
In __fpu__restore_sig(), fpu_fpregs_owner_ctx needs to be reset if the FPU state was not fully restored. Otherwise the following may happen (on the same CPU): Task A Task B fpu_fpregs_owner_ctx *active* A.fpu __fpu__restore_sig() ctx switch load B.fpu *active* B.fpu fpregs_lock() copy_user_to_fpregs_zeroing() copy_kernel_to_xregs() *modify* copy_user_to_xregs() *fails* fpregs_unlock() ctx switch skip loading B.fpu, *active* B.fpu In the success case, fpu_fpregs_owner_ctx is set to the current task. In the failure case, the FPU state might have been modified by loading the init state. In this case, fpu_fpregs_owner_ctx needs to be reset in order to ensure that the FPU state of the following task is loaded from saved state (and not skipped because it was the previous state). Reset fpu_fpregs_owner_ctx after a failure during restore occurred, to ensure that the FPU state for the next task is always loaded. The problem was debugged-by Yu-cheng Yu <yu-cheng.yu@intel.com>. [ bp: Massage commit message. ] Fixes: 5f409e20b7945 ("x86/fpu: Defer FPU state load until return to userspace") Reported-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191220195906.plk6kpmsrikvbcfn@linutronix.de
2019-06-08x86/fpu: Update kernel's FPU state before using for the fsave headerSebastian Andrzej Siewior1-0/+5
In commit 39388e80f9b0c ("x86/fpu: Don't save fxregs for ia32 frames in copy_fpstate_to_sigframe()") I removed the statement | if (ia32_fxstate) | copy_fxregs_to_kernel(fpu); and argued that it was wrongly merged because the content was already saved in kernel's state. This was wrong: It is required to write it back because it is only saved on the user-stack and save_fsave_header() reads it from task's FPU-state. I missed that part… Save x87 FPU state unless thread's FPU registers are already up to date. Fixes: 39388e80f9b0c ("x86/fpu: Don't save fxregs for ia32 frames in copy_fpstate_to_sigframe()") Reported-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Eric Biggers <ebiggers@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: kvm ML <kvm@vger.kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190607142915.y52mfmgk5lvhll7n@linutronix.de
2019-06-06x86/fpu: Use fault_in_pages_writeable() for pre-faultingHugh Dickins1-9/+2
Since commit d9c9ce34ed5c8 ("x86/fpu: Fault-in user stack if copy_fpstate_to_sigframe() fails") get_user_pages_unlocked() pre-faults user's memory if a write generates a page fault while the handler is disabled. This works in general and uncovered a bug as reported by Mike Rapoport¹. It has been pointed out that this function may be fragile and a simple pre-fault as in fault_in_pages_writeable() would be a better solution. Better as in taste and simplicity: that write (as performed by the alternative function) performs exactly the same faulting of memory as before. This was suggested by Hugh Dickins and Andrew Morton. Use fault_in_pages_writeable() for pre-faulting user's stack. [ bigeasy: Write commit message. ] [ bp: Massage some. ] ¹ https://lkml.kernel.org/r/1557844195-18882-1-git-send-email-rppt@linux.ibm.com Fixes: d9c9ce34ed5c8 ("x86/fpu: Fault-in user stack if copy_fpstate_to_sigframe() fails") Suggested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: linux-mm <linux-mm@kvack.org> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190529072540.g46j4kfeae37a3iu@linutronix.de Link: https://lkml.kernel.org/r/1557844195-18882-1-git-send-email-rppt@linux.ibm.com
2019-05-06x86/fpu: Fault-in user stack if copy_fpstate_to_sigframe() failsSebastian Andrzej Siewior1-16/+15
In the compacted form, XSAVES may save only the XMM+SSE state but skip FP (x87 state). This is denoted by header->xfeatures = 6. The fastpath (copy_fpregs_to_sigframe()) does that but _also_ initialises the FP state (cwd to 0x37f, mxcsr as we do, remaining fields to 0). The slowpath (copy_xstate_to_user()) leaves most of the FP state untouched. Only mxcsr and mxcsr_flags are set due to xfeatures_mxcsr_quirk(). Now that XFEATURE_MASK_FP is set unconditionally, see 04944b793e18 ("x86: xsave: set FP, SSE bits in the xsave header in the user sigcontext"), on return from the signal, random garbage is loaded as the FP state. Instead of utilizing copy_xstate_to_user(), fault-in the user memory and retry the fast path. Ideally, the fast path succeeds on the second attempt but may be retried again if the memory is swapped out due to memory pressure. If the user memory can not be faulted-in then get_user_pages() returns an error so we don't loop forever. Fault in memory via get_user_pages_unlocked() so copy_fpregs_to_sigframe() succeeds without a fault. Fixes: 69277c98f5eef ("x86/fpu: Always store the registers in copy_fpstate_to_sigframe()") Reported-by: Kurt Kanzenbach <kurt.kanzenbach@linutronix.de> Suggested-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jann Horn <jannh@google.com> Cc: "linux-mm@kvack.org" <linux-mm@kvack.org> Cc: Qian Cai <cai@lca.pw> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190502171139.mqtegctsg35cir2e@linutronix.de
2019-04-12x86/fpu: Restore regs in copy_fpstate_to_sigframe() in order to use the fastpathSebastian Andrzej Siewior1-12/+13
If a task is scheduled out and receives a signal then it won't be able to take the fastpath because the registers aren't available. The slowpath is more expensive compared to XRSTOR + XSAVE which usually succeeds. Here are some clock_gettime() numbers from a bigger box with AVX512 during bootup: - __fpregs_load_activate() takes 140ns - 350ns. If it was the most recent FPU context on the CPU then the optimisation in __fpregs_load_activate() will skip the load (which was disabled during the test). - copy_fpregs_to_sigframe() takes 200ns - 450ns if it succeeds. On a pagefault it is 1.8us - 3us usually in the 2.6us area. - The slowpath takes 1.5us - 6us. Usually in the 2.6us area. My testcases (including lat_sig) take the fastpath without __fpregs_load_activate(). I expect this to be the majority. Since the slowpath is in the >1us area it makes sense to load the registers and attempt to save them directly. The direct save may fail but should only happen on the first invocation or after fork() while the page is read-only. [ bp: Massage a bit. ] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: kvm ML <kvm@vger.kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190403164156.19645-27-bigeasy@linutronix.de
2019-04-12x86/fpu: Add a fastpath to copy_fpstate_to_sigframe()Sebastian Andrzej Siewior1-12/+22
Try to save the FPU registers directly to the userland stack frame if the CPU holds the FPU registers for the current task. This has to be done with the pagefault disabled because we can't fault (while the FPU registers are locked) and therefore the operation might fail. If it fails try the slowpath which can handle faults. [ bp: Massage a bit. ] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: kvm ML <kvm@vger.kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190403164156.19645-26-bigeasy@linutronix.de
2019-04-12x86/fpu: Add a fastpath to __fpu__restore_sig()Sebastian Andrzej Siewior1-2/+21
The previous commits refactor the restoration of the FPU registers so that they can be loaded from in-kernel memory. This overhead can be avoided if the load can be performed without a pagefault. Attempt to restore FPU registers by invoking copy_user_to_fpregs_zeroing(). If it fails try the slowpath which can handle pagefaults. [ bp: Add a comment over the fastpath to be able to find one's way around the function. ] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: kvm ML <kvm@vger.kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190403164156.19645-25-bigeasy@linutronix.de
2019-04-12x86/fpu: Defer FPU state load until return to userspaceRik van Riel1-19/+30
Defer loading of FPU state until return to userspace. This gives the kernel the potential to skip loading FPU state for tasks that stay in kernel mode, or for tasks that end up with repeated invocations of kernel_fpu_begin() & kernel_fpu_end(). The fpregs_lock/unlock() section ensures that the registers remain unchanged. Otherwise a context switch or a bottom half could save the registers to its FPU context and the processor's FPU registers would became random if modified at the same time. KVM swaps the host/guest registers on entry/exit path. This flow has been kept as is. First it ensures that the registers are loaded and then saves the current (host) state before it loads the guest's registers. The swap is done at the very end with disabled interrupts so it should not change anymore before theg guest is entered. The read/save version seems to be cheaper compared to memcpy() in a micro benchmark. Each thread gets TIF_NEED_FPU_LOAD set as part of fork() / fpu__copy(). For kernel threads, this flag gets never cleared which avoids saving / restoring the FPU state for kernel threads and during in-kernel usage of the FPU registers. [ bp: Correct and update commit message and fix checkpatch warnings. s/register/registers/ where it is used in plural. minor comment corrections. remove unused trace_x86_fpu_activate_state() TP. ] Signed-off-by: Rik van Riel <riel@surriel.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Aubrey Li <aubrey.li@intel.com> Cc: Babu Moger <Babu.Moger@amd.com> Cc: "Chang S. Bae" <chang.seok.bae@intel.com> Cc: Dmitry Safonov <dima@arista.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: kvm ML <kvm@vger.kernel.org> Cc: Nicolai Stange <nstange@suse.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Waiman Long <longman@redhat.com> Cc: x86-ml <x86@kernel.org> Cc: Yi Wang <wang.yi59@zte.com.cn> Link: https://lkml.kernel.org/r/20190403164156.19645-24-bigeasy@linutronix.de
2019-04-12x86/fpu: Merge the two code paths in __fpu__restore_sig()Sebastian Andrzej Siewior1-85/+54
The ia32_fxstate case (32bit with fxsr) and the other (64bit frames or 32bit frames without fxsr) restore both from kernel memory and sanitize the content. The !ia32_fxstate version restores missing xstates from "init state" while the ia32_fxstate doesn't and skips it. Merge the two code paths and keep the !ia32_fxstate one. Copy only the user_i387_ia32_struct data structure in the ia32_fxstate. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: kvm ML <kvm@vger.kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190403164156.19645-23-bigeasy@linutronix.de
2019-04-12x86/fpu: Restore from kernel memory on the 64-bit path tooSebastian Andrzej Siewior1-13/+49
The 64-bit case (both 64-bit and 32-bit frames) loads the new state from user memory. However, doing this is not desired if the FPU state is going to be restored on return to userland: it would be required to disable preemption in order to avoid a context switch which would set TIF_NEED_FPU_LOAD. If this happens before the restore operation then the loaded registers would become volatile. Furthermore, disabling preemption while accessing user memory requires to disable the pagefault handler. An error during FXRSTOR would then mean that either a page fault occurred (and it would have to be retried with enabled page fault handler) or a #GP occurred because the xstate is bogus (after all, the signal handler can modify it). In order to avoid that mess, copy the FPU state from userland, validate it and then load it. The copy_kernel_…() helpers are basically just like the old helpers except that they operate on kernel memory and the fault handler just sets the error value and the caller handles it. copy_user_to_fpregs_zeroing() and its helpers remain and will be used later for a fastpath optimisation. [ bp: Clarify commit message. ] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Aubrey Li <aubrey.li@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: kvm ML <kvm@vger.kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190403164156.19645-22-bigeasy@linutronix.de
2019-04-11x86/fpu: Inline copy_user_to_fpregs_zeroing()Sebastian Andrzej Siewior1-1/+19
Start refactoring __fpu__restore_sig() by inlining copy_user_to_fpregs_zeroing(). The original function remains and will be used to restore from userland memory if possible. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: kvm ML <kvm@vger.kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190403164156.19645-21-bigeasy@linutronix.de
2019-04-11x86/fpu: Prepare copy_fpstate_to_sigframe() for TIF_NEED_FPU_LOADRik van Riel1-1/+11
The FPU registers need only to be saved if TIF_NEED_FPU_LOAD is not set. Otherwise this has been already done and can be skipped. [ bp: Massage a bit. ] Signed-off-by: Rik van Riel <riel@surriel.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: kvm ML <kvm@vger.kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190403164156.19645-19-bigeasy@linutronix.de
2019-04-11x86/fpu: Always store the registers in copy_fpstate_to_sigframe()Rik van Riel1-5/+14
copy_fpstate_to_sigframe() stores the registers directly to user space. This is okay because the FPU registers are valid and saving them directly avoids saving them into kernel memory and making a copy. However, this cannot be done anymore if the FPU registers are going to be restored on the return to userland. It is possible that the FPU registers will be invalidated in the middle of the save operation and this should be done with disabled preemption / BH. Save the FPU registers to the task's FPU struct and copy them to the user memory later on. Signed-off-by: Rik van Riel <riel@surriel.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: kvm ML <kvm@vger.kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190403164156.19645-18-bigeasy@linutronix.de
2019-04-10x86/fpu: Remove user_fpu_begin()Sebastian Andrzej Siewior1-1/+0
user_fpu_begin() sets fpu_fpregs_owner_ctx to task's fpu struct. This is always the case since there is no lazy FPU anymore. fpu_fpregs_owner_ctx is used during context switch to decide if it needs to load the saved registers or if the currently loaded registers are valid. It could be skipped during a taskA -> kernel thread -> taskA switch because the switch to the kernel thread would not alter the CPU's sFPU tate. Since this field is always updated during context switch and never invalidated, setting it manually (in user context) makes no difference. A kernel thread with kernel_fpu_begin() block could set fpu_fpregs_owner_ctx to NULL but a kernel thread does not use user_fpu_begin(). This is a leftover from the lazy-FPU time. Remove user_fpu_begin(), it does not change fpu_fpregs_owner_ctx's content. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Aubrey Li <aubrey.li@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: kvm ML <kvm@vger.kernel.org> Cc: Nicolai Stange <nstange@suse.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190403164156.19645-9-bigeasy@linutronix.de
2019-04-10x86/fpu: Don't save fxregs for ia32 frames in copy_fpstate_to_sigframe()Sebastian Andrzej Siewior1-4/+0
In commit 72a671ced66db ("x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels") the 32bit and 64bit path of the signal delivery code were merged. The 32bit version: int save_i387_xstate_ia32(void __user *buf) … if (cpu_has_xsave) return save_i387_xsave(fp); if (cpu_has_fxsr) return save_i387_fxsave(fp); The 64bit version: int save_i387_xstate(void __user *buf) … if (user_has_fpu()) { if (use_xsave()) err = xsave_user(buf); else err = fxsave_user(buf); if (unlikely(err)) { __clear_user(buf, xstate_size); return err; The merge: int save_xstate_sig(void __user *buf, void __user *buf_fx, int size) … if (user_has_fpu()) { /* Save the live register state to the user directly. */ if (save_user_xstate(buf_fx)) return -1; /* Update the thread's fxstate to save the fsave header. */ if (ia32_fxstate) fpu_fxsave(&tsk->thread.fpu); I don't think that we needed to save the FPU registers to ->thread.fpu because the registers were stored in buf_fx. Today the state will be restored from buf_fx after the signal was handled (I assume that this was also the case with lazy-FPU). Since commit 66463db4fc560 ("x86, fpu: shift drop_init_fpu() from save_xstate_sig() to handle_signal()") it is ensured that the signal handler starts with clear/fresh set of FPU registers which means that the previous store is futile. Remove the copy_fxregs_to_kernel() call because task's FPU state is cleared later in handle_signal() via fpu__clear(). Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: kvm ML <kvm@vger.kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190403164156.19645-7-bigeasy@linutronix.de
2019-04-09x86/fpu: Remove fpu->initialized usage in copy_fpstate_to_sigframe()Sebastian Andrzej Siewior1-27/+8
With lazy-FPU support the (now named variable) ->initialized was set to true if the CPU's FPU registers were holding a valid state of the FPU registers for the active process. If it was set to false then the FPU state was saved in fpu->state and the FPU was deactivated. With lazy-FPU gone, ->initialized is always true for user threads and kernel threads never call this function so ->initialized is always true in copy_fpstate_to_sigframe(). The using_compacted_format() check is also a leftover from the lazy-FPU time. In the ->initialized == false case copy_to_user() would copy the compacted buffer while userland would expect the non-compacted format instead. So in order to save the FPU state in the non-compacted form it issues XSAVE to save the *current* FPU state. If the FPU is not enabled, the attempt raises the FPU trap, the trap restores the FPU contents and re-enables the FPU and XSAVE is invoked again and succeeds. *This* does not longer work since commit bef8b6da9522 ("x86/fpu: Handle #NM without FPU emulation as an error") Remove the check for ->initialized because it is always true and remove the false condition. Update the comment to reflect that the state is always live. [ bp: Massage. ] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: kvm ML <kvm@vger.kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190403164156.19645-6-bigeasy@linutronix.de
2019-04-09x86/fpu: Remove fpu->initialized usage in __fpu__restore_sig()Sebastian Andrzej Siewior1-25/+15
This is a preparation for the removal of the ->initialized member in the fpu struct. __fpu__restore_sig() is deactivating the FPU via fpu__drop() and then setting manually ->initialized followed by fpu__restore(). The result is that it is possible to manipulate fpu->state and the state of registers won't be saved/restored on a context switch which would overwrite fpu->state: fpu__drop(fpu): ... fpu->initialized = 0; preempt_enable(); <--- context switch Don't access the fpu->state while the content is read from user space and examined/sanitized. Use a temporary kmalloc() buffer for the preparation of the FPU registers and once the state is considered okay, load it. Should something go wrong, return with an error and without altering the original FPU registers. The removal of fpu__initialize() is a nop because fpu->initialized is already set for the user task. [ bp: Massage a bit. ] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: kvm ML <kvm@vger.kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190403164156.19645-2-bigeasy@linutronix.de
2019-04-03x86/fpu: Fix __user annotationsJann Horn1-3/+3
In save_xstate_epilog(), use __user when type-casting userspace pointers. In setup_sigcontext() and x32_setup_rt_frame(), cast the userspace pointers to 'unsigned long __user *' before writing into them. These pointers are originally '__u32 __user *' or '__u64 __user *', causing sparse to complain when a userspace pointer is written into them. The casts are okay because the pointers always point to pointer-sized values. Thanks to Luc Van Oostenryck and Al Viro for explaining this to me. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Mukesh Ojha <mojha@codeaurora.org> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190329214652.258477-3-jannh@google.com
2019-01-04Remove 'type' argument from access_ok() functionLinus Torvalds1-2/+2
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument of the user address range verification function since we got rid of the old racy i386-only code to walk page tables by hand. It existed because the original 80386 would not honor the write protect bit when in kernel mode, so you had to do COW by hand before doing any user access. But we haven't supported that in a long time, and these days the 'type' argument is a purely historical artifact. A discussion about extending 'user_access_begin()' to do the range checking resulted this patch, because there is no way we're going to move the old VERIFY_xyz interface to that model. And it's best done at the end of the merge window when I've done most of my merges, so let's just get this done once and for all. This patch was mostly done with a sed-script, with manual fix-ups for the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form. There were a couple of notable cases: - csky still had the old "verify_area()" name as an alias. - the iter_iov code had magical hardcoded knowledge of the actual values of VERIFY_{READ,WRITE} (not that they mattered, since nothing really used it) - microblaze used the type argument for a debug printout but other than those oddities this should be a total no-op patch. I tried to fix up all architectures, did fairly extensive grepping for access_ok() uses, and the changes are trivial, but I may have missed something. Any missed conversion should be trivially fixable, though. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>