kernel/linux.git/arch/powerpc/lib, branch linux-5.9.y

powerpc/64s: flush L1D after user accesses

2020-11-22T09:15:32+00:00

commit 9a32a7e78bd0cd9a9b6332cbdc345ee5ffd0c5de upstream. IBM Power9 processors can speculatively operate on data in the L1 cache before it has been completely validated, via a way-prediction mechanism. It is not possible for an attacker to determine the contents of impermissible memory using this method, since these systems implement a combination of hardware and software security measures to prevent scenarios where protected data could be leaked. However these measures don't address the scenario where an attacker induces the operating system to speculatively execute instructions using data that the attacker controls. This can be used for example to speculatively bypass "kernel user access prevention" techniques, as discovered by Anthony Steinhauser of Google's Safeside Project. This is not an attack by itself, but there is a possibility it could be used in conjunction with side-channels or other weaknesses in the privileged code to construct an attack. This issue can be mitigated by flushing the L1 cache between privilege boundaries of concern. This patch flushes the L1 cache after user accesses. This is part of the fix for CVE-2020-4788. Signed-off-by: Nicholas Piggin Signed-off-by: Daniel Axtens Signed-off-by: Greg Kroah-Hartman

powerpc/64s: flush L1D on kernel entry

2020-11-22T09:15:32+00:00

commit f79643787e0a0762d2409b7b8334e83f22d85695 upstream. IBM Power9 processors can speculatively operate on data in the L1 cache before it has been completely validated, via a way-prediction mechanism. It is not possible for an attacker to determine the contents of impermissible memory using this method, since these systems implement a combination of hardware and software security measures to prevent scenarios where protected data could be leaked. However these measures don't address the scenario where an attacker induces the operating system to speculatively execute instructions using data that the attacker controls. This can be used for example to speculatively bypass "kernel user access prevention" techniques, as discovered by Anthony Steinhauser of Google's Safeside Project. This is not an attack by itself, but there is a possibility it could be used in conjunction with side-channels or other weaknesses in the privileged code to construct an attack. This issue can be mitigated by flushing the L1 cache between privilege boundaries of concern. This patch flushes the L1 cache on kernel entry. This is part of the fix for CVE-2020-4788. Signed-off-by: Nicholas Piggin Signed-off-by: Daniel Axtens Signed-off-by: Greg Kroah-Hartman

x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()

2020-11-04T17:09:05+00:00

commit ec6347bb43395cb92126788a1a5b25302543f815 upstream. In reaction to a proposal to introduce a memcpy_mcsafe_fast() implementation Linus points out that memcpy_mcsafe() is poorly named relative to communicating the scope of the interface. Specifically what addresses are valid to pass as source, destination, and what faults / exceptions are handled. Of particular concern is that even though x86 might be able to handle the semantics of copy_mc_to_user() with its common copy_user_generic() implementation other archs likely need / want an explicit path for this case: On Fri, May 1, 2020 at 11:28 AM Linus Torvalds wrote: > > On Thu, Apr 30, 2020 at 6:21 PM Dan Williams wrote: > > > > However now I see that copy_user_generic() works for the wrong reason. > > It works because the exception on the source address due to poison > > looks no different than a write fault on the user address to the > > caller, it's still just a short copy. So it makes copy_to_user() work > > for the wrong reason relative to the name. > > Right. > > And it won't work that way on other architectures. On x86, we have a > generic function that can take faults on either side, and we use it > for both cases (and for the "in_user" case too), but that's an > artifact of the architecture oddity. > > In fact, it's probably wrong even on x86 - because it can hide bugs - > but writing those things is painful enough that everybody prefers > having just one function. Replace a single top-level memcpy_mcsafe() with either copy_mc_to_user(), or copy_mc_to_kernel(). Introduce an x86 copy_mc_fragile() name as the rename for the low-level x86 implementation formerly named memcpy_mcsafe(). It is used as the slow / careful backend that is supplanted by a fast copy_mc_generic() in a follow-on patch. One side-effect of this reorganization is that separating copy_mc_64.S to its own file means that perf no longer needs to track dependencies for its memcpy_64.S benchmarks. [ bp: Massage a bit. ] Signed-off-by: Dan Williams Signed-off-by: Borislav Petkov Reviewed-by: Tony Luck Acked-by: Michael Ellerman Cc: Link: http://lore.kernel.org/r/CAHk-=wjSqtXAqfUJxFtWNwmguFASTgB0dz1dT3V-78Quiezqbg@mail.gmail.com Link: https://lkml.kernel.org/r/160195561680.2163339.11574962055305783722.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Greg Kroah-Hartman

Revert "x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()"

2020-11-04T17:08:28+00:00

This reverts commit a85748ed9eb70108f9605558f2754ca94ee91401 which is commit ec6347bb43395cb92126788a1a5b25302543f815 upstream. We had a mistake when merging a later patch in this series due to some file movements, so revert this change for now, as we will add it back in a later commit. Signed-off-by: Greg Kroah-Hartman

x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()

2020-11-01T11:47:03+00:00

powerpc/test_emulate_step: Add testcases for divde[.] and divdeu[.] instructions

2020-07-29T13:47:52+00:00

Add testcases for divde, divde., divdeu, divdeu. emulated instructions to cover few scenarios, - with same dividend and divisor to have undefine RT for divdeu[.] - with divide by zero to have undefine RT for both divde[.] and divdeu[.] - with negative dividend to cover -|divisor| < r <= 0 if the dividend is negative for divde[.] - normal case with proper dividend and divisor for both divde[.] and divdeu[.] Signed-off-by: Balamuruhan S Reviewed-by: Sandipan Das Acked-by: Naveen N. Rao Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20200728130308.1790982-4-bala24@linux.ibm.com

powerpc/sstep: Add support for divde[.] and divdeu[.] instructions

2020-07-29T13:47:52+00:00

This patch adds emulation support for divde, divdeu instructions, - Divide Doubleword Extended (divde[.]) - Divide Doubleword Extended Unsigned (divdeu[.]) Signed-off-by: Balamuruhan S Reviewed-by: Sandipan Das Acked-by: Naveen N. Rao Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20200728130308.1790982-3-bala24@linux.ibm.com

powerpc/lib: remove memcpy_flushcache redundant return

2020-07-26T14:01:31+00:00

Align it with other architectures and none of the callers has been interested its return Signed-off-by: Li RongQing Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/1556278590-14727-1-git-send-email-lirongqing@baidu.com

powerpc/lib: Prepare code-patching for modules allocated outside vmalloc space

2020-07-26T14:01:30+00:00

Use is_vmalloc_or_module_addr() instead of is_vmalloc_addr() Signed-off-by: Christophe Leroy Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/7d884db0e5a6f521331639d8c0f13e520d5a4fef.1593428200.git.christophe.leroy@csgroup.eu

powerpc/64s: Implement queued spinlocks and rwlocks

2020-07-26T14:01:23+00:00

These have shown significantly improved performance and fairness when spinlock contention is moderate to high on very large systems. With this series including subsequent patches, on a 16 socket 1536 thread POWER9, a stress test such as same-file open/close from all CPUs gets big speedups, 11620op/s aggregate with simple spinlocks vs 384158op/s (33x faster), where the difference in throughput between the fastest and slowest thread goes from 7x to 1.4x. Thanks to the fast path being identical in terms of atomics and barriers (after a subsequent optimisation patch), single threaded performance is not changed (no measurable difference). On smaller systems, performance and fairness seems to be generally improved. Using dbench on tmpfs as a test (that starts to run into kernel spinlock contention), a 2-socket OpenPOWER POWER9 system was tested with bare metal and KVM guest configurations. Results can be found here: https://github.com/linuxppc/issues/issues/305#issuecomment-663487453 Observations are: - Queued spinlocks are equal when contention is insignificant, as expected and as measured with microbenchmarks. - When there is contention, on bare metal queued spinlocks have better throughput and max latency at all points. - When virtualised, queued spinlocks are slightly worse approaching peak throughput, but significantly better throughput and max latency at all points beyond peak, until queued spinlock maximum latency rises when clients are 2x vCPUs. The regressions haven't been analysed very well yet, there are a lot of things that can be tuned, particularly the paravirtualised locking, but the numbers already look like a good net win even on relatively small systems. Signed-off-by: Nicholas Piggin Acked-by: Peter Zijlstra (Intel) Acked-by: Waiman Long Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20200724131423.1362108-4-npiggin@gmail.com