summaryrefslogtreecommitdiff
path: root/arch/powerpc/include
AgeCommit message (Collapse)AuthorFilesLines
2020-11-01x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()Dan Williams2-16/+26
commit ec6347bb43395cb92126788a1a5b25302543f815 upstream. In reaction to a proposal to introduce a memcpy_mcsafe_fast() implementation Linus points out that memcpy_mcsafe() is poorly named relative to communicating the scope of the interface. Specifically what addresses are valid to pass as source, destination, and what faults / exceptions are handled. Of particular concern is that even though x86 might be able to handle the semantics of copy_mc_to_user() with its common copy_user_generic() implementation other archs likely need / want an explicit path for this case: On Fri, May 1, 2020 at 11:28 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Thu, Apr 30, 2020 at 6:21 PM Dan Williams <dan.j.williams@intel.com> wrote: > > > > However now I see that copy_user_generic() works for the wrong reason. > > It works because the exception on the source address due to poison > > looks no different than a write fault on the user address to the > > caller, it's still just a short copy. So it makes copy_to_user() work > > for the wrong reason relative to the name. > > Right. > > And it won't work that way on other architectures. On x86, we have a > generic function that can take faults on either side, and we use it > for both cases (and for the "in_user" case too), but that's an > artifact of the architecture oddity. > > In fact, it's probably wrong even on x86 - because it can hide bugs - > but writing those things is painful enough that everybody prefers > having just one function. Replace a single top-level memcpy_mcsafe() with either copy_mc_to_user(), or copy_mc_to_kernel(). Introduce an x86 copy_mc_fragile() name as the rename for the low-level x86 implementation formerly named memcpy_mcsafe(). It is used as the slow / careful backend that is supplanted by a fast copy_mc_generic() in a follow-on patch. One side-effect of this reorganization is that separating copy_mc_64.S to its own file means that perf no longer needs to track dependencies for its memcpy_64.S benchmarks. [ bp: Massage a bit. ] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: <stable@vger.kernel.org> Link: http://lore.kernel.org/r/CAHk-=wjSqtXAqfUJxFtWNwmguFASTgB0dz1dT3V-78Quiezqbg@mail.gmail.com Link: https://lkml.kernel.org/r/160195561680.2163339.11574962055305783722.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-29powerpc/64s/radix: Fix mm_cpumask trimming race vs kthread_use_mmNicholas Piggin1-13/+0
[ Upstream commit a665eec0a22e11cdde708c1c256a465ebe768047 ] Commit 0cef77c7798a7 ("powerpc/64s/radix: flush remote CPUs out of single-threaded mm_cpumask") added a mechanism to trim the mm_cpumask of a process under certain conditions. One of the assumptions is that mm_users would not be incremented via a reference outside the process context with mmget_not_zero() then go on to kthread_use_mm() via that reference. That invariant was broken by io_uring code (see previous sparc64 fix), but I'll point Fixes: to the original powerpc commit because we are changing that assumption going forward, so this will make backports match up. Fix this by no longer relying on that assumption, but by having each CPU check the mm is not being used, and clearing their own bit from the mask only if it hasn't been switched-to by the time the IPI is processed. This relies on commit 38cf307c1f20 ("mm: fix kthread_use_mm() vs TLB invalidate") and ARCH_WANT_IRQS_OFF_ACTIVATE_MM to disable irqs over mm switch sequences. Fixes: 0cef77c7798a7 ("powerpc/64s/radix: flush remote CPUs out of single-threaded mm_cpumask") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Michael Ellerman <mpe@ellerman.id.au> Depends-on: 38cf307c1f20 ("mm: fix kthread_use_mm() vs TLB invalidate") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200914045219.3736466-5-npiggin@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-29powerpc/tau: Use appropriate temperature sample intervalFinn Thain1-1/+1
[ Upstream commit 66943005cc41f48e4d05614e8f76c0ca1812f0fd ] According to the MPC750 Users Manual, the SITV value in Thermal Management Register 3 is 13 bits long. The present code calculates the SITV value as 60 * 500 cycles. This would overflow to give 10 us on a 500 MHz CPU rather than the intended 60 us. (But according to the Microprocessor Datasheet, there is also a factor of 266 that has to be applied to this value on certain parts i.e. speed sort above 266 MHz.) Always use the maximum cycle count, as recommended by the Datasheet. Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2") Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/896f542e5f0f1d6cf8218524c2b67d79f3d69b3c.1599260540.git.fthain@telegraphics.com.au Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-29powerpc/book3s64/hash/4k: Support large linear mapping range with 4KAneesh Kumar K.V1-7/+6
[ Upstream commit 7746406baa3bc9e23fdd7b7da2f04d86e25ab837 ] With commit: 0034d395f89d ("powerpc/mm/hash64: Map all the kernel regions in the same 0xc range"), we now split the 64TB address range into 4 contexts each of 16TB. That implies we can do only 16TB linear mapping. On some systems, eg. Power9, memory attached to nodes > 0 will appear above 16TB in the linear mapping. This resulted in kernel crash when we boot such systems in hash translation mode with 4K PAGE_SIZE. This patch updates the kernel mapping such that we now start supporting upto 61TB of memory with 4K. The kernel mapping now looks like below 4K PAGE_SIZE and hash translation. vmalloc start = 0xc0003d0000000000 IO start = 0xc0003e0000000000 vmemmap start = 0xc0003f0000000000 Our MAX_PHYSMEM_BITS for 4K is still 64TB even though we can only map 61TB. We prevent bolt mapping anything outside 61TB range by checking against H_VMALLOC_START. Fixes: 0034d395f89d ("powerpc/mm/hash64: Map all the kernel regions in the same 0xc range") Reported-by: Cameron Berkenpas <cam@neo-zeon.de> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200608070904.387440-3-aneesh.kumar@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-29powerpc/watchpoint: Fix quadword instruction handling on p10 predecessorsRavi Bangoria1-0/+1
[ Upstream commit 4759c11ed20454b7b36db4ec15f7d5aa1519af4a ] On p10 predecessors, watchpoint with quadword access is compared at quadword length. If the watch range is doubleword or less than that in a first half of quadword aligned 16 bytes, and if there is any unaligned quadword access which will access only the 2nd half, the handler should consider it as extraneous and emulate/single-step it before continuing. Fixes: 74c6881019b7 ("powerpc/watchpoint: Prepare handler to handle more than one watchpoint") Reported-by: Pedro Miraglia Franco de Carvalho <pedromfc@linux.ibm.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200902042945.129369-2-ravi.bangoria@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-29powerpc/pseries/svm: Allocate SWIOTLB buffer anywhere in memoryThiago Jung Bauermann1-0/+4
[ Upstream commit eae9eec476d13fad9af6da1f44a054ee02b7b161 ] POWER secure guests (i.e., guests which use the Protected Execution Facility) need to use SWIOTLB to be able to do I/O with the hypervisor, but they don't need the SWIOTLB memory to be in low addresses since the hypervisor doesn't have any addressing limitation. This solves a SWIOTLB initialization problem we are seeing in secure guests with 128 GB of RAM: they are configured with 4 GB of crashkernel reserved memory, which leaves no space for SWIOTLB in low addresses. To do this, we use mostly the same code as swiotlb_init(), but allocate the buffer using memblock_alloc() instead of memblock_alloc_low(). Fixes: 2efbc58f157a ("powerpc/pseries/svm: Force SWIOTLB for secure guests") Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200818221126.391073-1-bauerman@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-29pseries/drmem: don't cache node id in drmem_lmb structScott Cheloha1-21/+0
[ Upstream commit e5e179aa3a39c818db8fbc2dce8d2cd24adaf657 ] At memory hot-remove time we can retrieve an LMB's nid from its corresponding memory_block. There is no need to store the nid in multiple locations. Note that lmb_to_memblock() uses find_memory_block() to get the corresponding memory_block. As find_memory_block() runs in sub-linear time this approach is negligibly slower than what we do at present. In exchange for this lookup at hot-remove time we no longer need to call memory_add_physaddr_to_nid() during drmem_init() for each LMB. On powerpc, memory_add_physaddr_to_nid() is a linear search, so this spares us an O(n^2) initialization during boot. On systems with many LMBs that initialization overhead is palpable and disruptive. For example, on a box with 249854 LMBs we're seeing drmem_init() take upwards of 30 seconds to complete: [ 53.721639] drmem: initializing drmem v2 [ 80.604346] watchdog: BUG: soft lockup - CPU#65 stuck for 23s! [swapper/0:1] [ 80.604377] Modules linked in: [ 80.604389] CPU: 65 PID: 1 Comm: swapper/0 Not tainted 5.6.0-rc2+ #4 [ 80.604397] NIP: c0000000000a4980 LR: c0000000000a4940 CTR: 0000000000000000 [ 80.604407] REGS: c0002dbff8493830 TRAP: 0901 Not tainted (5.6.0-rc2+) [ 80.604412] MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 44000248 XER: 0000000d [ 80.604431] CFAR: c0000000000a4a38 IRQMASK: 0 [ 80.604431] GPR00: c0000000000a4940 c0002dbff8493ac0 c000000001904400 c0003cfffffede30 [ 80.604431] GPR04: 0000000000000000 c000000000f4095a 000000000000002f 0000000010000000 [ 80.604431] GPR08: c0000bf7ecdb7fb8 c0000bf7ecc2d3c8 0000000000000008 c00c0002fdfb2001 [ 80.604431] GPR12: 0000000000000000 c00000001e8ec200 [ 80.604477] NIP [c0000000000a4980] hot_add_scn_to_nid+0xa0/0x3e0 [ 80.604486] LR [c0000000000a4940] hot_add_scn_to_nid+0x60/0x3e0 [ 80.604492] Call Trace: [ 80.604498] [c0002dbff8493ac0] [c0000000000a4940] hot_add_scn_to_nid+0x60/0x3e0 (unreliable) [ 80.604509] [c0002dbff8493b20] [c000000000087c10] memory_add_physaddr_to_nid+0x20/0x60 [ 80.604521] [c0002dbff8493b40] [c0000000010d4880] drmem_init+0x25c/0x2f0 [ 80.604530] [c0002dbff8493c10] [c000000000010154] do_one_initcall+0x64/0x2c0 [ 80.604540] [c0002dbff8493ce0] [c0000000010c4aa0] kernel_init_freeable+0x2d8/0x3a0 [ 80.604550] [c0002dbff8493db0] [c000000000010824] kernel_init+0x2c/0x148 [ 80.604560] [c0002dbff8493e20] [c00000000000b648] ret_from_kernel_thread+0x5c/0x74 [ 80.604567] Instruction dump: [ 80.604574] 392918e8 e9490000 e90a000a e92a0000 80ea000c 1d080018 3908ffe8 7d094214 [ 80.604586] 7fa94040 419d00dc e9490010 714a0088 <2faa0008> 409e00ac e9490000 7fbe5040 [ 89.047390] drmem: 249854 LMB(s) With a patched kernel on the same machine we're no longer seeing the soft lockup. drmem_init() now completes in negligible time, even when the LMB count is large. Fixes: b2d3b5ee66f2 ("powerpc/pseries: Track LMB nid instead of using device tree") Signed-off-by: Scott Cheloha <cheloha@linux.ibm.com> Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200811015115.63677-1-cheloha@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-29powerpc/pseries: explicitly reschedule during drmem_lmb list traversalNathan Lynch1-1/+17
[ Upstream commit 9d6792ffe140240ae54c881cc4183f9acc24b4df ] The drmem lmb list can have hundreds of thousands of entries, and unfortunately lookups take the form of linear searches. As long as this is the case, traversals have the potential to monopolize the CPU and provoke lockup reports, workqueue stalls, and the like unless they explicitly yield. Rather than placing cond_resched() calls within various for_each_drmem_lmb() loop blocks in the code, put it in the iteration expression of the loop macro itself so users can't omit it. Introduce a drmem_lmb_next() iteration helper function which calls cond_resched() at a regular interval during array traversal. Each iteration of the loop in DLPAR code paths can involve around ten RTAS calls which can each take up to 250us, so this ensures the check is performed at worst every few milliseconds. Fixes: 6c6ea53725b3 ("powerpc/mm: Separate ibm, dynamic-memory data from DT format") Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200813151131.2070161-1-nathanl@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-23powerpc/book3s64/radix: Fix boot failure with large amount of guest memoryAneesh Kumar K.V1-5/+5
[ Upstream commit 103a8542cb35b5130f732d00b0419a594ba1b517 ] If the hypervisor doesn't support hugepages, the kernel ends up allocating a large number of page table pages. The early page table allocation was wrongly setting the max memblock limit to ppc64_rma_size with radix translation which resulted in boot failure as shown below. Kernel panic - not syncing: early_alloc_pgtable: Failed to allocate 16777216 bytes align=0x1000000 nid=-1 from=0x0000000000000000 max_addr=0xffffffffffffffff CPU: 0 PID: 0 Comm: swapper Not tainted 5.8.0-24.9-default+ #2 Call Trace: [c0000000016f3d00] [c0000000007c6470] dump_stack+0xc4/0x114 (unreliable) [c0000000016f3d40] [c00000000014c78c] panic+0x164/0x418 [c0000000016f3dd0] [c000000000098890] early_alloc_pgtable+0xe0/0xec [c0000000016f3e60] [c0000000010a5440] radix__early_init_mmu+0x360/0x4b4 [c0000000016f3ef0] [c000000001099bac] early_init_mmu+0x1c/0x3c [c0000000016f3f10] [c00000000109a320] early_setup+0x134/0x170 This was because the kernel was checking for the radix feature before we enable the feature via mmu_features. This resulted in the kernel using hash restrictions on radix. Rework the early init code such that the kernel boot with memblock restrictions as imposed by hash. At that point, the kernel still hasn't finalized the translation the kernel will end up using. We have three different ways of detecting radix. 1. dt_cpu_ftrs_scan -> used only in case of PowerNV 2. ibm,pa-features -> Used when we don't use cpu_dt_ftr_scan 3. CAS -> Where we negotiate with hypervisor about the supported translation. We look at 1 or 2 early in the boot and after that, we look at the CAS vector to finalize the translation the kernel will use. We also support a kernel command line option (disable_radix) to switch to hash. Update the memblock limit after mmu_early_init_devtree() if the kernel is going to use radix translation. This forces some of the memblock allocations we do before mmu_early_init_devtree() to be within the RMA limit. Fixes: 2bfd65e45e87 ("powerpc/mm/radix: Add radix callbacks for early init routines") Reported-by: Shirisha Ganta <shiganta@in.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200828100852.426575-1-aneesh.kumar@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-26KVM: Pass MMU notifier range flags to kvm_unmap_hva_range()Will Deacon1-1/+2
commit fdfe7cbd58806522e799e2a50a15aee7f2cbb7b6 upstream. The 'flags' field of 'struct mmu_notifier_range' is used to indicate whether invalidate_range_{start,end}() are permitted to block. In the case of kvm_mmu_notifier_invalidate_range_start(), this field is not forwarded on to the architecture-specific implementation of kvm_unmap_hva_range() and therefore the backend cannot sensibly decide whether or not to block. Add an extra 'flags' parameter to kvm_unmap_hva_range() so that architectures are aware as to whether or not they are permitted to block. Cc: <stable@vger.kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Message-Id: <20200811102725.7121-2-will@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-08-26powerpc/fixmap: Fix the size of the early debug areaChristophe Leroy1-1/+1
[ Upstream commit fdc6edbb31fba76fd25d7bd016b675a92908d81e ] Commit ("03fd42d458fb powerpc/fixmap: Fix FIX_EARLY_DEBUG_BASE when page size is 256k") reworked the setup of the early debug area and mistakenly replaced 128 * 1024 by SZ_128. Change to SZ_128K to restore the original 128 kbytes size of the area. Fixes: 03fd42d458fb ("powerpc/fixmap: Fix FIX_EARLY_DEBUG_BASE when page size is 256k") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/996184974d674ff984643778cf1cdd7fe58cc065.1597644194.git.christophe.leroy@csgroup.eu Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-21powerpc: Fix circular dependency between percpu.h and mmu.hMichael Ellerman1-2/+2
commit 0c83b277ada72b585e6a3e52b067669df15bcedb upstream. Recently random.h started including percpu.h (see commit f227e3ec3b5c ("random32: update the net random state on interrupt and activity")), which broke corenet64_smp_defconfig: In file included from /linux/arch/powerpc/include/asm/paca.h:18, from /linux/arch/powerpc/include/asm/percpu.h:13, from /linux/include/linux/random.h:14, from /linux/lib/uuid.c:14: /linux/arch/powerpc/include/asm/mmu.h:139:22: error: unknown type name 'next_tlbcam_idx' 139 | DECLARE_PER_CPU(int, next_tlbcam_idx); This is due to a circular header dependency: asm/mmu.h includes asm/percpu.h, which includes asm/paca.h, which includes asm/mmu.h Which means DECLARE_PER_CPU() isn't defined when mmu.h needs it. We can fix it by moving the include of paca.h below the include of asm-generic/percpu.h. This moves the include of paca.h out of the #ifdef __powerpc64__, but that is OK because paca.h is almost entirely inside #ifdef CONFIG_PPC64 anyway. It also moves the include of paca.h out of the #ifdef CONFIG_SMP, which could possibly break something, but seems to have no ill effects. Fixes: f227e3ec3b5c ("random32: update the net random state on interrupt and activity") Cc: stable@vger.kernel.org # v5.8 Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200804130558.292328-1-mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-08-19powerpc/32s: Fix CONFIG_BOOK3S_601 usesMichael Ellerman2-2/+2
[ Upstream commit df4d4ef22446b3a789a4efd74d34f2ec1e24deb2 ] We have two uses of CONFIG_BOOK3S_601, which doesn't exist. Fix them to use CONFIG_PPC_BOOK3S_601 which is the correct symbol. Fixes: 12c3f1fd87bf ("powerpc/32s: get rid of CPU_FTR_601 feature") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200724131728.1643966-5-mpe@ellerman.id.au Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-19powerpc/perf: Fix missing is_sier_aviable() during buildMadhavan Srinivasan1-0/+2
[ Upstream commit 3c9450c053f88e525b2db1e6990cdf34d14e7696 ] Compilation error: arch/powerpc/perf/perf_regs.c:80:undefined reference to `.is_sier_available' Currently is_sier_available() is part of core-book3s.c, which is added to build based on CONFIG_PPC_PERF_CTRS. A config with CONFIG_PERF_EVENTS and without CONFIG_PPC_PERF_CTRS will have a build break because of missing is_sier_available(). In practice it only breaks when CONFIG_FSL_EMB_PERF_EVENT=n because that also guards the usage of is_sier_available(). That only happens with CONFIG_PPC_BOOK3E_64=y and CONFIG_FSL_SOC_BOOKE=n. Patch adds is_sier_available() in asm/perf_event.h to fix the build break for configs missing CONFIG_PPC_PERF_CTRS. Fixes: 333804dc3b7a ("powerpc/perf: Update perf_regs structure to include SIER") Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> [mpe: Add detail about CONFIG_FSL_SOC_BOOKE] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200614083604.302611-1-maddy@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-19powerpc/rtas: don't online CPUs for partition suspendNathan Lynch1-2/+0
[ Upstream commit ec2fc2a9e9bbad9023aab65bc472ce7a3ca8608f ] Partition suspension, used for hibernation and migration, requires that the OS place all but one of the LPAR's processor threads into one of two states prior to calling the ibm,suspend-me RTAS function: * the architected offline state (via RTAS stop-self); or * the H_JOIN hcall, which does not return until the partition resumes execution Using H_CEDE as the offline mode, introduced by commit 3aa565f53c39 ("powerpc/pseries: Add hooks to put the CPU into an appropriate offline state"), means that any threads which are offline from Linux's point of view must be moved to one of those two states before a partition suspension can proceed. This was eventually addressed in commit 120496ac2d2d ("powerpc: Bring all threads online prior to migration/hibernation"), which added code to temporarily bring up any offline processor threads so they can call H_JOIN. Conceptually this is fine, but the implementation has had multiple races with cpu hotplug operations initiated from user space[1][2][3], the error handling is fragile, and it generates user-visible cpu hotplug events which is a lot of noise for a platform feature that's supposed to minimize disruption to workloads. With commit 3aa565f53c39 ("powerpc/pseries: Add hooks to put the CPU into an appropriate offline state") reverted, this code becomes unnecessary, so remove it. Since any offline CPUs now are truly offline from the platform's point of view, it is no longer necessary to bring up CPUs only to have them call H_JOIN and then go offline again upon resuming. Only active threads are required to call H_JOIN; stopped threads can be left alone. [1] commit a6717c01ddc2 ("powerpc/rtas: use device model APIs and serialization during LPM") [2] commit 9fb603050ffd ("powerpc/rtas: retry when cpu offline races with suspend/migration") [3] commit dfd718a2ed1f ("powerpc/rtas: Fix a potential race between CPU-Offline & Migration") Fixes: 120496ac2d2d ("powerpc: Bring all threads online prior to migration/hibernation") Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200612051238.1007764-3-nathanl@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-19powerpc/fixmap: Fix FIX_EARLY_DEBUG_BASE when page size is 256kChristophe Leroy1-1/+1
[ Upstream commit 03fd42d458fb9cb69e712600bd69ff77ff3a45a8 ] FIX_EARLY_DEBUG_BASE reserves a 128k area for debuging. When page size is 256k, the calculation results in a 0 number of pages, leading to the following failure: CC arch/powerpc/kernel/asm-offsets.s In file included from ./arch/powerpc/include/asm/nohash/32/pgtable.h:77:0, from ./arch/powerpc/include/asm/nohash/pgtable.h:8, from ./arch/powerpc/include/asm/pgtable.h:20, from ./include/linux/pgtable.h:6, from ./arch/powerpc/include/asm/kup.h:42, from ./arch/powerpc/include/asm/uaccess.h:9, from ./include/linux/uaccess.h:11, from ./include/linux/crypto.h:21, from ./include/crypto/hash.h:11, from ./include/linux/uio.h:10, from ./include/linux/socket.h:8, from ./include/linux/compat.h:15, from arch/powerpc/kernel/asm-offsets.c:14: ./arch/powerpc/include/asm/fixmap.h:75:2: error: overflow in enumeration values __end_of_permanent_fixed_addresses, ^ make[2]: *** [arch/powerpc/kernel/asm-offsets.s] Error 1 Ensure the debug area is at least one page. Fixes: b8e8efaa8639 ("powerpc: reserve fixmap entries for early debug") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/ca8c9f8249f523b1fab873e67b81b11989d46553.1592207216.git.christophe.leroy@csgroup.eu Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-11Revert "powerpc/kasan: Fix shadow pages allocation failure"Christophe Leroy1-0/+2
commit b506923ee44ae87fc9f4de16b53feb313623e146 upstream. This reverts commit d2a91cef9bbdeb87b7449fdab1a6be6000930210. This commit moved too much work in kasan_init(). The allocation of shadow pages has to be moved for the reason explained in that patch, but the allocation of page tables still need to be done before switching to the final hash table. First revert the incorrect commit, following patch redoes it properly. Fixes: d2a91cef9bbd ("powerpc/kasan: Fix shadow pages allocation failure") Cc: stable@vger.kernel.org Reported-by: Erhard F. <erhard_f@mailbox.org> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://bugzilla.kernel.org/show_bug.cgi?id=208181 Link: https://lore.kernel.org/r/3667deb0911affbf999b99f87c31c77d5e870cd2.1593690707.git.christophe.leroy@csgroup.eu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-15powerpc/vas: Report proper error code for address translation failureHaren Myneni1-0/+2
P9 DD2 NX workbook (Table 4-36) says DMA controller uses CC=5 internally for translation fault handling. NX reserves CC=250 for OS to notify user space when NX encounters address translation failure on the request buffer. Not an issue in earlier releases as NX does not get faults on kernel addresses. This patch defines CSB_CC_FAULT_ADDRESS(250) and updates CSB.CC with this proper error code for user space. Fixes: c96c4436aba4 ("powerpc/vas: Update CSB and notify process for fault CRBs") Signed-off-by: Haren Myneni <haren@linux.ibm.com> [mpe: Added Fixes tag and fix typo in comment] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/019fd53e7538c6f8f332d175df74b1815ef5aa8c.camel@linux.ibm.com
2020-06-21Merge tag 'powerpc-5.8-3' of ↵Linus Torvalds1-5/+13
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - One fix for the interrupt rework we did last release which broke KVM-PR - Three commits fixing some fallout from the READ_ONCE() changes interacting badly with our 8xx 16K pages support, which uses a pte_t that is a structure of 4 actual PTEs - A cleanup of the 8xx pte_update() to use the newly added pmd_off() - A fix for a crash when handling an oops if CONFIG_DEBUG_VIRTUAL is enabled - A minor fix for the SPU syscall generation Thanks to Aneesh Kumar K.V, Christian Zigotzky, Christophe Leroy, Mike Rapoport, Nicholas Piggin. * tag 'powerpc-5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/8xx: Provide ptep_get() with 16k pages mm: Allow arches to provide ptep_get() mm/gup: Use huge_ptep_get() in gup_hugepte() powerpc/syscalls: Use the number when building SPU syscall table powerpc/8xx: use pmd_off() to access a PMD entry in pte_update() powerpc/64s: Fix KVM interrupt using wrong save area powerpc: Fix kernel crash in show_instructions() w/DEBUG_VIRTUAL
2020-06-20Merge tag 'libnvdimm-for-5.8-rc2' of ↵Linus Torvalds1-0/+132
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm updates from Dan Williams: "A feature (papr_scm health retrieval) and a fix (sysfs attribute visibility) for v5.8. Vaibhav explains in the merge commit below why missing v5.8 would be painful and I agreed to try a -rc2 pull because only cosmetics kept this out of -rc1 and his initial versions were posted in more than enough time for v5.8 consideration: 'These patches are tied to specific features that were committed to customers in upcoming distros releases (RHEL and SLES) whose time-lines are tied to 5.8 kernel release. Being able to track the health of an nvdimm is critical for our customers that are running workloads leveraging papr-scm nvdimms. Missing the 5.8 kernel would mean missing the distro timelines and shifting forward the availability of this feature in distro kernels by at least 6 months' Summary: - Fix the visibility of the region 'align' attribute. The new unit tests for region alignment handling caught a corner case where the alignment cannot be specified if the region is converted from static to dynamic provisioning at runtime. - Add support for device health retrieval for the persistent memory supported by the papr_scm driver. This includes both the standard sysfs "health flags" that the nfit persistent memory driver publishes and a mechanism for the ndctl tool to retrieve a health-command payload" * tag 'libnvdimm-for-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: nvdimm/region: always show the 'align' attribute powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl() powerpc/papr_scm: Fetch nvdimm health information from PHYP seq_buf: Export seq_buf_printf powerpc: Document details on H_SCM_HEALTH hcall
2020-06-20powerpc/8xx: Provide ptep_get() with 16k pagesChristophe Leroy1-0/+10
READ_ONCE() now enforces atomic read, which leads to: CC mm/gup.o In file included from ./include/linux/kernel.h:11:0, from mm/gup.c:2: In function 'gup_hugepte.constprop', inlined from 'gup_huge_pd.isra.79' at mm/gup.c:2465:8: ./include/linux/compiler.h:392:38: error: call to '__compiletime_assert_222' declared with attribute error: Unsupported access size for {READ,WRITE}_ONCE(). _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^ ./include/linux/compiler.h:373:4: note: in definition of macro '__compiletime_assert' prefix ## suffix(); \ ^ ./include/linux/compiler.h:392:2: note: in expansion of macro '_compiletime_assert' _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^ ./include/linux/compiler.h:405:2: note: in expansion of macro 'compiletime_assert' compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \ ^ ./include/linux/compiler.h:291:2: note: in expansion of macro 'compiletime_assert_rwonce_type' compiletime_assert_rwonce_type(x); \ ^ mm/gup.c:2428:8: note: in expansion of macro 'READ_ONCE' pte = READ_ONCE(*ptep); ^ In function 'gup_get_pte', inlined from 'gup_pte_range' at mm/gup.c:2228:9, inlined from 'gup_pmd_range' at mm/gup.c:2613:15, inlined from 'gup_pud_range' at mm/gup.c:2641:15, inlined from 'gup_p4d_range' at mm/gup.c:2666:15, inlined from 'gup_pgd_range' at mm/gup.c:2694:15, inlined from 'internal_get_user_pages_fast' at mm/gup.c:2795:3: ./include/linux/compiler.h:392:38: error: call to '__compiletime_assert_219' declared with attribute error: Unsupported access size for {READ,WRITE}_ONCE(). _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^ ./include/linux/compiler.h:373:4: note: in definition of macro '__compiletime_assert' prefix ## suffix(); \ ^ ./include/linux/compiler.h:392:2: note: in expansion of macro '_compiletime_assert' _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^ ./include/linux/compiler.h:405:2: note: in expansion of macro 'compiletime_assert' compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \ ^ ./include/linux/compiler.h:291:2: note: in expansion of macro 'compiletime_assert_rwonce_type' compiletime_assert_rwonce_type(x); \ ^ mm/gup.c:2199:9: note: in expansion of macro 'READ_ONCE' return READ_ONCE(*ptep); ^ make[2]: *** [mm/gup.o] Error 1 Define ptep_get() on 8xx when using 16k pages. Fixes: 9e343b467c70 ("READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/341688399c1b102756046d19ea6ce39db1ae4742.1592225558.git.christophe.leroy@csgroup.eu
2020-06-18maccess: make get_kernel_nofault() check for minimal type compatibilityLinus Torvalds1-1/+1
Now that we've renamed probe_kernel_address() to get_kernel_nofault() and made it look and behave more in line with get_user(), some of the subtle type behavior differences end up being more obvious and possibly dangerous. When you do get_user(val, user_ptr); the type of the access comes from the "user_ptr" part, and the above basically acts as val = *user_ptr; by design (except, of course, for the fact that the actual dereference is done with a user access). Note how in the above case, the type of the end result comes from the pointer argument, and then the value is cast to the type of 'val' as part of the assignment. So the type of the pointer is ultimately the more important type both for the access itself. But 'get_kernel_nofault()' may now _look_ similar, but it behaves very differently. When you do get_kernel_nofault(val, kernel_ptr); it behaves like val = *(typeof(val) *)kernel_ptr; except, of course, for the fact that the actual dereference is done with exception handling so that a faulting access is suppressed and returned as the error code. But note how different the casting behavior of the two superficially similar accesses are: one does the actual access in the size of the type the pointer points to, while the other does the access in the size of the target, and ignores the pointer type entirely. Actually changing get_kernel_nofault() to act like get_user() is almost certainly the right thing to do eventually, but in the meantime this patch adds logit to at least verify that the pointer type is compatible with the type of the result. In many cases, this involves just casting the pointer to 'void *' to make it obvious that the type of the pointer is not the important part. It's not how 'get_user()' acts, but at least the behavioral difference is now obvious and explicit. Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-18maccess: rename probe_kernel_address to get_kernel_nofaultChristoph Hellwig1-1/+1
Better describe what this helper does, and match the naming of copy_from_kernel_nofault. Also switch the argument order around, so that it acts and looks like get_user(). Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-17powerpc/8xx: use pmd_off() to access a PMD entry in pte_update()Mike Rapoport1-5/+3
The pte_update() implementation for PPC_8xx unfolds page table from the PGD level to access a PMD entry. Since 8xx has only 2-level page table this can be simplified with pmd_off() shortcut. Replace explicit unfolding with pmd_off() and drop defines of pgd_index() and pgd_offset() that are no longer needed. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200615092229.23142-1-rppt@kernel.org
2020-06-16powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTHVaibhav Jain1-0/+37
This patch implements support for PDSM request 'PAPR_PDSM_HEALTH' that returns a newly introduced 'struct nd_papr_pdsm_health' instance containing dimm health information back to user space in response to ND_CMD_CALL. This functionality is implemented in newly introduced papr_pdsm_health() that queries the nvdimm health information and then copies this information to the package payload whose layout is defined by 'struct nd_papr_pdsm_health'. Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20200615124407.32596-7-vaibhav@linux.ibm.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-06-16ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methodsVaibhav Jain1-0/+95
Introduce support for PAPR NVDIMM Specific Methods (PDSM) in papr_scm module and add the command family NVDIMM_FAMILY_PAPR to the white list of NVDIMM command sets. Also advertise support for ND_CMD_CALL for the nvdimm command mask and implement necessary scaffolding in the module to handle ND_CMD_CALL ioctl and PDSM requests that we receive. The layout of the PDSM request as we expect from libnvdimm/libndctl is described in newly introduced uapi header 'papr_pdsm.h' which defines a 'struct nd_pkg_pdsm' and a maximal union named 'nd_pdsm_payload'. These new structs together with 'struct nd_cmd_pkg' for a pdsm envelop thats sent by libndctl to libnvdimm and serviced by papr_scm in 'papr_scm_service_pdsm()'. The PDSM request is communicated by member 'struct nd_cmd_pkg.nd_command' together with other information on the pdsm payload (size-in, size-out). The patch also introduces 'struct pdsm_cmd_desc' instances of which are stored in an array __pdsm_cmd_descriptors[] indexed with PDSM cmd and corresponding access function pdsm_cmd_desc() is introduced. 'struct pdsm_cdm_desc' holds the service function for a given PDSM and corresponding payload in/out sizes. A new function papr_scm_service_pdsm() is introduced and is called from papr_scm_ndctl() in case of a PDSM request is received via ND_CMD_CALL command from libnvdimm. The function performs validation on the PDSM payload based on info present in corresponding PDSM descriptor and if valid calls the 'struct pdcm_cmd_desc.service' function to service the PDSM. Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20200615124407.32596-6-vaibhav@linux.ibm.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-06-12Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds3-24/+20
Pull more KVM updates from Paolo Bonzini: "The guest side of the asynchronous page fault work has been delayed to 5.9 in order to sync with Thomas's interrupt entry rework, but here's the rest of the KVM updates for this merge window. MIPS: - Loongson port PPC: - Fixes ARM: - Fixes x86: - KVM_SET_USER_MEMORY_REGION optimizations - Fixes - Selftest fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (62 commits) KVM: x86: do not pass poisoned hva to __kvm_set_memory_region KVM: selftests: fix sync_with_host() in smm_test KVM: async_pf: Inject 'page ready' event only if 'page not present' was previously injected KVM: async_pf: Cleanup kvm_setup_async_pf() kvm: i8254: remove redundant assignment to pointer s KVM: x86: respect singlestep when emulating instruction KVM: selftests: Don't probe KVM_CAP_HYPERV_ENLIGHTENED_VMCS when nested VMX is unsupported KVM: selftests: do not substitute SVM/VMX check with KVM_CAP_NESTED_STATE check KVM: nVMX: Consult only the "basic" exit reason when routing nested exit KVM: arm64: Move hyp_symbol_addr() to kvm_asm.h KVM: arm64: Synchronize sysreg state on injecting an AArch32 exception KVM: arm64: Make vcpu_cp1x() work on Big Endian hosts KVM: arm64: Remove host_cpu_context member from vcpu structure KVM: arm64: Stop sparse from moaning at __hyp_this_cpu_ptr KVM: arm64: Handle PtrAuth traps early KVM: x86: Unexport x86_fpu_cache and make it static KVM: selftests: Ignore KVM 5-level paging support for VM_MODE_PXXV48_4K KVM: arm64: Save the host's PtrAuth keys in non-preemptible context KVM: arm64: Stop save/restoring ACTLR_EL1 KVM: arm64: Add emulation for 32bit guests accessing ACTLR2 ...
2020-06-09mmap locking API: convert mmap_sem commentsMichel Lespinasse1-1/+1
Convert comments that reference mmap_sem to reference mmap_lock instead. [akpm@linux-foundation.org: fix up linux-next leftovers] [akpm@linux-foundation.org: s/lockaphore/lock/, per Vlastimil] [akpm@linux-foundation.org: more linux-next fixups, per Michel] Signed-off-by: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ying Han <yinghan@google.com> Link: http://lkml.kernel.org/r/20200520052908.204642-13-walken@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09mm: consolidate pte_index() and pte_offset_*() definitionsMike Rapoport6-101/+12
All architectures define pte_index() as (address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1) and all architectures define pte_offset_kernel() as an entry in the array of PTEs indexed by the pte_index(). For the most architectures the pte_offset_kernel() implementation relies on the availability of pmd_page_vaddr() that converts a PMD entry value to the virtual address of the page containing PTEs array. Let's move x86 definitions of the PTE accessors to the generic place in <linux/pgtable.h> and then simply drop the respective definitions from the other architectures. The architectures that didn't provide pmd_page_vaddr() are updated to have that defined. The generic implementation of pte_offset_kernel() can be overridden by an architecture and alpha makes use of this because it has special ordering requirements for its version of pte_offset_kernel(). [rppt@linux.ibm.com: v2] Link: http://lkml.kernel.org/r/20200514170327.31389-11-rppt@kernel.org [rppt@linux.ibm.com: update] Link: http://lkml.kernel.org/r/20200514170327.31389-12-rppt@kernel.org [rppt@linux.ibm.com: update] Link: http://lkml.kernel.org/r/20200514170327.31389-13-rppt@kernel.org [akpm@linux-foundation.org: fix x86 warning] [sfr@canb.auug.org.au: fix powerpc build] Link: http://lkml.kernel.org/r/20200607153443.GB738695@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-10-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09mm: pgtable: add shortcuts for accessing kernel PMD and PTEMike Rapoport1-19/+0
The powerpc 32-bit implementation of pgtable has nice shortcuts for accessing kernel PMD and PTE for a given virtual address. Make these helpers available for all architectures. [rppt@linux.ibm.com: microblaze: fix page table traversal in setup_rt_frame()] Link: http://lkml.kernel.org/r/20200518191511.GD1118872@kernel.org [akpm@linux-foundation.org: s/pmd_ptr_k/pmd_off_k/ in various powerpc places] Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-9-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09mm: reorder includes after introduction of linux/pgtable.hMike Rapoport1-1/+1
The replacement of <asm/pgrable.h> with <linux/pgtable.h> made the include of the latter in the middle of asm includes. Fix this up with the aid of the below script and manual adjustments here and there. import sys import re if len(sys.argv) is not 3: print "USAGE: %s <file> <header>" % (sys.argv[0]) sys.exit(1) hdr_to_move="#include <linux/%s>" % sys.argv[2] moved = False in_hdrs = False with open(sys.argv[1], "r") as f: lines = f.readlines() for _line in lines: line = _line.rstrip(' ') if line == hdr_to_move: continue if line.startswith("#include <linux/"): in_hdrs = True elif not moved and in_hdrs: moved = True print hdr_to_move print line Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-4-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09mm: introduce include/linux/pgtable.hMike Rapoport5-6/+4
The include/linux/pgtable.h is going to be the home of generic page table manipulation functions. Start with moving asm-generic/pgtable.h to include/linux/pgtable.h and make the latter include asm/pgtable.h. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-3-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09mm: don't include asm/pgtable.h if linux/mm.h is already includedMike Rapoport1-1/+0
Patch series "mm: consolidate definitions of page table accessors", v2. The low level page table accessors (pXY_index(), pXY_offset()) are duplicated across all architectures and sometimes more than once. For instance, we have 31 definition of pgd_offset() for 25 supported architectures. Most of these definitions are actually identical and typically it boils down to, e.g. static inline unsigned long pmd_index(unsigned long address) { return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1); } static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address) { return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(address); } These definitions can be shared among 90% of the arches provided XYZ_SHIFT, PTRS_PER_XYZ and xyz_page_vaddr() are defined. For architectures that really need a custom version there is always possibility to override the generic version with the usual ifdefs magic. These patches introduce include/linux/pgtable.h that replaces include/asm-generic/pgtable.h and add the definitions of the page table accessors to the new header. This patch (of 12): The linux/mm.h header includes <asm/pgtable.h> to allow inlining of the functions involving page table manipulations, e.g. pte_alloc() and pmd_alloc(). So, there is no point to explicitly include <asm/pgtable.h> in the files that include <linux/mm.h>. The include statements in such cases are remove with a simple loop: for f in $(git grep -l "include <linux/mm.h>") ; do sed -i -e '/include <asm\/pgtable.h>/ d' $f done Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-1-rppt@kernel.org Link: http://lkml.kernel.org/r/20200514170327.31389-2-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08mm: rename flush_icache_user_range to flush_icache_user_pageChristoph Hellwig1-2/+2
The function currently known as flush_icache_user_range only operates on a single page. Rename it to flush_icache_user_page as we'll need the name flush_icache_user_range for something else soon. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20200515143646.3857579-20-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08powerpc: use asm-generic/cacheflush.hChristoph Hellwig1-32/+10
Power needs almost no cache flushing routines of its own. Rely on asm-generic/cacheflush.h for the defaults. Also remove the pointless __KERNEL__ ifdef while we're at it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Link: http://lkml.kernel.org/r/20200515143646.3857579-17-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-05Merge tag 'powerpc-5.8-1' of ↵Linus Torvalds68-688/+1039
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Support for userspace to send requests directly to the on-chip GZIP accelerator on Power9. - Rework of our lockless page table walking (__find_linux_pte()) to make it safe against parallel page table manipulations without relying on an IPI for serialisation. - A series of fixes & enhancements to make our machine check handling more robust. - Lots of plumbing to add support for "prefixed" (64-bit) instructions on Power10. - Support for using huge pages for the linear mapping on 8xx (32-bit). - Remove obsolete Xilinx PPC405/PPC440 support, and an associated sound driver. - Removal of some obsolete 40x platforms and associated cruft. - Initial support for booting on Power10. - Lots of other small features, cleanups & fixes. Thanks to: Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan, Andrey Abramov, Aneesh Kumar K.V, Balamuruhan S, Bharata B Rao, Bulent Abali, Cédric Le Goater, Chen Zhou, Christian Zigotzky, Christophe JAILLET, Christophe Leroy, Dmitry Torokhov, Emmanuel Nicolet, Erhard F., Gautham R. Shenoy, Geoff Levand, George Spelvin, Greg Kurz, Gustavo A. R. Silva, Gustavo Walbon, Haren Myneni, Hari Bathini, Joel Stanley, Jordan Niethe, Kajol Jain, Kees Cook, Leonardo Bras, Madhavan Srinivasan., Mahesh Salgaonkar, Markus Elfring, Michael Neuling, Michal Simek, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Paul Mackerras, Pingfan Liu, Qian Cai, Ram Pai, Raphael Moreira Zinsly, Ravi Bangoria, Sam Bobroff, Sandipan Das, Segher Boessenkool, Stephen Rothwell, Sukadev Bhattiprolu, Tyrel Datwyler, Wolfram Sang, Xiongfeng Wang. * tag 'powerpc-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (299 commits) powerpc/pseries: Make vio and ibmebus initcalls pseries specific cxl: Remove dead Kconfig options powerpc: Add POWER10 architected mode powerpc/dt_cpu_ftrs: Add MMA feature powerpc/dt_cpu_ftrs: Enable Prefixed Instructions powerpc/dt_cpu_ftrs: Advertise support for ISA v3.1 if selected powerpc: Add support for ISA v3.1 powerpc: Add new HWCAP bits powerpc/64s: Don't set FSCR bits in INIT_THREAD powerpc/64s: Save FSCR to init_task.thread.fscr after feature init powerpc/64s: Don't let DT CPU features set FSCR_DSCR powerpc/64s: Don't init FSCR_DSCR in __init_FSCR() powerpc/32s: Fix another build failure with CONFIG_PPC_KUAP_DEBUG powerpc/module_64: Use special stub for _mcount() with -mprofile-kernel powerpc/module_64: Simplify check for -mprofile-kernel ftrace relocations powerpc/module_64: Consolidate ftrace code powerpc/32: Disable KASAN with pages bigger than 16k powerpc/uaccess: Don't set KUEP by default on book3s/32 powerpc/uaccess: Don't set KUAP by default on book3s/32 powerpc/8xx: Reduce time spent in allow_user_access() and friends ...
2020-06-05kmap: consolidate kmap_prot definitionsIra Weiny1-1/+0
Most architectures define kmap_prot to be PAGE_KERNEL. Let sparc and xtensa define there own and define PAGE_KERNEL as the default if not overridden. [akpm@linux-foundation.org: coding style fixes] Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian König <christian.koenig@amd.com> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200507150004.1423069-16-ira.weiny@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-05arch/kmap: define kmap_atomic_prot() for all arch'sIra Weiny1-17/+0
To support kmap_atomic_prot(), all architectures need to support protections passed to their kmap_atomic_high() function. Pass protections into kmap_atomic_high() and change the name to kmap_atomic_high_prot() to match. Then define kmap_atomic_prot() as a core function which calls kmap_atomic_high_prot() when needed. Finally, redefine kmap_atomic() as a wrapper of kmap_atomic_prot() with the default kmap_prot exported by the architectures. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian König <christian.koenig@amd.com> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200507150004.1423069-11-ira.weiny@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-05arch/kmap: ensure kmap_prot visibilityIra Weiny1-1/+1
We want to support kmap_atomic_prot() on all architectures and it makes sense to define kmap_atomic() to use the default kmap_prot. So we ensure all arch's have a globally available kmap_prot either as a define or exported symbol. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian König <christian.koenig@amd.com> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200507150004.1423069-9-ira.weiny@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-05arch/kunmap_atomic: consolidate duplicate codeIra Weiny1-1/+0
Every single architecture (including !CONFIG_HIGHMEM) calls... pagefault_enable(); preempt_enable(); ... before returning from __kunmap_atomic(). Lift this code into the kunmap_atomic() macro. While we are at it rename __kunmap_atomic() to kunmap_atomic_high() to be consistent. [ira.weiny@intel.com: don't enable pagefault/preempt twice] Link: http://lkml.kernel.org/r/20200518184843.3029640-1-ira.weiny@intel.com [akpm@linux-foundation.org: coding style fixes] Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian König <christian.koenig@amd.com> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Guenter Roeck <linux@roeck-us.net> Link: http://lkml.kernel.org/r/20200507150004.1423069-8-ira.weiny@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-05arch/kmap_atomic: consolidate duplicate codeIra Weiny1-2/+2
Every arch has the same code to ensure atomic operations and a check for !HIGHMEM page. Remove the duplicate code by defining a core kmap_atomic() which only calls the arch specific kmap_atomic_high() when the page is high memory. [akpm@linux-foundation.org: coding style fixes] Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian König <christian.koenig@amd.com> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200507150004.1423069-7-ira.weiny@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-05{x86,powerpc,microblaze}/kmap: move preempt disableIra Weiny1-1/+10
During this kmap() conversion series we must maintain bisect-ability. To do this, kmap_atomic_prot() in x86, powerpc, and microblaze need to remain functional. Create a temporary inline version of kmap_atomic_prot within these architectures so we can rework their kmap_atomic() calls and then lift kmap_atomic_prot() to the core. Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian König <christian.koenig@amd.com> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200507150004.1423069-6-ira.weiny@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-05arch/kunmap: remove duplicate kunmap implementationsIra Weiny1-9/+0
All architectures do exactly the same thing for kunmap(); remove all the duplicate definitions and lift the call to the core. This also has the benefit of changing kmap_unmap() on a number of architectures to be an inline call rather than an actual function. [akpm@linux-foundation.org: fix CONFIG_HIGHMEM=n build on various architectures] Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian König <christian.koenig@amd.com> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200507150004.1423069-5-ira.weiny@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-05arch/kmap: remove redundant arch specific kmapsIra Weiny1-9/+0
The kmap code for all the architectures is almost 100% identical. Lift the common code to the core. Use ARCH_HAS_KMAP_FLUSH_TLB to indicate if an arch defines kmap_flush_tlb() and call if if needed. This also has the benefit of changing kmap() on a number of architectures to be an inline call rather than an actual function. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian König <christian.koenig@amd.com> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200507150004.1423069-4-ira.weiny@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-05arch/kmap: remove BUG_ON()Ira Weiny1-1/+1
Patch series "Remove duplicated kmap code", v3. The kmap infrastructure has been copied almost verbatim to every architecture. This series consolidates obvious duplicated code by defining core functions which call into the architectures only when needed. Some of the k[un]map_atomic() implementations have some similarities but the similarities were not sufficient to warrant further changes. In addition we remove a duplicate implementation of kmap() in DRM. This patch (of 15): Replace the use of BUG_ON(in_interrupt()) in the kmap() and kunmap() in favor of might_sleep(). Besides the benefits of might_sleep(), this normalizes the implementations such that they can be made generic in subsequent patches. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian König <christian.koenig@amd.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Helge Deller <deller@gmx.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Link: http://lkml.kernel.org/r/20200507150004.1423069-1-ira.weiny@intel.com Link: http://lkml.kernel.org/r/20200507150004.1423069-2-ira.weiny@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-05powerpc: add support for folded p4d page tablesMike Rapoport10-61/+65
Implement primitives necessary for the 4th level folding, add walks of p4d level where appropriate and replace 5level-fixup.h with pgtable-nop4d.h. [rppt@linux.ibm.com: powerpc/xmon: drop unused pgdir varialble in show_pte() function] Link: http://lkml.kernel.org/r/20200519181454.GI1059226@linux.ibm.com [rppt@linux.ibm.com; build fix] Link: http://lkml.kernel.org/r/20200423141845.GI13521@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Christophe Leroy <christophe.leroy@c-s.fr> # 8xx and 83xx Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: James Morse <james.morse@arm.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200414153455.21744-9-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04Merge tag 'kvm-ppc-next-5.8-1' of ↵Paolo Bonzini3-24/+20
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD PPC KVM update for 5.8 - Updates and bug fixes for secure guest support - Other minor bug fixes and cleanups.
2020-06-04Merge branch 'akpm' (patches from Andrew)Linus Torvalds2-8/+1
Merge more updates from Andrew Morton: "More mm/ work, plenty more to come Subsystems affected by this patch series: slub, memcg, gup, kasan, pagealloc, hugetlb, vmscan, tools, mempolicy, memblock, hugetlbfs, thp, mmap, kconfig" * akpm: (131 commits) arm64: mm: use ARCH_HAS_DEBUG_WX instead of arch defined x86: mm: use ARCH_HAS_DEBUG_WX instead of arch defined riscv: support DEBUG_WX mm: add DEBUG_WX support drivers/base/memory.c: cache memory blocks in xarray to accelerate lookup mm/thp: rename pmd_mknotpresent() as pmd_mkinvalid() powerpc/mm: drop platform defined pmd_mknotpresent() mm: thp: don't need to drain lru cache when splitting and mlocking THP hugetlbfs: get unmapped area below TASK_UNMAPPED_BASE for hugetlbfs sparc32: register memory occupied by kernel as memblock.memory include/linux/memblock.h: fix minor typo and unclear comment mm, mempolicy: fix up gup usage in lookup_node tools/vm/page_owner_sort.c: filter out unneeded line mm: swap: memcg: fix memcg stats for huge pages mm: swap: fix vmstats for huge pages mm: vmscan: limit the range of LRU type balancing mm: vmscan: reclaim writepage is IO cost mm: vmscan: determine anon/file pressure balance at the reclaim root mm: balance LRU lists based on relative thrashing mm: only count actual rotations as LRU reclaim cost ...
2020-06-04powerpc/mm: drop platform defined pmd_mknotpresent()Anshuman Khandual1-4/+0
Patch series "mm/thp: Rename pmd_mknotpresent() as pmd_mknotvalid()", v2. This series renames pmd_mknotpresent() as pmd_mknotvalid(). Before that it drops an existing pmd_mknotpresent() definition from powerpc platform which was never required as it defines it's pmdp_invalidate() through subscribing __HAVE_ARCH_PMDP_INVALIDATE. This does not create any functional change. This rename was suggested by Catalin during a previous discussion while we were trying to change the THP helpers on arm64 platform for migration. https://patchwork.kernel.org/patch/11019637/ This patch (of 2): Platform needs to define pmd_mknotpresent() for generic pmdp_invalidate() only when __HAVE_ARCH_PMDP_INVALIDATE is not subscribed. Otherwise platform specific pmd_mknotpresent() is not required. Hence just drop it. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1587520326-10099-1-git-send-email-anshuman.khandual@arm.com Link: http://lkml.kernel.org/r/1584680057-13753-1-git-send-email-anshuman.khandual@arm.com Link: http://lkml.kernel.org/r/1584680057-13753-2-git-send-email-anshuman.khandual@arm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04mm/hugetlb: define a generic fallback for arch_clear_hugepage_flags()Anshuman Khandual1-4/+0
There are multiple similar definitions for arch_clear_hugepage_flags() on various platforms. Lets just add it's generic fallback definition for platforms that do not override. This help reduce code duplication. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Helge Deller <deller@gmx.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: http://lkml.kernel.org/r/1588907271-11920-4-git-send-email-anshuman.khandual@arm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>