summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-04-14kernel/kexec_file.c: allow archs to set purgatory load addressPhilipp Rudo4-32/+31
For s390 new kernels are loaded to fixed addresses in memory before they are booted. With the current code this is a problem as it assumes the kernel will be loaded to an 'arbitrary' address. In particular, kexec_locate_mem_hole searches for a large enough memory region and sets the load address (kexec_bufer->mem) to it. Luckily there is a simple workaround for this problem. By returning 1 in arch_kexec_walk_mem, kexec_locate_mem_hole is turned off. This allows the architecture to set kbuf->mem by hand. While the trick works fine for the kernel it does not for the purgatory as here the architectures don't have access to its kexec_buffer. Give architectures access to the purgatories kexec_buffer by changing kexec_load_purgatory to take a pointer to it. With this change architectures have access to the buffer and can edit it as they need. A nice side effect of this change is that we can get rid of the purgatory_info->purgatory_load_address field. As now the information stored there can directly be accessed from kbuf->mem. Link: http://lkml.kernel.org/r/20180321112751.22196-11-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14kernel/kexec_file.c: remove mis-use of sh_offset field during purgatory loadPhilipp Rudo2-34/+13
The current code uses the sh_offset field in purgatory_info->sechdrs to store a pointer to the current load address of the section. Depending whether the section will be loaded or not this is either a pointer into purgatory_info->purgatory_buf or kexec_purgatory. This is not only a violation of the ELF standard but also makes the code very hard to understand as you cannot tell if the memory you are using is read-only or not. Remove this misuse and store the offset of the section in pugaroty_info->purgatory_buf in sh_offset. Link: http://lkml.kernel.org/r/20180321112751.22196-10-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14kernel/kexec_file.c: remove unneeded variables in kexec_purgatory_setup_sechdrsPhilipp Rudo1-22/+12
The main loop currently uses quite a lot of variables to update the section headers. Some of them are unnecessary. So clean them up a little. Link: http://lkml.kernel.org/r/20180321112751.22196-9-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14kernel/kexec_file.c: remove unneeded for-loop in kexec_purgatory_setup_sechdrsPhilipp Rudo1-46/+30
To update the entry point there is an extra loop over all section headers although this can be done in the main loop. So move it there and eliminate the extra loop and variable to store the 'entry section index'. Also, in the main loop, move the usual case, i.e. non-bss section, out of the extra if-block. Link: http://lkml.kernel.org/r/20180321112751.22196-8-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14kernel/kexec_file.c: split up __kexec_load_puragoryPhilipp Rudo1-97/+103
When inspecting __kexec_load_purgatory you find that it has two tasks 1) setting up the kexec_buffer for the new kernel and, 2) setting up pi->sechdrs for the final load address. The two tasks are independent of each other. To improve readability split up __kexec_load_purgatory into two functions, one for each task, and call them directly from kexec_load_purgatory. Link: http://lkml.kernel.org/r/20180321112751.22196-7-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14kernel/kexec_file.c: use read-only sections in arch_kexec_apply_relocations*Philipp Rudo3-61/+71
When the relocations are applied to the purgatory only the section the relocations are applied to is writable. The other sections, i.e. the symtab and .rel/.rela, are in read-only kexec_purgatory. Highlight this by marking the corresponding variables as 'const'. While at it also change the signatures of arch_kexec_apply_relocations* to take section pointers instead of just the index of the relocation section. This removes the second lookup and sanity check of the sections in arch code. Link: http://lkml.kernel.org/r/20180321112751.22196-6-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14kernel/kexec_file.c: search symbols in read-only kexec_purgatoryPhilipp Rudo1-16/+22
The stripped purgatory does not contain a symtab. So when looking for symbols this is done in read-only kexec_purgatory. Highlight this by marking the corresponding variables as 'const'. Link: http://lkml.kernel.org/r/20180321112751.22196-5-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14kernel/kexec_file.c: make purgatory_info->ehdr constPhilipp Rudo2-8/+13
The kexec_purgatory buffer is read-only. Thus all pointers into kexec_purgatory are read-only, too. Point this out by explicitly marking purgatory_info->ehdr as 'const' and update the comments in purgatory_info. Link: http://lkml.kernel.org/r/20180321112751.22196-4-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14kernel/kexec_file.c: remove checks in kexec_purgatory_loadPhilipp Rudo1-14/+0
Before the purgatory is loaded several checks are done whether the ELF file in kexec_purgatory is valid or not. These checks are incomplete. For example they don't check for the total size of the sections defined in the section header table or if the entry point actually points into the purgatory. On the other hand the purgatory, although an ELF file on its own, is part of the kernel. Thus not trusting the purgatory means not trusting the kernel build itself. So remove all validity checks on the purgatory and just trust the kernel build. Link: http://lkml.kernel.org/r/20180321112751.22196-3-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14include/linux/kexec.h: silence compile warningsPhilipp Rudo1-0/+2
Patch series "kexec_file: Clean up purgatory load", v2. Following the discussion with Dave and AKASHI, here are the common code patches extracted from my recent patch set (Add kexec_file_load support to s390) [1]. The patches were extracted to allow upstream integration together with AKASHI's common code patches before the arch code gets adjusted to the new base. The reason for this series is to prepare common code for adding kexec_file_load to s390 as well as cleaning up the mis-use of the sh_offset field during purgatory load. In detail this series contains: Patch #1&2: Minor cleanups/fixes. Patch #3-9: Clean up the purgatory load/relocation code. Especially remove the mis-use of the purgatory_info->sechdrs->sh_offset field, currently holding a pointer into either kexec_purgatory (ro) or purgatory_buf (rw) depending on the section. With these patches the section address will be calculated verbosely and sh_offset will contain the offset of the section in the stripped purgatory binary (purgatory_buf). Patch #10: Allows architectures to set the purgatory load address. This patch is important for s390 as the kernel and purgatory have to be loaded to fixed addresses. In current code this is impossible as the purgatory load is opaque to the architecture. Patch #11: Moves x86 purgatories sha implementation to common lib/ directory to allow reuse in other architectures. This patch (of 11) When building the kernel with CONFIG_KEXEC_FILE enabled gcc prints a compile warning multiple times. In file included from <path>/linux/init/initramfs.c:526:0: <path>/include/linux/kexec.h:120:9: warning: `struct kimage' declared inside parameter list [enabled by default] unsigned long cmdline_len); ^ This is because the typedefs for kexec_file_load uses struct kimage before it is declared. Fix this by simply forward declaring struct kimage. Link: http://lkml.kernel.org/r/20180321112751.22196-2-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14kexec_file, x86: move re-factored code to generic sideAKASHI Takahiro3-188/+201
In the previous patches, commonly-used routines, exclude_mem_range() and prepare_elf64_headers(), were carved out. Now place them in kexec common code. A prefix "crash_" is given to each of their names to avoid possible name collisions. Link: http://lkml.kernel.org/r/20180306102303.9063-8-takahiro.akashi@linaro.org Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Acked-by: Dave Young <dyoung@redhat.com> Tested-by: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14x86: kexec_file: clean up prepare_elf64_headers()AKASHI Takahiro1-11/+7
Removing bufp variable in prepare_elf64_headers() makes the code simpler and more understandable. Link: http://lkml.kernel.org/r/20180306102303.9063-7-takahiro.akashi@linaro.org Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Acked-by: Dave Young <dyoung@redhat.com> Tested-by: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14x86: kexec_file: lift CRASH_MAX_RANGES limit on crash_mem bufferAKASHI Takahiro1-51/+31
While CRASH_MAX_RANGES (== 16) seems to be good enough, fixed-number array is not a good idea in general. In this patch, size of crash_mem buffer is calculated as before and the buffer is now dynamically allocated. This change also allows removing crash_elf_data structure. Link: http://lkml.kernel.org/r/20180306102303.9063-6-takahiro.akashi@linaro.org Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Acked-by: Dave Young <dyoung@redhat.com> Tested-by: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14x86: kexec_file: remove X86_64 dependency from prepare_elf64_headers()AKASHI Takahiro1-12/+12
The code guarded by CONFIG_X86_64 is necessary on some architectures which have a dedicated kernel mapping outside of linear memory mapping. (arm64 is among those.) In this patch, an additional argument, kernel_map, is added to enable/ disable the code removing #ifdef. Link: http://lkml.kernel.org/r/20180306102303.9063-5-takahiro.akashi@linaro.org Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Acked-by: Dave Young <dyoung@redhat.com> Tested-by: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14x86: kexec_file: purge system-ram walking from prepare_elf64_headers()AKASHI Takahiro1-63/+58
While prepare_elf64_headers() in x86 looks pretty generic for other architectures' use, it contains some code which tries to list crash memory regions by walking through system resources, which is not always architecture agnostic. To make this function more generic, the related code should be purged. In this patch, prepare_elf64_headers() simply scans crash_mem buffer passed and add all the listed regions to elf header as a PT_LOAD segment. So walk_system_ram_res(prepare_elf64_headers_callback) have been moved forward before prepare_elf64_headers() where the callback, prepare_elf64_headers_callback(), is now responsible for filling up crash_mem buffer. Meanwhile exclude_elf_header_ranges() used to be called every time in this callback it is rather redundant and now called only once in prepare_elf_headers() as well. Link: http://lkml.kernel.org/r/20180306102303.9063-4-takahiro.akashi@linaro.org Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Acked-by: Dave Young <dyoung@redhat.com> Tested-by: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14kexec_file,x86,powerpc: factor out kexec_file_ops functionsAKASHI Takahiro8-94/+71
As arch_kexec_kernel_image_{probe,load}(), arch_kimage_file_post_load_cleanup() and arch_kexec_kernel_verify_sig() are almost duplicated among architectures, they can be commonalized with an architecture-defined kexec_file_ops array. So let's factor them out. Link: http://lkml.kernel.org/r/20180306102303.9063-3-takahiro.akashi@linaro.org Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Acked-by: Dave Young <dyoung@redhat.com> Tested-by: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14kexec_file: make use of purgatory optionalAKASHI Takahiro3-0/+11
Patch series "kexec_file, x86, powerpc: refactoring for other architecutres", v2. This is a preparatory patchset for adding kexec_file support on arm64. It was originally included in a arm64 patch set[1], but Philipp is also working on their kexec_file support on s390[2] and some changes are now conflicting. So these common parts were extracted and put into a separate patch set for better integration. What's more, my original patch#4 was split into a few small chunks for easier review after Dave's comment. As such, the resulting code is basically identical with my original, and the only *visible* differences are: - renaming of _kexec_kernel_image_probe() and _kimage_file_post_load_cleanup() - change one of types of arguments at prepare_elf64_headers() Those, unfortunately, require a couple of trivial changes on the rest (#1, #6 to #13) of my arm64 kexec_file patch set[1]. Patch #1 allows making a use of purgatory optional, particularly useful for arm64. Patch #2 commonalizes arch_kexec_kernel_{image_probe, image_load, verify_sig}() and arch_kimage_file_post_load_cleanup() across architectures. Patches #3-#7 are also intended to generalize parse_elf64_headers(), along with exclude_mem_range(), to be made best re-use of. [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2018-February/561182.html [2] http://lkml.iu.edu//hypermail/linux/kernel/1802.1/02596.html This patch (of 7): On arm64, crash dump kernel's usable memory is protected by *unmapping* it from kernel virtual space unlike other architectures where the region is just made read-only. It is highly unlikely that the region is accidentally corrupted and this observation rationalizes that digest check code can also be dropped from purgatory. The resulting code is so simple as it doesn't require a bit ugly re-linking/relocation stuff, i.e. arch_kexec_apply_relocations_add(). Please see: http://lists.infradead.org/pipermail/linux-arm-kernel/2017-December/545428.html All that the purgatory does is to shuffle arguments and jump into a new kernel, while we still need to have some space for a hash value (purgatory_sha256_digest) which is never checked against. As such, it doesn't make sense to have trampline code between old kernel and new kernel on arm64. This patch introduces a new configuration, ARCH_HAS_KEXEC_PURGATORY, and allows related code to be compiled in only if necessary. [takahiro.akashi@linaro.org: fix trivial screwup] Link: http://lkml.kernel.org/r/20180309093346.GF25863@linaro.org Link: http://lkml.kernel.org/r/20180306102303.9063-2-takahiro.akashi@linaro.org Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Acked-by: Dave Young <dyoung@redhat.com> Tested-by: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14proc: revalidate misc dentriesAlexey Dobriyan1-1/+22
If module removes proc directory while another process pins it by chdir'ing to it, then subsequent recreation of proc entry and all entries down the tree will not be visible to any process until pinning process unchdir from directory and unpins everything. Steps to reproduce: proc_mkdir("aaa", NULL); proc_create("aaa/bbb", ...); chdir("/proc/aaa"); remove_proc_entry("aaa/bbb", NULL); remove_proc_entry("aaa", NULL); proc_mkdir("aaa", NULL); # inaccessible because "aaa" dentry still points # to the original "aaa". proc_create("aaa/bbb", ...); Fix is to implement ->d_revalidate and ->d_delete. Link: http://lkml.kernel.org/r/20180312201938.GA4871@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14mm, slab: reschedule cache_reap() on the same CPUVlastimil Babka1-1/+2
cache_reap() is initially scheduled in start_cpu_timer() via schedule_delayed_work_on(). But then the next iterations are scheduled via schedule_delayed_work(), i.e. using WORK_CPU_UNBOUND. Thus since commit ef557180447f ("workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs") there is no guarantee the future iterations will run on the originally intended cpu, although it's still preferred. I was able to demonstrate this with /sys/module/workqueue/parameters/debug_force_rr_cpu. IIUC, it may also happen due to migrating timers in nohz context. As a result, some cpu's would be calling cache_reap() more frequently and others never. This patch uses schedule_delayed_work_on() with the current cpu when scheduling the next iteration. Link: http://lkml.kernel.org/r/20180411070007.32225-1-vbabka@suse.cz Fixes: ef557180447f ("workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Pekka Enberg <penberg@kernel.org> Acked-by: Christoph Lameter <cl@linux.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: David Rientjes <rientjes@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Stephen Boyd <sboyd@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14kexec: export PG_swapbacked to VMCOREINFOPetr Tesarik1-0/+1
Since commit 6326fec1122c ("mm: Use owner_priv bit for PageSwapCache, valid when PageSwapBacked"), PG_swapcache is an alias for PG_owner_priv_1, which may be also used for other purposes. To know whether the bit indeed has the PG_swapcache meaning, it is necessary to check PG_swapbacked, hence this bit must be exported. Link: http://lkml.kernel.org/r/20180410161345.142e142d@ezekiel.suse.cz Signed-off-by: Petr Tesarik <ptesarik@suse.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Young <dyoung@redhat.com> Cc: Xunlei Pang <xlpang@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: "Marc-Andr Lureau" <marcandre.lureau@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14ipc/shm: fix use-after-free of shm file via remap_file_pages()Eric Biggers1-3/+20
syzbot reported a use-after-free of shm_file_data(file)->file->f_op in shm_get_unmapped_area(), called via sys_remap_file_pages(). Unfortunately it couldn't generate a reproducer, but I found a bug which I think caused it. When remap_file_pages() is passed a full System V shared memory segment, the memory is first unmapped, then a new map is created using the ->vm_file. Between these steps, the shm ID can be removed and reused for a new shm segment. But, shm_mmap() only checks whether the ID is currently valid before calling the underlying file's ->mmap(); it doesn't check whether it was reused. Thus it can use the wrong underlying file, one that was already freed. Fix this by making the "outer" shm file (the one that gets put in ->vm_file) hold a reference to the real shm file, and by making __shm_open() require that the file associated with the shm ID matches the one associated with the "outer" file. Taking the reference to the real shm file is needed to fully solve the problem, since otherwise sfd->file could point to a freed file, which then could be reallocated for the reused shm ID, causing the wrong shm segment to be mapped (and without the required permission checks). Commit 1ac0b6dec656 ("ipc/shm: handle removed segments gracefully in shm_mmap()") almost fixed this bug, but it didn't go far enough because it didn't consider the case where the shm ID is reused. The following program usually reproduces this bug: #include <stdlib.h> #include <sys/shm.h> #include <sys/syscall.h> #include <unistd.h> int main() { int is_parent = (fork() != 0); srand(getpid()); for (;;) { int id = shmget(0xF00F, 4096, IPC_CREAT|0700); if (is_parent) { void *addr = shmat(id, NULL, 0); usleep(rand() % 50); while (!syscall(__NR_remap_file_pages, addr, 4096, 0, 0, 0)); } else { usleep(rand() % 50); shmctl(id, IPC_RMID, NULL); } } } It causes the following NULL pointer dereference due to a 'struct file' being used while it's being freed. (I couldn't actually get a KASAN use-after-free splat like in the syzbot report. But I think it's possible with this bug; it would just take a more extraordinary race...) BUG: unable to handle kernel NULL pointer dereference at 0000000000000058 PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI CPU: 9 PID: 258 Comm: syz_ipc Not tainted 4.16.0-05140-gf8cf2f16a7c95 #189 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014 RIP: 0010:d_inode include/linux/dcache.h:519 [inline] RIP: 0010:touch_atime+0x25/0xd0 fs/inode.c:1724 [...] Call Trace: file_accessed include/linux/fs.h:2063 [inline] shmem_mmap+0x25/0x40 mm/shmem.c:2149 call_mmap include/linux/fs.h:1789 [inline] shm_mmap+0x34/0x80 ipc/shm.c:465 call_mmap include/linux/fs.h:1789 [inline] mmap_region+0x309/0x5b0 mm/mmap.c:1712 do_mmap+0x294/0x4a0 mm/mmap.c:1483 do_mmap_pgoff include/linux/mm.h:2235 [inline] SYSC_remap_file_pages mm/mmap.c:2853 [inline] SyS_remap_file_pages+0x232/0x310 mm/mmap.c:2769 do_syscall_64+0x64/0x1a0 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ebiggers@google.com: add comment] Link: http://lkml.kernel.org/r/20180410192850.235835-1-ebiggers3@gmail.com Link: http://lkml.kernel.org/r/20180409043039.28915-1-ebiggers3@gmail.com Reported-by: syzbot+d11f321e7f1923157eac80aa990b446596f46439@syzkaller.appspotmail.com Fixes: c8d78c1823f4 ("mm: replace remap_file_pages() syscall with emulation") Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14mm/filemap.c: provide dummy filemap_page_mkwrite() for NOMMUArnd Bergmann1-1/+5
Building orangefs on MMU-less machines now results in a link error because of the newly introduced use of the filemap_page_mkwrite() function: ERROR: "filemap_page_mkwrite" [fs/orangefs/orangefs.ko] undefined! This adds a dummy version for it, similar to the existing generic_file_mmap and generic_file_readonly_mmap stubs in the same file, to avoid the link error without adding #ifdefs in each file system that uses these. Link: http://lkml.kernel.org/r/20180409105555.2439976-1-arnd@arndb.de Fixes: a5135eeab2e5 ("orangefs: implement vm_ops->fault") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Martin Brandenburg <martin@omnibond.com> Cc: Mike Marshall <hubcap@omnibond.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14mm/gup.c: document return valueMichael S. Tsirkin6-3/+17
__get_user_pages_fast handles errors differently from get_user_pages_fast: the former always returns the number of pages pinned, the later might return a negative error code. Link: http://lkml.kernel.org/r/1522962072-182137-6-git-send-email-mst@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thorsten Leemhuis <regressions@leemhuis.info> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14get_user_pages_fast(): return -EFAULT on access_ok failureMichael S. Tsirkin1-1/+4
get_user_pages_fast is supposed to be a faster drop-in equivalent of get_user_pages. As such, callers expect it to return a negative return code when passed an invalid address, and never expect it to return 0 when passed a positive number of pages, since its documentation says: * Returns number of pages pinned. This may be fewer than the number * requested. If nr_pages is 0 or negative, returns 0. If no pages * were pinned, returns -errno. When get_user_pages_fast fall back on get_user_pages this is exactly what happens. Unfortunately the implementation is inconsistent: it returns 0 if passed a kernel address, confusing callers: for example, the following is pretty common but does not appear to do the right thing with a kernel address: ret = get_user_pages_fast(addr, 1, writeable, &page); if (ret < 0) return ret; Change get_user_pages_fast to return -EFAULT when supplied a kernel address to make it match expectations. All callers have been audited for consistency with the documented semantics. Link: http://lkml.kernel.org/r/1522962072-182137-4-git-send-email-mst@redhat.com Fixes: 5b65c4677a57 ("mm, x86/mm: Fix performance regression in get_user_pages_fast()") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reported-by: syzbot+6304bf97ef436580fede@syzkaller.appspotmail.com Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thorsten Leemhuis <regressions@leemhuis.info> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14mm/gup_benchmark: handle gup failuresMichael S. Tsirkin1-1/+3
Patch series "mm/get_user_pages_fast fixes, cleanups", v2. Turns out get_user_pages_fast and __get_user_pages_fast return different values on error when given a single page: __get_user_pages_fast returns 0. get_user_pages_fast returns either 0 or an error. Callers of get_user_pages_fast expect an error so fix it up to return an error consistently. Stress the difference between get_user_pages_fast and __get_user_pages_fast to make sure callers aren't confused. This patch (of 3): __gup_benchmark_ioctl does not handle the case where get_user_pages_fast fails: - a negative return code will cause a buffer overrun - returning with partial success will cause use of uninitialized memory. [akpm@linux-foundation.org: simplification] Link: http://lkml.kernel.org/r/1522962072-182137-3-git-send-email-mst@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thorsten Leemhuis <regressions@leemhuis.info> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14resource: fix integer overflow at reallocationTakashi Iwai1-1/+2
We've got a bug report indicating a kernel panic at booting on an x86-32 system, and it turned out to be the invalid PCI resource assigned after reallocation. __find_resource() first aligns the resource start address and resets the end address with start+size-1 accordingly, then checks whether it's contained. Here the end address may overflow the integer, although resource_contains() still returns true because the function validates only start and end address. So this ends up with returning an invalid resource (start > end). There was already an attempt to cover such a problem in the commit 47ea91b4052d ("Resource: fix wrong resource window calculation"), but this case is an overseen one. This patch adds the validity check of the newly calculated resource for avoiding the integer overflow problem. Bugzilla: http://bugzilla.opensuse.org/show_bug.cgi?id=1086739 Link: http://lkml.kernel.org/r/s5hpo37d5l8.wl-tiwai@suse.de Fixes: 23c570a67448 ("resource: ability to resize an allocated resource") Signed-off-by: Takashi Iwai <tiwai@suse.de> Reported-by: Michael Henders <hendersm@shaw.ca> Tested-by: Michael Henders <hendersm@shaw.ca> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Ram Pai <linuxram@us.ibm.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14Merge branch 'overlayfs-linus' of ↵Linus Torvalds12-172/+510
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs updates from Miklos Szeredi: "In addition to bug fixes and cleanups there are two new features from Amir: - Consistent inode number support for the case when layers are not all on the same filesystem (feature is dubbed "xino"). - Optimize overlayfs file handle decoding. This one touches the exportfs interface to allow detecting the disconnected directory case" * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: update documentation w.r.t "xino" feature ovl: add support for "xino" mount and config options ovl: consistent d_ino for non-samefs with xino ovl: consistent i_ino for non-samefs with xino ovl: constant st_ino for non-samefs with xino ovl: allocate anon bdev per unique lower fs ovl: factor out ovl_map_dev_ino() helper ovl: cleanup ovl_update_time() ovl: add WARN_ON() for non-dir redirect cases ovl: cleanup setting OVL_INDEX ovl: set d->is_dir and d->opaque for last path element ovl: Do not check for redirect if this is last layer ovl: lookup in inode cache first when decoding lower file handle ovl: do not try to reconnect a disconnected origin dentry ovl: disambiguate ovl_encode_fh() ovl: set lower layer st_dev only if setting lower st_ino ovl: fix lookup with middle layer opaque dir and absolute path redirects ovl: Set d->last properly during lookup ovl: set i_ino to the value of st_ino for NFS export
2018-04-14Merge branch 'next' of ↵Linus Torvalds8-5/+283
git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux Pull thermal management update from Zhang Rui: - Fix race condition in imx_thermal_probe() (Mikhail Lappo) - Add cooling device's statistics in sysfs (Viresh Kumar) * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: thermal: Add cooling device's statistics in sysfs thermal: imx: Fix race condition in imx_thermal_probe()
2018-04-14Merge branch 'dmi-for-linus' of ↵Linus Torvalds2-4/+13
git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging Pull dmi updates from Jean Delvare. * 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging: firmware: dmi_scan: Use lowercase letters for UUID firmware: dmi_scan: Add DMI_OEM_STRING support to dmi_matches firmware: dmi_scan: Fix UUID length safety check
2018-04-14Merge tag 'chrome-platform-for-linus-4.17' of ↵Linus Torvalds10-639/+789
git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform Pull chrome platform updates from Benson Leung: - a series from Dmitry to remove platform data from chromeos_laptop.c, which was the only user of platform data for the atmel_mxt_ts driver. - a series to clean up sysfs and debugfs for cros_ec - other misc cleanups * tag 'chrome-platform-for-linus-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform: (22 commits) platform/chrome: mfd/cros_ec_dev: Add sysfs entry to set keyboard wake lid angle platform/chrome: cros_ec_debugfs: Add PD port info to debugfs platform/chrome: cros_ec_debugfs: Use octal permissions '0444' platform/chrome: cros_ec_sysfs: use permission-specific DEVICE_ATTR variants platform/chrome: cros_ec_sysfs: introduce to_cros_ec_dev define. platform/chrome: cros_ec_sysfs: Modify error handling platform/chrome: cros_ec_lpc: Add support for Google devices using custom coreboot firmware platform/chrome: cros_ec_lpc: wake up from s2idle on Chrome EC Input: atmel_mxt_ts - remove platform data support platform/chrome: chromeos_laptop - discard data for unneeded boards platform/chrome: chromeos_laptop - use device properties for Pixel platform/chrome: chromeos_laptop - rely on I2C to set up interrupt trigger platform/chrome: chromeos_laptop - use I2C notifier to create devices platform/chrome: chromeos_laptop - parse DMI IRQ data once platform/chrome: chromeos_laptop - rework i2c peripherals initialization platform/chrome: chromeos_laptop - factor out getting IRQ from DMI platform/chrome: chromeos_laptop - introduce pr_fmt() platform/chrome: chromeos_laptop - stop setting suspend mode for Atmel devices platform/chrome: chromeos_laptop - add SPDX identifier Input: atmel_mxt_ts - switch ChromeOS ACPI devices to generic props ...
2018-04-14Merge tag 'clk-for-linus' of ↵Linus Torvalds198-3950/+16113
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk updates from Stephen Boyd: "The large diff this time around is from the addition of a new clk driver for the TI Davinci family of SoCs. So far those clks have been supported with a custom implementation of the clk API in the arch port instead of in the CCF. With this driver merged we're one step closer to having a single clk API implementation. The other large diff is from the Amlogic clk driver that underwent some major surgery to use regmap. Beyond that, the biggest hitter is Samsung which needed some reworks to properly handle clk provider power domains and a bunch of PLL rate updates. The core framework was fairly quiet this round, just getting some cleanups and small fixes for some of the more esoteric features. And the usual set of driver non-critical fixes, cleanups, and minor additions are here as well. Core: - Rejig clk_ops::init() to be a little earlier for phase/accuracy ops - debugfs ops macroized to shave some lines of boilerplate code - Always calculate the phase instead of caching it in clk_get_phase() - More __must_check on bulk clk APIs New Drivers: - TI's Davinci family of SoCs - Intel's Stratix10 SoC - stm32mp157 SoC - Allwinner H6 CCU - Silicon Labs SI544 clock generator chip - Renesas R-Car M3-N and V3H SoCs - i.MX6SLL SoCs Removed Drivers: - ST-Ericsson AB8540/9540 Updates: - Mediatek MT2701 and MT7622 audsys support and MT2712 updates - STM32F469 DSI and STM32F769 sdmmc2 support - GPIO clks can sleep now - Spreadtrum SC9860 RTC clks - Nvidia Tegra MBIST workarounds and various minor fixes - Rockchip phase handling fixes and a memory leak plugged - Renesas drivers switch to readl/writel from clk_readl/clk_writel - Renesas gained CPU (Z/Z2) and watchdog support - Rockchip rk3328 display clks and rk3399 1.6GHz PLL support - Qualcomm PM8921 PMIC XO buffers - Amlogic migrates to regmap APIs - TI Keystone clk latching support - Allwinner H3 and H5 video clk fixes - Broadcom BCM2835 PLLs needed another bit to enable - i.MX6SX CKO mux fix and i.MX7D Video PLL divider fix - i.MX6UL/ULL epdc_podf support - Hi3798CV200 COMBPHY0 and USB2_OTG_UTMI and phase support for eMMC" * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (233 commits) clk: davinci: add a reset lookup table for psc0 clk: imx: add clock driver for imx6sll dt-bindings: imx: update clock doc for imx6sll clk: imx: add new gate/gate2 wrapper funtion clk: imx: Add CLK_IS_CRITICAL flag for busy divider and busy mux clk: cs2000: set pm_ops in hibernate-compatible way clk: bcm2835: De-assert/assert PLL reset signal when appropriate clk: imx7d: Move clks_init_on before any clock operations clk: imx7d: Correct ahb clk parent select clk: imx7d: Correct dram pll type clk: imx7d: Add USB clock information clk: socfpga: stratix10: add clock driver for Stratix10 platform dt-bindings: documentation: add clock bindings information for Stratix10 clk: ti: fix flag space conflict with clkctrl clocks clk: uniphier: add additional ethernet clock lines for Pro4 clk: uniphier: add SATA clock control support clk: uniphier: add PCIe clock control support clk: Add driver for the si544 clock generator chip clk: davinci: Remove redundant dev_err calls clk: uniphier: add ethernet clock control support for PXs3 ...
2018-04-14Merge tag 'pwm/for-4.17-rc1' of ↵Linus Torvalds16-63/+205
git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm Pull pwm updates from Thierry Reding: "This set of changes adds support for more generations of the RCar controller as well as runtime PM support. The JZ4740 driver gains support for device tree and can now be used on all Ingenic SoCs. Rounding things off is a random assortment of fixes and cleanups all across the board" * tag 'pwm/for-4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: (29 commits) pwm: rcar: Add suspend/resume support pwm: rcar: Use PM Runtime to control module clock dt-bindings: pwm: rcar: Add bindings for R-Car M3N support pwm: rcar: Fix a condition to prevent mismatch value setting to duty pwm: sysfs: Use put_device() instead of kfree() dt-bindings: pwm: sunxi: Add new compatible strings pwm: sun4i: Simplify controller mapping pwm: sun4i: Drop unused .has_rdy member pwm: sun4i: Properly check current state pwm: Remove depends on AVR32 pwm: stm32: LPTimer: Use 3 cells ->of_xlate() dt-bindings: pwm-stm32-lp: Add #pwm-cells pwm: stm32: Protect common prescaler for all channels pwm: stm32: Remove unused struct device pwm: mediatek: Improve precision in rate calculation pwm: mediatek: Remove redundant MODULE_ALIAS entries pwm: mediatek: Fix up PWM4 and PWM5 malfunction on MT7623 pwm: jz4740: Enable for all Ingenic SoCs pwm: jz4740: Add support for devicetree pwm: jz4740: Implement ->set_polarity() ...
2018-04-14Merge tag 'linux-watchdog-4.17-rc1' of ↵Linus Torvalds55-456/+628
git://www.linux-watchdog.org/linux-watchdog Pull watchdog updates from Wim Van Sebroeck: - Add Nuvoton NPCM watchdog driver - renesas_wdt: Add R-Car Gen2 support - renesas_wdt: add suspend/resume and restart handler support - hpwdt: convert to watchdog core and improve NMI - improve timeout setting/handling in various drivers - coh901327: make license text and module licence match - fix error handling in asm9260_wdt, sprd_wdt and davinci_wdt - aspeed imrovements - dw improvements (for control register & suspend/resume) - add SPDX identifiers for watchdog subsystem * tag 'linux-watchdog-4.17-rc1' of git://www.linux-watchdog.org/linux-watchdog: (35 commits) watchdog: davinci_wdt: fix error handling in davinci_wdt_probe() watchdog: add SPDX identifiers for watchdog subsystem watchdog: aspeed: Allow configuring for alternate boot watchdog: Add Nuvoton NPCM watchdog driver dt-bindings: watchdog: Add Nuvoton NPCM description watchdog: dw: save/restore control and timeout across suspend/resume watchdog: dw: RMW the control register watchdog: sprd_wdt: Fix error handling in sprd_wdt_enable() watchdog: aspeed: Fix translation of reset mode to ctrl register watchdog: renesas_wdt: Add restart handler watchdog: renesas_wdt: Add R-Car Gen2 support watchdog: renesas_wdt: Add suspend/resume support watchdog: f71808e_wdt: Fix WD_EN register read watchdog: hpwdt: Update driver version. watchdog: hpwdt: Add dynamic debug watchdog: hpwdt: Programable Pretimeout NMI watchdog: hpwdt: remove allow_kdump module parameter. watchdog: hpwdt: condition early return of NMI handler on iLO5 watchdog: hpwdt: Modify to use watchdog core. watchdog: hpwdt: Update nmi_panic message. ...
2018-04-14Merge tag 'apparmor-pr-2018-04-10' of ↵Linus Torvalds33-525/+2119
git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor Pull apparmor updates from John Johansen: "Features: - add base infrastructure for socket mediation. ABI bump and additional checks to ensure only v8 compliant policy uses socket af mediation. - improve and cleanup dfa verification - improve profile attachment logic - improve overlapping expression handling - add the xattr matching to the attachment logic - improve signal mediation handling with stacked labels - improve handling of no_new_privs in a label stack Cleanups and changes: - use dfa to parse string split - bounded version of label_parse - proper line wrap nulldfa.in - split context out into task and cred naming to better match usage - simplify code in aafs Bug fixes: - fix display of .ns_name for containers - fix resource audit messages when auditing peer - fix logging of the existence test for signals - fix resource audit messages when auditing peer - fix display of .ns_name for containers - fix an error code in verify_table_headers() - fix memory leak on buffer on error exit path - fix error returns checks by making size a ssize_t" * tag 'apparmor-pr-2018-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor: (36 commits) apparmor: fix memory leak on buffer on error exit path apparmor: fix dangling symlinks to policy rawdata after replacement apparmor: Fix an error code in verify_table_headers() apparmor: fix error returns checks by making size a ssize_t apparmor: update MAINTAINERS file git and wiki locations apparmor: remove POLICY_MEDIATES_SAFE apparmor: add base infastructure for socket mediation apparmor: improve overlapping domain attachment resolution apparmor: convert attaching profiles via xattrs to use dfa matching apparmor: Add support for attaching profiles via xattr, presence and value apparmor: cleanup: simplify code to get ns symlink name apparmor: cleanup create_aafs() error path apparmor: dfa split verification of table headers apparmor: dfa add support for state differential encoding apparmor: dfa move character match into a macro apparmor: update domain transitions that are subsets of confinement at nnp apparmor: move context.h to cred.h apparmor: move task related defines and fns to task.X files apparmor: cleanup, drop unused fn __aa_task_is_confined() apparmor: cleanup fixup description of aa_replace_profiles ...
2018-04-14Merge tag 'for-linus-20180413' of git://git.kernel.dk/linux-blockLinus Torvalds19-226/+245
Pull block fixes from Jens Axboe: "Followup fixes for this merge window. This contains: - Series from Ming, fixing corner cases in our CPU <-> queue mapping. This triggered repeated warnings on especially s390, but I also hit it in cpu hot plug/unplug testing while doing IO on NVMe on x86-64. - Another fix from Ming, ensuring that we always order budget and driver tag identically, avoiding a deadlock on QD=1 devices. - Loop locking regression fix from this merge window, from Omar. - Another loop locking fix, this time missing an unlock, from Tetsuo Handa. - Fix for racing IO submission with device removal from Bart. - sr reference fix from me, fixing a case where disk change or getevents can race with device removal. - Set of nvme fixes by way of Keith, from various contributors" * tag 'for-linus-20180413' of git://git.kernel.dk/linux-block: (28 commits) nvme: expand nvmf_check_if_ready checks nvme: Use admin command effects for admin commands nvmet: fix space padding in serial number nvme: check return value of init_srcu_struct function nvmet: Fix nvmet_execute_write_zeroes sector count nvme-pci: Separate IO and admin queue IRQ vectors nvme-pci: Remove unused queue parameter nvme-pci: Skip queue deletion if there are no queues nvme: target: fix buffer overflow nvme: don't send keep-alives to the discovery controller nvme: unexport nvme_start_keep_alive nvme-loop: fix kernel oops in case of unhandled command nvme: enforce 64bit offset for nvme_get_log_ext fn sr: get/drop reference to device in revalidate and check_events blk-mq: Revert "blk-mq: reimplement blk_mq_hw_queue_mapped" blk-mq: Avoid that submitting a bio concurrently with device removal triggers a crash backing: silence compiler warning using __printf blk-mq: remove code for dealing with remapping queue blk-mq: reimplement blk_mq_hw_queue_mapped blk-mq: don't check queue mapped in __blk_mq_delay_run_hw_queue() ...
2018-04-14Merge branch 'i2c/for-current' of ↵Linus Torvalds3-6/+47
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull more i2c updates from Wolfram Sang: - hot bugfix for i801 to make laptops with strange BIOS reboot again when using SMBUS Host notify - change to MAINTAINERS creating a specific fallback entry for I2C host drivers and settings its status to "Odd fixes" - a long overdue param checking for the I2C core * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: add param sanity check to i2c_transfer() MAINTAINERS: add maintainer for Renesas I2C related drivers MAINTAINERS: remove me as maintainer for I2C host drivers i2c: i801: Restore configuration at shutdown i2c: i801: Save register SMBSLVCMD value only once
2018-04-13Merge tag 'sh-for-4.17' of git://git.libc.org/linux-shLinus Torvalds9-38/+84
Pull arch/sh updates from Rich Felker: "Fixes for bugs in futex, device tree, and userspace breakpoint traps, and for PCI issues on SH7786" * tag 'sh-for-4.17' of git://git.libc.org/linux-sh: arch/sh: pcie-sh7786: handle non-zero DMA offset arch/sh: pcie-sh7786: adjust the memory mapping arch/sh: pcie-sh7786: adjust PCI MEM and IO regions arch/sh: pcie-sh7786: exclude unusable PCI MEM areas arch/sh: pcie-sh7786: mark unavailable PCI resource as disabled arch/sh: pci: don't use disabled resources arch/sh: make the DMA mapping operations observe dev->dma_pfn_offset arch/sh: add sh7786_mm_sel() function sh: fix debug trap failure to process signals before return to user sh: fix memory corruption of unflattened device tree sh: fix futex FUTEX_OP_SET op on userspace addresses
2018-04-13Merge tag 'arm64-upstream' of ↵Linus Torvalds10-199/+242
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull more arm64 updates from Will Deacon: "A few late updates to address some issues arising from conflicts with other trees: - Removal of Qualcomm-specific Spectre-v2 mitigation in favour of the generic SMCCC-based firmware call - Fix EL2 hardening capability checking, which was bodged to reduce conflicts with the KVM tree - Add some currently unused assembler macros for managing SIMD registers which will be used by some crypto code in the next merge window" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: assembler: add macros to conditionally yield the NEON under PREEMPT arm64: assembler: add utility macros to push/pop stack frames arm64: Move the content of bpi.S to hyp-entry.S arm64: Get rid of __smccc_workaround_1_hvc_* arm64: capabilities: Rework EL2 vector hardening entry arm64: KVM: Use SMCCC_ARCH_WORKAROUND_1 for Falkor BP hardening
2018-04-13Merge branch 'for-linus' of ↵Linus Torvalds30-1358/+357
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Martin Schwidefsky: "Three notable larger changes next to the usual bug fixing: - update the email addresses in MAINTAINERS for the s390 folks to use the simpler linux.ibm.com domain instead of the old linux.vnet.ibm.com - an update for the zcrypt device driver that removes some old and obsolete interfaces and add support for up to 256 crypto adapters - a rework of the IPL aka boot code" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (23 commits) s390: correct nospec auto detection init order s390/zcrypt: Support up to 256 crypto adapters. s390/zcrypt: Remove deprecated zcrypt proc interface. s390/zcrypt: Remove deprecated ioctls. s390/zcrypt: Make ap init functions static. MAINTAINERS: update s390 maintainers email addresses s390/ipl: remove reipl_method and dump_method s390/ipl: correct kdump reipl block checksum calculation s390/ipl: remove non-existing functions declaration s390: assume diag308 set always works s390/ipl: avoid adding scpdata to cmdline during ftp/dvd boot s390/ipl: correct ipl parmblock valid checks s390/ipl: rely on diag308 store to get ipl info s390/ipl: move ipl_flags to ipl.c s390/ipl: get rid of ipl_ssid and ipl_devno s390/ipl: unite diag308 and scsi boot ipl blocks s390/ipl: ensure loadparm valid flag is set s390/qdio: lock device while installing IRQ handler s390/qdio: clear intparm during shutdown s390/ccwgroup: require at least one ccw device ...
2018-04-13kconfig: extend output of 'listnewconfig'Don Zickus1-2/+12
We at Red Hat/Fedora have generally tried to have a per file breakdown of every config option we set. This makes it easy for us to add new options when they are exposed and keep a changelog of why they were set. A Fedora example is here: https://src.fedoraproject.org/cgit/rpms/kernel.git/tree/configs/fedora/generic Using various merge scripts, we build up a config file and run it through 'make listnewconfig' and 'make oldnoconfig'. The idea is to print out new config options that haven't been manually set and use the default until a patch is posted to set it properly. To speed things up, it would be nice to make it easier to generate a patch to post the default setting. The output of 'make listnewconfig' has two issues that limit us: - it doesn't provide the default value - it doesn't provide the new 'choice' options that get flagged in 'oldconfig' This patch extends 'listnewconfig' to address the above two issues. This allows us to run a script make listnewconfig | rhconfig-tool -o patches; git send-email patches/ The output of 'make listnewconfig': CONFIG_NET_EMATCH_IPT CONFIG_IPVLAN CONFIG_ICE CONFIG_NET_VENDOR_NI CONFIG_IEEE802154_MCR20A CONFIG_IR_IMON_DECODER CONFIG_IR_IMON_RAW The new output of 'make listnewconfig': CONFIG_KERNEL_XZ=n CONFIG_KERNEL_LZO=n CONFIG_NET_EMATCH_IPT=n CONFIG_IPVLAN=n CONFIG_ICE=n CONFIG_NET_VENDOR_NI=y CONFIG_IEEE802154_MCR20A=n CONFIG_IR_IMON_DECODER=n CONFIG_IR_IMON_RAW=n Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-04-13kbuild: rpm-pkg: use kernel-install as a fallback for new-kernel-pkgJavier Martinez Canillas1-0/+2
The new-kernel-pkg script is only present when grubby is installed, but it may not always be the case. So if the script isn't present, attempt to use the kernel-install script as a fallback instead. Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-04-13btrfs: Only check first key for committed tree blocksQu Wenruo1-0/+8
When looping btrfs/074 with many cpus (>= 8), it's possible to trigger kernel warning due to first key verification: [ 4239.523446] WARNING: CPU: 5 PID: 2381 at fs/btrfs/disk-io.c:460 btree_read_extent_buffer_pages+0x1ad/0x210 [ 4239.523830] Modules linked in: [ 4239.524630] RIP: 0010:btree_read_extent_buffer_pages+0x1ad/0x210 [ 4239.527101] Call Trace: [ 4239.527251] read_tree_block+0x42/0x70 [ 4239.527434] read_node_slot+0xd2/0x110 [ 4239.527632] push_leaf_right+0xad/0x1b0 [ 4239.527809] split_leaf+0x4ea/0x700 [ 4239.527988] ? leaf_space_used+0xbc/0xe0 [ 4239.528192] ? btrfs_set_lock_blocking_rw+0x99/0xb0 [ 4239.528416] btrfs_search_slot+0x8cc/0xa40 [ 4239.528605] btrfs_insert_empty_items+0x71/0xc0 [ 4239.528798] __btrfs_run_delayed_refs+0xa98/0x1680 [ 4239.529013] btrfs_run_delayed_refs+0x10b/0x1b0 [ 4239.529205] btrfs_commit_transaction+0x33/0xaf0 [ 4239.529445] ? start_transaction+0xa8/0x4f0 [ 4239.529630] btrfs_alloc_data_chunk_ondemand+0x1b0/0x4e0 [ 4239.529833] btrfs_check_data_free_space+0x54/0xa0 [ 4239.530045] btrfs_delalloc_reserve_space+0x25/0x70 [ 4239.531907] btrfs_direct_IO+0x233/0x3d0 [ 4239.532098] generic_file_direct_write+0xcb/0x170 [ 4239.532296] btrfs_file_write_iter+0x2bb/0x5f4 [ 4239.532491] aio_write+0xe2/0x180 [ 4239.532669] ? lock_acquire+0xac/0x1e0 [ 4239.532839] ? __might_fault+0x3e/0x90 [ 4239.533032] do_io_submit+0x594/0x860 [ 4239.533223] ? do_io_submit+0x594/0x860 [ 4239.533398] SyS_io_submit+0x10/0x20 [ 4239.533560] ? SyS_io_submit+0x10/0x20 [ 4239.533729] do_syscall_64+0x75/0x1d0 [ 4239.533979] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [ 4239.534182] RIP: 0033:0x7f8519741697 The problem here is, at btree_read_extent_buffer_pages() we don't have acquired read/write lock on that extent buffer, only basic info like level/bytenr is reliable. So race condition leads to such false alert. However in current call site, it's impossible to acquire proper lock without race window. To fix the problem, we only verify first key for committed tree blocks (whose generation is no larger than fs_info->last_trans_committed), so the content of such tree blocks will not change and there is no need to get read/write lock. Reported-by: Nikolay Borisov <nborisov@suse.com> Fixes: 581c1760415c ("btrfs: Validate child tree block's level and first key") Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-04-13powerpc/64s: Fix CPU_FTRS_ALWAYS vs DT CPU featuresMichael Ellerman2-15/+22
The cpu_has_feature() mechanism has an optimisation where at build time we construct a mask of the CPU feature bits that will always be true for the given .config, based on the platform/bitness/etc. that we are building for. That is incompatible with DT CPU features, where the set of CPU features is dependent on feature flags that are given to us by firmware. The result is that some feature bits can not be *disabled* by DT CPU features. Or more accurately, they can be disabled but they will still appear in the ALWAYS mask, meaning cpu_has_feature() will always return true for them. In the past this hasn't really been a problem because on Book3S 64 (where we support DT CPU features), the set of ALWAYS bits has been very small. That was because we always built for POWER4 and later, meaning the set of common bits was small. The only bit that could be cleared by DT CPU features that was also in the ALWAYS mask was CPU_FTR_NODSISRALIGN, and that was only used in the alignment handler to create a fake DSISR. That code was itself deleted in 31bfdb036f12 ("powerpc: Use instruction emulation infrastructure to handle alignment faults") (Sep 2017). However the set of ALWAYS features changed with the recent commit db5ae1c155af ("powerpc/64s: Refine feature sets for little endian builds") which restricted the set of feature flags when building little endian to Power7 or later. That caused the ALWAYS mask to become much larger for little endian builds. The result is that the following feature bits can currently not be *disabled* by DT CPU features: CPU_FTR_REAL_LE, CPU_FTR_MMCRA, CPU_FTR_CTRL, CPU_FTR_SMT, CPU_FTR_PURR, CPU_FTR_SPURR, CPU_FTR_DSCR, CPU_FTR_PKEY, CPU_FTR_VMX_COPY, CPU_FTR_CFAR, CPU_FTR_HAS_PPR. To fix it we need to mask the set of ALWAYS features with the base set of DT CPU features, ie. the features that are always enabled by DT CPU features. That way there are no bits in the ALWAYS mask that are not also always set by DT CPU features. Fixes: db5ae1c155af ("powerpc/64s: Refine feature sets for little endian builds") Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-13firmware: dmi_scan: Use lowercase letters for UUIDJean Delvare1-2/+2
RFC 4122 asks for letters a-f in UUID to be lowercase. Follow this recommendation. Suggested by Paul Dagnelie at: https://savannah.nongnu.org/bugs/index.php?53569 Signed-off-by: Jean Delvare <jdelvare@suse.de>
2018-04-13firmware: dmi_scan: Add DMI_OEM_STRING support to dmi_matchesAlex Hung2-1/+10
OEM strings are defined by each OEM and they contain customized and useful OEM information. Supporting it provides more flexible uses of the dmi_matches function. Signed-off-by: Alex Hung <alex.hung@canonical.com> Signed-off-by: Jean Delvare <jdelvare@suse.de>
2018-04-13firmware: dmi_scan: Fix UUID length safety checkJean Delvare1-1/+1
The test which ensures that the DMI type 1 structure is long enough to hold the UUID is off by one. It would fail if the structure is exactly 24 bytes long, while that's sufficient to hold the UUID. I don't expect this bug to cause problem in practice because all implementations I have seen had length 8, 25 or 27 bytes, in line with the SMBIOS specifications. But let's fix it still. Signed-off-by: Jean Delvare <jdelvare@suse.de> Fixes: a814c3597a6b ("firmware: dmi_scan: Check DMI structure length") Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2018-04-13Merge branches 'thermal-core' and 'thermal-soc' into nextZhang Rui1-3/+3
2018-04-13Merge tag 'drm-fixes-for-v4.17-rc1' of ↵Linus Torvalds40-1640/+432
git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "One omap, and one alsa pm fix (we merged the breaking patch via drm tree). Otherwise it's two bunches of amdgpu fixes, removing an unneeded file, some DC fixes, HDMI audio regression fix, and some vega12 fixes" * tag 'drm-fixes-for-v4.17-rc1' of git://people.freedesktop.org/~airlied/linux: (27 commits) Revert "drm/amd/display: disable CRTCs with NULL FB on their primary plane (V2)" Revert "drm/amd/display: fix dereferencing possible ERR_PTR()" drm/amd/display: Fix regamma not affecting full-intensity color values drm/amd/display: Fix FBC text console corruption drm/amd/display: Only register backlight device if embedded panel connected drm/amd/display: fix brightness level after resume from suspend drm/amd/display: HDMI has no sound after Panel power off/on drm/amdgpu: add MP1 and THM hw ip base reg offset drm/amdgpu: fix null pointer panic with direct fw loading on gpu reset drm/radeon: add PX quirk for Asus K73TK drm/omap: fix crash if there's no video PLL drm/amdgpu: Fix memory leaks at amdgpu_init() error path drm/amdgpu: Fix PCIe lane width calculation drm/radeon: Fix PCIe lane width calculation drm/amdgpu/si: implement get/set pcie_lanes asic callback drm/amdgpu: Add support for SRBM selection v3 Revert "drm/amdgpu: Don't change preferred domian when fallback GTT v5" drm/amd/powerply: fix power reading on Fiji drm/amd/powerplay: Enable ACG SS feature drm/amdgpu/sdma: fix mask in emit_pipeline_sync ...
2018-04-13Merge tag 'trace-v4.17-2' of ↵Linus Torvalds2-34/+15
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "A few clean ups and bug fixes: - replace open coded "ARRAY_SIZE()" with macro - updates to uprobes - bug fix for perf event filter on error path" * tag 'trace-v4.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Enforce passing in filter=NULL to create_filter() trace_uprobe: Simplify probes_seq_show() trace_uprobe: Use %lx to display offset tracing/uprobe: Add support for overlayfs tracing: Use ARRAY_SIZE() macro instead of open coding it
2018-04-13proc: fixup copyright signAlexey Dobriyan9-7/+37
Add copyright in two files before they get autorubberstamped. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>