kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2018-11-10	ACPI: sysfs: Make ACPI GPE mask kernel parameter cover all GPEs	Prarit Bhargava	1	-1/+0
	[ Upstream commit 0f27cff8597d86f881ea8274b49b63b678c14a3c ] The acpi_mask_gpe= kernel parameter documentation states that the range of mask is 128 GPEs (0x00 to 0x7F). The acpi_masked_gpes mask is a u64 so only 64 GPEs (0x00 to 0x3F) can really be masked. Use a bitmap of size 0xFF instead of a u64 for the GPE mask so 256 GPEs can be masked. Fixes: 9c4aa1eecb48 (ACPI / sysfs: Provide quirk mechanism to prevent GPE flooding) Signed-off-by: Prarit Bharava <prarit@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-10-13	x86/fpu: Finish excising 'eagerfpu'	Andy Lutomirski	1	-6/+0
	commit e63650840e8b053aa09ad934877e87e9941ed135 upstream. Now that eagerfpu= is gone, remove it from the docs and some comments. Also sync the changes to tools/. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/cf430dd4481d41280e93ac6cf0def1007a67fc8e.1476740397.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Daniel Sangorrin <daniel.sangorrin@toshiba.co.jp> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15	x86/bugs, kvm: Introduce boot-time control of L1TF mitigations	Jiri Kosina	1	-6/+62
	commit d90a7a0ec83fb86622cd7dae23255d3c50a99ec8 upstream Introduce the 'l1tf=' kernel command line option to allow for boot-time switching of mitigation that is used on processors affected by L1TF. The possible values are: full Provides all available mitigations for the L1TF vulnerability. Disables SMT and enables all mitigations in the hypervisors. SMT control via /sys/devices/system/cpu/smt/control is still possible after boot. Hypervisors will issue a warning when the first VM is started in a potentially insecure configuration, i.e. SMT enabled or L1D flush disabled. full,force Same as 'full', but disables SMT control. Implies the 'nosmt=force' command line option. sysfs control of SMT and the hypervisor flush control is disabled. flush Leaves SMT enabled and enables the conditional hypervisor mitigation. Hypervisors will issue a warning when the first VM is started in a potentially insecure configuration, i.e. SMT enabled or L1D flush disabled. flush,nosmt Disables SMT and enables the conditional hypervisor mitigation. SMT control via /sys/devices/system/cpu/smt/control is still possible after boot. If SMT is reenabled or flushing disabled at runtime hypervisors will issue a warning. flush,nowarn Same as 'flush', but hypervisors will not warn when a VM is started in a potentially insecure configuration. off Disables hypervisor mitigations and doesn't emit any warnings. Default is 'flush'. Let KVM adhere to these semantics, which means: - 'lt1f=full,force' : Performe L1D flushes. No runtime control possible. - 'l1tf=full' - 'l1tf-flush' - 'l1tf=flush,nosmt' : Perform L1D flushes and warn on VM start if SMT has been runtime enabled or L1D flushing has been run-time enabled - 'l1tf=flush,nowarn' : Perform L1D flushes and no warnings are emitted. - 'l1tf=off' : L1D flushes are not performed and no warnings are emitted. KVM can always override the L1D flushing behavior using its 'vmentry_l1d_flush' module parameter except when lt1f=full,force is set. This makes KVM's private 'nosmt' option redundant, and as it is a bit non-systematic anyway (this is something to control globally, not on hypervisor level), remove that option. Add the missing Documentation entry for the l1tf vulnerability sysfs file while at it. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142323.202758176@linutronix.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15	x86/KVM/VMX: Add module argument for L1TF mitigation	Konrad Rzeszutek Wilk	1	-0/+12
	commit a399477e52c17e148746d3ce9a483f681c2aa9a0 upstream Add a mitigation mode parameter "vmentry_l1d_flush" for CVE-2018-3620, aka L1 terminal fault. The valid arguments are: - "always" L1D cache flush on every VMENTER. - "cond" Conditional L1D cache flush, explained below - "never" Disable the L1D cache flush mitigation "cond" is trying to avoid L1D cache flushes on VMENTER if the code executed between VMEXIT and VMENTER is considered safe, i.e. is not bringing any interesting information into L1D which might exploited. [ tglx: Split out from a larger patch ] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15	x86/KVM: Warn user if KVM is loaded SMT and L1TF CPU bug being present	Konrad Rzeszutek Wilk	1	-0/+6
	commit 26acfb666a473d960f0fd971fe68f3e3ad16c70b upstream If the L1TF CPU bug is present we allow the KVM module to be loaded as the major of users that use Linux and KVM have trusted guests and do not want a broken setup. Cloud vendors are the ones that are uncomfortable with CVE 2018-3620 and as such they are the ones that should set nosmt to one. Setting 'nosmt' means that the system administrator also needs to disable SMT (Hyper-threading) in the BIOS, or via the 'nosmt' command line parameter, or via the /sys/devices/system/cpu/smt/control. See commit 05736e4ac13c ("cpu/hotplug: Provide knobs to control SMT"). Other mitigations are to use task affinity, cpu sets, interrupt binding, etc - anything to make sure that _only_ the same guests vCPUs are running on sibling threads. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15	Revert "x86/apic: Ignore secondary threads if nosmt=force"	Thomas Gleixner	1	-6/+2
	commit 506a66f374891ff08e064a058c446b336c5ac760 upstream Dave Hansen reported, that it's outright dangerous to keep SMT siblings disabled completely so they are stuck in the BIOS and wait for SIPI. The reason is that Machine Check Exceptions are broadcasted to siblings and the soft disabled sibling has CR4.MCE = 0. If a MCE is delivered to a logical core with CR4.MCE = 0, it asserts IERR#, which shuts down or reboots the machine. The MCE chapter in the SDM contains the following blurb: Because the logical processors within a physical package are tightly coupled with respect to shared hardware resources, both logical processors are notified of machine check errors that occur within a given physical processor. If machine-check exceptions are enabled when a fatal error is reported, all the logical processors within a physical package are dispatched to the machine-check exception handler. If machine-check exceptions are disabled, the logical processors enter the shutdown state and assert the IERR# signal. When enabling machine-check exceptions, the MCE flag in control register CR4 should be set for each logical processor. Reverting the commit which ignores siblings at enumeration time solves only half of the problem. The core cpuhotplug logic needs to be adjusted as well. This thoughtful engineered mechanism also turns the boot process on all Intel HT enabled systems into a MCE lottery. MCE is enabled on the boot CPU before the secondary CPUs are brought up. Depending on the number of physical cores the window in which this situation can happen is smaller or larger. On a HSW-EX it's about 750ms: MCE is enabled on the boot CPU: [ 0.244017] mce: CPU supports 22 MCE banks The corresponding sibling #72 boots: [ 1.008005] .... node #0, CPUs: #72 That means if an MCE hits on physical core 0 (logical CPUs 0 and 72) between these two points the machine is going to shutdown. At least it's a known safe state. It's obvious that the early boot can be hit by an MCE as well and then runs into the same situation because MCEs are not yet enabled on the boot CPU. But after enabling them on the boot CPU, it does not make any sense to prevent the kernel from recovering. Adjust the nosmt kernel parameter documentation as well. Reverts: 2207def700f9 ("x86/apic: Ignore secondary threads if nosmt=force") Reported-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-15	cpu/hotplug: Provide knobs to control SMT	Thomas Gleixner	1	-0/+8
	commit 05736e4ac13c08a4a9b1ef2de26dd31a32cbee57 upstream Provide a command line and a sysfs knob to control SMT. The command line options are: 'nosmt': Enumerate secondary threads, but do not online them 'nosmt=force': Ignore secondary threads completely during enumeration via MP table and ACPI/MADT. The sysfs control file has the following states (read/write): 'on': SMT is enabled. Secondary threads can be freely onlined 'off': SMT is disabled. Secondary threads, even if enumerated cannot be onlined 'forceoff': SMT is permanentely disabled. Writes to the control file are rejected. 'notsupported': SMT is not supported by the CPU The command line option 'nosmt' sets the sysfs control to 'off'. This can be changed to 'on' to reenable SMT during runtime. The command line option 'nosmt=force' sets the sysfs control to 'forceoff'. This cannot be changed during runtime. When SMT is 'on' and the control file is changed to 'off' then all online secondary threads are offlined and attempts to online a secondary thread later on are rejected. When SMT is 'off' and the control file is changed to 'on' then secondary threads can be onlined again. The 'off' -> 'on' transition does not automatically online the secondary threads. When the control file is set to 'forceoff', the behaviour is the same as setting it to 'off', but the operation is irreversible and later writes to the control file are rejected. When the control status is 'notsupported' then writes to the control file are rejected. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22	arm64: Add 'ssbd' command-line option	Marc Zyngier	1	-0/+17
	commit a43ae4dfe56a01f5b98ba0cb2f784b6a43bafcc6 upstream. On a system where the firmware implements ARCH_WORKAROUND_2, it may be useful to either permanently enable or disable the workaround for cases where the user decides that they'd rather not get a trap overhead, and keep the mitigation permanently on or off instead of switching it on exception entry/exit. In any case, default to the mitigation being enabled. Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-22	x86/speculation: Make "seccomp" the default mode for Speculative Store Bypass	Kees Cook	1	-9/+17
	commit f21b53b20c754021935ea43364dbf53778eeba32 upstream Unless explicitly opted out of, anything running under seccomp will have SSB mitigations enabled. Choosing the "prctl" mode will disable this. [ tglx: Adjusted it to the new arch_seccomp_spec_mitigate() mechanism ] Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-22	x86/speculation: Add prctl for Speculative Store Bypass mitigation	Thomas Gleixner	1	-1/+5
	commit a73ec77ee17ec556fe7f165d00314cb7c047b1ac upstream Add prctl based control for Speculative Store Bypass mitigation and make it the default mitigation for Intel and AMD. Andi Kleen provided the following rationale (slightly redacted): There are multiple levels of impact of Speculative Store Bypass: 1) JITed sandbox. It cannot invoke system calls, but can do PRIME+PROBE and may have call interfaces to other code 2) Native code process. No protection inside the process at this level. 3) Kernel. 4) Between processes. The prctl tries to protect against case (1) doing attacks. If the untrusted code can do random system calls then control is already lost in a much worse way. So there needs to be system call protection in some way (using a JIT not allowing them or seccomp). Or rather if the process can subvert its environment somehow to do the prctl it can already execute arbitrary code, which is much worse than SSB. To put it differently, the point of the prctl is to not allow JITed code to read data it shouldn't read from its JITed sandbox. If it already has escaped its sandbox then it can already read everything it wants in its address space, and do much worse. The ability to control Speculative Store Bypass allows to enable the protection selectively without affecting overall system performance. Based on an initial patch from Tim Chen. Completely rewritten. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-22	x86/bugs: Provide boot parameters for the spec_store_bypass_disable mitigation	Konrad Rzeszutek Wilk	1	-0/+33
	commit 24f7fc83b9204d20f878c57cb77d261ae825e033 upstream Contemporary high performance processors use a common industry-wide optimization known as "Speculative Store Bypass" in which loads from addresses to which a recent store has occurred may (speculatively) see an older value. Intel refers to this feature as "Memory Disambiguation" which is part of their "Smart Memory Access" capability. Memory Disambiguation can expose a cache side-channel attack against such speculatively read values. An attacker can create exploit code that allows them to read memory outside of a sandbox environment (for example, malicious JavaScript in a web page), or to perform more complex attacks against code running within the same privilege level, e.g. via the stack. As a first step to mitigate against such attacks, provide two boot command line control knobs: nospec_store_bypass_disable spec_store_bypass_disable=[off,auto,on] By default affected x86 processors will power on with Speculative Store Bypass enabled. Hence the provided kernel parameters are written from the point of view of whether to enable a mitigation or not. The parameters are as follows: - auto - Kernel detects whether your CPU model contains an implementation of Speculative Store Bypass and picks the most appropriate mitigation. - on - disable Speculative Store Bypass - off - enable Speculative Store Bypass [ tglx: Reordered the checks so that the whole evaluation is not done when the CPU does not support RDS ] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29	s390: introduce CPU alternatives	Vasily Gorbik	1	-0/+3
	[ Upstream commit 686140a1a9c41d85a4212a1c26d671139b76404b ] Implement CPU alternatives, which allows to optionally patch newer instructions at runtime, based on CPU facilities availability. A new kernel boot parameter "noaltinstr" disables patching. Current implementation is derived from x86 alternatives. Although ideal instructions padding (when altinstr is longer then oldinstr) is added at compile time, and no oldinstr nops optimization has to be done at runtime. Also couple of compile time sanity checks are done: 1. oldinstr and altinstr must be <= 254 bytes long, 2. oldinstr and altinstr must not have an odd length. alternative(oldinstr, altinstr, facility); alternative_2(oldinstr, altinstr1, facility1, altinstr2, facility2); Both compile time and runtime padding consists of either 6/4/2 bytes nop or a jump (brcl) + 2 bytes nop filler if padding is longer then 6 bytes. .altinstructions and .altinstr_replacement sections are part of __init_begin : __init_end region and are freed after initialization. Signed-off-by: Vasily Gorbik <gor@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-13	x86/paravirt: Remove 'noreplace-paravirt' cmdline option	Josh Poimboeuf	1	-2/+0
	(cherry picked from commit 12c69f1e94c89d40696e83804dd2f0965b5250cd) The 'noreplace-paravirt' option disables paravirt patching, leaving the original pv indirect calls in place. That's highly incompatible with retpolines, unless we want to uglify paravirt even further and convert the paravirt calls to retpolines. As far as I can tell, the option doesn't seem to be useful for much other than introducing surprising corner cases and making the kernel vulnerable to Spectre v2. It was probably a debug option from the early paravirt days. So just remove it. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jason Baron <jbaron@akamai.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Alok Kataria <akataria@vmware.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Dan Williams <dan.j.williams@intel.com> Link: https://lkml.kernel.org/r/20180131041333.2x6blhxirc2kclrq@treble Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-17	x86/spectre: Add boot time option to select Spectre v2 mitigation	David Woodhouse	1	-0/+28
	commit da285121560e769cc31797bba6422eea71d473e0 upstream. Add a spectre_v2= option to select the mitigation used for the indirect branch speculation vulnerability. Currently, the only option available is retpoline, in its various forms. This will be expanded to cover the new IBRS/IBPB microcode features. The RETPOLINE_AMD feature relies on a serializing LFENCE for speculation control. For AMD hardware, only set RETPOLINE_AMD if LFENCE is a serializing instruction, which is indicated by the LFENCE_RDTSC feature. [ tglx: Folded back the LFENCE/AMD fixes and reworked it so IBRS integration becomes simple ] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515707194-20531-5-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-17	x86/Documentation: Add PTI description	Dave Hansen	1	-7/+14
	commit 01c9b17bf673b05bb401b76ec763e9730ccf1376 upstream. Add some details about how PTI works, what some of the downsides are, and how to debug it when things go wrong. Also document the kernel parameter: 'pti/nopti'. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Moritz Lipp <moritz.lipp@iaik.tugraz.at> Cc: Daniel Gruss <daniel.gruss@iaik.tugraz.at> Cc: Michael Schwarz <michael.schwarz@iaik.tugraz.at> Cc: Richard Fellner <richard.fellner@student.tugraz.at> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Hugh Dickins <hughd@google.com> Cc: Andi Lutomirsky <luto@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180105174436.1BC6FA2B@viggo.jf.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-05	x86/kaiser: Check boottime cmdline params	Borislav Petkov	1	-0/+6
	AMD (and possibly other vendors) are not affected by the leak KAISER is protecting against. Keep the "nopti" for traditional reasons and add pti=<on\|off\|auto> like upstream. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-05	x86/kaiser: Rename and simplify X86_FEATURE_KAISER handling	Borislav Petkov	1	-1/+1
	Concentrate it in arch/x86/mm/kaiser.c and use the upstream string "nopti". Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-05	kaiser: add "nokaiser" boot option, using ALTERNATIVE	Hugh Dickins	1	-0/+2
	Added "nokaiser" boot option: an early param like "noinvpcid". Most places now check int kaiser_enabled (#defined 0 when not CONFIG_KAISER) instead of #ifdef CONFIG_KAISER; but entry_64.S and entry_64_compat.S are using the ALTERNATIVE technique, which patches in the preferred instructions at runtime. That technique is tied to x86 cpu features, so X86_FEATURE_KAISER is fabricated. Prior to "nokaiser", Kaiser #defined _PAGE_GLOBAL 0: revert that, but be careful with both _PAGE_GLOBAL and CR4.PGE: setting them when nokaiser like when !CONFIG_KAISER, but not setting either when kaiser - neither matters on its own, but it's hard to be sure that _PAGE_GLOBAL won't get set in some obscure corner, or something add PGE into CR4. By omitting _PAGE_GLOBAL from __supported_pte_mask when kaiser_enabled, all page table setup which uses pte_pfn() masks it out of the ptes. It's slightly shameful that the same declaration versus definition of kaiser_enabled appears in not one, not two, but in three header files (asm/kaiser.h, asm/pgtable.h, asm/tlbflush.h). I felt safer that way, than with #including any of those in any of the others; and did not feel it worth an asm/kaiser_enabled.h - kernel/cpu/common.c includes them all, so we shall hear about it if they get out of synch. Cleanups while in the area: removed the silly #ifdef CONFIG_KAISER from kaiser.c; removed the unused native_get_normal_pgd(); removed the spurious reg clutter from SWITCH_*_CR3 macro stubs; corrected some comments. But more interestingly, set CR4.PSE in secondary_startup_64: the manual is clear that it does not matter whether it's 0 or 1 when 4-level-pts are enabled, but I was distracted to find cr4 different on BSP and auxiliaries - BSP alone was adding PSE, in probe_page_size_mask(). Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-02	x86/mm: Add the 'nopcid' boot option to turn off PCID	Andy Lutomirski	1	-0/+2
	commit 0790c9aad84901ca1bdc14746175549c8b5da215 upstream. The parameter is only present on x86_64 systems to save a few bytes, as PCID is always disabled on x86_32. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Nadav Amit <nadav.amit@gmail.com> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/8bbb2e65bcd249a5f18bfb8128b4689f08ac2b60.1498751203.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-24	mm: larger stack guard gap, between vmas	Hugh Dickins	1	-0/+7
	commit 1be7107fbe18eed3e319a6c3e83c78254b693acb upstream. Stack guard page is a useful feature to reduce a risk of stack smashing into a different mapping. We have been using a single page gap which is sufficient to prevent having stack adjacent to a different mapping. But this seems to be insufficient in the light of the stack usage in userspace. E.g. glibc uses as large as 64kB alloca() in many commonly used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX] which is 256kB or stack strings with MAX_ARG_STRLEN. This will become especially dangerous for suid binaries and the default no limit for the stack size limit because those applications can be tricked to consume a large portion of the stack and a single glibc call could jump over the guard page. These attacks are not theoretical, unfortunatelly. Make those attacks less probable by increasing the stack guard gap to 1MB (on systems with 4k pages; but make it depend on the page size because systems with larger base pages might cap stack allocations in the PAGE_SIZE units) which should cover larger alloca() and VLA stack allocations. It is obviously not a full fix because the problem is somehow inherent, but it should reduce attack space a lot. One could argue that the gap size should be configurable from userspace, but that can be done later when somebody finds that the new 1MB is wrong for some special case applications. For now, add a kernel command line option (stack_guard_gap) to specify the stack gap size (in page units). Implementation wise, first delete all the old code for stack guard page: because although we could get away with accounting one extra page in a stack vma, accounting a larger gap can break userspace - case in point, a program run with "ulimit -S -v 20000" failed when the 1MB gap was counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK and strict non-overcommit mode. Instead of keeping gap inside the stack vma, maintain the stack guard gap as a gap between vmas: using vm_start_gap() in place of vm_start (or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few places which need to respect the gap - mainly arch_get_unmapped_area(), and and the vma tree's subtree_gap support for that. Original-patch-by: Oleg Nesterov <oleg@redhat.com> Original-patch-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Tested-by: Helge Deller <deller@gmx.de> # parisc Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [wt: backport to 4.11: adjust context] [wt: backport to 4.9: adjust context ; kernel doc was not in admin-guide] Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-12	ACPI / sysfs: Provide quirk mechanism to prevent GPE flooding	Lv Zheng	1	-0/+10
	[ Upstream commit 9c4aa1eecb48cfac18ed5e3aca9d9ae58fbafc11 ] Sometimes, the users may require a quirk to be provided from ACPI subsystem core to prevent a GPE from flooding. Normally, if a GPE cannot be dispatched, ACPICA core automatically prevents the GPE from firing. But there are cases the GPE is dispatched by _Lxx/_Exx provided via AML table, and OSPM is lacking of the knowledge to get _Lxx/_Exx correctly executed to handle the GPE, thus the GPE flooding may still occur. The existing quirk mechanism can be enabled/disabled using the following commands to prevent such kind of GPE flooding during runtime: # echo mask > /sys/firmware/acpi/interrupts/gpe00 # echo unmask > /sys/firmware/acpi/interrupts/gpe00 To avoid GPE flooding during boot, we need a boot stage mechanism. This patch provides such a boot stage quirk mechanism to stop this kind of GPE flooding. This patch doesn't fix any feature gap but since the new feature gaps could be found in the future endlessly, and can disappear if the feature gaps are filled, providing a boot parameter rather than a DMI table should suffice. Link: https://bugzilla.kernel.org/show_bug.cgi?id=53071 Link: https://bugzilla.kernel.org/show_bug.cgi?id=117481 Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/887793 Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-26	x86/platform/goldfish: Prevent unconditional loading	Thomas Gleixner	1	-0/+4
	commit 47512cfd0d7a8bd6ab71d01cd89fca19eb2093eb upstream. The goldfish platform code registers the platform device unconditionally which causes havoc in several ways if the goldfish_pdev_bus driver is enabled: - Access to the hardcoded physical memory region, which is either not available or contains stuff which is completely unrelated. - Prevents that the interrupt of the serial port can be requested - In case of a spurious interrupt it goes into a infinite loop in the interrupt handler of the pdev_bus driver (which needs to be fixed seperately). Add a 'goldfish' command line option to make the registration opt-in when the platform is compiled in. I'm seriously grumpy about this engineering trainwreck, which has seven SOBs from Intel developers for 50 lines of code. And none of them figured out that this is broken. Impressive fail! Fixes: ddd70cf93d78 ("goldfish: platform device for x86") Reported-by: Gabriel C <nix.or.die@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-26	swiotlb: Add swiotlb=noforce debug option	Geert Uytterhoeven	1	-1/+2
	commit fff5d99225107f5f13fe4a9805adc2a1c4b5fb00 upstream. On architectures like arm64, swiotlb is tied intimately to the core architecture DMA support. In addition, ZONE_DMA cannot be disabled. To aid debugging and catch devices not supporting DMA to memory outside the 32-bit address space, add a kernel command line option "swiotlb=noforce", which disables the use of bounce buffers. If specified, trying to map memory that cannot be used with DMA will fail, and a rate-limited warning will be printed. Note that io_tlb_nslabs is set to 1, which is the minimal supported value. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-14	Merge branch 'for-linus' of ↵	Linus Torvalds	1	-1/+8
	git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull some more input subsystem updates from Dmitry Torokhov: "An update to the ALPS driver to support the V8 protocol with touchstick, a change for i8042 to skip selftest on many Asus laptops which helps to keep their touchpads working after resume, and a couple other driver fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: i8042 - skip selftest on ASUS laptops Input: melfas_mip4 - add ic_name sysfs attribute Input: melfas_mip4 - add maintainer information Input: melfas_mip4 - add devicetree binding documentations Input: elantech - add Fujitsu Lifebook E556 to force crc_enabled Input: synaptics-rmi4 - fix error handling in I2C transport driver Input: synaptics-rmi4 - fix error handling in SPI transport driver Input: ALPS - add V8 protocol documentation Input: ALPS - set DualPoint flag for 74 03 28 devices Input: ALPS - allow touchsticks to report pressure Input: ALPS - handle 0-pressure 1F events Input: ALPS - add touchstick support for SS5 hardware Input: elantech - force needed quirks on Fujitsu H760 Input: elantech - fix Lenovo version typo
2016-10-14	Merge tag 'nfs-for-4.9-1' of git://git.linux-nfs.org/projects/anna/linux-nfs	Linus Torvalds	1	-0/+12
	Pull NFS client updates from Anna Schumaker: "Highlights include: Stable bugfixes: - sunrpc: fix writ espace race causing stalls - NFS: Fix inode corruption in nfs_prime_dcache() - NFSv4: Don't report revoked delegations as valid in nfs_have_delegation() - NFSv4: nfs4_copy_delegation_stateid() must fail if the delegation is invalid - NFSv4: Open state recovery must account for file permission changes - NFSv4.2: Fix a reference leak in nfs42_proc_layoutstats_generic Features: - Add support for tracking multiple layout types with an ordered list - Add support for using multiple backchannel threads on the client - Add support for pNFS file layout session trunking - Delay xprtrdma use of DMA API (for device driver removal) - Add support for xprtrdma remote invalidation - Add support for larger xprtrdma inline thresholds - Use a scatter/gather list for sending xprtrdma RPC calls - Add support for the CB_NOTIFY_LOCK callback - Improve hashing sunrpc auth_creds by using both uid and gid Bugfixes: - Fix xprtrdma use of DMA API - Validate filenames before adding to the dcache - Fix corruption of xdr->nwords in xdr_copy_to_scratch - Fix setting buffer length in xdr_set_next_buffer() - Don't deadlock the state manager on the SEQUENCE status flags - Various delegation and stateid related fixes - Retry operations if an interrupted slot receives EREMOTEIO - Make nfs boot time y2038 safe" * tag 'nfs-for-4.9-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (100 commits) NFSv4.2: Fix a reference leak in nfs42_proc_layoutstats_generic fs: nfs: Make nfs boot time y2038 safe sunrpc: replace generic auth_cred hash with auth-specific function sunrpc: add RPCSEC_GSS hash_cred() function sunrpc: add auth_unix hash_cred() function sunrpc: add generic_auth hash_cred() function sunrpc: add hash_cred() function to rpc_authops struct Retry operation on EREMOTEIO on an interrupted slot pNFS: Fix atime updates on pNFS clients sunrpc: queue work on system_power_efficient_wq NFSv4.1: Even if the stateid is OK, we may need to recover the open modes NFSv4: If recovery failed for a specific open stateid, then don't retry NFSv4: Fix retry issues with nfs41_test/free_stateid NFSv4: Open state recovery must account for file permission changes NFSv4: Mark the lock and open stateids as invalid after freeing them NFSv4: Don't test open_stateid unless it is set NFSv4: nfs4_do_handle_exception() handle revoke/expiry of a single stateid NFS: Always call nfs_inode_find_state_and_recover() when revoking a delegation NFSv4: Fix a race when updating an open_stateid NFSv4: Fix a race in nfs_inode_reclaim_delegation() ...
2016-10-12	Input: i8042 - skip selftest on ASUS laptops	Marcos Paulo de Souza	1	-1/+8
	On suspend/resume cycle, selftest is executed to reset i8042 controller. But when this is done in Asus devices, subsequent calls to detect/init functions to elantech driver fails. Skipping selftest fixes this problem. An easier step to reproduce this problem is adding i8042.reset=1 as a kernel parameter. On Asus laptops, it'll make the system to start with the touchpad already stuck, since psmouse_probe forcibly calls the selftest function. This patch was inspired by John Hiesey's change[1], but, since this problem affects a lot of models of Asus, let's avoid running selftests on them. All models affected by this problem: A455LD K401LB K501LB K501LX R409L V502LX X302LA X450LCP X450LD X455LAB X455LDB X455LF Z450LA [1]: https://marc.info/?l=linux-input&m=144312209020616&w=2 Fixes: "ETPS/2 Elantech Touchpad dies after resume from suspend" (https://bugzilla.kernel.org/show_bug.cgi?id=107971) Signed-off-by: Marcos Paulo de Souza <marcos.souza.org@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2016-10-12	lib/bitmap.c: enhance bitmap syntax	Noam Camus	1	-14/+36
	Today there are platforms with many CPUs (up to 4K). Trying to boot only part of the CPUs may result in too long string. For example lets take NPS platform that is part of arch/arc. This platform have SMP system with 256 cores each with 16 HW threads (SMT machine) where HW thread appears as CPU to the kernel. In this example there is total of 4K CPUs. When one tries to boot only part of the HW threads from each core the string representing the map may be long... For example if for sake of performance we decided to boot only first half of HW threads of each core the map will look like: 0-7,16-23,32-39,...,4080-4087 This patch introduce new syntax to accommodate with such use case. I added an optional postfix to a range of CPUs which will choose according to given modulo the desired range of reminders i.e.: <cpus range>:sed_size/group_size For example, above map can be described in new syntax like this: 0-4095:8/16 Note that this patch is backward compatible with current syntax. [akpm@linux-foundation.org: rework documentation] Link: http://lkml.kernel.org/r/1473579629-4283-1-git-send-email-noamca@mellanox.com Signed-off-by: Noam Camus <noamca@mellanox.com> Cc: David Decotigny <decot@googlers.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: David S. Miller <davem@davemloft.net> Cc: Pan Xinhui <xinhui@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-10	Merge branch 'mm-pkeys-for-linus' of ↵	Linus Torvalds	1	-0/+5
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull protection keys syscall interface from Thomas Gleixner: "This is the final step of Protection Keys support which adds the syscalls so user space can actually allocate keys and protect memory areas with them. Details and usage examples can be found in the documentation. The mm side of this has been acked by Mel" * 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/pkeys: Update documentation x86/mm/pkeys: Do not skip PKRU register if debug registers are not used x86/pkeys: Fix pkeys build breakage for some non-x86 arches x86/pkeys: Add self-tests x86/pkeys: Allow configuration of init_pkru x86/pkeys: Default to a restrictive init PKRU pkeys: Add details of system call use to Documentation/ generic syscalls: Wire up memory protection keys syscalls x86: Wire up protection keys system calls x86/pkeys: Allocation/free syscalls x86/pkeys: Make mprotect_key() mask off additional vm_flags mm: Implement new pkey_mprotect() system call x86/pkeys: Add fault handling for PF_PK page fault bit
2016-10-06	Merge tag 'kvm-4.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds	1	-0/+9
	Pull KVM updates from Radim Krčmář: "All architectures: - move `make kvmconfig` stubs from x86 - use 64 bits for debugfs stats ARM: - Important fixes for not using an in-kernel irqchip - handle SError exceptions and present them to guests if appropriate - proxying of GICV access at EL2 if guest mappings are unsafe - GICv3 on AArch32 on ARMv8 - preparations for GICv3 save/restore, including ABI docs - cleanups and a bit of optimizations MIPS: - A couple of fixes in preparation for supporting MIPS EVA host kernels - MIPS SMP host & TLB invalidation fixes PPC: - Fix the bug which caused guests to falsely report lockups - other minor fixes - a small optimization s390: - Lazy enablement of runtime instrumentation - up to 255 CPUs for nested guests - rework of machine check deliver - cleanups and fixes x86: - IOMMU part of AMD's AVIC for vmexit-less interrupt delivery - Hyper-V TSC page - per-vcpu tsc_offset in debugfs - accelerated INS/OUTS in nVMX - cleanups and fixes" * tag 'kvm-4.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (140 commits) KVM: MIPS: Drop dubious EntryHi optimisation KVM: MIPS: Invalidate TLB by regenerating ASIDs KVM: MIPS: Split kernel/user ASID regeneration KVM: MIPS: Drop other CPU ASIDs on guest MMU changes KVM: arm/arm64: vgic: Don't flush/sync without a working vgic KVM: arm64: Require in-kernel irqchip for PMU support KVM: PPC: Book3s PR: Allow access to unprivileged MMCR2 register KVM: PPC: Book3S PR: Support 64kB page size on POWER8E and POWER8NVL KVM: PPC: Book3S: Remove duplicate setting of the B field in tlbie KVM: PPC: BookE: Fix a sanity check KVM: PPC: Book3S HV: Take out virtual core piggybacking code KVM: PPC: Book3S: Treat VTB as a per-subcore register, not per-thread ARM: gic-v3: Work around definition of gic_write_bpr1 KVM: nVMX: Fix the NMI IDT-vectoring handling KVM: VMX: Enable MSR-BASED TPR shadow even if APICv is inactive KVM: nVMX: Fix reload apic access page warning kvmconfig: add virtio-gpu to config fragment config: move x86 kvm_guest.config to a common location arm64: KVM: Remove duplicating init code for setting VMID ARM: KVM: Support vgic-v3 ...
2016-10-05	Merge tag 'gpio-v4.9-1' of ↵	Linus Torvalds	1	-0/+4
	git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO updates from Linus Walleij: "This is the bulk of GPIO changes for the v4.9 series: Subsystem improvements: - do away with the last users of the obsolete Kconfig options ARCH_REQUIRE_GPIOLIB and ARCH_WANT_OPTIONAL_GPIOLIB (the latter always sounded like an item on a wishlist to Santa Claus to me). We can now select GPIOLIB and be done with it, for all archs. After some struggle it even work on UM. Not that it has GPIO, but if it wants to, it can select the library. - continued efforts to make drivers properly either tristate or bool. - introduce a warning for drivers assigning default triggers to their irqchip lines when probed from device tree, so we find and fix these ambigous drivers. It is agreed that in the OF config path, the device tree defines trigger characteristics. - the same warning, mutatis mutandis, for ACPI-probed GPIO irqchips. - we introduce the ability to mark certain IRQ lines as "unusable" as they can be taken by BIOS/firmware, unrouted in silicon and generally nasty if you use them, and such things. This is put to good use in the STMPE driver and also in the Cherryview pin control driver. - a new "mockup" virtual GPIO device that can be used for testing. The plan is to add unit tests under tools/* for exercising this device and verify that the kernel code paths are working as they should. - make memory-mapped I/O-drivers depend on HAS_IOMEM. This was implicit all the time, but when people started building UM with allyesconfig or allmodconfig it exploded in their face. - move some stray bits of device tree and ACPI HW description callbacks down into their respective implementation silo. These were causing issues when compiling on !HAS_IOMEM as well, so now eventually UM compiles the GPIOLIB library if it wants to. New drivers: - new driver for the Aspeed GPIO front-end companion to the pin controller merged through the pin control tree. - new driver for the LP873x PMIC GPIO portions. - new driver for Technologic Systems' I2C FPGA GPIO such as TS4900, TS-7970, TS-7990 and TS-4100. - new driver for the Broadcom BCM63xx series including BCM6338 and BCM6345. - new driver for the Intel WhiskeyCove PMIC GPIO. - new driver for the Allwinner AXP209 PMIC GPIO portions. - new driver for Diamond Systems 48 line GPIO-MM, another of these port-mapped I/O expansion cards. - support the STMicroelectronics STMPE1600 variant in the STMPE driver. Driver improvements: - the STMPE driver now supports rising/falling edge detection properly for IRQs. - the PCA954x will now fetch and enable its VCC regulator properly. - major rework of the PCA953x driver with the goal of eventually switching it over to use regmap and thus modernize it even more. - switch the IOP driver to use the generic MMIO GPIO library. - move the ages old HTC EGPIO (extended GPIO) GPIO expander driver over to this subsystem from MFD, achieveing some separation of concerns" * tag 'gpio-v4.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (81 commits) gpio: add missing static inline gpio: OF: localize some gpiochip init functions gpio: acpi: separation of concerns gpio: OF: separation of concerns gpio: make memory-mapped drivers depend on HAS_IOMEM gpio: stmpe: use BIT() macro gpio: stmpe: forbid unused lines to be mapped as IRQs mfd/gpio: Move HTC GPIO driver to GPIO subsystem gpio: MAINTAINERS: Add an entry for GPIO mockup driver gpio/mockup: add virtual gpio device gpio: Added zynq specific check for special pins on bank zero gpio: axp209: Implement get_direction gpio: aspeed: remove redundant return value check gpio: loongson1: remove redundant return value check ARM: omap2: fix missing include gpio: tc3589x: fix up complaints on unsigned gpio: tc3589x: add .get_direction() and small cleanup gpio: f7188x: use gpiochip_get_data instead of container_of gpio: tps65218: use devm_gpiochip_add_data() for gpio registration gpio: aspeed: fix return value check in aspeed_gpio_probe() ...
2016-10-04	Merge tag 'docs-4.9' of git://git.lwn.net/linux	Linus Torvalds	1	-10/+14
	Pull documentation updates from Jonathan Corbet: "This is the documentation update pull for the 4.9 merge window. The Sphinx transition is still creating a fair amount of work. Here we have a number of fixes and, importantly, a proper PDF output solution, thanks to Jani Nikula, Mauro Carvalho Chehab and Markus Heiser. I've started a couple of new books: a driver API book (based on the old device-drivers.tmpl) and a development tools book. Both are meant to show how we can integrate together our existing documentation into a more coherent and accessible whole. It involves moving some stuff around and formatting changes, but, I think, the results are worth it. The good news is that most of our existing Documentation/.txt files are almost* in RST format already; the amount of messing around required is minimal. And, of course, there's the usual set of updates, typo fixes, and more" * tag 'docs-4.9' of git://git.lwn.net/linux: (120 commits) URL changed for Linux Foundation TAB dax : Fix documentation with respect to struct pages iio: Documentation: Correct the path used to create triggers. docs: Remove space-before-label guidance from CodingStyle docs-rst: add inter-document cross references Documentation/email-clients.txt: convert it to ReST markup Documentation/kernel-docs.txt: reorder based on timestamp Documentation/kernel-docs.txt: Add dates for online docs Documentation/kernel-docs.txt: get rid of broken docs Documentation/kernel-docs.txt: move in-kernel docs Documentation/kernel-docs.txt: remove more legacy references Documentation/kernel-docs.txt: add two published books Documentation/kernel-docs.txt: sort books per publication date Documentation/kernel-docs.txt: adjust LDD references Documentation/kernel-docs.txt: some improvements on the ReST output Documentation/kernel-docs.txt: Consistent indenting: 4 spaces Documentation/kernel-docs.txt: Add 4 paper/book references Documentation/kernel-docs.txt: Improve layouting of book list Documentation/kernel-docs.txt: Remove offline or outdated entries docs: Clean up bare :: lines ...
2016-10-04	Merge tag 'usb-4.9-rc1' of ↵	Linus Torvalds	1	-0/+2
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull usb/phy/extcon updates from Greg KH: "Here is the big USB, and PHY, and extcon, patchsets for 4.9-rc1. Full details are in the shortlog, but generally a lot of new hardware support, usb gadget updates, and Wolfram's great cleanup of USB error message handling, making the kernel image a tad bit smaller. All of this has been in linux-next with no reported issues" * tag 'usb-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (343 commits) Revert "usbtmc: convert to devm_kzalloc" USB: serial: cp210x: Add ID for a Juniper console usb: Kconfig: using select for USB_COMMON dependency bluetooth: bcm203x: don't print error when allocating urb fails mmc: host: vub300: don't print error when allocating urb fails usb: hub: change CLEAR_FEATURE to SET_FEATURE usb: core: Introduce a USB port LED trigger USB: bcma: drop Northstar PHY 2.0 initialization code usb: core: hcd: add missing header dependencies usb: musb: da8xx: fix error handling message in probe usb: musb: Fix session based PM for first invalid VBUS usb: musb: Fix PM runtime for disconnect after unconfigure musb: Export musb_root_disconnect for use in modules usb: misc: legousbtower: Fix NULL pointer deference cdc-acm: hardening against malicious devices Revert "usb: gadget: NCM: Protect dev->port_usb using dev->lock" include: extcon: Fix compilation error caused because of incomplete merge MAINTAINERS: add tree entry for USB Serial phy-twl4030-usb: initialize charging-related stuff via pm_runtime phy-twl4030-usb: better handle musb_mailbox() failure ...
2016-10-04	Merge tag 'tty-4.9-rc1' of ↵	Linus Torvalds	1	-5/+6
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty and serial updates from Greg KH: "Here is the big tty and serial patch set for 4.9-rc1. It also includes some drivers/dma/ changes, as those were needed by some serial drivers, and they were all acked by the DMA maintainer. Also in here is the long-suffering ACPI SPCR patchset, which was passed around from maintainer to maintainer like a hot-potato. Seems I was the sucker^Wlucky one. All of those patches have been acked by the various subsystem maintainers as well. All of this has been in linux-next with no reported issues" * tag 'tty-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (111 commits) Revert "serial: pl011: add console matching function" MAINTAINERS: update entry for atmel_serial driver serial: pl011: add console matching function ARM64: ACPI: enable ACPI_SPCR_TABLE ACPI: parse SPCR and enable matching console of/serial: move earlycon early_param handling to serial Revert "drivers/tty: Explicitly pass current to show_stack" tty: amba-pl011: Don't complain on -EPROBE_DEFER when no irq nios2: dts: 10m50: Add tx-threshold parameter serial: 8250: Set Altera 16550 TX FIFO Threshold serial: 8250: of: Load TX FIFO Threshold from DT Documentation: dt: serial: Add TX FIFO threshold parameter drivers/tty: Explicitly pass current to show_stack serial: imx: Fix DCD reading serial: stm32: mark symbols static where possible serial: xuartps: Add some register initialisation to cdns_early_console_setup() serial: xuartps: Removed unwanted checks while reading the error conditions serial: xuartps: Rewrite the interrupt handling logic serial: stm32: use mapbase instead of membase for DMA tty/serial: atmel: fix fractional baud rate computation ...
2016-10-03	Merge tag 'arm64-upstream' of ↵	Linus Torvalds	1	-0/+9
	git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "It's a bit all over the place this time with no "killer feature" to speak of. Support for mismatched cache line sizes should help people seeing whacky JIT failures on some SoCs, and the big.LITTLE perf updates have been a long time coming, but a lot of the changes here are cleanups. We stray outside arch/arm64 in a few areas: the arch/arm/ arch_timer workaround is acked by Russell, the DT/OF bits are acked by Rob, the arch_timer clocksource changes acked by Marc, CPU hotplug by tglx and jump_label by Peter (all CC'd). Summary: - Support for execute-only page permissions - Support for hibernate and DEBUG_PAGEALLOC - Support for heterogeneous systems with mismatches cache line sizes - Errata workarounds (A53 843419 update and QorIQ A-008585 timer bug) - arm64 PMU perf updates, including cpumasks for heterogeneous systems - Set UTS_MACHINE for building rpm packages - Yet another head.S tidy-up - Some cleanups and refactoring, particularly in the NUMA code - Lots of random, non-critical fixes across the board" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (100 commits) arm64: tlbflush.h: add __tlbi() macro arm64: Kconfig: remove SMP dependence for NUMA arm64: Kconfig: select OF/ACPI_NUMA under NUMA config arm64: fix dump_backtrace/unwind_frame with NULL tsk arm/arm64: arch_timer: Use archdata to indicate vdso suitability arm64: arch_timer: Work around QorIQ Erratum A-008585 arm64: arch_timer: Add device tree binding for A-008585 erratum arm64: Correctly bounds check virt_addr_valid arm64: migrate exception table users off module.h and onto extable.h arm64: pmu: Hoist pmu platform device name arm64: pmu: Probe default hw/cache counters arm64: pmu: add fallback probe table MAINTAINERS: Update ARM PMU PROFILING AND DEBUGGING entry arm64: Improve kprobes test for atomic sequence arm64/kvm: use alternative auto-nop arm64: use alternative auto-nop arm64: alternative: add auto-nop infrastructure arm64: lse: convert lse alternatives NOP padding to use __nops arm64: barriers: introduce nops and __nops macros for NOP sequences arm64: sysreg: replace open-coded mrs_s/msr_s with {read,write}_sysreg_s ...
2016-09-27	serial: xuartps: Add some register initialisation to cdns_early_console_setup()	Scott Telford	1	-5/+6
	Add initialisation of control register and baud rate to cdns_early_console_setup(), required when running kernel standalone without a boot loader. Baud rate is only initialised when specified in earlycon command-line option, otherwise it is assumed this has been set by a boot loader. Updated Documentation/kernel-parameters.txt accordingly. Signed-off-by: Scott Telford <stelford@cadence.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-26	gpio/mockup: add virtual gpio device	Bamvor Jian Zhang	1	-0/+4
	This patch add basic structure of a virtual gpio device(gpio-mockup) for testing gpio subsystem. The tester could manipulate such device through userspace(sysfs or char device) and check the result from debugfs. Currently, it support one or more gpiochip(determined by module parameters with base,ngpio pair). One could test the overlap of different gpiochip and test the direction and/or output values of these chips. Signed-off-by: Kamlakant Patel <kamlakant.patel@broadcom.com> Signed-off-by: Bamvor Jian Zhang <bamvor.zhangjian@linaro.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-09-23	arm64: arch_timer: Work around QorIQ Erratum A-008585	Scott Wood	1	-0/+9
	Erratum A-008585 says that the ARM generic timer counter "has the potential to contain an erroneous value for a small number of core clock cycles every time the timer value changes". Accesses to TVAL (both read and write) are also affected due to the implicit counter read. Accesses to CVAL are not affected. The workaround is to reread TVAL and count registers until successive reads return the same value. Writes to TVAL are replaced with an equivalent write to CVAL. The workaround is to reread TVAL and count registers until successive reads return the same value, and when writing TVAL to retry until counter reads before and after the write return the same value. The workaround is enabled if the fsl,erratum-a008585 property is found in the timer node in the device tree. This can be overridden with the clocksource.arm_arch_timer.fsl-a008585 boot parameter, which allows KVM users to enable the workaround until a mechanism is implemented to automatically communicate this information. This erratum can be found on LS1043A and LS2080A. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Scott Wood <oss@buserror.net> [will: renamed read macro to reflect that it's not usually unstable] Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-19	NFSv4.x: Add kernel parameter to control the callback server	Trond Myklebust	1	-0/+12
	Add support for the kernel parameter nfs.callback_nr_threads to set the number of threads that will be assigned to the callback channel. Add support for the kernel parameter nfs.nfs.max_session_cb_slots to set the maximum size of the callback channel slot table. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-09-13	scsi: introduce a quirk for false cache reporting	Oliver Neukum	1	-0/+2
	Some SATA to USB bridges fail to cooperate with some drives resulting in no cache being present being reported to the host. That causes the host to skip sending a command to synchronize caches. That causes data loss when the drive is powered down. Signed-off-by: Oliver Neukum <oneukum@suse.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-09	x86/pkeys: Default to a restrictive init PKRU	Dave Hansen	1	-0/+5
	PKRU is the register that lets you disallow writes or all access to a given protection key. The XSAVE hardware defines an "init state" of 0 for PKRU: its most permissive state, allowing access/writes to everything. Since we start off all new processes with the init state, we start all processes off with the most permissive possible PKRU. This is unfortunate. If a thread is clone()'d [1] before a program has time to set PKRU to a restrictive value, that thread will be able to write to all data, no matter what pkey is set on it. This weakens any integrity guarantees that we want pkeys to provide. To fix this, we define a very restrictive PKRU to override the XSAVE-provided value when we create a new FPU context. We choose a value that only allows access to pkey 0, which is as restrictive as we can practically make it. This does not cause any practical problems with applications using protection keys because we require them to specify initial permissions for each key when it is allocated, which override the restrictive default. In the end, this ensures that threads which do not know how to manage their own pkey rights can not do damage to data which is pkey-protected. I would have thought this was a pretty contrived scenario, except that I heard a bug report from an MPX user who was creating threads in some very early code before main(). It may be crazy, but folks evidently _do_ it. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-arch@vger.kernel.org Cc: Dave Hansen <dave@sr71.net> Cc: mgorman@techsingularity.net Cc: arnd@arndb.de Cc: linux-api@vger.kernel.org Cc: linux-mm@kvack.org Cc: luto@kernel.org Cc: akpm@linux-foundation.org Cc: torvalds@linux-foundation.org Link: http://lkml.kernel.org/r/20160729163021.F3C25D4A@viggo.jf.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-06	documentation/scsi: Remove nodisconnect parameter	Finn Thain	1	-2/+0
	The driver that used the 'nodisconnect' parameter was removed in commit 565bae6a4a8f ("[SCSI] 53c7xx: kill driver"). Related documentation was cleaned up in commit f37a7238d379 ("[SCSI] 53c7xx: fix removal fallout"), except for the remaining two mentions that are removed here. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2016-09-05	iommu/amd: Detect and enable guest vAPIC support	Suravee Suthikulpanit	1	-0/+9
	This patch introduces a new IOMMU driver parameter, amd_iommu_guest_ir, which can be used to specify different interrupt remapping mode for passthrough devices to VM guest: * legacy: Legacy interrupt remapping (w/ 32-bit IRTE) * vapic : Guest vAPIC interrupt remapping (w/ GA mode 128-bit IRTE) Note that in vapic mode, it can also supports legacy interrupt remapping for non-passthrough devices with the 128-bit IRTE. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-08-25	docs: kernel-parameter: Improve the description of nr_cpus and maxcpus	Baoquan He	1	-7/+13
	From the old description people still can't get what's the exact difference between nr_cpus and maxcpus. Especially in kdump kernel nr_cpus is always suggested if it's implemented in the ARCH. The reason is nr_cpus is used to limit the max number of possible cpu in system, the sum of already plugged cpus and hot plug cpus can't exceed its value. However maxcpus is used to limit how many cpus are allowed to be brought up during bootup. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2016-08-19	Update the maximum depth of C-state from 6 to 9	baolex.ni	1	-1/+1
	Hi Jon, This patch is an old one, we have corrected some minor issues on the newer one. Please only review the newest version from my last mail with this subject "[PATCH] ACPI: Update the maximum depth of C-state from 6 to 9". And I also attached it to this mail. Thanks, Baole On 7/11/2016 6:37 AM, Jonathan Corbet wrote: > On Mon, 4 Jul 2016 09:55:10 +0800 > "baolex.ni" <baolex.ni@intel.com> wrote: > >> Currently, CPUIDLE_STATE_MAX has been defined as 10 in the cpuidle head file, >> and max_cstate = CPUIDLE_STATE_MAX – 1, so 9 is the right maximum depth of C-state. >> This change is reflected in one place of the kernel-param file, >> but not in the other place where I suggest changing. >> >> Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com> >> Signed-off-by: Baole Ni <baolex.ni@intel.com> > > So why are there two signoffs on a single-line patch? Which one of you > is the actual author? > > Thanks, > > jon > From cf5f8aa6885874f6490b11507d3c0c86fa0a11f4 Mon Sep 17 00:00:00 2001 From: Chuansheng Liu <chuansheng.liu@intel.com> Date: Mon, 4 Jul 2016 08:52:51 +0800 Subject: [PATCH] Update the maximum depth of C-state from 6 to 9 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Currently, CPUIDLE_STATE_MAX has been defined as 10 in the cpuidle head file, and max_cstate = CPUIDLE_STATE_MAX – 1, so 9 is the right maximum depth of C-state. This change is reflected in one place of the kernel-param file, but not in the other place where I suggest changing. Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com> Signed-off-by: Baole Ni <baolex.ni@intel.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2016-08-09	PCI: Update "pci=resource_alignment" documentation	Mathias Koehrer	1	-0/+4
	Some uio based PCI drivers, e.g., uio_cif, do not work if the assigned PCI memory resources are not page aligned. By using the kernel option "pci=resource_alignment=<align>@<bus>:<slot>.<func>" it is possible to request page alignment for memory resources of devices. However, this is cumbersome when using several devices, and the bus/slot/func addresses may change if devices are added to or removed from the system. Extend the "pci=resource_alignment" option so we can specify the relevant devices via PCI vendor, device, subvendor, and subdevice IDs. The specification of the devices via IDs is indicated by a leading string "pci:" as argument to "pci=resource_alignment". The format of the specification is pci:<vendor>:<device>[:<subvendor>:<subdevice>] Examples: pci=resource_alignment=4096@pci:8086:9c22:103c:198f pci=resource_alignment=pci:8086:9c22 # defaults to PAGE_SIZE align [bhelgaas: changelog, use actual vendor/device IDs in examples] Signed-off-by: Mathias Koehrer <mathias.koehrer@etas.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-08-07	Merge tag 'doc-4.8-fixes' of git://git.lwn.net/linux	Linus Torvalds	1	-2/+2
	Pull documentation fixes from Jonathan Corbet: "Three fixes for the docs build, including removing an annoying warning on 'make help' if sphinx isn't present" * tag 'doc-4.8-fixes' of git://git.lwn.net/linux: DocBook: use DOCBOOKS="" to ignore DocBooks instead of IGNORE_DOCBOOKS=1 Documenation: update cgroup's document path Documentation/sphinx: do not warn about missing tools in 'make help'
2016-08-05	Merge tag 'nfsd-4.8' of git://linux-nfs.org/~bfields/linux	Linus Torvalds	1	-0/+6
	Pull nfsd updates from Bruce Fields: "Highlights: - Trond made a change to the server's tcp logic that allows a fast client to better take advantage of high bandwidth networks, but may increase the risk that a single client could starve other clients; a new sunrpc.svc_rpc_per_connection_limit parameter should help mitigate this in the (hopefully unlikely) event this becomes a problem in practice. - Tom Haynes added a minimal flex-layout pnfs server, which is of no use in production for now--don't build it unless you're doing client testing or further server development" * tag 'nfsd-4.8' of git://linux-nfs.org/~bfields/linux: (32 commits) nfsd: remove some dead code in nfsd_create_locked() nfsd: drop unnecessary MAY_EXEC check from create nfsd: clean up bad-type check in nfsd_create_locked nfsd: remove unnecessary positive-dentry check nfsd: reorganize nfsd_create nfsd: check d_can_lookup in fh_verify of directories nfsd: remove redundant zero-length check from create nfsd: Make creates return EEXIST instead of EACCES SUNRPC: Detect immediate closure of accepted sockets SUNRPC: accept() may return sockets that are still in SYN_RECV nfsd: allow nfsd to advertise multiple layout types nfsd: Close race between nfsd4_release_lockowner and nfsd4_lock nfsd/blocklayout: Make sure calculate signature/designator length aligned xfs: abstract block export operations from nfsd layouts SUNRPC: Remove unused callback xpo_adjust_wspace() SUNRPC: Change TCP socket space reservation SUNRPC: Add a server side per-connection limit SUNRPC: Micro optimisation for svc_data_ready SUNRPC: Call the default socket callbacks instead of open coding SUNRPC: lock the socket while detaching it ...
2016-08-04	Merge tag 'modules-next-for-linus' of ↵	Linus Torvalds	1	-0/+3
	git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module updates from Rusty Russell: "The only interesting thing here is Jessica's patch to add ro_after_init support to modules. The rest are all trivia" * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: extable.h: add stddef.h so "NULL" definition is not implicit modules: add ro_after_init support jump_label: disable preemption around __module_text_address(). exceptions: fork exception table content from module.h into extable.h modules: Add kernel parameter to blacklist modules module: Do a WARN_ON_ONCE() for assert module mutex not held Documentation/module-signing.txt: Note need for version info if reusing a key module: Invalidate signatures on force-loaded modules module: Issue warnings when tainting kernel module: fix redundant test. module: fix noreturn attribute for __module_put_and_exit()
2016-08-04	modules: Add kernel parameter to blacklist modules	Prarit Bhargava	1	-0/+3
	Blacklisting a module in linux has long been a problem. The current procedure is to use rd.blacklist=module_name, however, that doesn't cover the case after the initramfs and before a boot prompt (where one is supposed to use /etc/modprobe.d/blacklist.conf to blacklist runtime loading). Using rd.shell to get an early prompt is hit-or-miss, and doesn't cover all situations AFAICT. This patch adds this functionality of permanently blacklisting a module by its name via the kernel parameter module_blacklist=module_name. [v2]: Rusty, use core_param() instead of __setup() which simplifies things. [v3]: Rusty, undo wreckage from strsep() [v4]: Rusty, simpler version of blacklisted() Signed-off-by: Prarit Bhargava <prarit@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: linux-doc@vger.kernel.org Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2016-08-04	Documenation: update cgroup's document path	seokhoon.yoon	1	-2/+2
	cgroup's document path is changed to "cgroup-v1". update it. Signed-off-by: seokhoon.yoon <iamyooon@gmail.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>