summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-09-23objtool: Handle another GCC stack pointer adjustment bugJosh Poimboeuf2-17/+32
The kbuild bot reported the following warning with GCC 4.4 and a randconfig: net/socket.o: warning: objtool: compat_sock_ioctl()+0x1083: stack state mismatch: cfa1=7+160 cfa2=-1+0 This is caused by another GCC non-optimization, where it backs up and restores the stack pointer for no apparent reason: 2f91: 48 89 e0 mov %rsp,%rax 2f94: 4c 89 e7 mov %r12,%rdi 2f97: 4c 89 f6 mov %r14,%rsi 2f9a: ba 20 00 00 00 mov $0x20,%edx 2f9f: 48 89 c4 mov %rax,%rsp This issue would have been happily ignored before the following commit: dd88a0a0c861 ("objtool: Handle GCC stack pointer adjustment bug") But now that objtool is paying attention to such stack pointer writes to/from a register, it needs to understand them properly. In this case that means recognizing that the "mov %rsp, %rax" instruction is potentially a backup of the stack pointer. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthias Kaehlcke <mka@chromium.org> Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: dd88a0a0c861 ("objtool: Handle GCC stack pointer adjustment bug") Link: http://lkml.kernel.org/r/8c7aa8e9a36fbbb6655d9d8e7cea58958c912da8.1505942196.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-17x86/mm/32: Load a sane CR3 before cpu_init() on secondary CPUsAndy Lutomirski1-6/+7
For unknown historical reasons (i.e. Borislav doesn't recall), 32-bit kernels invoke cpu_init() on secondary CPUs with initial_page_table loaded into CR3. Then they set current->active_mm to &init_mm and call enter_lazy_tlb() before fixing CR3. This means that the x86 TLB code gets invoked while CR3 is inconsistent, and, with the improved PCID sanity checks I added, we warn. Fix it by loading swapper_pg_dir (i.e. init_mm.pgd) earlier. Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Reported-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 72c0098d92ce ("x86/mm: Reinitialize TLB state on hotplug and resume") Link: http://lkml.kernel.org/r/30cdfea504682ba3b9012e77717800a91c22097f.1505663533.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-17x86/mm/32: Move setup_clear_cpu_cap(X86_FEATURE_PCID) earlierAndy Lutomirski2-8/+8
Otherwise we might have the PCID feature bit set during cpu_init(). This is just for robustness. I haven't seen any actual bugs here. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: cba4671af755 ("x86/mm: Disable PCID on 32-bit kernels") Link: http://lkml.kernel.org/r/b16dae9d6b0db5d9801ddbebbfd83384097c61f3.1505663533.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-17x86/mm/64: Stop using CR3.PCID == 0 in ASID-aware codeAndy Lutomirski1-2/+19
Putting the logical ASID into CR3's PCID bits directly means that we have two cases to consider separately: ASID == 0 and ASID != 0. This means that bugs that only hit in one of these cases trigger nondeterministically. There were some bugs like this in the past, and I think there's still one in current kernels. In particular, we have a number of ASID-unware code paths that save CR3, write some special value, and then restore CR3. This includes suspend/resume, hibernate, kexec, EFI, and maybe other things I've missed. This is currently dangerous: if ASID != 0, then this code sequence will leave garbage in the TLB tagged for ASID 0. We could potentially see corruption when switching back to ASID 0. In principle, an initialize_tlbstate_and_flush() call after these sequences would solve the problem, but EFI, at least, does not call this. (And it probably shouldn't -- initialize_tlbstate_and_flush() is rather expensive.) Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/cdc14bbe5d3c3ef2a562be09a6368ffe9bd947a6.1505663533.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-17x86/mm: Factor out CR3-building codeAndy Lutomirski2-10/+16
Current, the code that assembles a value to load into CR3 is open-coded everywhere. Factor it out into helpers build_cr3() and build_cr3_noflush(). This makes one semantic change: __get_current_cr3_fast() was wrong on SME systems. No one noticed because the only caller is in the VMX code, and there are no CPUs with both SME and VMX. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Link: http://lkml.kernel.org/r/ce350cf11e93e2842d14d0b95b0199c7d881f527.1505663533.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-17Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds2-13/+29
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc fixes from Thomas Gleixner: - A fix for a user space regression in /proc/$PID/stat - A couple of objtool fixes: ~ Plug a memory leak ~ Avoid accessing empty sections which upsets certain binutil versions ~ Prevent corrupting the obj file when section sizes did not change * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: fs/proc: Report eip/esp in /prod/PID/stat for coredumping objtool: Fix object file corruption objtool: Do not retrieve data from empty sections objtool: Fix memory leak in elf_create_rela_section()
2017-09-17Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Thomas Gleixner: "Fix for an off by one error in a cpumask result comparison" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Fix cpumask check in __irq_startup_managed()
2017-09-17Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds1-0/+11
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Thomas Gleixner: "A single fix addressing the missing CP8 feature bit in CPUID for a range of AMD ZEN models/mask revisions" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu/AMD: Fix erratum 1076 (CPB bit)
2017-09-17Linux 4.14-rc1v4.14-rc1Linus Torvalds1-2/+2
2017-09-16Merge tag 'upstream-4.14-rc1' of git://git.infradead.org/linux-ubifsLinus Torvalds4-16/+16
Pull UBI updates from Richard Weinberger: "Minor improvements" * tag 'upstream-4.14-rc1' of git://git.infradead.org/linux-ubifs: UBI: Fix two typos in comments ubi: fastmap: fix spelling mistake: "invalidiate" -> "invalidate" ubi: pr_err() strings should end with newlines ubi: pr_err() strings should end with newlines ubi: pr_err() strings should end with newlines
2017-09-16Merge branch 'for-linus-4.14-rc1' of ↵Linus Torvalds14-27/+41
git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml Pull UML updates from Richard Weinberger: - minor improvements - fixes for Debian's new gcc defaults (pie enabled by default) - fixes for XSTATE/XSAVE to make UML work again on modern systems * 'for-linus-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: um: return negative in tuntap_open_tramp() um: remove a stray tab um: Use relative modversions with LD_SCRIPT_DYN um: link vmlinux with -no-pie um: Fix CONFIG_GCOV for modules. Fix minor typos and grammar in UML start_up help um: defconfig: Cleanup from old Kconfig options um: Fix FP register size for XSTATE/XSAVE
2017-09-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds48-234/+467
Pull networking fixes from David Miller: 1) Fix hotplug deadlock in hv_netvsc, from Stephen Hemminger. 2) Fix double-free in rmnet driver, from Dan Carpenter. 3) INET connection socket layer can double put request sockets, fix from Eric Dumazet. 4) Don't match collect metadata-mode tunnels if the device is down, from Haishuang Yan. 5) Do not perform TSO6/GSO on ipv6 packets with extensions headers in be2net driver, from Suresh Reddy. 6) Fix scaling error in gen_estimator, from Eric Dumazet. 7) Fix 64-bit statistics deadlock in systemport driver, from Florian Fainelli. 8) Fix use-after-free in sctp_sock_dump, from Xin Long. 9) Reject invalid BPF_END instructions in verifier, from Edward Cree. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits) mlxsw: spectrum_router: Only handle IPv4 and IPv6 events Documentation: link in networking docs tcp: fix data delivery rate bpf/verifier: reject BPF_ALU64|BPF_END sctp: do not mark sk dumped when inet_sctp_diag_fill returns err sctp: fix an use-after-free issue in sctp_sock_dump netvsc: increase default receive buffer size tcp: update skb->skb_mstamp more carefully net: ipv4: fix l3slave check for index returned in IP_PKTINFO net: smsc911x: Quieten netif during suspend net: systemport: Fix 64-bit stats deadlock net: vrf: avoid gcc-4.6 warning qed: remove unnecessary call to memset tg3: clean up redundant initialization of tnapi tls: make tls_sw_free_resources static sctp: potential read out of bounds in sctp_ulpevent_type_enabled() MAINTAINERS: review Renesas DT bindings as well net_sched: gen_estimator: fix scaling error in bytes/packets samples nfp: wait for the NSP resource to appear on boot nfp: wait for board state before talking to the NSP ...
2017-09-16Merge branch 'for-linus' of ↵Linus Torvalds12-11/+380
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull more input updates from Dmitry Torokhov: "A second round of updates for the input subsystem: - a new driver for PWM-controlled vibrators - ucb1400 touchscreen driver had completely busted suspend/resume handling - we now handle "home" button found on some devices with Goodix touchscreens - assorted other fixups" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: i8042 - add Gigabyte P57 to the keyboard reset table Input: xpad - validate USB endpoint type during probe Input: ucb1400_ts - fix suspend and resume handling Input: edt-ft5x06 - fix access to non-existing register Input: elantech - make arrays debounce_packet static, reduces object code size Input: surface3_spi - make const array header static, reduces object code size Input: goodix - add support for capacitive home button Input: add a driver for PWM controllable vibrators Input: adi - make array seq static, reduces object code size
2017-09-16genirq: Fix cpumask check in __irq_startup_managed()Thomas Gleixner1-1/+1
The result of cpumask_any_and() is invalid when result greater or equal nr_cpu_ids. The current check is checking for greater only. Fix it. Fixes: 761ea388e8c4 ("genirq: Handle managed irqs gracefully in irq_startup()") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Chen Yu <yu.c.chen@intel.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Alok Kataria <akataria@vmware.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: stable@vger.kernel.org Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Rui Zhang <rui.zhang@intel.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Len Brown <lenb@kernel.org> Link: http://lkml.kernel.org/r/20170913213152.272283444@linutronix.de
2017-09-16firmware: Restore support for built-in firmwareMarkus Trippelsdorf4-5/+71
Commit 5620a0d1aac ("firmware: delete in-kernel firmware") removed the entire firmware directory. Unfortunately it thereby also removed the support for built-in firmware. This restores the ability to build firmware directly into the kernel by pruning the original Makefile to the necessary minimum. The default for EXTRA_FIRMWARE_DIR is now the standard directory /lib/firmware/. Fixes: 5620a0d1aac ("firmware: delete in-kernel firmware") Signed-off-by: Markus Trippelsdorf <markus@trippelsdorf.de> Acked-by: Greg K-H <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-16mlxsw: spectrum_router: Only handle IPv4 and IPv6 eventsIdo Schimmel1-1/+2
The driver doesn't support events from address families other than IPv4 and IPv6, so ignore them. Otherwise, we risk queueing a work item before it's initialized. This can happen in case a VRF is configured when MROUTE_MULTIPLE_TABLES is enabled, as the VRF driver will try to add an l3mdev rule for the IPMR family. Fixes: 65e65ec137f4 ("mlxsw: spectrum_router: Don't ignore IPv6 notifications") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Andreas Rammhold <andreas@rammhold.de> Reported-by: Florian Klink <flokli@flokli.de> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-16Documentation: link in networking docsPavel Machek1-1/+1
Fix link in filter.txt. Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-16tcp: fix data delivery rateEric Dumazet1-4/+3
Now skb->mstamp_skb is updated later, we also need to call tcp_rate_skb_sent() after the update is done. Fixes: 8c72c65b426b ("tcp: update skb->skb_mstamp more carefully") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-16Merge branch '4.14-features' of ↵Linus Torvalds205-3427/+6399
git://git.linux-mips.org/pub/scm/ralf/upstream-linus Pull MIPS updates from Ralf Baechle: "This is the main pull request for 4.14 for MIPS; below a summary of the non-merge commits: CM: - Rename mips_cm_base to mips_gcr_base - Specify register size when generating accessors - Use BIT/GENMASK for register fields, order & drop shifts - Add cluster & block args to mips_cm_lock_other() CPC: - Use common CPS accessor generation macros - Use BIT/GENMASK for register fields, order & drop shifts - Introduce register modify (set/clear/change) accessors - Use change_*, set_* & clear_* where appropriate - Add CM/CPC 3.5 register definitions - Use GlobalNumber macros rather than magic numbers - Have asm/mips-cps.h include CM & CPC headers - Cluster support for topology functions - Detect CPUs in secondary clusters CPS: - Read GIC_VL_IDENT directly, not via irqchip driver DMA: - Consolidate coherent and non-coherent dma_alloc code - Don't use dma_cache_sync to implement fd_cacheflush FPU emulation / FP assist code: - Another series of 14 commits fixing corner cases such as NaN propgagation and other special input values. - Zero bits 32-63 of the result for a CLASS.D instruction. - Enhanced statics via debugfs - Do not use bools for arithmetic. GCC 7.1 moans about this. - Correct user fault_addr type Generic MIPS: - Enhancement of stack backtraces - Cleanup from non-existing options - Handle non word sized instructions when examining frame - Fix detection and decoding of ADDIUSP instruction - Fix decoding of SWSP16 instruction - Refactor handling of stack pointer in get_frame_info - Remove unreachable code from force_fcr31_sig() - Convert to using %pOF instead of full_name - Remove the R6000 support. - Move FP code from *_switch.S to *_fpu.S - Remove unused ST_OFF from r2300_switch.S - Allow platform to specify multiple its.S files - Add #includes to various files to ensure code builds reliable and without warning.. - Remove __invalidate_kernel_vmap_range - Remove plat_timer_setup - Declare various variables & functions static - Abstract CPU core & VP(E) ID access through accessor functions - Store core & VP IDs in GlobalNumber-style variable - Unify checks for sibling CPUs - Add CPU cluster number accessors - Prevent direct use of generic_defconfig - Make CONFIG_MIPS_MT_SMP default y - Add __ioread64_copy - Remove unnecessary inclusions of linux/irqchip/mips-gic.h GIC: - Introduce asm/mips-gic.h with accessor functions - Use new GIC accessor functions in mips-gic-timer - Remove counter access functions from irq-mips-gic.c - Remove gic_read_local_vp_id() from irq-mips-gic.c - Simplify shared interrupt pending/mask reads in irq-mips-gic.c - Simplify gic_local_irq_domain_map() in irq-mips-gic.c - Drop gic_(re)set_mask() functions in irq-mips-gic.c - Remove gic_set_polarity(), gic_set_trigger(), gic_set_dual_edge(), gic_map_to_pin() and gic_map_to_vpe() from irq-mips-gic.c. - Convert remaining shared reg access, local int mask access and remaining local reg access to new accessors - Move GIC_LOCAL_INT_* to asm/mips-gic.h - Remove GIC_CPU_INT* macros from irq-mips-gic.c - Move various definitions to the driver - Remove gic_get_usm_range() - Remove __gic_irq_dispatch() forward declaration - Remove gic_init() - Use mips_gic_present() in place of gic_present and remove gic_present - Move gic_get_c0_*_int() to asm/mips-gic.h - Remove linux/irqchip/mips-gic.h - Inline __gic_init() - Inline gic_basic_init() - Make pcpu_masks a per-cpu variable - Use pcpu_masks to avoid reading GIC_SH_MASK* - Clean up mti, reserved-cpu-vectors handling - Use cpumask_first_and() in gic_set_affinity() - Let the core set struct irq_common_data affinity microMIPS: - Fix microMIPS stack unwinding on big endian systems MIPS-GIC: - SYNC after enabling GIC region NUMA: - Remove the unused parent_node() macro R6: - Constify r2_decoder_tables - Add accessor & bit definitions for GlobalNumber SMP: - Constify smp ops - Allow boot_secondary SMP op to return errors VDSO: - Drop gic_get_usm_range() usage - Avoid use of linux/irqchip/mips-gic.h Platform changes: Alchemy: - Add devboard machine type to cpuinfo - update cpu feature overrides - Threaded carddetect irqs for devboards AR7: - allow NULL clock for clk_get_rate BCM63xx: - Fix ENETDMA_6345_MAXBURST_REG offset - Allow NULL clock for clk_get_rate CI20: - Enable GPIO and RTC drivers in defconfig - Add ethernet and fixed-regulator nodes to DTS Generic platform: - Move Boston and NI 169445 FIT image source to their own files - Include asm/bootinfo.h for plat_fdt_relocated() - Include asm/time.h for get_c0_*_int() - Include asm/bootinfo.h for plat_fdt_relocated() - Include asm/time.h for get_c0_*_int() - Allow filtering enabled boards by requirements - Don't explicitly disable CONFIG_USB_SUPPORT - Bump default NR_CPUS to 16 JZ4700: - Probe the jz4740-rtc driver from devicetree Lantiq: - Drop check of boot select from the spi-falcon driver. - Drop check of boot select from the lantiq-flash MTD driver. - Access boot cause register in the watchdog driver through regmap - Add device tree binding documentation for the watchdog driver - Add docs for the RCU DT bindings. - Convert the fpi bus driver to a platform_driver - Remove ltq_reset_cause() and ltq_boot_select( - Switch to a proper reset driver - Switch to a new drivers/soc GPHY driver - Add an USB PHY driver for the Lantiq SoCs using the RCU module - Use of_platform_default_populate instead of __dt_register_buses - Enable MFD_SYSCON to be able to use it for the RCU MFD - Replace ltq_boot_select() with dummy implementation. Loongson 2F: - Allow NULL clock for clk_get_rate Malta: - Use new GIC accessor functions NI 169445: - Add support for NI 169445 board. - Only include in 32r2el kernels Octeon: - Add support for watchdog of 78XX SOCs. - Add support for watchdog of CN68XX SOCs. - Expose support for mips32r1, mips32r2 and mips64r1 - Enable more drivers in config file - Add support for accessing the boot vector. - Remove old boot vector code from watchdog driver - Define watchdog registers for 70xx, 73xx, 78xx, F75xx. - Make CSR functions node aware. - Allow access to CIU3 IRQ domains. - Misc cleanups in the watchdog driver Omega2+: - New board, add support and defconfig Pistachio: - Enable Root FS on NFS in defconfig Ralink: - Add Mediatek MT7628A SoC - Allow NULL clock for clk_get_rate - Explicitly request exclusive reset control in the pci-mt7620 PCI driver. SEAD3: - Only include in 32 bit kernels by default VoCore: - Add VoCore as a vendor t0 dt-bindings - Add defconfig file" * '4.14-features' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (167 commits) MIPS: Refactor handling of stack pointer in get_frame_info MIPS: Stacktrace: Fix microMIPS stack unwinding on big endian systems MIPS: microMIPS: Fix decoding of swsp16 instruction MIPS: microMIPS: Fix decoding of addiusp instruction MIPS: microMIPS: Fix detection of addiusp instruction MIPS: Handle non word sized instructions when examining frame MIPS: ralink: allow NULL clock for clk_get_rate MIPS: Loongson 2F: allow NULL clock for clk_get_rate MIPS: BCM63XX: allow NULL clock for clk_get_rate MIPS: AR7: allow NULL clock for clk_get_rate MIPS: BCM63XX: fix ENETDMA_6345_MAXBURST_REG offset mips: Save all registers when saving the frame MIPS: Add DWARF unwinding to assembly MIPS: Make SAVE_SOME more standard MIPS: Fix issues in backtraces MIPS: jz4780: DTS: Probe the jz4740-rtc driver from devicetree MIPS: Ci20: Enable RTC driver watchdog: octeon-wdt: Add support for 78XX SOCs. watchdog: octeon-wdt: Add support for cn68XX SOCs. watchdog: octeon-wdt: File cleaning. ...
2017-09-16Merge tag 'pci-v4.14-fixes-1' of ↵Linus Torvalds1-11/+2
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fix from Bjorn Helgaas: "Revert an attempt to fix a race while enabling upstream bridges because it broke iwlwifi firmware loading" * tag 'pci-v4.14-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: Revert "PCI: Avoid race while enabling upstream bridges"
2017-09-16Merge tag 'drm-fixes-for-v4.14-rc1' of ↵Linus Torvalds28-160/+236
git://people.freedesktop.org/~airlied/linux Pull drm AMD fixes from Dave Airlie: "Just had a single AMD fixes pull from Alex for rc1" * tag 'drm-fixes-for-v4.14-rc1' of git://people.freedesktop.org/~airlied/linux: drm/amdgpu: revert "fix deadlock of reservation between cs and gpu reset v2" drm/amdgpu: remove duplicate return statement drm/amdgpu: check memory allocation failure drm/amd/amdgpu: fix BANK_SELECT on Vega10 (v2) drm/amdgpu: inline amdgpu_ttm_do_bind again drm/amdgpu: fix amdgpu_ttm_bind drm/amdgpu: remove the GART copy hack drm/ttm:fix wrong decoding of bo_count drm/ttm: fix missing inc bo_count drm/amdgpu: set sched_hw_submission higher for KIQ (v3) drm/amdgpu: move default gart size setting into gmc modules drm/amdgpu: refine default gart size drm/amd/powerplay: ACG frequency added in PPTable drm/amdgpu: discard commands of killed processes drm/amdgpu: fix and cleanup shadow handling drm/amdgpu: add automatic per asic settings for gart_size drm/amdgpu/gfx8: fix spelling typo in mqd allocation drm/amd/powerplay: unhalt mec after loading drm/amdgpu/virtual_dce: Virtual display doesn't support disable vblank immediately drm/amdgpu: Fix huge page updates with CPU
2017-09-16Merge branch 'i2c/for-next' of ↵Linus Torvalds9-14/+1602
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull more i2c updates from Wolfram Sang: "I2C has two more new drivers: Altera FPGA and STM32F7" * 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: i2c-stm32f7: add driver i2c: i2c-stm32f4: use generic definition of speed enum dt-bindings: i2c-stm32: Document the STM32F7 I2C bindings i2c: altera: Add Altera I2C Controller driver dt-bindings: i2c: Add Altera I2C Controller
2017-09-16Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds18-111/+257
Pull more KVM updates from Paolo Bonzini: - PPC bugfixes - RCU splat fix - swait races fix - pointless userspace-triggerable BUG() fix - misc fixes for KVM_RUN corner cases - nested virt correctness fixes + one host DoS - some cleanups - clang build fix - fix AMD AVIC with default QEMU command line options - x86 bugfixes * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (28 commits) kvm: nVMX: Handle deferred early VMLAUNCH/VMRESUME failure properly kvm: vmx: Handle VMLAUNCH/VMRESUME failure properly kvm: nVMX: Remove nested_vmx_succeed after successful VM-entry kvm,mips: Fix potential swait_active() races kvm,powerpc: Serialize wq active checks in ops->vcpu_kick kvm: Serialize wq active checks in kvm_vcpu_wake_up() kvm,x86: Fix apf_task_wake_one() wq serialization kvm,lapic: Justify use of swait_active() kvm,async_pf: Use swq_has_sleeper() sched/wait: Add swq_has_sleeper() KVM: VMX: Do not BUG() on out-of-bounds guest IRQ KVM: Don't accept obviously wrong gsi values via KVM_IRQFD kvm: nVMX: Don't allow L2 to access the hardware CR8 KVM: trace events: update list of exit reasons KVM: async_pf: Fix #DF due to inject "Page not Present" and "Page Ready" exceptions simultaneously KVM: X86: Don't block vCPU if there is pending exception KVM: SVM: Add irqchip_split() checks before enabling AVIC KVM: Add struct kvm_vcpu pointer parameter to get_enable_apicv() KVM: SVM: Refactor AVIC vcpu initialization into avic_init_vcpu() KVM: x86: fix clang build ...
2017-09-16bpf/verifier: reject BPF_ALU64|BPF_ENDEdward Cree2-1/+18
Neither ___bpf_prog_run nor the JITs accept it. Also adds a new test case. Fixes: 17a5267067f3 ("bpf: verifier (add verifier core)") Signed-off-by: Edward Cree <ecree@solarflare.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-16sctp: do not mark sk dumped when inet_sctp_diag_fill returns errXin Long1-1/+0
sctp_diag would not actually dump out sk/asoc if inet_sctp_diag_fill returns err, in which case it shouldn't mark sk dumped by setting cb->args[3] as 1 in sctp_sock_dump(). Otherwise, it could cause some asocs to have no parent's sk dumped in 'ss --sctp'. So this patch is to not set cb->args[3] when inet_sctp_diag_fill() returns err in sctp_sock_dump(). Fixes: 8f840e47f190 ("sctp: add the sctp_diag.c file") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-16sctp: fix an use-after-free issue in sctp_sock_dumpXin Long3-39/+36
Commit 86fdb3448cc1 ("sctp: ensure ep is not destroyed before doing the dump") tried to fix an use-after-free issue by checking !sctp_sk(sk)->ep with holding sock and sock lock. But Paolo noticed that endpoint could be destroyed in sctp_rcv without sock lock protection. It means the use-after-free issue still could be triggered when sctp_rcv put and destroy ep after sctp_sock_dump checks !ep, although it's pretty hard to reproduce. I could reproduce it by mdelay in sctp_rcv while msleep in sctp_close and sctp_sock_dump long time. This patch is to add another param cb_done to sctp_for_each_transport and dump ep->assocs with holding tsp after jumping out of transport's traversal in it to avoid this issue. It can also improve sctp diag dump to make it run faster, as no need to save sk into cb->args[5] and keep calling sctp_for_each_transport any more. This patch is also to use int * instead of int for the pos argument in sctp_for_each_transport, which could make postion increment only in sctp_for_each_transport and no need to keep changing cb->args[2] in sctp_sock_filter and sctp_sock_dump any more. Fixes: 86fdb3448cc1 ("sctp: ensure ep is not destroyed before doing the dump") Reported-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-16netvsc: increase default receive buffer sizeStephen Hemminger1-1/+1
The default receive buffer size was reduced by recent change to a value which was appropriate for 10G and Windows Server 2016. But the value is too small for full performance with 40G on Azure. Increase the default back to maximum supported by host. Fixes: 8b5327975ae1 ("netvsc: allow controlling send/recv buffer size") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-16tcp: update skb->skb_mstamp more carefullyEric Dumazet1-7/+12
liujian reported a problem in TCP_USER_TIMEOUT processing with a patch in tcp_probe_timer() : https://www.spinics.net/lists/netdev/msg454496.html After investigations, the root cause of the problem is that we update skb->skb_mstamp of skbs in write queue, even if the attempt to send a clone or copy of it failed. One reason being a routing problem. This patch prevents this, solving liujian issue. It also removes a potential RTT miscalculation, since __tcp_retransmit_skb() is not OR-ing TCP_SKB_CB(skb)->sacked with TCPCB_EVER_RETRANS if a failure happens, but skb->skb_mstamp has been changed. A future ACK would then lead to a very small RTT sample and min_rtt would then be lowered to this too small value. Tested: # cat user_timeout.pkt --local_ip=192.168.102.64 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 +0 `ifconfig tun0 192.168.102.64/16; ip ro add 192.0.2.1 dev tun0` +0 < S 0:0(0) win 0 <mss 1460> +0 > S. 0:0(0) ack 1 <mss 1460> +.1 < . 1:1(0) ack 1 win 65530 +0 accept(3, ..., ...) = 4 +0 setsockopt(4, SOL_TCP, TCP_USER_TIMEOUT, [3000], 4) = 0 +0 write(4, ..., 24) = 24 +0 > P. 1:25(24) ack 1 win 29200 +.1 < . 1:1(0) ack 25 win 65530 //change the ipaddress +1 `ifconfig tun0 192.168.0.10/16` +1 write(4, ..., 24) = 24 +1 write(4, ..., 24) = 24 +1 write(4, ..., 24) = 24 +1 write(4, ..., 24) = 24 +0 `ifconfig tun0 192.168.102.64/16` +0 < . 1:2(1) ack 25 win 65530 +0 `ifconfig tun0 192.168.0.10/16` +3 write(4, ..., 24) = -1 # ./packetdrill user_timeout.pkt Signed-off-by: Eric Dumazet <edumazet@googl.com> Reported-by: liujian <liujian56@huawei.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-16fs/proc: Report eip/esp in /prod/PID/stat for coredumpingJohn Ogness1-0/+9
Commit 0a1eb2d474ed ("fs/proc: Stop reporting eip and esp in /proc/PID/stat") stopped reporting eip/esp because it is racy and dangerous for executing tasks. The comment adds: As far as I know, there are no use programs that make any material use of these fields, so just get rid of them. However, existing userspace core-dump-handler applications (for example, minicoredumper) are using these fields since they provide an excellent cross-platform interface to these valuable pointers. So that commit introduced a user space visible regression. Partially revert the change and make the readout possible for tasks with the proper permissions and only if the target task has the PF_DUMPCORE flag set. Fixes: 0a1eb2d474ed ("fs/proc: Stop reporting eip and esp in> /proc/PID/stat") Reported-by: Marco Felsch <marco.felsch@preh.de> Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Cc: Tycho Andersen <tycho.andersen@canonical.com> Cc: Kees Cook <keescook@chromium.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: stable@vger.kernel.org Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: Borislav Petkov <bp@alien8.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Linux API <linux-api@vger.kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/87poatfwg6.fsf@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-09-16net: ipv4: fix l3slave check for index returned in IP_PKTINFODavid Ahern1-2/+6
rt_iif is only set to the actual egress device for the output path. The recent change to consider the l3slave flag when returning IP_PKTINFO works for local traffic (the correct device index is returned), but it broke the more typical use case of packets received from a remote host always returning the VRF index rather than the original ingress device. Update the fixup to consider l3slave and rt_iif actually getting set. Fixes: 1dfa76390bf05 ("net: ipv4: add check for l3slave for index returned in IP_PKTINFO") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-16net: smsc911x: Quieten netif during suspendGeert Uytterhoeven1-1/+14
If the network interface is kept running during suspend, the net core may call net_device_ops.ndo_start_xmit() while the Ethernet device is still suspended, which may lead to a system crash. E.g. on sh73a0/kzm9g and r8a73a4/ape6evm, the external Ethernet chip is driven by a PM controlled clock. If the Ethernet registers are accessed while the clock is not running, the system will crash with an imprecise external abort. As this is a race condition with a small time window, it is not so easy to trigger at will. Using pm_test may increase your chances: # echo 0 > /sys/module/printk/parameters/console_suspend # echo platform > /sys/power/pm_test # echo mem > /sys/power/state To fix this, make sure the network interface is quietened during suspend. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-16net: systemport: Fix 64-bit stats deadlockFlorian Fainelli1-3/+0
We can enter a deadlock situation because there is no sufficient protection when ndo_get_stats64() runs in process context to guard against RX or TX NAPI contexts running in softirq, this can lead to the following lockdep splat and actual deadlock was experienced as well with an iperf session in the background and a while loop doing ifconfig + ethtool. [ 5.780350] ================================ [ 5.784679] WARNING: inconsistent lock state [ 5.789011] 4.13.0-rc7-02179-g32fae27c725d #70 Not tainted [ 5.794561] -------------------------------- [ 5.798890] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. [ 5.804971] swapper/0/0 [HC0[0]:SC1[1]:HE0:SE0] takes: [ 5.810175] (&syncp->seq#2){+.?...}, at: [<c0768a28>] bcm_sysport_tx_reclaim+0x30/0x54 [ 5.818327] {SOFTIRQ-ON-W} state was registered at: [ 5.823278] bcm_sysport_get_stats64+0x17c/0x258 [ 5.828053] dev_get_stats+0x38/0xac [ 5.831776] rtnl_fill_stats+0x30/0x118 [ 5.835761] rtnl_fill_ifinfo+0x538/0xe24 [ 5.839921] rtmsg_ifinfo_build_skb+0x6c/0xd8 [ 5.844430] rtmsg_ifinfo_event.part.5+0x14/0x44 [ 5.849201] rtmsg_ifinfo+0x20/0x28 [ 5.852837] register_netdevice+0x628/0x6b8 [ 5.857171] register_netdev+0x14/0x24 [ 5.861051] bcm_sysport_probe+0x30c/0x438 [ 5.865280] platform_drv_probe+0x50/0xb0 [ 5.869418] driver_probe_device+0x2e8/0x450 [ 5.873817] __driver_attach+0x104/0x120 [ 5.877871] bus_for_each_dev+0x7c/0xc0 [ 5.881834] bus_add_driver+0x1b0/0x270 [ 5.885797] driver_register+0x78/0xf4 [ 5.889675] do_one_initcall+0x54/0x190 [ 5.893646] kernel_init_freeable+0x144/0x1d0 [ 5.898135] kernel_init+0x8/0x110 [ 5.901665] ret_from_fork+0x14/0x2c [ 5.905363] irq event stamp: 24263 [ 5.908804] hardirqs last enabled at (24262): [<c08eecf0>] net_rx_action+0xc4/0x4e4 [ 5.916624] hardirqs last disabled at (24263): [<c0a7da00>] _raw_spin_lock_irqsave+0x1c/0x98 [ 5.925143] softirqs last enabled at (24258): [<c022a7fc>] irq_enter+0x84/0x98 [ 5.932524] softirqs last disabled at (24259): [<c022a918>] irq_exit+0x108/0x16c [ 5.939985] [ 5.939985] other info that might help us debug this: [ 5.946576] Possible unsafe locking scenario: [ 5.946576] [ 5.952556] CPU0 [ 5.955031] ---- [ 5.957506] lock(&syncp->seq#2); [ 5.960955] <Interrupt> [ 5.963604] lock(&syncp->seq#2); [ 5.967227] [ 5.967227] *** DEADLOCK *** [ 5.967227] [ 5.973222] 1 lock held by swapper/0/0: [ 5.977092] #0: (&(&ring->lock)->rlock){..-...}, at: [<c0768a18>] bcm_sysport_tx_reclaim+0x20/0x54 So just remove the u64_stats_update_begin()/end() pair in ndo_get_stats64() since it does not appear to be useful for anything. No inconsistency was observed with either ifconfig or ethtool, global TX counts equal the sum of per-queue TX counts on a 32-bit architecture. Fixes: 10377ba7673d ("net: systemport: Support 64bit statistics") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-16net: vrf: avoid gcc-4.6 warningArnd Bergmann1-3/+3
When building an allmodconfig kernel with gcc-4.6, we get a rather odd warning: drivers/net/vrf.c: In function ‘vrf_ip6_input_dst’: drivers/net/vrf.c:964:3: error: initialized field with side-effects overwritten [-Werror] drivers/net/vrf.c:964:3: error: (near initialization for ‘fl6’) [-Werror] I have no idea what this warning is even trying to say, but it does seem like a false positive. Reordering the initialization in to match the structure definition gets rid of the warning, and might also avoid whatever gcc thinks is wrong here. Fixes: 9ff74384600a ("net: vrf: Handle ipv6 multicast and link-local addresses") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-16qed: remove unnecessary call to memsetHimanshu Jha1-1/+0
call to memset to assign 0 value immediately after allocating memory with kzalloc is unnecesaary as kzalloc allocates the memory filled with 0 value. Semantic patch used to resolve this issue: @@ expression e,e2; constant c; statement S; @@ e = kzalloc(e2, c); if(e == NULL) S - memset(e, 0, e2); Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com> Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com> Acked-by: Sudarsana Kalluru <sudarsana.kalluru@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-15Merge tag 'firmware_removal-4.14-rc1' of ↵Linus Torvalds153-129170/+1
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull firmware removal from Greg KH: "Many many years ago (at the kernel summit in Boston), we all came to the agreement that the firmware/ tree should be dropped from the kernel, and everyone use the linux-firmware package instead. For some minor reason, David Woodhouse didn't send the pull request at that point in time, and everyone forgot about this. The topic came up in the hallway track at the Plumbers conference this week, so here's a single patch that drops the whole firmware tree. The last firmware update was back in 2013, and all distros have been using linux-firmware instead since at least that year, if not before. The only commits to that directory since 2013 was some kbuild fixups for various build tool issues. So lets finally drop this, we don't need to lug them around in the kernel source tree anymore, especially as no one wants or uses them. This has passed build testing with 0-day, I don't think it made it into linux-next this week, but I figured it was good to get in before 4.14-rc1 was out" * tag 'firmware_removal-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: firmware: delete in-kernel firmware
2017-09-15Merge tag 'nios2-v4.14-rc1' of ↵Linus Torvalds2-2/+6
git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2 Pull arch/nios2 update from Ley Foon Tan. * tag 'nios2-v4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2: nios2: time: Read timer in get_cycles only if initialized nios2: add earlycon support to 3c120 devboard DTS
2017-09-15Merge tag 'powerpc-4.14-2' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fix from Michael Ellerman: "Just one fix, for the handling of alignment interrupts on dcbz instructions. Thanks to Paul Mackerras, Christian Zigotzky, Michal Sojka" * tag 'powerpc-4.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc: Fix handling of alignment interrupt on dcbz instruction
2017-09-15Merge tag 'for-linus-4.14-ofs2' of ↵Linus Torvalds9-63/+60
git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux Pull orangefs updates from Mike Marshall: "Some cleanups and a big bug fix for ACLs. When I was reviewing Jan Kara's ACL patch, I realized that Orangefs ACL code was busted, not just in the kernel module, but in the server as well. I've been working on the code in the server mostly, but here's one kernel patch, there will be more" * tag 'for-linus-4.14-ofs2' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux: orangefs: Adjust three checks for null pointers orangefs: Use kcalloc() in orangefs_prepare_cdm_array() orangefs: Delete error messages for a failed memory allocation in five functions orangefs: constify xattr_handler structure orangefs: don't call filemap_write_and_wait from fsync orangefs: off by ones in xattr size checks orangefs: documentation clean up orangefs: react properly to posix_acl_update_mode's aftermath. orangefs: Don't clear SGID when inheriting ACLs
2017-09-15Merge branch 'next' into for-linusDmitry Torokhov12-11/+380
Prepare second round of input updates for 4.14 merge window.
2017-09-15Input: i8042 - add Gigabyte P57 to the keyboard reset tableKai-Heng Feng1-0/+7
Similar to other Gigabyte laptops, the touchpad on P57 requires a keyboard reset to detect Elantech touchpad correctly. BugLink: https://bugs.launchpad.net/bugs/1594214 Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-09-15kvm: nVMX: Handle deferred early VMLAUNCH/VMRESUME failure properlyJim Mattson1-59/+75
When emulating a nested VM-entry from L1 to L2, several control field validation checks are deferred to the hardware. Should one of these validation checks fail, vcpu_vmx_run will set the vmx->fail flag. When this happens, the L2 guest state is not loaded (even in part), and execution should continue in L1 with the next instruction after the VMLAUNCH/VMRESUME. The VMCS12 is not modified (except for the VM-instruction error field), the VMCS12 MSR save/load lists are not processed, and the CPU state is not loaded from the VMCS12 host area. Moreover, the vmcs02 exit reason is stale, so it should not be consulted for any reason. Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-15kvm: vmx: Handle VMLAUNCH/VMRESUME failure properlyJim Mattson1-6/+8
On an early VMLAUNCH/VMRESUME failure (i.e. one which sets the VM-instruction error field of the current VMCS), the launch state of the current VMCS is not set to "launched," and the VM-exit information fields of the current VMCS (including IDT-vectoring information and exit reason) are stale. On a late VMLAUNCH/VMRESUME failure (i.e. one which sets the high bit of the exit reason field), the launch state of the current VMCS is not set to "launched," and only two of the VM-exit information fields of the current VMCS are modified (exit reason and exit qualification). The remaining VM-exit information fields of the current VMCS (including IDT-vectoring information, in particular) are stale. Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-15kvm: nVMX: Remove nested_vmx_succeed after successful VM-entryJim Mattson1-7/+9
After a successful VM-entry, RFLAGS is cleared, with the exception of bit 1, which is always set. This is handled by load_vmcs12_host_state. Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-15kvm,mips: Fix potential swait_active() racesDavidlohr Bueso1-2/+2
For example, the following could occur, making us miss a wakeup: CPU0 CPU1 kvm_vcpu_block kvm_mips_comparecount_func [L] swait_active(&vcpu->wq) [S] prepare_to_swait(&vcpu->wq) [L] if (!kvm_vcpu_has_pending_timer(vcpu)) schedule() [S] queue_timer_int(vcpu) Ensure that the swait_active() check is not hoisted over the interrupt. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-15kvm,powerpc: Serialize wq active checks in ops->vcpu_kickDavidlohr Bueso1-1/+1
Particularly because kvmppc_fast_vcpu_kick_hv() is a callback, ensure that we properly serialize wq active checks in order to avoid potentially missing a wakeup due to racing with the waiter side. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-15kvm: Serialize wq active checks in kvm_vcpu_wake_up()Davidlohr Bueso1-1/+1
This is a generic call and can be suceptible to races in reading the wq task_list while another task is adding itself to the list. Add a full barrier by using the swq_has_sleeper() helper. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-15kvm,x86: Fix apf_task_wake_one() wq serializationDavidlohr Bueso1-1/+1
During code inspection, the following potential race was seen: CPU0 CPU1 kvm_async_pf_task_wait apf_task_wake_one [L] swait_active(&n->wq) [S] prepare_to_swait(&n.wq) [L] if (!hlist_unhahed(&n.link)) schedule() [S] hlist_del_init(&n->link); Properly serialize swait_active() checks such that a wakeup is not missed. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-15kvm,lapic: Justify use of swait_active()Davidlohr Bueso1-0/+4
A comment might serve future readers. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-15kvm,async_pf: Use swq_has_sleeper()Davidlohr Bueso1-5/+1
... as we've got the new helper now. This caller already does the right thing, hence no changes in semantics. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-15sched/wait: Add swq_has_sleeper()Davidlohr Bueso1-2/+56
Which is the equivalent of what we have in regular waitqueues. I'm not crazy about the name, but this also helps us get both apis closer -- which iirc comes originally from the -net folks. We also duplicate the comments for the lockless swait_active(), from wait.h. Future users will make use of this interface. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>