summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-04-17wilc1000: Check for errors at end of DMA writeDavid Mosberger-Tang1-1/+61
After a DMA write to the WILC chip, check for and report any errors. This is based on code from the wilc driver in the linux-at91 repository. Signed-off-by: David Mosberger-Tang <davidm@egauge.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210227172818.1711071-3-davidm@egauge.net
2021-04-17wilc1000: Introduce symbolic names for SPI protocol registerDavid Mosberger-Tang1-9/+29
The WILC1000 protocol control register has bits for enabling the CRCs (CRC7 for commands and CRC16 for data) and to set the data packet size. Define symbolic names for those so the code is more easily understood. Signed-off-by: David Mosberger-Tang <davidm@egauge.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210227172818.1711071-2-davidm@egauge.net
2021-04-17wilc1000: Make SPI transfers work at 48MHzDavid Mosberger-Tang1-13/+29
For CMD_SINGLE_READ and CMD_INTERNAL_READ, WILC may insert one or more zero bytes between the command response and the DATA Start tag (0xf3). This behavior appears to be undocumented in "ATWILC1000 USER GUIDE" (https://tinyurl.com/4hhshdts) but we have observed 1-4 zero bytes when the SPI bus operates at 48MHz and none when it operates at 1MHz. This code is derived from the equivalent code of the wilc driver in the linux-at91 repository. Signed-off-by: David Mosberger-Tang <davidm@egauge.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210227172818.1711071-1-davidm@egauge.net
2021-04-17mwifiex: don't print SSID to logsBrian Norris1-7/+4
There are a few reasons not to dump SSIDs as-is in kernel logs: 1) they're not guaranteed to be any particular text encoding (UTF-8, ASCII, ...) in general 2) it's somewhat redundant; the BSSID should be enough to uniquely identify the AP/STA to which we're connecting 3) BSSIDs have an easily-recognized format, whereas SSIDs do not (they are free-form) 4) other common drivers (e.g., everything based on mac80211) get along just fine by only including BSSIDs when logging state transitions Additional notes on reason #3: this is important for the privacy-conscious, especially when providing tools that convey kernel logs on behalf of a user -- e.g., when reporting bugs. So for example, it's easy to automatically filter logs for MAC addresses, but it's much harder to filter SSIDs out of unstructured text. Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210225024454.4106485-1-briannorris@chromium.org
2021-04-17ipw2x00: potential buffer overflow in libipw_wx_set_encodeext()Dan Carpenter1-2/+4
The "ext->key_len" is a u16 that comes from the user. If it's over SCM_KEY_LEN (32) that could lead to memory corruption. Fixes: e0d369d1d969 ("[PATCH] ieee82011: Added WE-18 support to default wireless extension handler") Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Stanislav Yakovlev <stas.yakovlev@gmail.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/YHaoA1i+8uT4ir4h@mwanda
2021-04-17rtlwifi: rtl8192de: Use DEFINE_SPINLOCK() for spinlockGuobin Huang1-7/+3
spinlock can be initialized automatically with DEFINE_SPINLOCK() rather than explicitly calling spin_lock_init(). Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Guobin Huang <huangguobin4@huawei.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/1617711406-49649-1-git-send-email-huangguobin4@huawei.com
2021-04-17qtnfmac: remove meaningless goto statement and labelswengjianfeng1-67/+0
some function's label meaningless, the label statement follows the goto statement, no other statements, so just remove it. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: wengjianfeng <wengjianfeng@yulong.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210406025206.4924-1-samirweng1979@163.com
2021-04-17rtlwifi: Simplify locking of a skb list accessesChristophe JAILLET2-14/+2
The 'c2hcmd_lock' spinlock is only used to protect some __skb_queue_tail() and __skb_dequeue() calls. Use the lock provided in the skb itself and call skb_queue_tail() and skb_dequeue(). These functions already include the correct locking. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/99cf8894fd52202cb7ce2ec6e3200eef400bc071.1617609346.git.christophe.jaillet@wanadoo.fr
2021-04-17rtlwifi: remove rtl_get_tid_hChristophe JAILLET1-6/+1
'rtl_get_tid_h()' is the same as 'ieee80211_get_tid()'. So this function can be removed to save a line of code. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/db340a67a95c119e4f9ba8fa99aea1c73d0dcfc9.1617383263.git.christophe.jaillet@wanadoo.fr
2021-04-17rtlwifi: rtl8188ee: remove redundant assignment of variable ↵Yang Li1-2/+0
rtlpriv->btcoexist.reg_bt_sco Assigning value "3" to "rtlpriv->btcoexist.reg_bt_sco" here, but that stored value is overwritten before it can be used. Coverity reports this problem as CWE563: A value assigned to a variable is never used. drivers/net/wireless/realtek/rtlwifi/rtl8188ee/hw.c: rtl8188ee_bt_reg_init Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/1617182023-110950-1-git-send-email-yang.lee@linux.alibaba.com
2021-04-17rtlwifi: remove redundant assignment to variable errColin Ian King1-1/+0
Variable err is assigned -ENODEV followed by an error return path via label error_out that does not access the variable and returns with the -ENODEV error return code. The assignment to err is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210327230014.25554-1-colin.king@canonical.com
2021-04-17rtlwifi: Few mundane typo fixesBhaskar Chowdhury1-3/+3
s/resovle/resolve/ s/broadcase/broadcast/ s/sytem/system/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210320194426.21621-1-unixbhaskar@gmail.com
2021-04-17qtnfmac: remove meaningless labelswengjianfeng1-21/+6
some function's label meaningless, the return statement follows the goto statement, so just remove it. Signed-off-by: wengjianfeng <wengjianfeng@yulong.com> Reviewed-by: Sergey Matyukevich <geomatsi@gmail.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210223065754.34392-1-samirweng1979@163.com
2021-04-17rtlwifi: 8821ae: upgrade PHY and RF parametersPing-Ke Shih1-130/+370
The signal strength of 5G is quite low, so user can't connect to an AP far away. New parameters with new format and its parser are updated by the commit 84d26fda52e2 ("rtlwifi: Update 8821ae new phy parameters and its parser."), but some parameters are missing. Use this commit to update to the novel parameters that use new format. Fixes: 84d26fda52e2 ("rtlwifi: Update 8821ae new phy parameters and its parser") Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210219052607.7323-1-pkshih@realtek.com
2021-04-17cw1200: Remove unused function pointer typedef wsm_*Chen Lin1-12/+0
Remove the 'wsm_*' typedef as it is not used. Signed-off-by: Chen Lin <chen.lin5@zte.com.cn> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/1613449833-4910-1-git-send-email-chen45464546@163.com
2021-04-17cw1200: Remove unused function pointer typedef cw1200_wsm_handlerChen Lin1-3/+0
Remove the 'cw1200_wsm_handler' typedef as it is not used. Signed-off-by: Chen Lin <chen.lin5@zte.com.cn> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/1613446918-4532-1-git-send-email-chen45464546@163.com
2021-04-17Merge tag 'net-5.12-rc8' of ↵Linus Torvalds53-439/+479
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes for 5.12-rc8, including fixes from netfilter, and bpf. BPF verifier changes stand out, otherwise things have slowed down. Current release - regressions: - gro: ensure frag0 meets IP header alignment - Revert "net: stmmac: re-init rx buffers when mac resume back" - ethernet: macb: fix the restore of cmp registers Previous releases - regressions: - ixgbe: Fix NULL pointer dereference in ethtool loopback test - ixgbe: fix unbalanced device enable/disable in suspend/resume - phy: marvell: fix detection of PHY on Topaz switches - make tcp_allowed_congestion_control readonly in non-init netns - xen-netback: Check for hotplug-status existence before watching Previous releases - always broken: - bpf: mitigate a speculative oob read of up to map value size by tightening the masking window - sctp: fix race condition in sctp_destroy_sock - sit, ip6_tunnel: Unregister catch-all devices - netfilter: nftables: clone set element expression template - netfilter: flowtable: fix NAT IPv6 offload mangling - net: geneve: check skb is large enough for IPv4/IPv6 header - netlink: don't call ->netlink_bind with table lock held" * tag 'net-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (52 commits) netlink: don't call ->netlink_bind with table lock held MAINTAINERS: update my email bpf: Update selftests to reflect new error states bpf: Tighten speculative pointer arithmetic mask bpf: Move sanitize_val_alu out of op switch bpf: Refactor and streamline bounds check into helper bpf: Improve verifier error messages for users bpf: Rework ptr_limit into alu_limit and add common error path bpf: Ensure off_reg has no mixed signed bounds for all types bpf: Move off_reg into sanitize_ptr_alu bpf: Use correct permission flag for mixed signed bounds arithmetic ch_ktls: do not send snd_una update to TCB in middle ch_ktls: tcb close causes tls connection failure ch_ktls: fix device connection close ch_ktls: Fix kernel panic i40e: fix the panic when running bpf in xdpdrv mode net/mlx5e: fix ingress_ifindex check in mlx5e_flower_parse_meta net/mlx5e: Fix setting of RS FEC mode net/mlx5: Fix setting of devlink traps in switchdev mode Revert "net: stmmac: re-init rx buffers when mac resume back" ...
2021-04-17Merge tag 'libnvdimm-fixes-for-5.12-rc8' of ↵Linus Torvalds5-18/+56
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: "The largest change is for a regression that landed during -rc1 for block-device read-only handling. Vaibhav found a new use for the ability (originally introduced by virtio_pmem) to call back to the platform to flush data, but also found an original bug in that implementation. Lastly, Arnd cleans up some compile warnings in dax. This has all appeared in -next with no reported issues. Summary: - Fix a regression of read-only handling in the pmem driver - Fix a compile warning - Fix support for platform cache flush commands on powerpc/papr" * tag 'libnvdimm-fixes-for-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: libnvdimm/region: Fix nvdimm_has_flush() to handle ND_REGION_ASYNC libnvdimm: Notify disk drivers to revalidate region read-only dax: avoid -Wempty-body warnings
2021-04-17Merge tag 'cxl-fixes-for-5.12-rc8' of ↵Linus Torvalds1-63/+89
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull CXL memory class fixes from Dan Williams: "A collection of fixes for the CXL memory class driver introduced in this release cycle. The driver was primarily developed on a work-in-progress QEMU emulation of the interface and we have since found a couple places where it hid spec compliance bugs in the driver, or had a spec implementation bug itself. The biggest change here is replacing a percpu_ref with an rwsem to cleanup a couple bugs in the error unwind path during ioctl device init. Lastly there were some minor cleanups to not export the power-management sysfs-ABI for the ioctl device, use the proper sysfs helper for emitting values, and prevent subtle bugs as new administration commands are added to the supported list. The bulk of it has appeared in -next save for the top commit which was found today and validated on a fixed-up QEMU model. Summary: - Fix support for CXL memory devices with registers offset from the BAR base. - Fix the reporting of device capacity. - Fix the driver commands list definition to be disconnected from the UAPI command list. - Replace percpu_ref with rwsem to fix initialization error path. - Fix leaks in the driver initialization error path. - Drop the power/ directory from CXL device sysfs. - Use the recommended sysfs helper for attribute 'show' implementations" * tag 'cxl-fixes-for-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/mem: Fix memory device capacity probing cxl/mem: Fix register block offset calculation cxl/mem: Force array size of mem_commands[] to CXL_MEM_COMMAND_ID_MAX cxl/mem: Disable cxl device power management cxl/mem: Do not rely on device_add() side effects for dev_set_name() failures cxl/mem: Fix synchronization mechanism for device removal vs ioctl operations cxl/mem: Use sysfs_emit() for attribute show routines
2021-04-17Merge branch 'akpm' (patches from Andrew)Linus Torvalds25-59/+66
Merge misc fixes from Andrew Morton: "12 patches. Subsystems affected by this patch series: mm (documentation, kasan, and pagemap), csky, ia64, gcov, and lib" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: lib: remove "expecting prototype" kernel-doc warnings gcov: clang: fix clang-11+ build mm: ptdump: fix build failure mm/mapping_dirty_helpers: guard hugepage pud's usage ia64: tools: remove duplicate definition of ia64_mf() on ia64 ia64: tools: remove inclusion of ia64-specific version of errno.h header ia64: fix discontig.c section mismatches ia64: remove duplicate entries in generic_defconfig csky: change a Kconfig symbol name to fix e1000 build error kasan: remove redundant config option kasan: fix hwasan build for gcc mm: eliminate "expecting prototype" kernel-doc warnings
2021-04-17posix-timers: Preserve return value in clock_adjtime32()Chen Jun1-2/+2
The return value on success (>= 0) is overwritten by the return value of put_old_timex32(). That works correct in the fault case, but is wrong for the success case where put_old_timex32() returns 0. Just check the return value of put_old_timex32() and return -EFAULT in case it is not zero. [ tglx: Massage changelog ] Fixes: 3a4d44b61625 ("ntp: Move adjtimex related compat syscalls to native counterparts") Signed-off-by: Chen Jun <chenjun102@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Richard Cochran <richardcochran@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210414030449.90692-1-chenjun102@huawei.com
2021-04-17powerpc/smp: Set numa node before updating maskSrikar Dronamraju1-3/+3
Geethika reported a trace when doing a dlpar CPU add. ------------[ cut here ]------------ WARNING: CPU: 152 PID: 1134 at kernel/sched/topology.c:2057 CPU: 152 PID: 1134 Comm: kworker/152:1 Not tainted 5.12.0-rc5-master #5 Workqueue: events cpuset_hotplug_workfn NIP: c0000000001cfc14 LR: c0000000001cfc10 CTR: c0000000007e3420 REGS: c0000034a08eb260 TRAP: 0700 Not tainted (5.12.0-rc5-master+) MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 28828422 XER: 00000020 CFAR: c0000000001fd888 IRQMASK: 0 #012GPR00: c0000000001cfc10 c0000034a08eb500 c000000001f35400 0000000000000027 #012GPR04: c0000035abaa8010 c0000035abb30a00 0000000000000027 c0000035abaa8018 #012GPR08: 0000000000000023 c0000035abaaef48 00000035aa540000 c0000035a49dffe8 #012GPR12: 0000000028828424 c0000035bf1a1c80 0000000000000497 0000000000000004 #012GPR16: c00000000347a258 0000000000000140 c00000000203d468 c000000001a1a490 #012GPR20: c000000001f9c160 c0000034adf70920 c0000034aec9fd20 0000000100087bd3 #012GPR24: 0000000100087bd3 c0000035b3de09f8 0000000000000030 c0000035b3de09f8 #012GPR28: 0000000000000028 c00000000347a280 c0000034aefe0b00 c0000000010a2a68 NIP [c0000000001cfc14] build_sched_domains+0x6a4/0x1500 LR [c0000000001cfc10] build_sched_domains+0x6a0/0x1500 Call Trace: [c0000034a08eb500] [c0000000001cfc10] build_sched_domains+0x6a0/0x1500 (unreliable) [c0000034a08eb640] [c0000000001d1e6c] partition_sched_domains_locked+0x3ec/0x530 [c0000034a08eb6e0] [c0000000002936d4] rebuild_sched_domains_locked+0x524/0xbf0 [c0000034a08eb7e0] [c000000000296bb0] rebuild_sched_domains+0x40/0x70 [c0000034a08eb810] [c000000000296e74] cpuset_hotplug_workfn+0x294/0xe20 [c0000034a08ebc30] [c000000000178dd0] process_one_work+0x300/0x670 [c0000034a08ebd10] [c0000000001791b8] worker_thread+0x78/0x520 [c0000034a08ebda0] [c000000000185090] kthread+0x1a0/0x1b0 [c0000034a08ebe10] [c00000000000ccec] ret_from_kernel_thread+0x5c/0x70 Instruction dump: 7d2903a6 4e800421 e8410018 7f67db78 7fe6fb78 7f45d378 7f84e378 7c681b78 3c62ff1a 3863c6f8 4802dc35 60000000 <0fe00000> 3920fff4 f9210070 e86100a0 ---[ end trace 532d9066d3d4d7ec ]--- Some of the per-CPU masks use cpu_cpu_mask as a filter to limit the search for related CPUs. On a dlpar add of a CPU, update cpu_cpu_mask before updating the per-CPU masks. This will ensure the cpu_cpu_mask is updated correctly before its used in setting the masks. Setting the numa_node will ensure that when cpu_cpu_mask() gets called, the correct node number is used. This code movement helped fix the above call trace. Reported-by: Geetika Moolchandani <Geetika.Moolchandani1@ibm.com> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210401154200.150077-1-srikar@linux.vnet.ibm.com
2021-04-17KVM: Take mmu_lock when handling MMU notifier iff the hva hits a memslotSean Christopherson1-4/+11
Defer acquiring mmu_lock in the MMU notifier paths until a "hit" has been detected in the memslots, i.e. don't take the lock for notifications that don't affect the guest. For small VMs, spurious locking is a minor annoyance. And for "volatile" setups where the majority of notifications _are_ relevant, this barely qualifies as an optimization. But, for large VMs (hundreds of threads) with static setups, e.g. no page migration, no swapping, etc..., the vast majority of MMU notifier callbacks will be unrelated to the guest, e.g. will often be in response to the userspace VMM adjusting its own virtual address space. In such large VMs, acquiring mmu_lock can be painful as it blocks vCPUs from handling page faults. In some scenarios it can even be "fatal" in the sense that it causes unacceptable brownouts, e.g. when rebuilding huge pages after live migration, a significant percentage of vCPUs will be attempting to handle page faults. x86's TDP MMU implementation is especially susceptible to spurious locking due it taking mmu_lock for read when handling page faults. Because rwlock is fair, a single writer will stall future readers, while the writer is itself stalled waiting for in-progress readers to complete. This is exacerbated by the MMU notifiers often firing multiple times in quick succession, e.g. moving a page will (always?) invoke three separate notifiers: .invalidate_range_start(), invalidate_range_end(), and .change_pte(). Unnecessarily taking mmu_lock each time means even a single spurious sequence can be problematic. Note, this optimizes only the unpaired callbacks. Optimizing the .invalidate_range_{start,end}() pairs is more complex and will be done in a future patch. Suggested-by: Ben Gardon <bgardon@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210402005658.3024832-9-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: Move MMU notifier's mmu_lock acquisition into common helperSean Christopherson1-41/+80
Acquire and release mmu_lock in the __kvm_handle_hva_range() helper instead of requiring the caller to do the same. This paves the way for future patches to take mmu_lock if and only if an overlapping memslot is found, without also having to introduce the on_lock() shenanigans used to manipulate the notifier count and sequence. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210402005658.3024832-8-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: Kill off the old hva-based MMU notifier callbacksSean Christopherson6-97/+0
Yank out the hva-based MMU notifier APIs now that all architectures that use the notifiers have moved to the gfn-based APIs. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210402005658.3024832-7-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: PPC: Convert to the gfn-based MMU notifier callbacksSean Christopherson10-173/+95
Move PPC to the gfn-base MMU notifier APIs, and update all 15 bajillion PPC-internal hooks to work with gfns instead of hvas. No meaningful functional change intended, though the exact order of operations is slightly different since the memslot lookups occur before calling into arch code. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210402005658.3024832-6-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: MIPS/MMU: Convert to the gfn-based MMU notifier callbacksSean Christopherson2-79/+14
Move MIPS to the gfn-based MMU notifier APIs, which do the hva->gfn lookup in common code, and whose code is nearly identical to MIPS' lookup. No meaningful functional change intended, though the exact order of operations is slightly different since the memslot lookups occur before calling into arch code. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210402005658.3024832-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: arm64: Convert to the gfn-based MMU notifier callbacksSean Christopherson2-84/+32
Move arm64 to the gfn-base MMU notifier APIs, which do the hva->gfn lookup in common code. No meaningful functional change intended, though the exact order of operations is slightly different since the memslot lookups occur before calling into arch code. Reviewed-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210402005658.3024832-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: Move x86's MMU notifier memslot walkers to generic codeSean Christopherson6-246/+314
Move the hva->gfn lookup for MMU notifiers into common code. Every arch does a similar lookup, and some arch code is all but identical across multiple architectures. In addition to consolidating code, this will allow introducing optimizations that will benefit all architectures without incurring multiple walks of the memslots, e.g. by taking mmu_lock if and only if a relevant range exists in the memslots. The use of __always_inline to avoid indirect call retpolines, as done by x86, may also benefit other architectures. Consolidating the lookups also fixes a wart in x86, where the legacy MMU and TDP MMU each do their own memslot walks. Lastly, future enhancements to the memslot implementation, e.g. to add an interval tree to track host address, will need to touch far less arch specific code. MIPS, PPC, and arm64 will be converted one at a time in future patches. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210402005658.3024832-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: Assert that notifier count is elevated in .change_pte()Sean Christopherson1-2/+7
In KVM's .change_pte() notification callback, replace the notifier sequence bump with a WARN_ON assertion that the notifier count is elevated. An elevated count provides stricter protections than bumping the sequence, and the sequence is guarnateed to be bumped before the count hits zero. When .change_pte() was added by commit 828502d30073 ("ksm: add mmu_notifier set_pte_at_notify()"), bumping the sequence was necessary as .change_pte() would be invoked without any surrounding notifications. However, since commit 6bdb913f0a70 ("mm: wrap calls to set_pte_at_notify with invalidate_range_start and invalidate_range_end"), all calls to .change_pte() are guaranteed to be surrounded by start() and end(), and so are guaranteed to run with an elevated notifier count. Note, wrapping .change_pte() with .invalidate_range_{start,end}() is a bug of sorts, as invalidating the secondary MMU's (KVM's) PTE defeats the purpose of .change_pte(). Every arch's kvm_set_spte_hva() assumes .change_pte() is called when the relevant SPTE is present in KVM's MMU, as the original goal was to accelerate Kernel Samepage Merging (KSM) by updating KVM's SPTEs without requiring a VM-Exit (due to invalidating the SPTE). I.e. it means that .change_pte() is effectively dead code on _all_ architectures. x86 and MIPS are clearcut nops if the old SPTE is not-present, and that is guaranteed due to the prior invalidation. PPC simply unmaps the SPTE, which again should be a nop due to the invalidation. arm64 is a bit murky, but it's also likely a nop because kvm_pgtable_stage2_map() is called without a cache pointer, which means it will map an entry if and only if an existing PTE was found. For now, take advantage of the bug to simplify future consolidation of KVMs's MMU notifier code. Doing so will not greatly complicate fixing .change_pte(), assuming it's even worth fixing. .change_pte() has been broken for 8+ years and no one has complained. Even if there are KSM+KVM users that care deeply about its performance, the benefits of avoiding VM-Exits via .change_pte() need to be reevaluated to justify the added complexity and testing burden. Ripping out .change_pte() entirely would be a lot easier. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: MIPS: defer flush to generic MMU notifier codePaolo Bonzini1-9/+2
Return 1 from kvm_unmap_hva_range and kvm_set_spte_hva if a flush is needed, so that the generic code can coalesce the flushes. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: MIPS: let generic code call prepare_flush_shadowPaolo Bonzini3-10/+10
Since all calls to kvm_flush_remote_tlbs must be preceded by kvm_mips_callbacks->prepare_flush_shadow, repurpose kvm_arch_flush_remote_tlb to invoke it. This makes it possible to use the TLB flushing mechanism provided by the generic MMU notifier code. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: MIPS: rework flush_shadow_* callbacks into one that prepares the flushPaolo Bonzini5-43/+19
Both trap-and-emulate and VZ have a single implementation that covers both .flush_shadow_all and .flush_shadow_memslot, and both of them end with a call to kvm_flush_remote_tlbs. Unify the callbacks into one and extract the call to kvm_flush_remote_tlbs. The next patches will pull it further out of the the architecture-specific MMU notifier functions kvm_unmap_hva_range and kvm_set_spte_hva. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: constify kvm_arch_flush_remote_tlbs_memslotPaolo Bonzini4-4/+4
memslots are stored in RCU and there should be no need to change them. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: Explicitly use GFP_KERNEL_ACCOUNT for 'struct kvm_vcpu' allocationsSean Christopherson1-1/+1
Use GFP_KERNEL_ACCOUNT when allocating vCPUs to make it more obvious that that the allocations are accounted, to make it easier to audit KVM's allocations in the future, and to be consistent with other cache usage in KVM. When using SLAB/SLUB, this is a nop as the cache itself is created with SLAB_ACCOUNT. When using SLOB, there are caveats within caveats. SLOB doesn't honor SLAB_ACCOUNT, so passing GFP_KERNEL_ACCOUNT will result in vCPU allocations now being accounted. But, even that depends on internal SLOB details as SLOB will only go to the page allocator when its cache is depleted. That just happens to be extremely likely for vCPUs because the size of kvm_vcpu is larger than the a page for almost all combinations of architecture and page size. Whether or not the SLOB behavior is by design is unknown; it's just as likely that no SLOB users care about accounding and so no one has bothered to implemented support in SLOB. Regardless, accounting vCPU allocations will not break SLOB+KVM+cgroup users, if any exist. Reviewed-by: Wanpeng Li <kernellwp@gmail.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210406190740.4055679-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: MMU: protect TDP MMU pages only down to required levelPaolo Bonzini1-1/+1
When using manual protection of dirty pages, it is not necessary to protect nested page tables down to the 4K level; instead KVM can protect only hugepages in order to split them lazily, and delay write protection at 4K-granularity until KVM_CLEAR_DIRTY_LOG. This was overlooked in the TDP MMU, so do it there as well. Fixes: a6a0b05da9f37 ("kvm: x86/mmu: Support dirty logging for the TDP MMU") Cc: Ben Gardon <bgardon@google.com> Reviewed-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: s390x: implement KVM_CAP_SET_GUEST_DEBUG2Maxim Levitsky2-0/+7
Define KVM_GUESTDBG_VALID_MASK and use it to implement this capabiity. Compile tested only. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210401135451.1004564-6-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: aarch64: implement KVM_CAP_SET_GUEST_DEBUG2Maxim Levitsky3-5/+6
Move KVM_GUESTDBG_VALID_MASK to kvm_host.h and use it to return the value of this capability. Compile tested only. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210401135451.1004564-5-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: x86: implement KVM_CAP_SET_GUEST_DEBUG2Maxim Levitsky2-0/+11
Store the supported bits into KVM_GUESTDBG_VALID_MASK macro, similar to how arm does this. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210401135451.1004564-4-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: introduce KVM_CAP_SET_GUEST_DEBUG2Paolo Bonzini2-0/+4
This capability will allow the user to know which KVM_GUESTDBG_* bits are supported. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210401135451.1004564-3-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: x86: pending exceptions must not be blocked by an injected eventMaxim Levitsky2-3/+15
Injected interrupts/nmi should not block a pending exception, but rather be either lost if nested hypervisor doesn't intercept the pending exception (as in stock x86), or be delivered in exitintinfo/IDT_VECTORING_INFO field, as a part of a VMexit that corresponds to the pending exception. The only reason for an exception to be blocked is when nested run is pending (and that can't really happen currently but still worth checking for). Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210401143817.1030695-2-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: selftests: remove redundant semi-colonYang Yingliang1-1/+1
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Message-Id: <20210401142514.1688199-1-yangyingliang@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: nSVM: call nested_svm_load_cr3 on nested state loadMaxim Levitsky1-18/+22
While KVM's MMU should be fully reset by loading of nested CR0/CR3/CR4 by KVM_SET_SREGS, we are not in nested mode yet when we do it and therefore only root_mmu is reset. On regular nested entries we call nested_svm_load_cr3 which both updates the guest's CR3 in the MMU when it is needed, and it also initializes the mmu again which makes it initialize the walk_mmu as well when nested paging is enabled in both host and guest. Since we don't call nested_svm_load_cr3 on nested state load, the walk_mmu can be left uninitialized, which can lead to a NULL pointer dereference while accessing it if we happen to get a nested page fault right after entering the nested guest first time after the migration and we decide to emulate it, which leads to the emulator trying to access walk_mmu->gva_to_gpa which is NULL. Therefore we should call this function on nested state load as well. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210401141814.1029036-3-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: x86: dump_vmcs should include the autoload/autostore MSR listsDavid Edmondson1-0/+16
When dumping the current VMCS state, include the MSRs that are being automatically loaded/stored during VM entry/exit. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: David Edmondson <david.edmondson@oracle.com> Message-Id: <20210318120841.133123-6-david.edmondson@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: x86: dump_vmcs should show the effective EFERDavid Edmondson2-5/+17
If EFER is not being loaded from the VMCS, show the effective value by reference to the MSR autoload list or calculation. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Edmondson <david.edmondson@oracle.com> Message-Id: <20210318120841.133123-5-david.edmondson@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: x86: dump_vmcs should consider only the load controls of EFER/PATDavid Edmondson1-4/+2
When deciding whether to dump the GUEST_IA32_EFER and GUEST_IA32_PAT fields of the VMCS, examine only the VM entry load controls, as saving on VM exit has no effect on whether VM entry succeeds or fails. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Edmondson <david.edmondson@oracle.com> Message-Id: <20210318120841.133123-4-david.edmondson@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: x86: dump_vmcs should not conflate EFER and PAT presence in VMCSDavid Edmondson1-9/+10
Show EFER and PAT based on their individual entry/exit controls. Signed-off-by: David Edmondson <david.edmondson@oracle.com> Message-Id: <20210318120841.133123-3-david.edmondson@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: x86: dump_vmcs should not assume GUEST_IA32_EFER is validDavid Edmondson1-6/+3
If the VM entry/exit controls for loading/saving MSR_EFER are either not available (an older processor or explicitly disabled) or not used (host and guest values are the same), reading GUEST_IA32_EFER from the VMCS returns an inaccurate value. Because of this, in dump_vmcs() don't use GUEST_IA32_EFER to decide whether to print the PDPTRs - always do so if the fields exist. Fixes: 4eb64dce8d0a ("KVM: x86: dump VMCS on invalid entry") Signed-off-by: David Edmondson <david.edmondson@oracle.com> Message-Id: <20210318120841.133123-2-david.edmondson@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: nSVM: improve SYSENTER emulation on AMDMaxim Levitsky2-37/+68
Currently to support Intel->AMD migration, if CPU vendor is GenuineIntel, we emulate the full 64 value for MSR_IA32_SYSENTER_{EIP|ESP} msrs, and we also emulate the sysenter/sysexit instruction in long mode. (Emulator does still refuse to emulate sysenter in 64 bit mode, on the ground that the code for that wasn't tested and likely has no users) However when virtual vmload/vmsave is enabled, the vmload instruction will update these 32 bit msrs without triggering their msr intercept, which will lead to having stale values in kvm's shadow copy of these msrs, which relies on the intercept to be up to date. Fix/optimize this by doing the following: 1. Enable the MSR intercepts for SYSENTER MSRs iff vendor=GenuineIntel (This is both a tiny optimization and also ensures that in case the guest cpu vendor is AMD, the msrs will be 32 bit wide as AMD defined). 2. Store only high 32 bit part of these msrs on interception and combine it with hardware msr value on intercepted read/writes iff vendor=GenuineIntel. 3. Disable vmload/vmsave virtualization if vendor=GenuineIntel. (It is somewhat insane to set vendor=GenuineIntel and still enable SVM for the guest but well whatever). Then zero the high 32 bit parts when kvm intercepts and emulates vmload. Thanks a lot to Paulo Bonzini for helping me with fixing this in the most correct way. This patch fixes nested migration of 32 bit nested guests, that was broken because incorrect cached values of SYSENTER msrs were stored in the migration stream if L1 changed these msrs with vmload prior to L2 entry. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210401111928.996871-3-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: x86: add guest_cpuid_is_intelMaxim Levitsky1-0/+8
This is similar to existing 'guest_cpuid_is_amd_or_hygon' Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210401111928.996871-2-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>