Age | Commit message (Collapse) | Author | Files | Lines |
|
When doing x86_emulate_instruction(EMULTYPE_SKIP) interrupt shadow has to
be cleared if and only if the skipping is successful.
There are two immediate issues:
- In SVM skip_emulated_instruction() we are not zapping interrupt shadow
in case kvm_emulate_instruction(EMULTYPE_SKIP) is used to advance RIP
(!nrpip_save).
- In VMX handle_ept_misconfig() when running as a nested hypervisor we
(static_cpu_has(X86_FEATURE_HYPERVISOR) case) forget to clear interrupt
shadow.
Note that we intentionally don't handle the case when the skipped
instruction is supposed to prolong the interrupt shadow ("MOV/POP SS") as
skip-emulation of those instructions should not happen under normal
circumstances.
Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
On AMD, kvm_x86_ops->skip_emulated_instruction(vcpu) can, in theory,
fail: in !nrips case we call kvm_emulate_instruction(EMULTYPE_SKIP).
Currently, we only do printk(KERN_DEBUG) when this happens and this
is not ideal. Propagate the error up the stack.
On VMX, skip_emulated_instruction() doesn't fail, we have two call
sites calling it explicitly: handle_exception_nmi() and
handle_task_switch(), we can just ignore the result.
On SVM, we also have two explicit call sites:
svm_queue_exception() and it seems we don't need to do anything there as
we check if RIP was advanced or not. In task_switch_interception(),
however, we are better off not proceeding to kvm_task_switch() in case
skip_emulated_instruction() failed.
Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
results in #GP
svm->next_rip is only used by skip_emulated_instruction() and in case
kvm_set_msr() fails we rightfully don't do that. Move svm->next_rip
advancement to 'else' branch to avoid creating false impression that
it's always advanced (and make it look like rdmsr_interception()).
This is a preparatory change to removing hardcoded RIP advancement
from instruction intercepts, no functional change.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Jump to the common error handling in x86_decode_insn() if
__do_insn_fetch_bytes() fails so that its error code is converted to the
appropriate return type. Although the various helpers used by
x86_decode_insn() return X86EMUL_* values, x86_decode_insn() itself
returns EMULATION_FAILED or EMULATION_OK.
This doesn't cause a functional issue as the sole caller,
x86_emulate_instruction(), currently only cares about success vs.
failure, and success is indicated by '0' for both types
(X86EMUL_CONTINUE and EMULATION_OK).
Fixes: 285ca9e948fa ("KVM: emulate: speed up do_insn_fetch")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Similar to AMD bits, set the Intel bits from the vendor-independent
feature and bug flags, because KVM_GET_SUPPORTED_CPUID does not care
about the vendor and they should be set on AMD processors as well.
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Even though it is preferrable to use SPEC_CTRL (represented by
X86_FEATURE_AMD_SSBD) instead of VIRT_SPEC, VIRT_SPEC is always
supported anyway because otherwise it would be impossible to
migrate from old to new CPUs. Make this apparent in the
result of KVM_GET_SUPPORTED_CPUID as well.
However, we need to hide the bit on Intel processors, so move
the setting to svm_set_supported_cpuid.
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reported-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The AMD_* bits have to be set from the vendor-independent
feature and bug flags, because KVM_GET_SUPPORTED_CPUID does not care
about the vendor and they should be set on Intel processors as well.
On top of this, SSBD, STIBP and AMD_SSB_NO bit were not set, and
VIRT_SSBD does not have to be added manually because it is a
cpufeature that comes directly from the host's CPUID bit.
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
test_msr_platform_info_disabled() generates EXIT_SHUTDOWN but VMCB state
is undefined after that so an attempt to launch this guest again from
test_msr_platform_info_enabled() fails. Reorder the tests to make test
pass.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This reverts commit 4e103134b862314dc2f2f18f2fb0ab972adc3f5f.
Alex Williamson reported regressions with device assignment with
this patch. Even though the bug is probably elsewhere and still
latent, this is needed to fix the regression.
Fixes: 4e103134b862 ("KVM: x86/mmu: Zap only the relevant pages when removing a memslot", 2019-02-05)
Reported-by: Alex Willamson <alex.williamson@redhat.com>
Cc: stable@vger.kernel.org
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
state_test and smm_test are failing on older processors that do not
have xcr0. This is because on those processor KVM does provide
support for KVM_GET/SET_XSAVE (to avoid having to rely on the older
KVM_GET/SET_FPU) but not for KVM_GET/SET_XCRS.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
When a vpcu is about to block by calling kvm_vcpu_block, we call
back into the arch code to allow any form of synchronization that
may be required at this point (SVN stops the AVIC, ARM synchronises
the VMCR and enables GICv4 doorbells). But this synchronization
comes in quite late, as we've potentially waited for halt_poll_ns
to expire.
Instead, let's move kvm_arch_vcpu_blocking() to the beginning of
kvm_vcpu_block(), which on ARM has several benefits:
- VMCR gets synchronised early, meaning that any interrupt delivered
during the polling window will be evaluated with the correct guest
PMR
- GICv4 doorbells are enabled, which means that any guest interrupt
directly injected during that window will be immediately recognised
Tang Nianyao ran some tests on a GICv4 machine to evaluate such
change, and reported up to a 10% improvement for netperf:
<quote>
netperf result:
D06 as server, intel 8180 server as client
with change:
package 512 bytes - 5500 Mbits/s
package 64 bytes - 760 Mbits/s
without change:
package 512 bytes - 5000 Mbits/s
package 64 bytes - 710 Mbits/s
</quote>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Since commit 503a62862e8f ("KVM: arm/arm64: vgic: Rely on the GIC driver to
parse the firmware tables"), the vgic_v{2,3}_probe functions stopped using
a DT node. Commit 909777324588 ("KVM: arm/arm64: vgic-new: vgic_init:
implement kvm_vgic_hyp_init") changed the functions again, and now they
require exactly one argument, a struct gic_kvm_info populated by the GIC
driver. Unfortunately the comments regressed and state that a DT node is
used instead. Change the function comments to reflect the current
prototypes.
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
For VPIPT I-caches, we need I-cache maintenance on VMID rollover to
avoid an ABA problem. Consider a single vCPU VM, with a pinned stage-2,
running with an idmap VA->IPA and idmap IPA->PA. If we don't do
maintenance on rollover:
// VMID A
Writes insn X to PA 0xF
Invalidates PA 0xF (for VMID A)
I$ contains [{A,F}->X]
[VMID ROLLOVER]
// VMID B
Writes insn Y to PA 0xF
Invalidates PA 0xF (for VMID B)
I$ contains [{A,F}->X, {B,F}->Y]
[VMID ROLLOVER]
// VMID A
I$ contains [{A,F}->X, {B,F}->Y]
Unexpectedly hits stale I$ line {A,F}->X.
However, for PIPT and VIPT I-caches, the VMID doesn't affect lookup or
constrain maintenance. Given the VMID doesn't affect PIPT and VIPT
I-caches, and given VMID rollover is independent of changes to stage-2
mappings, I-cache maintenance cannot be necessary on VMID rollover for
PIPT or VIPT I-caches.
This patch removes the maintenance on rollover for VIPT and PIPT
I-caches. At the same time, the unnecessary colons are removed from the
asm statement to make it more legible.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: kvmarm@lists.cs.columbia.edu
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Now that we have a cache of MSI->LPI translations, it is pretty
easy to implement kvm_arch_set_irq_inatomic (this cache can be
parsed without sleeping).
Hopefully, this will improve some LPI-heavy workloads.
Tested-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
When performing an MSI injection, let's first check if the translation
is already in the cache. If so, let's inject it quickly without
going through the whole translation process.
Tested-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
On a successful translation, preserve the parameters in the LPI
translation cache. Each translation is reusing the last slot
in the list, naturally evicting the least recently used entry.
Tested-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
In order to avoid leaking vgic_irq structures on teardown, we need to
drop all references to LPIs before deallocating the cache itself.
This is done by invalidating the cache on vgic teardown.
Tested-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
If an ITS gets disabled, we need to make sure that further interrupts
won't hit in the cache. For that, we invalidate the translation cache
when the ITS is disabled.
Tested-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
If a vcpu disables LPIs at its redistributor level, we need to make sure
we won't pend more interrupts. For this, we need to invalidate the LPI
translation cache.
Tested-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
commands
The LPI translation cache needs to be discarded when an ITS command
may affect the translation of an LPI (DISCARD, MAPC and MAPD with V=0)
or the routing of an LPI to a redistributor with disabled LPIs (MOVI,
MOVALL).
We decide to perform a full invalidation of the cache, irrespective
of the LPI that is affected. Commands are supposed to be rare enough
that it doesn't matter.
Tested-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
There's a number of cases where we need to invalidate the caching
of translations, so let's add basic support for that.
Tested-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Our LPI translation cache needs to be able to drop the refcount
on an LPI whilst already holding the lpi_list_lock.
Let's add a new primitive for this.
Tested-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Add the basic data structure that expresses an MSI to LPI
translation as well as the allocation/release hooks.
The size of the cache is arbitrarily defined as 16*nr_vcpus.
Tested-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Testing has revealed the existence of a race condition where a XIVE
interrupt being shut down can be in one of the XIVE interrupt queues
(of which there are up to 8 per CPU, one for each priority) at the
point where free_irq() is called. If this happens, can return an
interrupt number which has been shut down. This can lead to various
symptoms:
- irq_to_desc(irq) can be NULL. In this case, no end-of-interrupt
function gets called, resulting in the CPU's elevated interrupt
priority (numerically lowered CPPR) never gets reset. That then
means that the CPU stops processing interrupts, causing device
timeouts and other errors in various device drivers.
- The irq descriptor or related data structures can be in the process
of being freed as the interrupt code is using them. This typically
leads to crashes due to bad pointer dereferences.
This race is basically what commit 62e0468650c3 ("genirq: Add optional
hardware synchronization for shutdown", 2019-06-28) is intended to
fix, given a get_irqchip_state() method for the interrupt controller
being used. It works by polling the interrupt controller when an
interrupt is being freed until the controller says it is not pending.
With XIVE, the PQ bits of the interrupt source indicate the state of
the interrupt source, and in particular the P bit goes from 0 to 1 at
the point where the hardware writes an entry into the interrupt queue
that this interrupt is directed towards. Normally, the code will then
process the interrupt and do an end-of-interrupt (EOI) operation which
will reset PQ to 00 (assuming another interrupt hasn't been generated
in the meantime). However, there are situations where the code resets
P even though a queue entry exists (for example, by setting PQ to 01,
which disables the interrupt source), and also situations where the
code leaves P at 1 after removing the queue entry (for example, this
is done for escalation interrupts so they cannot fire again until
they are explicitly re-enabled).
The code already has a 'saved_p' flag for the interrupt source which
indicates that a queue entry exists, although it isn't maintained
consistently. This patch adds a 'stale_p' flag to indicate that
P has been left at 1 after processing a queue entry, and adds code
to set and clear saved_p and stale_p as necessary to maintain a
consistent indication of whether a queue entry may or may not exist.
With this, we can implement xive_get_irqchip_state() by looking at
stale_p, saved_p and the ESB PQ bits for the interrupt.
There is some additional code to handle escalation interrupts
properly; because they are enabled and disabled in KVM assembly code,
which does not have access to the xive_irq_data struct for the
escalation interrupt. Hence, stale_p may be incorrect when the
escalation interrupt is freed in kvmppc_xive_{,native_}cleanup_vcpu().
Fortunately, we can fix it up by looking at vcpu->arch.xive_esc_on,
with some careful attention to barriers in order to ensure the correct
result if xive_esc_irq() races with kvmppc_xive_cleanup_vcpu().
Finally, this adds code to make noise on the console (pr_crit and
WARN_ON(1)) if we find an interrupt queue entry for an interrupt
which does not have a descriptor. While this won't catch the race
reliably, if it does get triggered it will be an indication that
the race is occurring and needs to be debugged.
Fixes: 243e25112d06 ("powerpc/xive: Native exploitation of the XIVE interrupt controller")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190813100648.GE9567@blackberry
|
|
At present, when running a guest on POWER9 using HV KVM but not using
an in-kernel interrupt controller (XICS or XIVE), for example if QEMU
is run with the kernel_irqchip=off option, the guest entry code goes
ahead and tries to load the guest context into the XIVE hardware, even
though no context has been set up.
To fix this, we check that the "CAM word" is non-zero before pushing
it to the hardware. The CAM word is initialized to a non-zero value
in kvmppc_xive_connect_vcpu() and kvmppc_xive_native_connect_vcpu(),
and is now cleared in kvmppc_xive_{,native_}cleanup_vcpu.
Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller")
Cc: stable@vger.kernel.org # v4.12+
Reported-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190813100100.GC9567@blackberry
|
|
Escalation interrupts are interrupts sent to the host by the XIVE
hardware when it has an interrupt to deliver to a guest VCPU but that
VCPU is not running anywhere in the system. Hence we disable the
escalation interrupt for the VCPU being run when we enter the guest
and re-enable it when the guest does an H_CEDE hypercall indicating
it is idle.
It is possible that an escalation interrupt gets generated just as we
are entering the guest. In that case the escalation interrupt may be
using a queue entry in one of the interrupt queues, and that queue
entry may not have been processed when the guest exits with an H_CEDE.
The existing entry code detects this situation and does not clear the
vcpu->arch.xive_esc_on flag as an indication that there is a pending
queue entry (if the queue entry gets processed, xive_esc_irq() will
clear the flag). There is a comment in the code saying that if the
flag is still set on H_CEDE, we have to abort the cede rather than
re-enabling the escalation interrupt, lest we end up with two
occurrences of the escalation interrupt in the interrupt queue.
However, the exit code doesn't do that; it aborts the cede in the sense
that vcpu->arch.ceded gets cleared, but it still enables the escalation
interrupt by setting the source's PQ bits to 00. Instead we need to
set the PQ bits to 10, indicating that an interrupt has been triggered.
We also need to avoid setting vcpu->arch.xive_esc_on in this case
(i.e. vcpu->arch.xive_esc_on seen to be set on H_CEDE) because
xive_esc_irq() will run at some point and clear it, and if we race with
that we may end up with an incorrect result (i.e. xive_esc_on set when
the escalation interrupt has just been handled).
It is extremely unlikely that having two queue entries would cause
observable problems; theoretically it could cause queue overflow, but
the CPU would have to have thousands of interrupts targetted to it for
that to be possible. However, this fix will also make it possible to
determine accurately whether there is an unhandled escalation
interrupt in the queue, which will be needed by the following patch.
Fixes: 9b9b13a6d153 ("KVM: PPC: Book3S HV: Keep XIVE escalation interrupt masked unless ceded")
Cc: stable@vger.kernel.org # v4.16+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190813100349.GD9567@blackberry
|
|
When a vCPU is brought done, the XIVE VP (Virtual Processor) is first
disabled and then the event notification queues are freed. When freeing
the queues, we check for possible escalation interrupts and free them
also.
But when a XIVE VP is disabled, the underlying XIVE ENDs also are
disabled in OPAL. When an END (Event Notification Descriptor) is
disabled, its ESB pages (ESn and ESe) are disabled and loads return all
1s. Which means that any access on the ESB page of the escalation
interrupt will return invalid values.
When an interrupt is freed, the shutdown handler computes a 'saved_p'
field from the value returned by a load in xive_do_source_set_mask().
This value is incorrect for escalation interrupts for the reason
described above.
This has no impact on Linux/KVM today because we don't make use of it
but we will introduce in future changes a xive_get_irqchip_state()
handler. This handler will use the 'saved_p' field to return the state
of an interrupt and 'saved_p' being incorrect, softlockup will occur.
Fix the vCPU cleanup sequence by first freeing the escalation interrupts
if any, then disable the XIVE VP and last free the queues.
Fixes: 90c73795afa2 ("KVM: PPC: Book3S HV: Add a new KVM device for the XIVE native exploitation mode")
Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190806172538.5087-1-clg@kaod.org
|
|
vmx_set_nested_state_test is trying to use the KVM_STATE_NESTED_EVMCS without
enabling enlightened VMCS first. Correct the outcome of the test, and actually
test that it succeeds after the capability is enabled.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
There are two tests already enabling eVMCS and a third is coming.
Add a function that enables the capability and tests the result.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This test is only covering various edge cases of the
KVM_SET_NESTED_STATE ioctl. Running the VM does not really
add anything.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
new_entry is reassigned a new value next line. So
it's redundant and remove it.
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
This is probably overdue---KVM x86 has quite a few contributors that
usually review each other's patches, which is really helpful to me.
Formalize this by listing them as reviewers. I am including people
with various expertise:
- Joerg for SVM (with designated reviewers, it makes more sense to have
him in the main KVM/x86 stanza)
- Sean for MMU and VMX
- Jim for VMX
- Vitaly for Hyper-V and possibly SVM
- Wanpeng for LAPIC and paravirtualization.
Please ack if you are okay with this arrangement, otherwise speak up.
In other news, Radim is going to leave Red Hat soon. However, he has
not been very much involved in upstream KVM development for some time,
and in the immediate future he is still going to help maintain kvm/queue
while I am on vacation. Since not much is going to change, I will let
him decide whether he wants to keep the maintainer role after he leaves.
Acked-by: Joerg Roedel <joro@8bytes.org>
Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Wanpeng Li <wanpengli@tencent.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
KVM/s390 does not have a list of its own, and linux-s390 is in the
loop anyway thanks to the generic arch/s390 match. So use the generic
KVM list for s390 patches.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
recalculate_apic_map does not santize ldr and it's possible that
multiple bits are set. In that case, a previous valid entry
can potentially be overwritten by an invalid one.
This condition is hit when booting a 32 bit, >8 CPU, RHEL6 guest and then
triggering a crash to boot a kdump kernel. This is the sequence of
events:
1. Linux boots in bigsmp mode and enables PhysFlat, however, it still
writes to the LDR which probably will never be used.
2. However, when booting into kdump, the stale LDR values remain as
they are not cleared by the guest and there isn't a apic reset.
3. kdump boots with 1 cpu, and uses Logical Destination Mode but the
logical map has been overwritten and points to an inactive vcpu.
Signed-off-by: Radim Krcmar <rkrcmar@redhat.com>
Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull dax fixes from Dan Williams:
"A filesystem-dax and device-dax fix for v5.3.
The filesystem-dax fix is tagged for stable as the implementation has
been mistakenly throwing away all cow pages on any truncate or hole
punch operation as part of the solution to coordinate device-dma vs
truncate to dax pages.
The device-dax change fixes up a regression this cycle from the
introduction of a common 'internal per-cpu-ref' implementation.
Summary:
- Fix dax_layout_busy_page() to not discard private cow pages of
fs/dax private mappings.
- Update the memremap_pages core to properly cleanup on behalf of
internal reference-count users like device-dax"
* tag 'dax-fixes-5.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
mm/memremap: Fix reuse of pgmap instances with internal references
dax: dax_layout_busy_page() should not unmap cow pages
|
|
Pull NTB fix from Jon Mason:
"Bug fix for NTB MSI kernel compile warning"
* tag 'ntb-5.3-bugfixes' of git://github.com/jonmason/ntb:
NTB/msi: remove incorrect MODULE defines
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Paul Walmsley:
"A few minor RISC-V updates for v5.3-rc4:
- Remove __udivdi3() from the 32-bit Linux port, converting the only
upstream user to use do_div(), per Linux policy
- Convert the RISC-V standard clocksource away from per-cpu data
structures, since only one is used by Linux, even on a multi-CPU
system
- A set of DT binding updates that remove an obsolete text binding in
favor of a YAML binding, fix a bogus compatible string in the
schema (thus fixing a "make dtbs_check" warning), and clarifies the
future values expected in one of the RISC-V CPU properties"
* tag 'riscv/for-v5.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
dt-bindings: riscv: fix the schema compatible string for the HiFive Unleashed board
dt-bindings: riscv: remove obsolete cpus.txt
RISC-V: Remove udivdi3
riscv: delay: use do_div() instead of __udivdi3()
dt-bindings: Update the riscv,isa string description
RISC-V: Remove per cpu clocksource
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A few fixes for x86:
- Don't reset the carefully adjusted build flags for the purgatory
and remove the unwanted flags instead. The 'reset all' approach led
to build fails under certain circumstances.
- Unbreak CLANG build of the purgatory by avoiding the builtin
memcpy/memset implementations.
- Address missing prototype warnings by including the proper header
- Fix yet more fall-through issues"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/lib/cpu: Address missing prototypes warning
x86/purgatory: Use CFLAGS_REMOVE rather than reset KBUILD_CFLAGS
x86/purgatory: Do not use __builtin_memcpy and __builtin_memset
x86: mtrr: cyrix: Mark expected switch fall-through
x86/ptrace: Mark expected switch fall-through
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf tooling fixes from Thomas Gleixner:
"Perf tooling fixes all over the place:
- Fix the selection of the main thread COMM in db-export
- Fix the disassemmbly display for BPF in annotate
- Fix cpumap mask setup in perf ftrace when only one CPU is present
- Add the missing 'cpu_clk_unhalted.core' event
- Fix CPU 0 bindings in NUMA benchmarks
- Fix the module size calculations for s390
- Handle the gap between kernel end and module start on s390
correctly
- Build and typo fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf pmu-events: Fix missing "cpu_clk_unhalted.core" event
perf annotate: Fix s390 gap between kernel end and module start
perf record: Fix module size on s390
perf tools: Fix include paths in ui directory
perf tools: Fix a typo in a variable name in the Documentation Makefile
perf cpumap: Fix writing to illegal memory in handling cpumap mask
perf ftrace: Fix failure to set cpumask when only one cpu is present
perf db-export: Fix thread__exec_comm()
perf annotate: Fix printing of unaugmented disassembled instructions from BPF
perf bench numa: Fix cpu0 binding
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Thomas Gleixner:
"Three fixlets for the scheduler:
- Avoid double bandwidth accounting in the push & pull code
- Use a sane FIFO priority for the Pressure Stall Information (PSI)
thread.
- Avoid permission checks when setting the scheduler params for the
PSI thread"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/psi: Do not require setsched permission from the trigger creator
sched/psi: Reduce psimon FIFO priority
sched/deadline: Fix double accounting of rq/running bw in push & pull
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fix from Thomas Gleixner:
"A small fix for the affinity spreading code.
It failed to handle situations where a single vector was requested
either due to only one CPU being available or vector exhaustion
causing only a single interrupt to be granted.
The fix is to simply remove the requirement in the affinity spreading
code for more than one interrupt being available"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq/affinity: Create affinity mask for single vector
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool warning fix from Thomas Gleixner:
"The recent objtool fixes/enhancements unearthed a unbalanced CLAC in
the i915 driver.
Chris asked me to pick the fix up and route it through"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
drm/i915: Remove redundant user_access_end() from __copy_from_user() error path
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 fix from Andreas Gruenbacher:
"Fix incorrect lseek / fiemap results"
* tag 'gfs2-v5.3-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: gfs2_walk_metadata fix
|
|
for clang
A compilation -Wimplicit-fallthrough warning was enabled by commit
a035d552a93b ("Makefile: Globally enable fall-through warning")
Even though clang 10.0.0 does not currently support this warning without
a patch, clang currently does not support a value for this option.
Link: https://bugs.llvm.org/show_bug.cgi?id=39382
The gcc default for this warning is 3 so removing the =3 has no effect
for gcc and enables the warning for patched versions of clang.
Also remove the =3 from an existing use in a parisc Makefile:
arch/parisc/math-emu/Makefile
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-and-tested-by: Nathan Chancellor <natechancellor@gmail.com>
Cc: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes Greg KH:
"Here are some small char/misc driver fixes for 5.3-rc4.
Two of these are for the habanalabs driver for issues found when
running on a big-endian system (are they still alive?) The others are
tiny fixes reported by people, and a MAINTAINERS update about the
location of the fpga development tree.
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-5.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
coresight: Fix DEBUG_LOCKS_WARN_ON for uninitialized attribute
MAINTAINERS: Move linux-fpga tree to new location
nvmem: Use the same permissions for eeprom as for nvmem
habanalabs: fix host memory polling in BE architecture
habanalabs: fix F/W download in BE architecture
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are two small fixes for some driver core issues that have been
reported. There is also a kernfs "fix" here, which was then reverted
because it was found to cause problems in linux-next.
The driver core fixes both resolve reported issues, one with gpioint
stuff that showed up in 5.3-rc1, and the other finally (and hopefully)
resolves a very long standing race when removing glue directories.
It's nice to get that issue finally resolved and the developers
involved should be applauded for the persistence it took to get this
patch finally accepted.
All of these have been in linux-next for a while with no reported
issues. Well, the one reported issue, hence the revert :)"
* tag 'driver-core-5.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
Revert "kernfs: fix memleak in kernel_ops_readdir()"
kernfs: fix memleak in kernel_ops_readdir()
driver core: Fix use-after-free and double free on glue directory
driver core: platform: return -ENXIO for missing GpioInt
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty fix from Greg KH:
"Here is a single tty kgdb fix for 5.3-rc4.
It fixes an annoying log message that has caused kdb to become
useless. It's another fallout from commit ddde3c18b700 ("vt: More
locking checks") which tries to enforce locking checks more strictly
in the tty layer, unfortunatly when kdb is stopped, there's no need
for locks :)
This patch has been linux-next for a while with no reported issues"
* tag 'tty-5.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
kgdboc: disable the console lock when in kgdb
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging / IIO driver fixes from Greg KH:
"Here are some small staging and IIO driver fixes for 5.3-rc4.
Nothing major, just resolutions for a number of small reported issues,
full details in the shortlog.
All have been in linux-next for a while with no reported issues"
* tag 'staging-5.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
iio: adc: gyroadc: fix uninitialized return code
docs: generic-counter.rst: fix broken references for ABI file
staging: android: ion: Bail out upon SIGKILL when allocating memory.
Staging: fbtft: Fix GPIO handling
staging: unisys: visornic: Update the description of 'poll_for_irq()'
staging: wilc1000: flush the workqueue before deinit the host
staging: gasket: apex: fix copy-paste typo
Staging: fbtft: Fix reset assertion when using gpio descriptor
Staging: fbtft: Fix probing of gpio descriptor
iio: imu: mpu6050: add missing available scan masks
iio: cros_ec_accel_legacy: Fix incorrect channel setting
IIO: Ingenic JZ47xx: Set clock divider on probe
iio: adc: max9611: Fix misuse of GENMASK macro
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB fixes for 5.3-rc4.
The "biggest" one here is moving code from one file to another in
order to fix a long-standing race condition with the creation of sysfs
files for USB devices. Turns out that there are now userspace tools
out there that are hitting this long-known bug, so it's time to fix
them. Thankfully the tool-maker in this case fixed the issue :)
The other patches in here are all fixes for reported issues. Now that
syzbot knows how to fuzz USB drivers better, and is starting to now
fuzz the userspace facing side of them at the same time, there will be
more and more small fixes like these coming, which is a good thing.
All of these have been in linux-next with no reported issues"
* tag 'usb-5.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: setup authorized_default attributes using usb_bus_notify
usb: iowarrior: fix deadlock on disconnect
Revert "USB: rio500: simplify locking"
usb: usbfs: fix double-free of usb memory upon submiturb error
usb: yurex: Fix use-after-free in yurex_delete
usb: typec: tcpm: Ignore unsupported/unknown alternate mode requests
xhci: Fix NULL pointer dereference at endpoint zero reset.
usb: host: xhci-rcar: Fix timeout in xhci_suspend()
usb: typec: ucsi: ccg: Fix uninitilized symbol error
usb: typec: tcpm: remove tcpm dir if no children
usb: typec: tcpm: free log buf memory when remove debug file
usb: typec: tcpm: Add NULL check before dereferencing config
|