summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-04-21bcachefs: Fix inode early destruction pathKent Overstreet1-3/+6
discard_new_inode() is the wrong interface to use when we need to free an inode that was never inserted into the inode hash table; we can bypass the whole iput() -> evict() path and replace it with __destroy_inode(); kmem_cache_free() - this fixes a WARN_ON() about I_NEW. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-21bcachefs: Fix deadlock in journal write pathKent Overstreet1-18/+42
bch2_journal_write() was incorrectly waiting on earlier journal writes synchronously; this usually worked because most of the time we'd be running in the context of a thread that did a journal_buf_put(), but sometimes we'd be running out of the same workqueue that completes those prior journal writes. Additionally, this makes sure to punt to a workqueue before submitting preflushes - we really don't want to be calling submit_bio() in the main transaction commit path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-21bcachefs: Tweak btree key cache shrinker so it actually freesKent Overstreet1-15/+4
Freeing key cache items is a multi stage process; we need to wait for an SRCU grace period to elapse, and we handle this ourselves - partially to avoid callback overhead, but primarily so that when allocating we can first allocate from the freed items waiting for an SRCU grace period. Previously, the shrinker was counting the items on the 'waiting for SRCU grace period' lists as items being scanned, but this meant that too many items waiting for an SRCU grace period could prevent it from doing any work at all. After this, we're seeing that items skipped due to the accessed bit are the main cause of the shrinker not making any progress, and we actually want the key cache shrinker to run quite aggressively because reclaimed items will still generally be found (more compactly) in the btree node cache - so we also tweak the shrinker to not count those against nr_to_scan. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-20bcachefs: bkey_cached.btree_trans_barrier_seq needs to be a ulongKent Overstreet1-1/+1
this stores the SRCU sequence number, which we use to check if an SRCU barrier has elapsed; this is a partial fix for the key cache shrinker not actually freeing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-20Merge tag 'block-6.9-20240420' of git://git.kernel.dk/linuxLinus Torvalds4-13/+28
Pull block fixes from Jens Axboe: "Just two minor fixes that should go into the 6.9 kernel release, one fixing a regression with partition scanning errors, and one fixing a WARN_ON() that can get triggered if we race with a timer" * tag 'block-6.9-20240420' of git://git.kernel.dk/linux: blk-iocost: do not WARN if iocg was already offlined block: propagate partition scanning errors to the BLKRRPART ioctl
2024-04-20Merge tag 'email' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-2/+2
Pull email address update from James Bottomley: "My IBM email has stopped working, so update to a working email address" * tag 'email' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: MAINTAINERS: update to working email address
2024-04-20Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds25-159/+267
Pull kvm fixes from Paolo Bonzini: "This is a bit on the large side, mostly due to two changes: - Changes to disable some broken PMU virtualization (see below for details under "x86 PMU") - Clean up SVM's enter/exit assembly code so that it can be compiled without OBJECT_FILES_NON_STANDARD. This fixes a warning "Unpatched return thunk in use. This should not happen!" when running KVM selftests. Everything else is small bugfixes and selftest changes: - Fix a mostly benign bug in the gfn_to_pfn_cache infrastructure where KVM would allow userspace to refresh the cache with a bogus GPA. The bug has existed for quite some time, but was exposed by a new sanity check added in 6.9 (to ensure a cache is either GPA-based or HVA-based). - Drop an unused param from gfn_to_pfn_cache_invalidate_start() that got left behind during a 6.9 cleanup. - Fix a math goof in x86's hugepage logic for KVM_SET_MEMORY_ATTRIBUTES that results in an array overflow (detected by KASAN). - Fix a bug where KVM incorrectly clears root_role.direct when userspace sets guest CPUID. - Fix a dirty logging bug in the where KVM fails to write-protect SPTEs used by a nested guest, if KVM is using Page-Modification Logging and the nested hypervisor is NOT using EPT. x86 PMU: - Drop support for virtualizing adaptive PEBS, as KVM's implementation is architecturally broken without an obvious/easy path forward, and because exposing adaptive PEBS can leak host LBRs to the guest, i.e. can leak host kernel addresses to the guest. - Set the enable bits for general purpose counters in PERF_GLOBAL_CTRL at RESET time, as done by both Intel and AMD processors. - Disable LBR virtualization on CPUs that don't support LBR callstacks, as KVM unconditionally uses PERF_SAMPLE_BRANCH_CALL_STACK when creating the perf event, and would fail on such CPUs. Tests: - Fix a flaw in the max_guest_memory selftest that results in it exhausting the supply of ucall structures when run with more than 256 vCPUs. - Mark KVM_MEM_READONLY as supported for RISC-V in set_memory_region_test" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (30 commits) KVM: Drop unused @may_block param from gfn_to_pfn_cache_invalidate_start() KVM: selftests: Add coverage of EPT-disabled to vmx_dirty_log_test KVM: x86/mmu: Fix and clarify comments about clearing D-bit vs. write-protecting KVM: x86/mmu: Remove function comments above clear_dirty_{gfn_range,pt_masked}() KVM: x86/mmu: Write-protect L2 SPTEs in TDP MMU when clearing dirty status KVM: x86/mmu: Precisely invalidate MMU root_role during CPUID update KVM: VMX: Disable LBR virtualization if the CPU doesn't support LBR callstacks perf/x86/intel: Expose existence of callback support to KVM KVM: VMX: Snapshot LBR capabilities during module initialization KVM: x86/pmu: Do not mask LVTPC when handling a PMI on AMD platforms KVM: x86: Snapshot if a vCPU's vendor model is AMD vs. Intel compatible KVM: x86: Stop compiling vmenter.S with OBJECT_FILES_NON_STANDARD KVM: SVM: Create a stack frame in __svm_sev_es_vcpu_run() KVM: SVM: Save/restore args across SEV-ES VMRUN via host save area KVM: SVM: Save/restore non-volatile GPRs in SEV-ES VMRUN via host save area KVM: SVM: Clobber RAX instead of RBX when discarding spec_ctrl_intercepted KVM: SVM: Drop 32-bit "support" from __svm_sev_es_vcpu_run() KVM: SVM: Wrap __svm_sev_es_vcpu_run() with #ifdef CONFIG_KVM_AMD_SEV KVM: SVM: Create a stack frame in __svm_vcpu_run() for unwinding KVM: SVM: Remove a useless zeroing of allocated memory ...
2024-04-20Merge tag 'powerpc-6.9-3' of ↵Linus Torvalds3-6/+11
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix wireguard loading failure on pre-Power10 due to Power10 crypto routines - Fix papr-vpd selftest failure due to missing variable initialization - Avoid unnecessary get/put in spapr_tce_platform_iommu_attach_dev() Thanks to Geetika Moolchandani, Jason Gunthorpe, Michal Suchánek, Nathan Lynch, and Shivaprasad G Bhat. * tag 'powerpc-6.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: selftests/powerpc/papr-vpd: Fix missing variable initialization powerpc/crypto/chacha-p10: Fix failure on non Power10 powerpc/iommu: Refactor spapr_tce_platform_iommu_attach_dev()
2024-04-20Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds4-39/+156
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "A couple clk driver fixes, a build fix, and a deadlock fix: - Mediatek mt7988 has broken PCIe because the wrong parent is used - Mediatek clk drivers may deadlock when registering their clks because the clk provider device is repeatedly runtime PM resumed and suspended during probe and clk registration. Resuming the clk provider device deadlocks with an ABBA deadlock due to genpd_lock and the clk prepare_lock. The fix is to keep the device runtime resumed while registering clks. - Another runtime PM related deadlock, this time with disabling unused clks during late init. We get an ABBA deadlock where a device is runtime PM resuming (or suspending) while the disabling of unused clks is happening in parallel. That runtime PM action calls into the clk framework and tries to grab the clk prepare_lock while the disabling of unused clks holds the prepare_lock and is waiting for that runtime PM action to complete. The fix is to runtime resume all the clk provider devices before grabbing the clk prepare_lock during disable unused. - A build fix to provide an empty devm_clk_rate_exclusive_get() function when CONFIG_COMMON_CLK=n" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: mediatek: mt7988-infracfg: fix clocks for 2nd PCIe port clk: mediatek: Do a runtime PM get on controllers during probe clk: Get runtime PM before walking tree for clk_summary clk: Get runtime PM before walking tree during disable_unused clk: Initialize struct clk_core kref earlier clk: Don't hold prepare_lock when calling kref_put() clk: Remove prepare_lock hold assertion in __clk_release() clk: Provide !COMMON_CLK dummy for devm_clk_rate_exclusive_get()
2024-04-20Revert "svcrdma: Add Write chunk WRs to the RPC's Send WR chain"Chuck Lever3-78/+26
Performance regression reported with NFS/RDMA using Omnipath, bisected to commit e084ee673c77 ("svcrdma: Add Write chunk WRs to the RPC's Send WR chain"). Tracing on the server reports: nfsd-7771 [060] 1758.891809: svcrdma_sq_post_err: cq.id=205 cid=226 sc_sq_avail=13643/851 status=-12 sq_post_err reports ENOMEM, and the rdma->sc_sq_avail (13643) is larger than rdma->sc_sq_depth (851). The number of available Send Queue entries is always supposed to be smaller than the Send Queue depth. That seems like a Send Queue accounting bug in svcrdma. As it's getting to be late in the 6.9-rc cycle, revert this commit. It can be revisited in a subsequent kernel release. Link: https://bugzilla.kernel.org/show_bug.cgi?id=218743 Fixes: e084ee673c77 ("svcrdma: Add Write chunk WRs to the RPC's Send WR chain") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-04-20MAINTAINERS: update to working email addressJames Bottomley1-2/+2
jejb@linux.ibm.com no longer works. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2024-04-20KVM: arm64: Drop trapping of PAuth instructions/keysMarc Zyngier7-99/+70
We currently insist on disabling PAuth on vcpu_load(), and get to enable it on first guest use of an instruction or a key (ignoring the NV case for now). It isn't clear at all what this is trying to achieve: guests tend to use PAuth when available, and nothing forces you to expose it to the guest if you don't want to. This also isn't totally free: we take a full GPR save/restore between host and guest, only to write ten 64bit registers. The "value proposition" escapes me. So let's forget this stuff and enable PAuth eagerly if exposed to the guest. This results in much simpler code. Performance wise, that's not bad either (tested on M2 Pro running a fully automated Debian installer as the workload): - On a non-NV guest, I can see reduction of 0.24% in the number of cycles (measured with perf over 10 consecutive runs) - On a NV guest (L2), I see a 2% reduction in wall-clock time (measured with 'time', as M2 doesn't have a PMUv3 and NV doesn't support it either) So overall, a much reduced complexity and a (small) performance improvement. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-16-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: nv: Advertise support for PAuthMarc Zyngier1-6/+2
Now that we (hopefully) correctly handle ERETAx, drop the masking of the PAuth feature (something that was not even complete, as APA3 and AGA3 were still exposed). Reviewed-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-15-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: nv: Handle ERETA[AB] instructionsMarc Zyngier3-5/+33
Now that we have some emulation in place for ERETA[AB], we can plug it into the exception handling machinery. As for a bare ERET, an "easy" ERETAx instruction is processed as a fixup, while something that requires a translation regime transition or an exception delivery is left to the slow path. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-14-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: nv: Add emulation for ERETAx instructionsMarc Zyngier4-0/+210
FEAT_NV has the interesting property of relying on ERET being trapped. An added complexity is that it also traps ERETAA and ERETAB, meaning that the Pointer Authentication aspect of these instruction must be emulated. Add an emulation of Pointer Authentication, limited to ERETAx (always using SP_EL2 as the modifier and ELR_EL2 as the pointer), using the Generic Authentication instructions. The emulation, however small, is placed in its own compilation unit so that it can be avoided if the configuration doesn't include it (or the toolchan in not up to the task). Reviewed-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-13-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: nv: Add kvm_has_pauth() helperMarc Zyngier1-0/+15
Pointer Authentication comes in many flavors, and a faithful emulation relies on correctly handling the flavour implemented by the HW. For this, provide a new kvm_has_pauth() that checks whether we expose to the guest a particular level of support. This checks across all 3 possible authentication algorithms (Q5, Q3 and IMPDEF). Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-12-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: nv: Reinject PAC exceptions caused by HCR_EL2.API==0Marc Zyngier1-3/+25
In order for a L1 hypervisor to correctly handle PAuth instructions, it must observe traps caused by a L1 PAuth instruction when HCR_EL2.API==0. Since we already handle the case for API==1 as a fixup, only the exception injection case needs to be handled. Rework the kvm_handle_ptrauth() callback to reinject the trap in this case. Note that APK==0 is already handled by the exising triage_sysreg_trap() helper. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-11-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: nv: Handle HCR_EL2.{API,APK} independentlyMarc Zyngier2-10/+27
Although KVM couples API and APK for simplicity, the architecture makes no such requirement, and the two can be independently set or cleared. Check for which of the two possible reasons we have trapped here, and if the corresponding L1 control bit isn't set, delegate the handling for forwarding. Otherwise, set this exact bit in HCR_EL2 and resume the guest. Of course, in the non-NV case, we keep setting both bits and be done with it. Note that the entry core already saves/restores the keys should any of the two control bits be set. This results in a bit of rework, and the removal of the (trivial) vcpu_ptrauth_enable() helper. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-10-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: nv: Honor HFGITR_EL2.ERET being setMarc Zyngier1-1/+2
If the L1 hypervisor decides to trap ERETs while running L2, make sure we don't try to emulate it, just like we wouldn't if it had its NV bit set. The exception will be reinjected from the core handler. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-9-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: nv: Fast-track 'InHost' exception returnsMarc Zyngier2-26/+47
A significant part of the FEAT_NV extension is to trap ERET instructions so that the hypervisor gets a chance to switch from a vEL2 L1 guest to an EL1 L2 guest. But this also has the unfortunate consequence of trapping ERET in unsuspecting circumstances, such as staying at vEL2 (interrupt handling while being in the guest hypervisor), or returning to host userspace in the case of a VHE guest. Although we already make some effort to handle these ERET quicker by not doing the put/load dance, it is still way too far down the line for it to be efficient enough. For these cases, it would ideal to ERET directly, no question asked. Of course, we can't do that. But the next best thing is to do it as early as possible, in fixup_guest_exit(), much as we would handle FPSIMD exceptions. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-8-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: nv: Add trap forwarding for ERET and SMCMarc Zyngier3-0/+35
Honor the trap forwarding bits for both ERET and SMC, using a new helper that checks for common conditions. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Co-developed-by: Jintack Lim <jintack.lim@linaro.org> Signed-off-by: Jintack Lim <jintack.lim@linaro.org> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-7-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: nv: Configure HCR_EL2 for FEAT_NV2Marc Zyngier3-5/+36
Add the HCR_EL2 configuration for FEAT_NV2, adding the required bits for running a guest hypervisor, and overall merging the allowed bits provided by the guest. This heavily replies on unavaliable features being sanitised when the HCR_EL2 shadow register is accessed, and only a couple of bits must be explicitly disabled. Non-NV guests are completely unaffected by any of this. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-6-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: nv: Drop VCPU_HYP_CONTEXT flagMarc Zyngier2-8/+1
It has become obvious that HCR_EL2.NV serves the exact same use as VCPU_HYP_CONTEXT, only in an architectural way. So just drop the flag for good. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-5-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: Constraint PAuth support to consistent implementationsMarc Zyngier1-2/+36
PAuth comes it two parts: address authentication, and generic authentication. So far, KVM mandates that both are implemented. PAuth also comes in three flavours: Q5, Q3, and IMPDEF. Only one can be implemented for any of address and generic authentication. Crucially, the architecture doesn't mandate that address and generic authentication implement the *same* flavour. This would make implementing ERETAx very difficult for NV, something we are not terribly keen on. So only allow PAuth support for KVM on systems that are not totally insane. Which is so far 100% of the known HW. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-4-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: Add helpers for ESR_ELx_ERET_ISS_ERET*Marc Zyngier2-1/+13
The ESR_ELx_ERET_ISS_ERET* macros are a bit confusing: - ESR_ELx_ERET_ISS_ERET really indicates that we have trapped an ERETA* instruction, as opposed to an ERET - ESR_ELx_ERET_ISS_ERETA really indicates that we have trapped an ERETAB instruction, as opposed to an ERETAA. We could repaint those to make more sense, but these are the names that are present in the ARM ARM, and we are sentimentally attached to those. Instead, add two new helpers: - esr_iss_is_eretax() being true tells you that you need to authenticate the ERET - esr_iss_is_eretab() tells you that you need to use the B key instead of the A key Following patches will make use of these primitives. Suggested-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-3-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20KVM: arm64: Harden __ctxt_sys_reg() against out-of-range valuesMarc Zyngier1-1/+8
The unsuspecting kernel tinkerer can be easily confused into writing something that looks like this: ikey.lo = __vcpu_sys_reg(vcpu, SYS_APIAKEYLO_EL1); which seems vaguely sensible, until you realise that the second parameter is the encoding of a sysreg, and not the index into the vcpu sysreg file... Debugging what happens in this case is an interesting exercise in head<->wall interactions. As they often say: "Any resemblance to actual persons, living or dead, or actual events is purely coincidental". In order to save people's time, add some compile-time hardening that will at least weed out the "stupidly out of range" values. This will *not* catch anything that isn't a compile-time constant. Reviewed-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240419102935.1935571-2-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-04-20bcachefs: Fix missing call to bch2_fs_allocator_background_exit()Kent Overstreet1-0/+1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-20bcachefs: Check for journal entries overruning end of sb clean sectionKent Overstreet2-1/+10
Fix a missing bounds check in superblock validation. Note that we don't yet have repair code for this case - repair code for individual items is generally low priority, since the whole superblock is checksummed, validated prior to write, and we have backups. Reported-by: lei lu <llfamsec@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-20Merge branch 'mlxsw-fixes'Jakub Kicinski3-21/+11
Petr Machata says: ==================== mlxsw: Fixes This patchset fixes the following issues: - During driver de-initialization the driver unregisters the EMAD response trap by setting its action to DISCARD. However the manual only permits TRAP and FORWARD, and future firmware versions will enforce this. In patch #1, suppress the error message by aligning the driver to the manual and use a FORWARD (NOP) action when unregistering the trap. - The driver queries the Management Capabilities Mask (MCAM) register during initialization to understand if certain features are supported. However, not all firmware versions support this register, leading to the driver failing to load. Patches #2 and #3 fix this issue by treating an error in the register query as an indication that the feature is not supported. v2: - Patch #2: - Make mlxsw_env_max_module_eeprom_len_query() void ==================== Link: https://lore.kernel.org/r/cover.1713446092.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-20mlxsw: pci: Fix driver initialization with old firmwareIdo Schimmel1-6/+4
The driver queries the Management Capabilities Mask (MCAM) register during initialization to understand if a new and deeper reset flow is supported. However, not all firmware versions support this register, leading to the driver failing to load. Fix by treating an error in the register query as an indication that the feature is not supported. Fixes: f257c73e5356 ("mlxsw: pci: Add support for new reset flow") Reported-by: Tim 'mithro' Ansell <me@mith.ro> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/ee968c49d53bac96a4c66d1b09ebbd097d81aca5.1713446092.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-20mlxsw: core_env: Fix driver initialization with old firmwareIdo Schimmel1-14/+6
The driver queries the Management Capabilities Mask (MCAM) register during initialization to understand if it can read up to 128 bytes from transceiver modules. However, not all firmware versions support this register, leading to the driver failing to load. Fix by treating an error in the register query as an indication that the feature is not supported. Fixes: 1f4aea1f72da ("mlxsw: core_env: Read transceiver module EEPROM in 128 bytes chunks") Reported-by: Tim 'mithro' Ansell <me@mith.ro> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/0afa8b2e8bac178f5f88211344429176dcc72281.1713446092.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-20mlxsw: core: Unregister EMAD trap using FORWARD actionIdo Schimmel1-1/+1
The device's manual (PRM - Programmer's Reference Manual) classifies the trap that is used to deliver EMAD responses as an "event trap". Among other things, it means that the only actions that can be associated with the trap are TRAP and FORWARD (NOP). Currently, during driver de-initialization the driver unregisters the trap by setting its action to DISCARD, which violates the above guideline. Future firmware versions will prevent such misuses by returning an error. This does not prevent the driver from working, but an error will be printed to the kernel log during module removal / devlink reload: mlxsw_spectrum 0000:03:00.0: Reg cmd access status failed (status=7(bad parameter)) mlxsw_spectrum 0000:03:00.0: Reg cmd access failed (reg_id=7003(hpkt),type=write) Suppress the error message by aligning the driver to the manual and use a FORWARD (NOP) action when unregistering the trap. Fixes: 4ec14b7634b2 ("mlxsw: Add interface to access registers and process events") Cc: Jiri Pirko <jiri@resnulli.us> Cc: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/753a89e14008fde08cb4a2c1e5f537b81d8eb2d6.1713446092.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-20net: bcmasp: fix memory leak when bringing down interfaceJustin Chen1-7/+14
When bringing down the TX rings we flush the rings but forget to reclaimed the flushed packets. This leads to a memory leak since we do not free the dma mapped buffers. This also leads to tx control block corruption when bringing down the interface for power management. Fixes: 490cb412007d ("net: bcmasp: Add support for ASP2.0 Ethernet controller") Signed-off-by: Justin Chen <justin.chen@broadcom.com> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240418180541.2271719-1-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-20udp: preserve the connected status if only UDP cmsgYick Xie2-4/+6
If "udp_cmsg_send()" returned 0 (i.e. only UDP cmsg), "connected" should not be set to 0. Otherwise it stops the connected socket from using the cached route. Fixes: 2e8de8576343 ("udp: add gso segment cmsg") Signed-off-by: Yick Xie <yick.xie@gmail.com> Cc: stable@vger.kernel.org Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20240418170610.867084-1-yick.xie@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-20ksmbd: add continuous availability share parameterNamjae Jeon2-19/+27
If capabilities of the share is not SMB2_SHARE_CAP_CONTINUOUS_AVAILABILITY, ksmbd should not grant a persistent handle to the client. This patch add continuous availability share parameter to control it. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-04-20ksmbd: common: use struct_group_attr instead of struct_group for ↵Namjae Jeon1-1/+1
network_open_info 4byte padding cause the connection issue with the applications of MacOS. smb2_close response size increases by 4 bytes by padding, And the smb client of MacOS check it and stop the connection. This patch use struct_group_attr instead of struct_group for network_open_info to use __packed to avoid padding. Fixes: 0015eb6e1238 ("smb: client, common: fix fortify warnings") Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-04-20ksmbd: clear RENAME_NOREPLACE before calling vfs_renameMarios Makassikis1-0/+5
File overwrite case is explicitly handled, so it is not necessary to pass RENAME_NOREPLACE to vfs_rename. Clearing the flag fixes rename operations when the share is a ntfs-3g mount. The latter uses an older version of fuse with no support for flags in the ->rename op. Cc: stable@vger.kernel.org Signed-off-by: Marios Makassikis <mmakassikis@freebox.fr> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-04-20ksmbd: validate request buffer size in smb2_allocate_rsp_buf()Namjae Jeon1-0/+4
The response buffer should be allocated in smb2_allocate_rsp_buf before validating request. But the fields in payload as well as smb2 header is used in smb2_allocate_rsp_buf(). This patch add simple buffer size validation to avoid potencial out-of-bounds in request buffer. Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-04-20ksmbd: fix slab-out-of-bounds in smb2_allocate_rsp_bufNamjae Jeon1-8/+5
If ->ProtocolId is SMB2_TRANSFORM_PROTO_NUM, smb2 request size validation could be skipped. if request size is smaller than sizeof(struct smb2_query_info_req), slab-out-of-bounds read can happen in smb2_allocate_rsp_buf(). This patch allocate response buffer after decrypting transform request. smb3_decrypt_req() will validate transform request size and avoid slab-out-of-bound in smb2_allocate_rsp_buf(). Reported-by: Norbert Szetei <norbert@doyensec.com> Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-04-20ASoC: codecs: wsa881x: set clk_stop_mode1 flagSrinivas Kandagatla1-0/+1
WSA881x codecs do not retain the state while clock is stopped, so mark this with clk_stop_mode1 flag. Fixes: a0aab9e1404a ("ASoC: codecs: add wsa881x amplifier support") Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20240419140012.91384-1-srinivas.kandagatla@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2024-04-20Merge tag 'perf-tools-fixes-for-v6.9-2024-04-19' of ↵Linus Torvalds19-742/+817
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Namhyung Kim: "A random set of small bug fixes: - Fix perf annotate TUI when used with data type profiling - Work around BPF verifier about sighand lock checking And a set of kernel header synchronization" * tag 'perf-tools-fixes-for-v6.9-2024-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: tools/include: Sync arm64 asm/cputype.h with the kernel sources tools/include: Sync asm-generic/bitops/fls.h with the kernel sources tools/include: Sync x86 asm/msr-index.h with the kernel sources tools/include: Sync x86 asm/irq_vectors.h with the kernel sources tools/include: Sync x86 CPU feature headers with the kernel sources tools/include: Sync uapi/sound/asound.h with the kernel sources tools/include: Sync uapi/linux/kvm.h and asm/kvm.h with the kernel sources tools/include: Sync uapi/linux/fs.h with the kernel sources tools/include: Sync uapi/drm/i915_drm.h with the kernel sources perf lock contention: Add a missing NULL check perf annotate: Make sure to call symbol__annotate2() in TUI
2024-04-20Merge tag 'hardening-v6.9-rc5' of ↵Linus Torvalds2-7/+22
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening fixes from Kees Cook: - Correctly disable UBSAN configs in configs/hardening (Nathan Chancellor) - Add missing signed integer overflow trap types to arm64 handler * tag 'hardening-v6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: ubsan: Add awareness of signed integer overflow traps configs/hardening: Disable CONFIG_UBSAN_SIGNED_WRAP configs/hardening: Fix disabling UBSAN configurations
2024-04-20smb: client: fix rename(2) regression against sambaPaulo Alcantara1-0/+1
After commit 2c7d399e551c ("smb: client: reuse file lease key in compound operations") the client started reusing lease keys for rename, unlink and set path size operations to prevent it from breaking its own leases and thus causing unnecessary lease breaks to same connection. The implementation relies on positive dentries and cifsInodeInfo::lease_granted to decide whether reusing lease keys for the compound requests. cifsInodeInfo::lease_granted was introduced by commit 0ab95c2510b6 ("Defer close only when lease is enabled.") to indicate whether lease caching is granted for a specific file, but that can only happen until file is open, so cifsInodeInfo::lease_granted was left uninitialised in ->alloc_inode and then client started sending random lease keys for files that hadn't any leases. This fixes the following test case against samba: mount.cifs //srv/share /mnt/1 -o ...,nosharesock mount.cifs //srv/share /mnt/2 -o ...,nosharesock touch /mnt/1/foo; tail -f /mnt/1/foo & pid=$! mv /mnt/2/foo /mnt/2/bar # fails with -EIO kill $pid Fixes: 0ab95c2510b6 ("Defer close only when lease is enabled.") Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-04-20Merge tag 'for-linus-iommufd' of ↵Linus Torvalds2-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd fixes from Jason Gunthorpe: "Two fixes for the selftests: - CONFIG_IOMMUFD_TEST needs CONFIG_IOMMUFD_DRIVER to work - The kconfig fragment sshould include fault injection so the fault injection test can work" * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: iommufd: Add config needed for iommufd_fail_nth iommufd: Add missing IOMMUFD_DRIVER kconfig for the selftest
2024-04-20cifs: Add tracing for the cifs_tcon struct refcountingDavid Howells11-26/+143
Add tracing for the refcounting/lifecycle of the cifs_tcon struct, marking different events with different labels and giving each tcon its own debug ID so that the tracelines corresponding to individual tcons can be distinguished. This can be enabled with: echo 1 >/sys/kernel/debug/tracing/events/cifs/smb3_tcon_ref/enable Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2024-04-19Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds3-5/+11
Pull rdma fixes from Jason Gunthorpe: - Add a missing mutex_destroy() in rxe - Enhance the debugging print for cm_destroy failures to help debug these - Fix mlx5 MAD processing in cases where multiport devices are running in switchedev mode * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/mlx5: Fix port number for counter query in multi-port configuration RDMA/cm: Print the old state when cm_destroy_id gets timeout RDMA/rxe: Fix the problem "mutex_destroy missing"
2024-04-19cifs: Fix reacquisition of volume cookie on still-live connectionDavid Howells3-0/+18
During mount, cifs_mount_get_tcon() gets a tcon resource connection record and then attaches an fscache volume cookie to it. However, it does this irrespective of whether or not the tcon returned from cifs_get_tcon() is a new record or one that's already in use. This leads to a warning about a volume cookie collision and a leaked volume cookie because tcon->fscache gets reset. Fix this be adding a mutex and a "we've already tried this" flag and only doing it once for the lifetime of the tcon. [!] Note: Looking at cifs_mount_get_tcon(), a more general solution may actually be required. Reacquiring the volume cookie isn't the only thing that function does: it also partially reinitialises the tcon record without any locking - which may cause live filesystem ops already using the tcon through a previous mount to malfunction. This can be reproduced simply by something like: mount //example.com/test /xfstest.test -o user=shares,pass=xxx,fsc mount //example.com/test /mnt -o user=shares,pass=xxx,fsc Fixes: 70431bfd825d ("cifs: Support fscache indexing rewrite") Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> cc: Shyam Prasad N <sprasad@microsoft.com> cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2024-04-19Merge tag '9p-fixes-for-6.9-rc5' of ↵Linus Torvalds4-6/+23
git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs Pull fs/9p fixes from Eric Van Hensbergen: "This contains a reversion of one of the original 6.9 patches which seems to have been the cause of most of the instability. It also incorporates several fixes to legacy support and cache fixes. There are few additional changes to improve stability, but I want another week of testing before sending them upstream" * tag '9p-fixes-for-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: fs/9p: drop inodes immediately on non-.L too fs/9p: Revert "fs/9p: fix dups even in uncached mode" fs/9p: remove erroneous nlink init from legacy stat2inode 9p: explicitly deny setlease attempts fs/9p: fix the cache always being enabled on files with qid flags fs/9p: translate O_TRUNC into OTRUNC fs/9p: only translate RWX permissions for plain 9P2000
2024-04-19Merge tag 'fuse-fixes-6.9-rc5' of ↵Linus Torvalds6-27/+58
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fixes from Miklos Szeredi: - Fix two bugs in the new passthrough mode - Fix a statx bug introduced in v6.6 - Fix code documentation * tag 'fuse-fixes-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: cuse: add kernel-doc comments to cuse_process_init_reply() fuse: fix leaked ENOSYS error on first statx call fuse: fix parallel dio write on file open in passthrough mode fuse: fix wrong ff->iomode state changes from parallel dio write
2024-04-19Merge tag 'arm64-fixes' of ↵Linus Torvalds3-6/+9
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Fix a kernel fault during page table walking in huge_pte_alloc() with PTABLE_LEVELS=5 due to using p4d_offset() instead of p4d_alloc() - head.S fix and cleanup to disable the MMU before toggling the HCR_EL2.E2H bit when entering the kernel with the MMU on from the EFI stub. Changing this bit (currently from VHE to nVHE) causes some system registers as well as page table descriptors to be interpreted differently, potentially resulting in spurious MMU faults - Fix translation fault in swsusp_save() accessing MEMBLOCK_NOMAP memory ranges due to kernel_page_present() returning true in most configurations other than rodata_full == true, CONFIG_DEBUG_PAGEALLOC=y or CONFIG_KFENCE=y * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: hibernate: Fix level3 translation fault in swsusp_save() arm64/head: Disable MMU at EL2 before clearing HCR_EL2.E2H arm64/head: Drop unnecessary pre-disable-MMU workaround arm64/hugetlb: Fix page table walk in huge_pte_alloc()