kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2024-05-02	ALSA: hda/realtek: Fix build error without CONFIG_PM	Takashi Iwai	1	-1/+3
	The alc_spec.power_hook is defined only with CONFIG_PM, and the recent fix overlooked it, resulting in a build error without CONFIG_PM. Fix it with the simple ifdef and set __maybe_unused for the function. We may drop the whole CONFIG_PM dependency there, but it should be done in a separate cleanup patch later. Fixes: 1e707769df07 ("ALSA: hda/realtek - Set GPIO3 to default at S4 state for Thinkpad with ALC1318") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202405012104.Dr7h318W-lkp@intel.com/ Message-ID: <20240502062442.30545-1-tiwai@suse.de> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-05-02	vxlan: Pull inner IP header in vxlan_rcv().	Guillaume Nault	1	-1/+18
	Ensure the inner IP header is part of skb's linear data before reading its ECN bits. Otherwise we might read garbage. One symptom is the system erroneously logging errors like "vxlan: non-ECT from xxx.xxx.xxx.xxx with TOS=xxxx". Similar bugs have been fixed in geneve, ip_tunnel and ip6_tunnel (see commit 1ca1ba465e55 ("geneve: make sure to pull inner header in geneve_rx()") for example). So let's reuse the same code structure for consistency. Maybe we'll can add a common helper in the future. Fixes: d342894c5d2f ("vxlan: virtual extensible lan") Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/1239c8db54efec341dd6455c77e0380f58923a3c.1714495737.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-02	tipc: fix a possible memleak in tipc_buf_append	Xin Long	1	-1/+1
	__skb_linearize() doesn't free the skb when it fails, so move '*buf = NULL' after __skb_linearize(), so that the skb can be freed on the err path. Fixes: b7df21cf1b79 ("tipc: skb_linearize the head skb when reassembling msgs") Reported-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Tung Nguyen <tung.q.nguyen@dektech.com.au> Link: https://lore.kernel.org/r/90710748c29a1521efac4f75ea01b3b7e61414cf.1714485818.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-02	tipc: fix UAF in error path	Paolo Abeni	1	-1/+5
	Sam Page (sam4k) working with Trend Micro Zero Day Initiative reported a UAF in the tipc_buf_append() error path: BUG: KASAN: slab-use-after-free in kfree_skb_list_reason+0x47e/0x4c0 linux/net/core/skbuff.c:1183 Read of size 8 at addr ffff88804d2a7c80 by task poc/8034 CPU: 1 PID: 8034 Comm: poc Not tainted 6.8.2 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-5 04/01/2014 Call Trace: <IRQ> __dump_stack linux/lib/dump_stack.c:88 dump_stack_lvl+0xd9/0x1b0 linux/lib/dump_stack.c:106 print_address_description linux/mm/kasan/report.c:377 print_report+0xc4/0x620 linux/mm/kasan/report.c:488 kasan_report+0xda/0x110 linux/mm/kasan/report.c:601 kfree_skb_list_reason+0x47e/0x4c0 linux/net/core/skbuff.c:1183 skb_release_data+0x5af/0x880 linux/net/core/skbuff.c:1026 skb_release_all linux/net/core/skbuff.c:1094 __kfree_skb linux/net/core/skbuff.c:1108 kfree_skb_reason+0x12d/0x210 linux/net/core/skbuff.c:1144 kfree_skb linux/./include/linux/skbuff.h:1244 tipc_buf_append+0x425/0xb50 linux/net/tipc/msg.c:186 tipc_link_input+0x224/0x7c0 linux/net/tipc/link.c:1324 tipc_link_rcv+0x76e/0x2d70 linux/net/tipc/link.c:1824 tipc_rcv+0x45f/0x10f0 linux/net/tipc/node.c:2159 tipc_udp_recv+0x73b/0x8f0 linux/net/tipc/udp_media.c:390 udp_queue_rcv_one_skb+0xad2/0x1850 linux/net/ipv4/udp.c:2108 udp_queue_rcv_skb+0x131/0xb00 linux/net/ipv4/udp.c:2186 udp_unicast_rcv_skb+0x165/0x3b0 linux/net/ipv4/udp.c:2346 __udp4_lib_rcv+0x2594/0x3400 linux/net/ipv4/udp.c:2422 ip_protocol_deliver_rcu+0x30c/0x4e0 linux/net/ipv4/ip_input.c:205 ip_local_deliver_finish+0x2e4/0x520 linux/net/ipv4/ip_input.c:233 NF_HOOK linux/./include/linux/netfilter.h:314 NF_HOOK linux/./include/linux/netfilter.h:308 ip_local_deliver+0x18e/0x1f0 linux/net/ipv4/ip_input.c:254 dst_input linux/./include/net/dst.h:461 ip_rcv_finish linux/net/ipv4/ip_input.c:449 NF_HOOK linux/./include/linux/netfilter.h:314 NF_HOOK linux/./include/linux/netfilter.h:308 ip_rcv+0x2c5/0x5d0 linux/net/ipv4/ip_input.c:569 __netif_receive_skb_one_core+0x199/0x1e0 linux/net/core/dev.c:5534 __netif_receive_skb+0x1f/0x1c0 linux/net/core/dev.c:5648 process_backlog+0x101/0x6b0 linux/net/core/dev.c:5976 __napi_poll.constprop.0+0xba/0x550 linux/net/core/dev.c:6576 napi_poll linux/net/core/dev.c:6645 net_rx_action+0x95a/0xe90 linux/net/core/dev.c:6781 __do_softirq+0x21f/0x8e7 linux/kernel/softirq.c:553 do_softirq linux/kernel/softirq.c:454 do_softirq+0xb2/0xf0 linux/kernel/softirq.c:441 </IRQ> <TASK> __local_bh_enable_ip+0x100/0x120 linux/kernel/softirq.c:381 local_bh_enable linux/./include/linux/bottom_half.h:33 rcu_read_unlock_bh linux/./include/linux/rcupdate.h:851 __dev_queue_xmit+0x871/0x3ee0 linux/net/core/dev.c:4378 dev_queue_xmit linux/./include/linux/netdevice.h:3169 neigh_hh_output linux/./include/net/neighbour.h:526 neigh_output linux/./include/net/neighbour.h:540 ip_finish_output2+0x169f/0x2550 linux/net/ipv4/ip_output.c:235 __ip_finish_output linux/net/ipv4/ip_output.c:313 __ip_finish_output+0x49e/0x950 linux/net/ipv4/ip_output.c:295 ip_finish_output+0x31/0x310 linux/net/ipv4/ip_output.c:323 NF_HOOK_COND linux/./include/linux/netfilter.h:303 ip_output+0x13b/0x2a0 linux/net/ipv4/ip_output.c:433 dst_output linux/./include/net/dst.h:451 ip_local_out linux/net/ipv4/ip_output.c:129 ip_send_skb+0x3e5/0x560 linux/net/ipv4/ip_output.c:1492 udp_send_skb+0x73f/0x1530 linux/net/ipv4/udp.c:963 udp_sendmsg+0x1a36/0x2b40 linux/net/ipv4/udp.c:1250 inet_sendmsg+0x105/0x140 linux/net/ipv4/af_inet.c:850 sock_sendmsg_nosec linux/net/socket.c:730 __sock_sendmsg linux/net/socket.c:745 __sys_sendto+0x42c/0x4e0 linux/net/socket.c:2191 __do_sys_sendto linux/net/socket.c:2203 __se_sys_sendto linux/net/socket.c:2199 __x64_sys_sendto+0xe0/0x1c0 linux/net/socket.c:2199 do_syscall_x64 linux/arch/x86/entry/common.c:52 do_syscall_64+0xd8/0x270 linux/arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x6f/0x77 linux/arch/x86/entry/entry_64.S:120 RIP: 0033:0x7f3434974f29 Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48 RSP: 002b:00007fff9154f2b8 EFLAGS: 00000212 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3434974f29 RDX: 00000000000032c8 RSI: 00007fff9154f300 RDI: 0000000000000003 RBP: 00007fff915532e0 R08: 00007fff91553360 R09: 0000000000000010 R10: 0000000000000000 R11: 0000000000000212 R12: 000055ed86d261d0 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK> In the critical scenario, either the relevant skb is freed or its ownership is transferred into a frag_lists. In both cases, the cleanup code must not free it again: we need to clear the skb reference earlier. Fixes: 1149557d64c9 ("tipc: eliminate unnecessary linearization of incoming buffers") Cc: stable@vger.kernel.org Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-23852 Acked-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/752f1ccf762223d109845365d07f55414058e5a3.1714484273.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-02	rxrpc: Clients must accept conn from any address	Jeffrey Altman	1	-7/+2
	The find connection logic of Transarc's Rx was modified in the mid-1990s to support multi-homed servers which might send a response packet from an address other than the destination address in the received packet. The rules for accepting a packet by an Rx initiator (RX_CLIENT_CONNECTION) were altered to permit acceptance of a packet from any address provided that the port number was unchanged and all of the connection identifiers matched (Epoch, CID, SecurityClass, ...). This change applies the same rules to the Linux implementation which makes it consistent with IBM AFS 3.6, Arla, OpenAFS and AuriStorFS. Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") Signed-off-by: Jeffrey Altman <jaltman@auristor.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Link: https://lore.kernel.org/r/20240419163057.4141728-1-marc.dionne@auristor.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-01	Merge tag 'asoc-fix-v6.9-rc6' of ↵	Takashi Iwai	788	-4634/+10154
	https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v6.9 This is much larger than is ideal, partly due to your holiday but also due to several vendors having come in with relatively large fixes at similar times. It's all driver specific stuff. The meson fixes from Jerome fix some rare timing issues with blocking operations happening in triggers, plus the continuous clock support which fixes clocking for some platforms. The SOF series from Peter builds to the fix to avoid spurious resets of ChainDMA which triggered errors in cleanup paths with both PulseAudio and PipeWire, and there's also some simple new debugfs files from Pierre which make support a lot eaiser.
2024-05-01	Merge tag 'regulator-fix-v6.9-rc6' of ↵	Linus Torvalds	5	-14/+27
	git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "There's a few simple driver specific fixes here, plus some core cleanups from Matti which fix issues found with client drivers due to the API being confusing. The two fixes for the stubs provide more constructive behaviour with !REGULATOR configurations, issues were noticed with some hwmon drivers which would otherwise have needed confusing bodges in the users. The irq_helpers fix to duplicate the provided name for the interrupt controller was found because a driver got this wrong and it's again a case where the core is the sensible place to put the fix" * tag 'regulator-fix-v6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: change devm_regulator_get_enable_optional() stub to return Ok regulator: change stubbed devm_regulator_get_enable to return Ok regulator: vqmmc-ipq4019: fix module autoloading regulator: qcom-refgen: fix module autoloading regulator: mt6360: De-capitalize devicetree regulator subnodes regulator: irq_helpers: duplicate IRQ name
2024-05-01	KVM: arm64: Force injection of a data abort on NISV MMIO exit	Marc Zyngier	2	-0/+15
	If a vcpu exits for a data abort with an invalid syndrome, the expectations are that userspace has a chance to save the day if it has requested to see such exits. However, this is completely futile in the case of a protected VM, as none of the state is available. In this particular case, inject a data abort directly into the vcpu, consistent with what userspace could do. This also helps with pKVM, which discards all syndrome information when forwarding data aborts that are not known to be MMIO. Finally, document this tweak to the API. Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-31-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Restrict supported capabilities for protected VMs	Fuad Tabba	1	-0/+32
	For practical reasons as well as security related ones, not all capabilities are supported for protected VMs in pKVM. Add a function that restricts the capabilities for protected VMs. This behaves as an allow-list to ensure that future capabilities are checked for compatibility and security before being allowed for protected VMs. Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-30-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Refactor setting the return value in kvm_vm_ioctl_enable_cap()	Fuad Tabba	1	-15/+9
	Initialize r = -EINVAL to get rid of the error-path initializations in kvm_vm_ioctl_enable_cap(). No functional change intended. Suggested-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-29-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Document the KVM/arm64-specific calls in hypercalls.rst	Will Deacon	2	-0/+47
	KVM/arm64 makes use of the SMCCC "Vendor Specific Hypervisor Service Call Range" to expose KVM-specific hypercalls to guests in a discoverable and extensible fashion. Document the existence of this interface and the discovery hypercall. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-28-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Rename firmware pseudo-register documentation file	Will Deacon	2	-4/+4
	In preparation for describing the guest view of KVM/arm64 hypercalls in hypercalls.rst, move the existing contents of the file concerning the firmware pseudo-registers elsewhere. Cc: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-27-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Reformat/beautify PTP hypercall documentation	Will Deacon	1	-14/+24
	The PTP hypercall documentation doesn't produce the best-looking table when formatting in HTML as all of the return value definitions end up on the same line. Reformat the PTP hypercall documentation to follow the formatting used by hypercalls.rst. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-26-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Clarify rationale for ZCR_EL1 value restored on guest exit	Fuad Tabba	1	-1/+23
	Expand comment clarifying why the host value representing SVE vector length being restored for ZCR_EL1 on guest exit isn't the same as it was on guest entry. Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-21-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Introduce and use predicates that check for protected VMs	Fuad Tabba	3	-8/+11
	In order to determine whether or not a VM or vcpu are protected, introduce helpers to query this state. While at it, use the vcpu helper to check vcpus protected state instead of the kvm one. Co-authored-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-19-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Add is_pkvm_initialized() helper	Quentin Perret	1	-4/+8
	Add a helper allowing to check when the pkvm static key is enabled to ease the introduction of pkvm hooks in other parts of the code. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-18-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Simplify vgic-v3 hypercalls	Marc Zyngier	10	-76/+38
	Consolidate the GICv3 VMCR accessor hypercalls into the APR save/restore hypercalls so that all of the EL2 GICv3 state is covered by a single pair of hypercalls. Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-17-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Move setting the page as dirty out of the critical section	Fuad Tabba	1	-3/+5
	Move the unlock earlier in user_mem_abort() to shorten the critical section. This also helps for future refactoring and reuse of similar code. This moves out marking the page as dirty outside of the critical section. That code does not interact with the stage-2 page tables, which the read lock in the critical section protects. Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-16-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Change kvm_handle_mmio_return() return polarity	Fuad Tabba	2	-3/+3
	Most exit handlers return <= 0 to indicate that the host needs to handle the exit. Make kvm_handle_mmio_return() consistent with the exit handlers in handle_exit(). This makes the code easier to reason about, and makes it easier to add other handlers in future patches. No functional change intended. Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-15-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Fix comment for __pkvm_vcpu_init_traps()	Fuad Tabba	1	-1/+1
	Fix the comment to clarify that __pkvm_vcpu_init_traps() initializes traps for all VMs in protected mode, and not only for protected VMs. Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-14-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Prevent kmemleak from accessing .hyp.data	Quentin Perret	1	-0/+1
	We've added a .data section for the hypervisor, which kmemleak is eager to parse. This clearly doesn't go well, so add the section to kmemleak's block list. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-13-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Do not map the host fpsimd state to hyp in pKVM	Fuad Tabba	3	-31/+4
	pKVM maintains its own state at EL2 for tracking the host fpsimd state. Therefore, no need to map and share the host's view with it. Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-12-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Rename __tlb_switch_to_{guest,host}() in VHE	Fuad Tabba	1	-13/+13
	Rename __tlb_switch_to_{guest,host}() to {enter,exit}_vmid_context() in VHE code to maintain symmetry between the nVHE and VHE TLB invalidations. No functional change intended. Suggested-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-11-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Support TLB invalidation in guest context	Will Deacon	1	-24/+91
	Typically, TLB invalidation of guest stage-2 mappings using nVHE is performed by a hypercall originating from the host. For the invalidation instruction to be effective, therefore, __tlb_switch_to_{guest,host}() swizzle the active stage-2 context around the TLBI instruction. With guest-to-host memory sharing and unsharing hypercalls originating from the guest under pKVM, there is need to support both guest and host VMID invalidations issued from guest context. Replace the __tlb_switch_to_{guest,host}() functions with a more general {enter,exit}_vmid_context() implementation which supports being invoked from guest context and acts as a no-op if the target context matches the running context. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-10-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Avoid BBM when changing only s/w bits in Stage-2 PTE	Will Deacon	1	-0/+15
	Break-before-make (BBM) can be expensive, as transitioning via an invalid mapping (i.e. the "break" step) requires the completion of TLB invalidation and can also cause other agents to fault concurrently on the invalid mapping. Since BBM is not required when changing only the software bits of a PTE, avoid the sequence in this case and just update the PTE directly. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-9-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Check for PTE validity when checking for executable/cacheable	Marc Zyngier	1	-3/+3
	Don't just assume that the PTE is valid when checking whether it describes an executable or cacheable mapping. This makes sure that we don't issue CMOs for invalid mappings. Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-8-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Avoid BUG-ing from the host abort path	Quentin Perret	1	-1/+7
	Under certain circumstances __get_fault_info() may resolve the faulting address using the AT instruction. Given that this is being done outside of the host lock critical section, it is racy and the resolution via AT may fail. We currently BUG() in this situation, which is obviously less than ideal. Moving the address resolution to the critical section may have a performance impact, so let's keep it where it is, but bail out and return to the host to try a second time. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-7-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Issue CMOs when tearing down guest s2 pages	Quentin Perret	1	-0/+1
	On the guest teardown path, pKVM will zero the pages used to back the guest data structures before returning them to the host as they may contain secrets (e.g. in the vCPU registers). However, the zeroing is done using a cacheable alias, and CMOs are missing, hence giving the host a potential opportunity to read the original content of the guest structs from memory. Fix this by issuing CMOs after zeroing the pages. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-6-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Do not re-initialize the KVM lock	Fuad Tabba	1	-1/+0
	The lock is already initialized in core KVM code at kvm_create_vm(). Fixes: 9d0c063a4d1d ("KVM: arm64: Instantiate pKVM hypervisor VM and vCPU structures from EL1") Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-5-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Refactor checks for FP state ownership	Fuad Tabba	6	-10/+13
	To avoid direct comparison against the fp_owner enum, add a new function that performs the check, host_owns_fp_regs(), to complement the existing guest_owns_fp_regs(). To check for fpsimd state ownership, use the helpers instead of directly using the enums. No functional change intended. Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-4-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Move guest_owns_fp_regs() to increase its scope	Fuad Tabba	4	-8/+8
	guest_owns_fp_regs() will be used to check fpsimd state ownership across kvm/arm64. Therefore, move it to kvm_host.h to widen its scope. Moreover, the host state is not per-vcpu anymore, the vcpu parameter isn't used, so remove it as well. No functional change intended. Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-3-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Initialize the kvm host data's fpsimd_state pointer in pKVM	Fuad Tabba	3	-0/+13
	Since the host_fpsimd_state has been removed from kvm_vcpu_arch, it isn't pointing to the hyp's version of the host fp_regs in protected mode. Initialize the host_data fpsimd_state point to the host_data's context fp_regs on pKVM initialization. Fixes: 51e09b5572d6 ("KVM: arm64: Exclude host_fpsimd_state pointer from kvm_vcpu_arch") Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-2-tabba@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	KVM: arm64: Remove duplicated AA64MMFR1_EL1 XNX	Russell King	1	-1/+0
	Commit d5a32b60dc18 ("KVM: arm64: Allow userspace to change ID_AA64MMFR{0-2}_EL1") made certain fields in these registers writable, but in doing so, ID_AA64MMFR1_EL1_XNX was listed twice. Remove the duplication. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Zenghui Yu <zenghui.yu@linux.dev> Link: https://lore.kernel.org/r/E1s2AxF-00AWLv-03@rmk-PC.armlinux.org.uk Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-05-01	drm/xe/vm: prevent UAF in rebind_work_func()	Matthew Auld	1	-0/+3
	We flush the rebind worker during the vm close phase, however in places like preempt_fence_work_func() we seem to queue the rebind worker without first checking if the vm has already been closed. The concern here is the vm being closed with the worker flushed, but then being rearmed later, which looks like potential uaf, since there is no actual refcounting to track the queued worker. We can't take the vm->lock here in preempt_rebind_work_func() to first check if the vm is closed since that will deadlock, so instead flush the worker again when the vm refcount reaches zero. v2: - Grabbing vm->lock in the preempt worker creates a deadlock, so checking the closed state is tricky. Instead flush the worker when the refcount reaches zero. It should be impossible to queue the preempt worker without already holding vm ref. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1676 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1591 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1364 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1304 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1249 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240423074721.119633-4-matthew.auld@intel.com (cherry picked from commit 3d44d67c441a9fe6f81a1d705f7de009a32a5b35) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-05-01	drm/amd/display: Disable panel replay by default for now	Mario Limonciello	1	-9/+12
	Panel replay was enabled by default in commit 5950efe25ee0 ("drm/amd/display: Enable Panel Replay for static screen use case"), but it isn't working properly at least on some BOE and AUO panels. Instead of being static the screen is solid black when active. As it's a new feature that was just introduced that regressed VRR disable it for now so that problem can be properly root caused. Cc: Tom Chung <chiahsuan.chung@amd.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3344 Fixes: 5950efe25ee0 ("drm/amd/display: Enable Panel Replay for static screen use case") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-01	net: core: reject skb_copy(_expand) for fraglist GSO skbs	Felix Fietkau	1	-8/+19
	SKB_GSO_FRAGLIST skbs must not be linearized, otherwise they become invalid. Return NULL if such an skb is passed to skb_copy or skb_copy_expand, in order to prevent a crash on a potential later call to skb_gso_segment. Fixes: 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.") Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-01	net: bridge: fix multicast-to-unicast with fraglist GSO	Felix Fietkau	1	-1/+1
	Calling skb_copy on a SKB_GSO_FRAGLIST skb is not valid, since it returns an invalid linearized skb. This code only needs to change the ethernet header, so pskb_copy is the right function to call here. Fixes: 6db6f0eae605 ("bridge: multicast to unicast") Signed-off-by: Felix Fietkau <nbd@nbd.name> Acked-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-01	nvme-tcp: strict pdu pacing to avoid send stalls on TLS	Hannes Reinecke	1	-2/+8
	TLS requires a strict pdu pacing via MSG_EOR to signal the end of a record and subsequent encryption. If we do not set MSG_EOR at the end of a sequence the record won't be closed, encryption doesn't start, and we end up with a send stall as the message will never be passed on to the TCP layer. So do not check for the queue status when TLS is enabled but rather make the MSG_MORE setting dependent on the current request only. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-05-01	nvmet: fix nvme status code when namespace is disabled	Sagi Grimberg	3	-1/+18
	If the user disabled a nvmet namespace, it is removed from the subsystem namespaces list. When nvmet processes a command directed to an nsid that was disabled, it cannot differentiate between a nsid that is disabled vs. a non-existent namespace, and resorts to return NVME_SC_INVALID_NS with the dnr bit set. This translates to a non-retryable status for the host, which translates to a user error. We should expect disabled namespaces to not cause an I/O error in a multipath environment. Address this by searching a configfs item for the namespace nvmet failed to find, and if we found one, conclude that the namespace is disabled (perhaps temporarily). Return NVME_SC_INTERNAL_PATH_ERROR in this case and keep DNR bit cleared. Reported-by: Jirong Feng <jirong.feng@easystack.cn> Tested-by: Jirong Feng <jirong.feng@easystack.cn> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-05-01	nvmet-tcp: fix possible memory leak when tearing down a controller	Sagi Grimberg	1	-7/+4
	When we teardown the controller, we wait for pending I/Os to complete (sq->ref on all queues to drop to zero) and then we go over the commands, and free their command buffers in case they are still fetching data from the host (e.g. processing nvme writes) and have yet to take a reference on the sq. However, we may miss the case where commands have failed before executing and are queued for sending a response, but will never occur because the queue socket is already down. In this case we may miss deallocating command buffers. Solve this by freeing all commands buffers as nvmet_tcp_free_cmd_buffers is idempotent anyways. Reported-by: Yi Zhang <yi.zhang@redhat.com> Tested-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-05-01	nvme: cancel pending I/O if nvme controller is in terminal state	Nilay Shroff	3	-22/+28
	While I/O is running, if the pci bus error occurs then in-flight I/O can not complete. Worst, if at this time, user (logically) hot-unplug the nvme disk then the nvme_remove() code path can't forward progress until in-flight I/O is cancelled. So these sequence of events may potentially hang hot-unplug code path indefinitely. This patch helps cancel the pending/in-flight I/O from the nvme request timeout handler in case the nvme controller is in the terminal (DEAD/DELETING/DELETING_NOIO) state and that helps nvme_remove() code path forward progress and finish successfully. Link: https://lore.kernel.org/all/199be893-5dfa-41e5-b6f2-40ac90ebccc4@linux.ibm.com/ Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-05-01	nvmet-auth: replace pr_debug() with pr_err() to report an error.	Maurizio Lombardi	1	-3/+3
	In nvmet_auth_host_hash(), if a mismatch is detected in the hash length the kernel should print an error. Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-05-01	nvmet-auth: return the error code to the nvmet_auth_host_hash() callers	Maurizio Lombardi	1	-1/+1
	If the nvmet_auth_host_hash() function fails, the error code should be returned to its callers. Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-05-01	nvme: find numa distance only if controller has valid numa id	Nilay Shroff	1	-1/+2
	On system where native nvme multipath is configured and iopolicy is set to numa but the nvme controller numa node id is undefined or -1 (NUMA_NO_NODE) then avoid calculating node distance for finding optimal io path. In such case we may access numa distance table with invalid index and that may potentially refer to incorrect memory. So this patch ensures that if the nvme controller numa node id is -1 then instead of calculating node distance for finding optimal io path, we set the numa node distance of such controller to default 10 (LOCAL_DISTANCE). Link: https://lore.kernel.org/all/20240413090614.678353-1-nilay@linux.ibm.com/ Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-05-01	s390/paes: Reestablish retry loop in paes	Harald Freudenberger	1	-2/+13
	With commit ed6776c96c60 ("s390/crypto: remove retry loop with sleep from PAES pkey invocation") the retry loop to retry derivation of a protected key from a secure key has been removed. This was based on the assumption that theses retries are not needed any more as proper retries are done in the zcrypt layer. However, tests have revealed that there exist some cases with master key change in the HSM and immediately (< 1 second) attempt to derive a protected key from a secure key with exact this HSM may eventually fail. The low level functions in zcrypt_ccamisc.c and zcrypt_ep11misc.c detect and report this temporary failure and report it to the caller as -EBUSY. The re-established retry loop in the paes implementation catches exactly this -EBUSY and eventually may run some retries. Fixes: ed6776c96c60 ("s390/crypto: remove retry loop with sleep from PAES pkey invocation") Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-05-01	s390/zcrypt: Use EBUSY to indicate temp unavailability	Harald Freudenberger	1	-3/+3
	Use -EBUSY instead of -EAGAIN in zcrypt_ccamisc.c in cases where the CCA card returns 8/2290 to indicate a temporarily unavailability of this function. Fixes: ed6776c96c60 ("s390/crypto: remove retry loop with sleep from PAES pkey invocation") Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-05-01	s390/zcrypt: Handle ep11 cprb return code	Harald Freudenberger	1	-0/+46
	An EP11 reply cprb contains a field ret_code which may hold an error code different than the error code stored in the payload of the cprb. As of now all the EP11 misc functions do not evaluate this field but focus on the error code in the payload. Before checking the payload error, first the cprb error field should be evaluated which is introduced with this patch. If the return code value 0x000c0003 is seen, this indicates a busy situation which is reflected by -EBUSY in the zcrpyt_ep11misc.c low level function. A higher level caller should consider to retry after waiting a dedicated duration (say 1 second). Fixes: ed6776c96c60 ("s390/crypto: remove retry loop with sleep from PAES pkey invocation") Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-05-01	s390/zcrypt: Fix wrong format string in debug feature printout	Harald Freudenberger	1	-1/+1
	Fix wrong format string debug feature: %04x was used to print out a 32 bit value. - changed to %08x. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-05-01	x86/mm: Remove broken vsyscall emulation code from the page fault code	Linus Torvalds	3	-59/+3
	The syzbot-reported stack trace from hell in this discussion thread actually has three nested page faults: https://lore.kernel.org/r/000000000000d5f4fc0616e816d4@google.com ... and I think that's actually the important thing here: - the first page fault is from user space, and triggers the vsyscall emulation. - the second page fault is from __do_sys_gettimeofday(), and that should just have caused the exception that then sets the return value to -EFAULT - the third nested page fault is due to _raw_spin_unlock_irqrestore() -> preempt_schedule() -> trace_sched_switch(), which then causes a BPF trace program to run, which does that bpf_probe_read_compat(), which causes that page fault under pagefault_disable(). It's quite the nasty backtrace, and there's a lot going on. The problem is literally the vsyscall emulation, which sets current->thread.sig_on_uaccess_err = 1; and that causes the fixup_exception() code to send the signal despite the exception being caught. And I think that is in fact completely bogus. It's completely bogus exactly because it sends that signal even when it shouldn't be sent - like for the BPF user mode trace gathering. In other words, I think the whole "sig_on_uaccess_err" thing is entirely broken, because it makes any nested page-faults do all the wrong things. Now, arguably, I don't think anybody should enable vsyscall emulation any more, but this test case clearly does. I think we should just make the "send SIGSEGV" be something that the vsyscall emulation does on its own, not this broken per-thread state for something that isn't actually per thread. The x86 page fault code actually tried to deal with the "incorrect nesting" by having that: if (in_interrupt()) return; which ignores the sig_on_uaccess_err case when it happens in interrupts, but as shown by this example, these nested page faults do not need to be about interrupts at all. IOW, I think the only right thing is to remove that horrendously broken code. The attached patch looks like the ObviouslyCorrect(tm) thing to do. NOTE! This broken code goes back to this commit in 2011: 4fc3490114bb ("x86-64: Set siginfo and context on vsyscall emulation faults") ... and back then the reason was to get all the siginfo details right. Honestly, I do not for a moment believe that it's worth getting the siginfo details right here, but part of the commit says: This fixes issues with UML when vsyscall=emulate. ... and so my patch to remove this garbage will probably break UML in this situation. I do not believe that anybody should be running with vsyscall=emulate in 2024 in the first place, much less if you are doing things like UML. But let's see if somebody screams. Reported-and-tested-by: syzbot+83e7f982ca045ab4405c@syzkaller.appspotmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Andy Lutomirski <luto@kernel.org> Link: https://lore.kernel.org/r/CAHk-=wh9D6f7HUkDgZHKmDCHUQmp+Co89GP+b8+z+G56BKeyNg@mail.gmail.com
2024-05-01	spi: fix null pointer dereference within spi_sync	Mans Rullgard	1	-0/+1
	If spi_sync() is called with the non-empty queue and the same spi_message is then reused, the complete callback for the message remains set while the context is cleared, leading to a null pointer dereference when the callback is invoked from spi_finalize_current_message(). With function inlining disabled, the call stack might look like this: _raw_spin_lock_irqsave from complete_with_flags+0x18/0x58 complete_with_flags from spi_complete+0x8/0xc spi_complete from spi_finalize_current_message+0xec/0x184 spi_finalize_current_message from spi_transfer_one_message+0x2a8/0x474 spi_transfer_one_message from __spi_pump_transfer_message+0x104/0x230 __spi_pump_transfer_message from __spi_transfer_message_noqueue+0x30/0xc4 __spi_transfer_message_noqueue from __spi_sync+0x204/0x248 __spi_sync from spi_sync+0x24/0x3c spi_sync from mcp251xfd_regmap_crc_read+0x124/0x28c [mcp251xfd] mcp251xfd_regmap_crc_read [mcp251xfd] from _regmap_raw_read+0xf8/0x154 _regmap_raw_read from _regmap_bus_read+0x44/0x70 _regmap_bus_read from _regmap_read+0x60/0xd8 _regmap_read from regmap_read+0x3c/0x5c regmap_read from mcp251xfd_alloc_can_err_skb+0x1c/0x54 [mcp251xfd] mcp251xfd_alloc_can_err_skb [mcp251xfd] from mcp251xfd_irq+0x194/0xe70 [mcp251xfd] mcp251xfd_irq [mcp251xfd] from irq_thread_fn+0x1c/0x78 irq_thread_fn from irq_thread+0x118/0x1f4 irq_thread from kthread+0xd8/0xf4 kthread from ret_from_fork+0x14/0x28 Fix this by also setting message->complete to NULL when the transfer is complete. Fixes: ae7d2346dc89 ("spi: Don't use the message queue if possible in spi_sync") Signed-off-by: Mans Rullgard <mans@mansr.com> Link: https://lore.kernel.org/r/20240430182705.13019-1-mans@mansr.com Signed-off-by: Mark Brown <broonie@kernel.org>