<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/arch, branch v6.6.12</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.12</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.12'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-01-10T16:17:00+00:00</updated>
<entry>
<title>x86/kprobes: fix incorrect return address calculation in kprobe_emulate_call_indirect</title>
<updated>2024-01-10T16:17:00+00:00</updated>
<author>
<name>Jinghao Jia</name>
<email>jinghao7@illinois.edu</email>
</author>
<published>2024-01-02T23:33:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=dfb8a426e030c14737dfde038cf9c173c8032f08'/>
<id>urn:sha1:dfb8a426e030c14737dfde038cf9c173c8032f08</id>
<content type='text'>
commit f5d03da48d062966c94f0199d20be0b3a37a7982 upstream.

kprobe_emulate_call_indirect currently uses int3_emulate_call to emulate
indirect calls. However, int3_emulate_call always assumes the size of
the call to be 5 bytes when calculating the return address. This is
incorrect for register-based indirect calls in x86, which can be either
2 or 3 bytes depending on whether REX prefix is used. At kprobe runtime,
the incorrect return address causes control flow to land onto the wrong
place after return -- possibly not a valid instruction boundary. This
can lead to a panic like the following:

[    7.308204][    C1] BUG: unable to handle page fault for address: 000000000002b4d8
[    7.308883][    C1] #PF: supervisor read access in kernel mode
[    7.309168][    C1] #PF: error_code(0x0000) - not-present page
[    7.309461][    C1] PGD 0 P4D 0
[    7.309652][    C1] Oops: 0000 [#1] SMP
[    7.309929][    C1] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.7.0-rc5-trace-for-next #6
[    7.310397][    C1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-20220807_005459-localhost 04/01/2014
[    7.311068][    C1] RIP: 0010:__common_interrupt+0x52/0xc0
[    7.311349][    C1] Code: 01 00 4d 85 f6 74 39 49 81 fe 00 f0 ff ff 77 30 4c 89 f7 4d 8b 5e 68 41 ba 91 76 d8 42 45 03 53 fc 74 02 0f 0b cc ff d3 65 48 &lt;8b&gt; 05 30 c7 ff 7e 65 4c 89 3d 28 c7 ff 7e 5b 41 5c 41 5e 41 5f c3
[    7.312512][    C1] RSP: 0018:ffffc900000e0fd0 EFLAGS: 00010046
[    7.312899][    C1] RAX: 0000000000000001 RBX: 0000000000000023 RCX: 0000000000000001
[    7.313334][    C1] RDX: 00000000000003cd RSI: 0000000000000001 RDI: ffff888100d302a4
[    7.313702][    C1] RBP: 0000000000000001 R08: 0ef439818636191f R09: b1621ff338a3b482
[    7.314146][    C1] R10: ffffffff81e5127b R11: ffffffff81059810 R12: 0000000000000023
[    7.314509][    C1] R13: 0000000000000000 R14: ffff888100d30200 R15: 0000000000000000
[    7.314951][    C1] FS:  0000000000000000(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000
[    7.315396][    C1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    7.315691][    C1] CR2: 000000000002b4d8 CR3: 0000000003028003 CR4: 0000000000370ef0
[    7.316153][    C1] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    7.316508][    C1] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    7.316948][    C1] Call Trace:
[    7.317123][    C1]  &lt;IRQ&gt;
[    7.317279][    C1]  ? __die_body+0x64/0xb0
[    7.317482][    C1]  ? page_fault_oops+0x248/0x370
[    7.317712][    C1]  ? __wake_up+0x96/0xb0
[    7.317964][    C1]  ? exc_page_fault+0x62/0x130
[    7.318211][    C1]  ? asm_exc_page_fault+0x22/0x30
[    7.318444][    C1]  ? __cfi_native_send_call_func_single_ipi+0x10/0x10
[    7.318860][    C1]  ? default_idle+0xb/0x10
[    7.319063][    C1]  ? __common_interrupt+0x52/0xc0
[    7.319330][    C1]  common_interrupt+0x78/0x90
[    7.319546][    C1]  &lt;/IRQ&gt;
[    7.319679][    C1]  &lt;TASK&gt;
[    7.319854][    C1]  asm_common_interrupt+0x22/0x40
[    7.320082][    C1] RIP: 0010:default_idle+0xb/0x10
[    7.320309][    C1] Code: 4c 01 c7 4c 29 c2 e9 72 ff ff ff cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 b8 0c 67 40 a5 66 90 0f 00 2d 09 b9 3b 00 fb f4 &lt;fa&gt; c3 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 b8 0c 67 40 a5 e9
[    7.321449][    C1] RSP: 0018:ffffc9000009bee8 EFLAGS: 00000256
[    7.321808][    C1] RAX: ffff88813bca8b68 RBX: 0000000000000001 RCX: 000000000001ef0c
[    7.322227][    C1] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 000000000001ef0c
[    7.322656][    C1] RBP: ffffc9000009bef8 R08: 8000000000000000 R09: 00000000000008c2
[    7.323083][    C1] R10: 0000000000000000 R11: ffffffff81058e70 R12: 0000000000000000
[    7.323530][    C1] R13: ffff8881002b30c0 R14: 0000000000000000 R15: 0000000000000000
[    7.323948][    C1]  ? __cfi_lapic_next_deadline+0x10/0x10
[    7.324239][    C1]  default_idle_call+0x31/0x50
[    7.324464][    C1]  do_idle+0xd3/0x240
[    7.324690][    C1]  cpu_startup_entry+0x25/0x30
[    7.324983][    C1]  start_secondary+0xb4/0xc0
[    7.325217][    C1]  secondary_startup_64_no_verify+0x179/0x17b
[    7.325498][    C1]  &lt;/TASK&gt;
[    7.325641][    C1] Modules linked in:
[    7.325906][    C1] CR2: 000000000002b4d8
[    7.326104][    C1] ---[ end trace 0000000000000000 ]---
[    7.326354][    C1] RIP: 0010:__common_interrupt+0x52/0xc0
[    7.326614][    C1] Code: 01 00 4d 85 f6 74 39 49 81 fe 00 f0 ff ff 77 30 4c 89 f7 4d 8b 5e 68 41 ba 91 76 d8 42 45 03 53 fc 74 02 0f 0b cc ff d3 65 48 &lt;8b&gt; 05 30 c7 ff 7e 65 4c 89 3d 28 c7 ff 7e 5b 41 5c 41 5e 41 5f c3
[    7.327570][    C1] RSP: 0018:ffffc900000e0fd0 EFLAGS: 00010046
[    7.327910][    C1] RAX: 0000000000000001 RBX: 0000000000000023 RCX: 0000000000000001
[    7.328273][    C1] RDX: 00000000000003cd RSI: 0000000000000001 RDI: ffff888100d302a4
[    7.328632][    C1] RBP: 0000000000000001 R08: 0ef439818636191f R09: b1621ff338a3b482
[    7.329223][    C1] R10: ffffffff81e5127b R11: ffffffff81059810 R12: 0000000000000023
[    7.329780][    C1] R13: 0000000000000000 R14: ffff888100d30200 R15: 0000000000000000
[    7.330193][    C1] FS:  0000000000000000(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000
[    7.330632][    C1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    7.331050][    C1] CR2: 000000000002b4d8 CR3: 0000000003028003 CR4: 0000000000370ef0
[    7.331454][    C1] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    7.331854][    C1] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    7.332236][    C1] Kernel panic - not syncing: Fatal exception in interrupt
[    7.332730][    C1] Kernel Offset: disabled
[    7.333044][    C1] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

The relevant assembly code is (from objdump, faulting address
highlighted):

ffffffff8102ed9d:       41 ff d3                  call   *%r11
ffffffff8102eda0:       65 48 &lt;8b&gt; 05 30 c7 ff    mov    %gs:0x7effc730(%rip),%rax

The emulation incorrectly sets the return address to be ffffffff8102ed9d
+ 0x5 = ffffffff8102eda2, which is the 8b byte in the middle of the next
mov. This in turn causes incorrect subsequent instruction decoding and
eventually triggers the page fault above.

Instead of invoking int3_emulate_call, perform push and jmp emulation
directly in kprobe_emulate_call_indirect. At this point we can obtain
the instruction size from p-&gt;ainsn.size so that we can calculate the
correct return address.

Link: https://lore.kernel.org/all/20240102233345.385475-1-jinghao7@illinois.edu/

Fixes: 6256e668b7af ("x86/kprobes: Use int3 instead of debug trap for single-step")
Cc: stable@vger.kernel.org
Signed-off-by: Jinghao Jia &lt;jinghao7@illinois.edu&gt;
Signed-off-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>powerpc/pseries/vas: Migration suspend waits for no in-progress open windows</title>
<updated>2024-01-10T16:16:59+00:00</updated>
<author>
<name>Haren Myneni</name>
<email>haren@linux.ibm.com</email>
</author>
<published>2023-11-25T23:51:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e1b45baa2e382f7058f993e01ac48df224ff70c6'/>
<id>urn:sha1:e1b45baa2e382f7058f993e01ac48df224ff70c6</id>
<content type='text'>
[ Upstream commit 0cf72f7f14d12cb065c3d01954cf42fc5638aa69 ]

The hypervisor returns migration failure if all VAS windows are not
closed. During pre-migration stage, vas_migration_handler() sets
migration_in_progress flag and closes all windows from the list.
The allocate VAS window routine checks the migration flag, setup
the window and then add it to the list. So there is possibility of
the migration handler missing the window that is still in the
process of setup.

t1: Allocate and open VAS	t2: Migration event
    window

lock vas_pseries_mutex
If migration_in_progress set
  unlock vas_pseries_mutex
  return
open window HCALL
unlock vas_pseries_mutex
Modify window HCALL		lock vas_pseries_mutex
setup window			migration_in_progress=true
				Closes all windows from the list
				// May miss windows that are
				// not in the list
				unlock vas_pseries_mutex
lock vas_pseries_mutex		return
if nr_closed_windows == 0
  // No DLPAR CPU or migration
  add window to the list
  // Window will be added to the
  // list after the setup is completed
  unlock vas_pseries_mutex
  return
unlock vas_pseries_mutex
Close VAS window
// due to DLPAR CPU or migration
return -EBUSY

This patch resolves the issue with the following steps:
- Set the migration_in_progress flag without holding mutex.
- Introduce nr_open_wins_progress counter in VAS capabilities
  struct
- This counter tracks the number of open windows are still in
  progress
- The allocate setup window thread closes windows if the migration
  is set and decrements nr_open_window_progress counter
- The migration handler waits for no in-progress open windows.

The code flow with the fix is as follows:

t1: Allocate and open VAS       t2: Migration event
    window

lock vas_pseries_mutex
If migration_in_progress set
   unlock vas_pseries_mutex
   return
open window HCALL
nr_open_wins_progress++
// Window opened, but not
// added to the list yet
unlock vas_pseries_mutex
Modify window HCALL		migration_in_progress=true
setup window			lock vas_pseries_mutex
				Closes all windows from the list
				While nr_open_wins_progress {
				    unlock vas_pseries_mutex
lock vas_pseries_mutex		    sleep
if nr_closed_windows == 0	    // Wait if any open window in
or migration is not started	    // progress. The open window
   // No DLPAR CPU or migration	    // thread closes the window without
   add window to the list	    // adding to the list and return if
   nr_open_wins_progress--	    // the migration is in progress.
   unlock vas_pseries_mutex
   return
Close VAS window
nr_open_wins_progress--
unlock vas_pseries_mutex
return -EBUSY			    lock vas_pseries_mutex
				}
				unlock vas_pseries_mutex
				return

Fixes: 37e6764895ef ("powerpc/pseries/vas: Add VAS migration handler")
Signed-off-by: Haren Myneni &lt;haren@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://msgid.link/20231125235104.3405008-1-haren@linux.ibm.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>RISCV: KVM: update external interrupt atomically for IMSIC swfile</title>
<updated>2024-01-10T16:16:59+00:00</updated>
<author>
<name>Yong-Xuan Wang</name>
<email>yongxuan.wang@sifive.com</email>
</author>
<published>2023-12-13T06:16:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=24f855a88c9580d3aea5150d971e0ffb0f6e6b6d'/>
<id>urn:sha1:24f855a88c9580d3aea5150d971e0ffb0f6e6b6d</id>
<content type='text'>
[ Upstream commit 4ad9843e1ea088bd2529290234c6c4c6374836a7 ]

The emulated IMSIC update the external interrupt pending depending on
the value of eidelivery and topei. It might lose an interrupt when it
is interrupted before setting the new value to the pending status.

For example, when VCPU0 sends an IPI to VCPU1 via IMSIC:

VCPU0                           VCPU1

                                CSRSWAP topei = 0
                                The VCPU1 has claimed all the external
                                interrupt in its interrupt handler.

                                topei of VCPU1's IMSIC = 0

set pending in VCPU1's IMSIC

topei of VCPU1' IMSIC = 1

set the external interrupt
pending of VCPU1

                                clear the external interrupt pending
                                of VCPU1

When the VCPU1 switches back to VS mode, it exits the interrupt handler
because the result of CSRSWAP topei is 0. If there are no other external
interrupts injected into the VCPU1's IMSIC, VCPU1 will never know this
pending interrupt unless it initiative read the topei.

If the interruption occurs between updating interrupt pending in IMSIC
and updating external interrupt pending of VCPU, it will not cause a
problem. Suppose that the VCPU1 clears the IPI pending in IMSIC right
after VCPU0 sets the pending, the external interrupt pending of VCPU1
will not be set because the topei is 0. But when the VCPU1 goes back to
VS mode, the pending IPI will be reported by the CSRSWAP topei, it will
not lose this interrupt.

So we only need to make the external interrupt updating procedure as a
critical section to avoid the problem.

Fixes: db8b7e97d613 ("RISC-V: KVM: Add in-kernel virtualization of AIA IMSIC")
Tested-by: Roy Lin &lt;roy.lin@sifive.com&gt;
Tested-by: Wayling Chen &lt;wayling.chen@sifive.com&gt;
Co-developed-by: Vincent Chen &lt;vincent.chen@sifive.com&gt;
Signed-off-by: Vincent Chen &lt;vincent.chen@sifive.com&gt;
Signed-off-by: Yong-Xuan Wang &lt;yongxuan.wang@sifive.com&gt;
Signed-off-by: Anup Patel &lt;anup@brainfault.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>KVM: s390: vsie: fix wrong VIR 37 when MSO is used</title>
<updated>2024-01-10T16:16:56+00:00</updated>
<author>
<name>Claudio Imbrenda</name>
<email>imbrenda@linux.ibm.com</email>
</author>
<published>2023-11-02T15:35:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=42db0099eca3bee8459b8c0121975dc8d7cc1c75'/>
<id>urn:sha1:42db0099eca3bee8459b8c0121975dc8d7cc1c75</id>
<content type='text'>
[ Upstream commit 80aea01c48971a1fffc0252d036995572d84950d ]

When the host invalidates a guest page, it will also check if the page
was used to map the prefix of any guest CPUs, in which case they are
stopped and marked as needing a prefix refresh. Upon starting the
affected CPUs again, their prefix pages are explicitly faulted in and
revalidated if they had been invalidated. A bit in the PGSTEs indicates
whether or not a page might contain a prefix. The bit is allowed to
overindicate. Pages above 2G are skipped, because they cannot be
prefixes, since KVM runs all guests with MSO = 0.

The same applies for nested guests (VSIE). When the host invalidates a
guest page that maps the prefix of the nested guest, it has to stop the
affected nested guest CPUs and mark them as needing a prefix refresh.
The same PGSTE bit used for the guest prefix is also used for the
nested guest. Pages above 2G are skipped like for normal guests, which
is the source of the bug.

The nested guest runs is the guest primary address space. The guest
could be running the nested guest using MSO != 0. If the MSO + prefix
for the nested guest is above 2G, the check for nested prefix will skip
it. This will cause the invalidation notifier to not stop the CPUs of
the nested guest and not mark them as needing refresh. When the nested
guest is run again, its prefix will not be refreshed, since it has not
been marked for refresh. This will cause a fatal validity intercept
with VIR code 37.

Fix this by removing the check for 2G for nested guests. Now all
invalidations of pages with the notify bit set will always scan the
existing VSIE shadow state descriptors.

This allows to catch invalidations of nested guest prefix mappings even
when the prefix is above 2G in the guest virtual address space.

Fixes: a3508fbe9dc6 ("KVM: s390: vsie: initial support for nested virtualization")
Tested-by: Nico Boehr &lt;nrb@linux.ibm.com&gt;
Reviewed-by: Nico Boehr &lt;nrb@linux.ibm.com&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Message-ID: &lt;20231102153549.53984-1-imbrenda@linux.ibm.com&gt;
Signed-off-by: Claudio Imbrenda &lt;imbrenda@linux.ibm.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>riscv: don't probe unaligned access speed if already done</title>
<updated>2024-01-10T16:16:56+00:00</updated>
<author>
<name>Jisheng Zhang</name>
<email>jszhang@kernel.org</email>
</author>
<published>2023-09-12T15:40:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b649a7fe9a8b41a01b608e129a90af09d4b5f52f'/>
<id>urn:sha1:b649a7fe9a8b41a01b608e129a90af09d4b5f52f</id>
<content type='text'>
[ Upstream commit c20d36cc2a2073d4cdcda92bd7a1bb9b3b3b7c79 ]

If misaligned_access_speed percpu var isn't so called "HWPROBE
MISALIGNED UNKNOWN", it means the probe has happened(this is possible
for example, hotplug off then hotplug on one cpu), and the percpu var
has been set, don't probe again in this case.

Signed-off-by: Jisheng Zhang &lt;jszhang@kernel.org&gt;
Fixes: 584ea6564bca ("RISC-V: Probe for unaligned access speed")
Reviewed-by: Conor Dooley &lt;conor.dooley@microchip.com&gt;
Link: https://lore.kernel.org/r/20230912154040.3306-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>ARM: sun9i: smp: Fix array-index-out-of-bounds read in sunxi_mc_smp_init</title>
<updated>2024-01-10T16:16:50+00:00</updated>
<author>
<name>Stefan Wahren</name>
<email>wahrenst@gmx.net</email>
</author>
<published>2023-12-28T19:39:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=015414fdc068ab54ebcac181824ca0800e5855db'/>
<id>urn:sha1:015414fdc068ab54ebcac181824ca0800e5855db</id>
<content type='text'>
[ Upstream commit 72ad3b772b6d393701df58ba1359b0bb346a19ed ]

Running a multi-arch kernel (multi_v7_defconfig) on a Raspberry Pi 3B+
with enabled CONFIG_UBSAN triggers the following warning:

 UBSAN: array-index-out-of-bounds in arch/arm/mach-sunxi/mc_smp.c:810:29
 index 2 is out of range for type 'sunxi_mc_smp_data [2]'
 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.7.0-rc6-00248-g5254c0cbc92d
 Hardware name: BCM2835
  unwind_backtrace from show_stack+0x10/0x14
  show_stack from dump_stack_lvl+0x40/0x4c
  dump_stack_lvl from ubsan_epilogue+0x8/0x34
  ubsan_epilogue from __ubsan_handle_out_of_bounds+0x78/0x80
  __ubsan_handle_out_of_bounds from sunxi_mc_smp_init+0xe4/0x4cc
  sunxi_mc_smp_init from do_one_initcall+0xa0/0x2fc
  do_one_initcall from kernel_init_freeable+0xf4/0x2f4
  kernel_init_freeable from kernel_init+0x18/0x158
  kernel_init from ret_from_fork+0x14/0x28

Since the enabled method couldn't match with any entry from
sunxi_mc_smp_data, the value of the index shouldn't be used right after
the loop. So move it after the check of ret in order to have a valid
index.

Fixes: 1631090e34f5 ("ARM: sun9i: smp: Add is_a83t field")
Signed-off-by: Stefan Wahren &lt;wahrenst@gmx.net&gt;
Link: https://lore.kernel.org/r/20231228193903.9078-1-wahrenst@gmx.net
Reviewed-by: Chen-Yu Tsai &lt;wens@csie.org&gt;
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>KVM: x86/pmu: fix masking logic for MSR_CORE_PERF_GLOBAL_CTRL</title>
<updated>2024-01-10T16:16:46+00:00</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2024-01-04T15:15:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=30912a7f64de49408c9c0596937717770f1e7a63'/>
<id>urn:sha1:30912a7f64de49408c9c0596937717770f1e7a63</id>
<content type='text'>
commit 971079464001c6856186ca137778e534d983174a upstream.

When commit c59a1f106f5c ("KVM: x86/pmu: Add IA32_PEBS_ENABLE
MSR emulation for extended PEBS") switched the initialization of
cpuc-&gt;guest_switch_msrs to use compound literals, it screwed up
the boolean logic:

+	u64 pebs_mask = cpuc-&gt;pebs_enabled &amp; x86_pmu.pebs_capable;
...
-	arr[0].guest = intel_ctrl &amp; ~cpuc-&gt;intel_ctrl_host_mask;
-	arr[0].guest &amp;= ~(cpuc-&gt;pebs_enabled &amp; x86_pmu.pebs_capable);
+               .guest = intel_ctrl &amp; (~cpuc-&gt;intel_ctrl_host_mask | ~pebs_mask),

Before the patch, the value of arr[0].guest would have been intel_ctrl &amp;
~cpuc-&gt;intel_ctrl_host_mask &amp; ~pebs_mask.  The intent is to always treat
PEBS events as host-only because, while the guest runs, there is no way
to tell the processor about the virtual address where to put PEBS records
intended for the host.

Unfortunately, the new expression can be expanded to

	(intel_ctrl &amp; ~cpuc-&gt;intel_ctrl_host_mask) | (intel_ctrl &amp; ~pebs_mask)

which makes no sense; it includes any bit that isn't *both* marked as
exclude_guest and using PEBS.  So, reinstate the old logic.  Another
way to write it could be "intel_ctrl &amp; ~(cpuc-&gt;intel_ctrl_host_mask |
pebs_mask)", presumably the intention of the author of the faulty.
However, I personally find the repeated application of A AND NOT B to
be a bit more readable.

This shows up as guest failures when running concurrent long-running
perf workloads on the host, and was reported to happen with rcutorture.
All guests on a given host would die simultaneously with something like an
instruction fault or a segmentation violation.

Reported-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Analyzed-by: Sean Christopherson &lt;seanjc@google.com&gt;
Tested-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: stable@vger.kernel.org
Fixes: c59a1f106f5c ("KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS")
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>kexec: fix KEXEC_FILE dependencies</title>
<updated>2024-01-05T14:19:41+00:00</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2023-10-23T11:01:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=78422b744ad90f0f99b8081c10a1da9182964efd'/>
<id>urn:sha1:78422b744ad90f0f99b8081c10a1da9182964efd</id>
<content type='text'>
[ Upstream commit c1ad12ee0efc07244be37f69311e6f7c4ac98e62 ]

The cleanup for the CONFIG_KEXEC Kconfig logic accidentally changed the
'depends on CRYPTO=y' dependency to a plain 'depends on CRYPTO', which
causes a link failure when all the crypto support is in a loadable module
and kexec_file support is built-in:

x86_64-linux-ld: vmlinux.o: in function `__x64_sys_kexec_file_load':
(.text+0x32e30a): undefined reference to `crypto_alloc_shash'
x86_64-linux-ld: (.text+0x32e58e): undefined reference to `crypto_shash_update'
x86_64-linux-ld: (.text+0x32e6ee): undefined reference to `crypto_shash_final'

Both s390 and x86 have this problem, while ppc64 and riscv have the
correct dependency already.  On riscv, the dependency is only used for the
purgatory, not for the kexec_file code itself, which may be a bit
surprising as it means that with CONFIG_CRYPTO=m, it is possible to enable
KEXEC_FILE but then the purgatory code is silently left out.

Move this into the common Kconfig.kexec file in a way that is correct
everywhere, using the dependency on CRYPTO_SHA256=y only when the
purgatory code is available.  This requires reversing the dependency
between ARCH_SUPPORTS_KEXEC_PURGATORY and KEXEC_FILE, but the effect
remains the same, other than making riscv behave like the other ones.

On s390, there is an additional dependency on CRYPTO_SHA256_S390, which
should technically not be required but gives better performance.  Remove
this dependency here, noting that it was not present in the initial
Kconfig code but was brought in without an explanation in commit
71406883fd357 ("s390/kexec_file: Add kexec_file_load system call").

[arnd@arndb.de: fix riscv build]
  Link: https://lkml.kernel.org/r/67ddd260-d424-4229-a815-e3fcfb864a77@app.fastmail.com
Link: https://lkml.kernel.org/r/20231023110308.1202042-1-arnd@kernel.org
Fixes: 6af5138083005 ("x86/kexec: refactor for kernel/Kconfig.kexec")
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Reviewed-by: Eric DeVolder &lt;eric_devolder@yahoo.com&gt;
Tested-by: Eric DeVolder &lt;eric_devolder@yahoo.com&gt;
Cc: Albert Ou &lt;aou@eecs.berkeley.edu&gt;
Cc: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Cc: Ard Biesheuvel &lt;ardb@kernel.org&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Christian Borntraeger &lt;borntraeger@linux.ibm.com&gt;
Cc: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Cc: Conor Dooley &lt;conor@kernel.org&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Cc: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Cc: Palmer Dabbelt &lt;palmer@dabbelt.com&gt;
Cc: Paul Walmsley &lt;paul.walmsley@sifive.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Sven Schnelle &lt;svens@linux.ibm.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>x86/smpboot/64: Handle X2APIC BIOS inconsistency gracefully</title>
<updated>2024-01-01T12:42:47+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2023-12-15T08:58:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ddc4ad52c022c366c3b48f7eaed7c36f92f95db4'/>
<id>urn:sha1:ddc4ad52c022c366c3b48f7eaed7c36f92f95db4</id>
<content type='text'>
commit 69a7386c1ec25476a0c78ffeb59de08a2a08f495 upstream.

Chris reported that a Dell PowerEdge T340 system stopped to boot when upgrading
to a kernel which contains the parallel hotplug changes.  Disabling parallel
hotplug on the kernel command line makes it boot again.

It turns out that the Dell BIOS has x2APIC enabled and the boot CPU comes up in
X2APIC mode, but the APs come up inconsistently in xAPIC mode.

Parallel hotplug requires that the upcoming CPU reads out its APIC ID from the
local APIC in order to map it to the Linux CPU number.

In this particular case the readout on the APs uses the MMIO mapped registers
because the BIOS failed to enable x2APIC mode. That readout results in a page
fault because the kernel does not have the APIC MMIO space mapped when X2APIC
mode was enabled by the BIOS on the boot CPU and the kernel switched to X2APIC
mode early. That page fault can't be handled on the upcoming CPU that early and
results in a silent boot failure.

If parallel hotplug is disabled the system boots because in that case the APIC
ID read is not required as the Linux CPU number is provided to the AP in the
smpboot control word. When the kernel uses x2APIC mode then the APs are
switched to x2APIC mode too slightly later in the bringup process, but there is
no reason to do it that late.

Cure the BIOS bogosity by checking in the parallel bootup path whether the
kernel uses x2APIC mode and if so switching over the APs to x2APIC mode before
the APIC ID readout.

Fixes: 0c7ffa32dbd6 ("x86/smpboot/64: Implement arch_cpuhp_init_parallel_bringup() and enable it")
Reported-by: Chris Lindee &lt;chris.lindee@gmail.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Ashok Raj &lt;ashok.raj@intel.com&gt;
Tested-by: Chris Lindee &lt;chris.lindee@gmail.com&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/CA%2B2tU59853R49EaU_tyvOZuOTDdcU0RshGyydccp9R1NX9bEeQ@mail.gmail.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>x86/alternatives: Disable interrupts and sync when optimizing NOPs in place</title>
<updated>2024-01-01T12:42:47+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2023-12-07T19:49:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6778977590dacbd8964ef1da715f5b07e9e04302'/>
<id>urn:sha1:6778977590dacbd8964ef1da715f5b07e9e04302</id>
<content type='text'>
commit 2dc4196138055eb0340231aecac4d78c2ec2bea5 upstream.

apply_alternatives() treats alternatives with the ALT_FLAG_NOT flag set
special as it optimizes the existing NOPs in place.

Unfortunately, this happens with interrupts enabled and does not provide any
form of core synchronization.

So an interrupt hitting in the middle of the update and using the affected code
path will observe a half updated NOP and crash and burn. The following
3 NOP sequence was observed to expose this crash halfway reliably under QEMU
  32bit:

   0x90 0x90 0x90

which is replaced by the optimized 3 byte NOP:

   0x8d 0x76 0x00

So an interrupt can observe:

   1) 0x90 0x90 0x90		nop nop nop
   2) 0x8d 0x90 0x90		undefined
   3) 0x8d 0x76 0x90		lea    -0x70(%esi),%esi
   4) 0x8d 0x76 0x00		lea     0x0(%esi),%esi

Where only #1 and #4 are true NOPs. The same problem exists for 64bit obviously.

Disable interrupts around this NOP optimization and invoke sync_core()
before re-enabling them.

Fixes: 270a69c4485d ("x86/alternative: Support relocations in alternatives")
Reported-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/ZT6narvE%2BLxX%2B7Be@windriver.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
</feed>
