<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/arch/powerpc/platforms/pseries/hotplug-cpu.c, branch linux-5.9.y</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=linux-5.9.y</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=linux-5.9.y'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2020-08-18T07:09:52+00:00</updated>
<entry>
<title>powerpc/pseries/hotplug-cpu: wait indefinitely for vCPU death</title>
<updated>2020-08-18T07:09:52+00:00</updated>
<author>
<name>Michael Roth</name>
<email>mdroth@linux.vnet.ibm.com</email>
</author>
<published>2020-08-11T16:15:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=801980f6497946048709b9b09771a1729551d705'/>
<id>urn:sha1:801980f6497946048709b9b09771a1729551d705</id>
<content type='text'>
For a power9 KVM guest with XIVE enabled, running a test loop
where we hotplug 384 vcpus and then unplug them, the following traces
can be seen (generally within a few loops) either from the unplugged
vcpu:

  cpu 65 (hwid 65) Ready to die...
  Querying DEAD? cpu 66 (66) shows 2
  list_del corruption. next-&gt;prev should be c00a000002470208, but was c00a000002470048
  ------------[ cut here ]------------
  kernel BUG at lib/list_debug.c:56!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in: fuse nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 ...
  CPU: 66 PID: 0 Comm: swapper/66 Kdump: loaded Not tainted 4.18.0-221.el8.ppc64le #1
  NIP:  c0000000007ab50c LR: c0000000007ab508 CTR: 00000000000003ac
  REGS: c0000009e5a17840 TRAP: 0700   Not tainted  (4.18.0-221.el8.ppc64le)
  MSR:  800000000282b033 &lt;SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE&gt;  CR: 28000842  XER: 20040000
  ...
  NIP __list_del_entry_valid+0xac/0x100
  LR  __list_del_entry_valid+0xa8/0x100
  Call Trace:
    __list_del_entry_valid+0xa8/0x100 (unreliable)
    free_pcppages_bulk+0x1f8/0x940
    free_unref_page+0xd0/0x100
    xive_spapr_cleanup_queue+0x148/0x1b0
    xive_teardown_cpu+0x1bc/0x240
    pseries_mach_cpu_die+0x78/0x2f0
    cpu_die+0x48/0x70
    arch_cpu_idle_dead+0x20/0x40
    do_idle+0x2f4/0x4c0
    cpu_startup_entry+0x38/0x40
    start_secondary+0x7bc/0x8f0
    start_secondary_prolog+0x10/0x14

or on the worker thread handling the unplug:

  pseries-hotplug-cpu: Attempting to remove CPU &lt;NULL&gt;, drc index: 1000013a
  Querying DEAD? cpu 314 (314) shows 2
  BUG: Bad page state in process kworker/u768:3  pfn:95de1
  cpu 314 (hwid 314) Ready to die...
  page:c00a000002577840 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x0
  flags: 0x5ffffc00000000()
  raw: 005ffffc00000000 5deadbeef0000100 5deadbeef0000200 0000000000000000
  raw: 0000000000000000 0000000000000000 00000000ffffff7f 0000000000000000
  page dumped because: nonzero mapcount
  Modules linked in: kvm xt_CHECKSUM ipt_MASQUERADE xt_conntrack ...
  CPU: 0 PID: 548 Comm: kworker/u768:3 Kdump: loaded Not tainted 4.18.0-224.el8.bz1856588.ppc64le #1
  Workqueue: pseries hotplug workque pseries_hp_work_fn
  Call Trace:
    dump_stack+0xb0/0xf4 (unreliable)
    bad_page+0x12c/0x1b0
    free_pcppages_bulk+0x5bc/0x940
    page_alloc_cpu_dead+0x118/0x120
    cpuhp_invoke_callback.constprop.5+0xb8/0x760
    _cpu_down+0x188/0x340
    cpu_down+0x5c/0xa0
    cpu_subsys_offline+0x24/0x40
    device_offline+0xf0/0x130
    dlpar_offline_cpu+0x1c4/0x2a0
    dlpar_cpu_remove+0xb8/0x190
    dlpar_cpu_remove_by_index+0x12c/0x150
    dlpar_cpu+0x94/0x800
    pseries_hp_work_fn+0x128/0x1e0
    process_one_work+0x304/0x5d0
    worker_thread+0xcc/0x7a0
    kthread+0x1ac/0x1c0
    ret_from_kernel_thread+0x5c/0x80

The latter trace is due to the following sequence:

  page_alloc_cpu_dead
    drain_pages
      drain_pages_zone
        free_pcppages_bulk

where drain_pages() in this case is called under the assumption that
the unplugged cpu is no longer executing. To ensure that is the case,
and early call is made to __cpu_die()-&gt;pseries_cpu_die(), which runs a
loop that waits for the cpu to reach a halted state by polling its
status via query-cpu-stopped-state RTAS calls. It only polls for 25
iterations before giving up, however, and in the trace above this
results in the following being printed only .1 seconds after the
hotplug worker thread begins processing the unplug request:

  pseries-hotplug-cpu: Attempting to remove CPU &lt;NULL&gt;, drc index: 1000013a
  Querying DEAD? cpu 314 (314) shows 2

At that point the worker thread assumes the unplugged CPU is in some
unknown/dead state and procedes with the cleanup, causing the race
with the XIVE cleanup code executed by the unplugged CPU.

Fix this by waiting indefinitely, but also making an effort to avoid
spurious lockup messages by allowing for rescheduling after polling
the CPU status and printing a warning if we wait for longer than 120s.

Fixes: eac1e731b59ee ("powerpc/xive: guest exploitation of the XIVE interrupt controller")
Suggested-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Signed-off-by: Michael Roth &lt;mdroth@linux.vnet.ibm.com&gt;
Tested-by: Greg Kurz &lt;groug@kaod.org&gt;
Reviewed-by: Thiago Jung Bauermann &lt;bauerman@linux.ibm.com&gt;
Reviewed-by: Greg Kurz &lt;groug@kaod.org&gt;
[mpe: Trim oopses in change log slightly for readability]
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20200811161544.10513-1-mdroth@linux.vnet.ibm.com
</content>
</entry>
<entry>
<title>powerpc/pseries/hotplug-cpu: Remove double free in error path</title>
<updated>2020-07-30T12:53:49+00:00</updated>
<author>
<name>Nathan Lynch</name>
<email>nathanl@linux.ibm.com</email>
</author>
<published>2019-09-19T23:16:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a0ff72f9f5a780341e7ff5e9ba50a0dad5fa1980'/>
<id>urn:sha1:a0ff72f9f5a780341e7ff5e9ba50a0dad5fa1980</id>
<content type='text'>
In the unlikely event that the device tree lacks a /cpus node,
find_dlpar_cpus_to_add() oddly frees the cpu_drcs buffer it has been
passed before returning an error. Its only caller also frees the
buffer on error.

Remove the less conventional kfree() of a caller-supplied buffer from
find_dlpar_cpus_to_add().

Fixes: 90edf184b9b7 ("powerpc/pseries: Add CPU dlpar add functionality")
Signed-off-by: Nathan Lynch &lt;nathanl@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20190919231633.1344-1-nathanl@linux.ibm.com
</content>
</entry>
<entry>
<title>powerpc/pseries: remove dlpar_cpu_readd()</title>
<updated>2020-07-16T03:12:40+00:00</updated>
<author>
<name>Nathan Lynch</name>
<email>nathanl@linux.ibm.com</email>
</author>
<published>2020-06-12T05:12:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=38c392cef19019457ddcfb197ff3d9c5267698e6'/>
<id>urn:sha1:38c392cef19019457ddcfb197ff3d9c5267698e6</id>
<content type='text'>
dlpar_cpu_readd() is unused now.

Signed-off-by: Nathan Lynch &lt;nathanl@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20200612051238.1007764-18-nathanl@linux.ibm.com
</content>
</entry>
<entry>
<title>powerpc/numa: remove timed_topology_update()</title>
<updated>2020-07-16T03:12:37+00:00</updated>
<author>
<name>Nathan Lynch</name>
<email>nathanl@linux.ibm.com</email>
</author>
<published>2020-06-12T05:12:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b1815aeac7fde2dc3412daf2efaededd21cd58e0'/>
<id>urn:sha1:b1815aeac7fde2dc3412daf2efaededd21cd58e0</id>
<content type='text'>
timed_topology_update is a no-op now, so remove it and all call sites.

Signed-off-by: Nathan Lynch &lt;nathanl@linux.ibm.com&gt;
Reviewed-by: Srikar Dronamraju &lt;srikar@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20200612051238.1007764-11-nathanl@linux.ibm.com
</content>
</entry>
<entry>
<title>powerpc/pseries: remove cede offline state for CPUs</title>
<updated>2020-07-16T03:12:34+00:00</updated>
<author>
<name>Nathan Lynch</name>
<email>nathanl@linux.ibm.com</email>
</author>
<published>2020-06-12T05:12:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=48f6e7f6d948b56489da027bc3284c709b939d28'/>
<id>urn:sha1:48f6e7f6d948b56489da027bc3284c709b939d28</id>
<content type='text'>
This effectively reverts commit 3aa565f53c39 ("powerpc/pseries: Add
hooks to put the CPU into an appropriate offline state"), which added
an offline mode for CPUs which uses the H_CEDE hcall instead of the
architected stop-self RTAS function in order to facilitate "folding"
of dedicated mode processors on PowerVM platforms to achieve energy
savings. This has been the default offline mode since its
introduction.

There's nothing about stop-self that would prevent the hypervisor from
achieving the energy savings available via H_CEDE, so the original
premise of this change appears to be flawed.

I also have encountered the claim that the transition to and from
ceded state is much faster than stop-self/start-cpu. Certainly we
would not want to use stop-self as an *idle* mode. That is what H_CEDE
is for. However, this difference is insignificant in the context of
Linux CPU hotplug, where the latency of an offline or online operation
on current systems is on the order of 100ms, mainly attributable to
all the various subsystems' cpuhp callbacks.

The cede offline mode also prevents accurate accounting, as discussed
before:
https://lore.kernel.org/linuxppc-dev/1571740391-3251-1-git-send-email-ego@linux.vnet.ibm.com/

Unconditionally use stop-self to offline processor threads. This is
the architected method for offlining CPUs on PAPR systems.

The "cede_offline" boot parameter is rendered obsolete.

Removing this code enables the removal of the partition suspend code
which temporarily onlines all present CPUs.

Fixes: 3aa565f53c39 ("powerpc/pseries: Add hooks to put the CPU into an appropriate offline state")
Signed-off-by: Nathan Lynch &lt;nathanl@linux.ibm.com&gt;
Reviewed-by: Gautham R. Shenoy &lt;ego@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20200612051238.1007764-2-nathanl@linux.ibm.com
</content>
</entry>
<entry>
<title>powerpc/pseries: Add cpu DLPAR support for drc-info property</title>
<updated>2019-11-13T05:57:57+00:00</updated>
<author>
<name>Tyrel Datwyler</name>
<email>tyreld@linux.ibm.com</email>
</author>
<published>2019-11-11T05:21:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b015f6bc9547dbc056edde7177c7868ca8629c4c'/>
<id>urn:sha1:b015f6bc9547dbc056edde7177c7868ca8629c4c</id>
<content type='text'>
Older firmwares provided information about Dynamic Reconfig
Connectors (DRC) through several device tree properties, namely
ibm,drc-types, ibm,drc-indexes, ibm,drc-names, and
ibm,drc-power-domains. New firmwares have the ability to present this
same information in a much condensed format through a device tree
property called ibm,drc-info.

The existing cpu DLPAR hotplug code only understands the older DRC
property format when validating the drc-index of a cpu during a
hotplug add. This updates those code paths to use the ibm,drc-info
property, when present, instead for validation.

Signed-off-by: Tyrel Datwyler &lt;tyreld@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/1573449697-5448-4-git-send-email-tyreld@linux.ibm.com
</content>
</entry>
<entry>
<title>powerpc/pseries: safely roll back failed DLPAR cpu add</title>
<updated>2019-11-05T00:26:45+00:00</updated>
<author>
<name>Nathan Lynch</name>
<email>nathanl@linux.ibm.com</email>
</author>
<published>2019-10-16T18:36:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=80c784282859cc39617b808440a34d0be9502b87'/>
<id>urn:sha1:80c784282859cc39617b808440a34d0be9502b87</id>
<content type='text'>
dlpar_online_cpu() attempts to online all threads of a core that has
been added to an LPAR. If onlining a non-primary thread
fails (e.g. due to an allocation failure), the core is left with at
least one thread online. dlpar_cpu_add() attempts to roll back the
whole operation, releasing the core back to the platform. However,
since some threads of the core being removed are still online, the
BUG_ON(cpu_online(cpu)) in pseries_remove_processor() strikes:

LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in:
CPU: 3 PID: 8587 Comm: drmgr Not tainted 5.3.0-rc2-00190-g9b123d1ea237-dirty #46
NIP:  c0000000000eeb2c LR: c0000000000eeac4 CTR: c0000000000ee9e0
REGS: c0000001f745b6c0 TRAP: 0700   Not tainted  (5.3.0-rc2-00190-g9b123d1ea237-dirty)
MSR:  800000010282b033 &lt;SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]&gt;  CR: 44002448  XER: 00000000
CFAR: c00000000195d718 IRQMASK: 0
GPR00: c0000000000eeac4 c0000001f745b950 c0000000032f6200 0000000000000008
GPR04: 0000000000000008 c000000003349c78 0000000000000040 00000000000001ff
GPR08: 0000000000000008 0000000000000000 0000000000000001 0007ffffffffffff
GPR12: 0000000084002844 c00000001ecacb80 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000008
GPR24: c000000003349ee0 c00000000334a2e4 c0000000fca4d7a8 c000000001d20048
GPR28: 0000000000000001 ffffffffffffffff ffffffffffffffff c0000000fca4d7c4
NIP [c0000000000eeb2c] pseries_smp_notifier+0x14c/0x2e0
LR [c0000000000eeac4] pseries_smp_notifier+0xe4/0x2e0
Call Trace:
[c0000001f745b950] [c0000000000eeac4] pseries_smp_notifier+0xe4/0x2e0 (unreliable)
[c0000001f745ba10] [c0000000001ac774] notifier_call_chain+0xb4/0x190
[c0000001f745bab0] [c0000000001ad62c] blocking_notifier_call_chain+0x7c/0xb0
[c0000001f745baf0] [c00000000167bda0] of_detach_node+0xc0/0x110
[c0000001f745bb50] [c0000000000e7ae4] dlpar_detach_node+0x64/0xa0
[c0000001f745bb80] [c0000000000edefc] dlpar_cpu_add+0x31c/0x360
[c0000001f745bc10] [c0000000000ee980] dlpar_cpu_probe+0x50/0xb0
[c0000001f745bc50] [c00000000002cf70] arch_cpu_probe+0x40/0x70
[c0000001f745bc70] [c000000000ccd808] cpu_probe_store+0x48/0x80
[c0000001f745bcb0] [c000000000cbcef8] dev_attr_store+0x38/0x60
[c0000001f745bcd0] [c00000000059c980] sysfs_kf_write+0x70/0xb0
[c0000001f745bd10] [c00000000059afb8] kernfs_fop_write+0xf8/0x280
[c0000001f745bd60] [c0000000004b437c] __vfs_write+0x3c/0x70
[c0000001f745bd80] [c0000000004b8710] vfs_write+0xd0/0x220
[c0000001f745bdd0] [c0000000004b8acc] ksys_write+0x7c/0x140
[c0000001f745be20] [c00000000000bbd8] system_call+0x5c/0x68

Move dlpar_offline_cpu() up in the file so that dlpar_online_cpu() can
use it to re-offline any threads that have been onlined when an error
is encountered.

Signed-off-by: Nathan Lynch &lt;nathanl@linux.ibm.com&gt;
Fixes: e666ae0b10aa ("powerpc/pseries: Update CPU hotplug error recovery")
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20191016183611.10867-3-nathanl@linux.ibm.com
</content>
</entry>
<entry>
<title>powerpc/pseries: address checkpatch warnings in dlpar_offline_cpu</title>
<updated>2019-11-05T00:26:45+00:00</updated>
<author>
<name>Nathan Lynch</name>
<email>nathanl@linux.ibm.com</email>
</author>
<published>2019-10-16T18:36:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3366ebe9e19ba3c672a00a6fb86f0ac8636ee989'/>
<id>urn:sha1:3366ebe9e19ba3c672a00a6fb86f0ac8636ee989</id>
<content type='text'>
Remove some stray blank lines, convert a printk to pr_warn, and
address a line length violation.

One functional change: use WARN_ON instead of BUG_ON in case H_PROD of
a ceded thread yields an unexpected result from the platform. We can
expect this code path to get uninterruptibly stuck in __cpu_die() if
this happens, but that's more desirable than crashing.

Signed-off-by: Nathan Lynch &lt;nathanl@linux.ibm.com&gt;
Fixes: b6db63d1a7f0 ("pseries/pseries: Add code to online/offline CPUs of a DLPAR node")
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20191016183611.10867-2-nathanl@linux.ibm.com
</content>
</entry>
<entry>
<title>treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152</title>
<updated>2019-05-30T18:26:32+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2019-05-27T06:55:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2874c5fd284268364ece81a7bd936f3c8168e567'/>
<id>urn:sha1:2874c5fd284268364ece81a7bd936f3c8168e567</id>
<content type='text'>
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 3029 file(s).

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Allison Randal &lt;allison@lohutok.net&gt;
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>powerpc/pseries: Perform full re-add of CPU for topology update post-migration</title>
<updated>2019-01-30T12:28:56+00:00</updated>
<author>
<name>Nathan Fontenot</name>
<email>nfont@linux.vnet.ibm.com</email>
</author>
<published>2018-10-29T18:43:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=81b61324922c67f73813d8a9c175f3c153f6a1c6'/>
<id>urn:sha1:81b61324922c67f73813d8a9c175f3c153f6a1c6</id>
<content type='text'>
On pseries systems, performing a partition migration can result in
altering the nodes a CPU is assigned to on the destination system. For
exampl, pre-migration on the source system CPUs are in node 1 and 3,
post-migration on the destination system CPUs are in nodes 2 and 3.

Handling the node change for a CPU can cause corruption in the slab
cache if we hit a timing where a CPUs node is changed while cache_reap()
is invoked. The corruption occurs because the slab cache code appears
to rely on the CPU and slab cache pages being on the same node.

The current dynamic updating of a CPUs node done in arch/powerpc/mm/numa.c
does not prevent us from hitting this scenario.

Changing the device tree property update notification handler that
recognizes an affinity change for a CPU to do a full DLPAR remove and
add of the CPU instead of dynamically changing its node resolves this
issue.

Signed-off-by: Nathan Fontenot &lt;nfont@linux.vnet.ibm.com&gt;
Signed-off-by: Michael W. Bringmann &lt;mwb@linux.vnet.ibm.com&gt;
Tested-by: Michael W. Bringmann &lt;mwb@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
</feed>
