<feed xmlns='http://www.w3.org/2005/Atom'>
<title>BMC/Intel-BMC/linux.git/arch/powerpc/include/asm/processor.h, branch dev-4.7</title>
<subtitle>Intel OpenBMC Linux kernel source tree (mirror)</subtitle>
<id>https://git.radix-linux.su/BMC/Intel-BMC/linux.git/atom?h=dev-4.7</id>
<link rel='self' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/atom?h=dev-4.7'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/'/>
<updated>2016-03-29T01:08:08+00:00</updated>
<entry>
<title>powerpc: Correct used_vsr comment</title>
<updated>2016-03-29T01:08:08+00:00</updated>
<author>
<name>Simon Guo</name>
<email>wei.guo.simon@gmail.com</email>
</author>
<published>2016-03-24T17:12:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=71528d8bd7a8aa920cd69d4223c6c87d5849257d'/>
<id>urn:sha1:71528d8bd7a8aa920cd69d4223c6c87d5849257d</id>
<content type='text'>
The used_vsr flag is set if process has used VSX registers, not Altivec
registers. But the comment says otherwise, correct the comment.

Signed-off-by: Simon Guo &lt;wei.guo.simon@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
<entry>
<title>powerpc: Restore FPU/VEC/VSX if previously used</title>
<updated>2016-03-02T12:34:48+00:00</updated>
<author>
<name>Cyril Bur</name>
<email>cyrilbur@gmail.com</email>
</author>
<published>2016-02-29T06:53:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=70fe3d980f5f14d8125869125ba9a0ea95e09c6b'/>
<id>urn:sha1:70fe3d980f5f14d8125869125ba9a0ea95e09c6b</id>
<content type='text'>
Currently the FPU, VEC and VSX facilities are lazily loaded. This is not
a problem unless a process is using these facilities.

Modern versions of GCC are very good at automatically vectorising code,
new and modernised workloads make use of floating point and vector
facilities, even the kernel makes use of vectorised memcpy.

All this combined greatly increases the cost of a syscall since the
kernel uses the facilities sometimes even in syscall fast-path making it
increasingly common for a thread to take an *_unavailable exception soon
after a syscall, not to mention potentially taking all three.

The obvious overcompensation to this problem is to simply always load
all the facilities on every exit to userspace. Loading up all FPU, VEC
and VSX registers every time can be expensive and if a workload does
avoid using them, it should not be forced to incur this penalty.

An 8bit counter is used to detect if the registers have been used in the
past and the registers are always loaded until the value wraps to back
to zero.

Several versions of the assembly in entry_64.S were tested:

  1. Always calling C.
  2. Performing a common case check and then calling C.
  3. A complex check in asm.

After some benchmarking it was determined that avoiding C in the common
case is a performance benefit (option 2). The full check in asm (option
3) greatly complicated that codepath for a negligible performance gain
and the trade-off was deemed not worth it.

Signed-off-by: Cyril Bur &lt;cyrilbur@gmail.com&gt;
[mpe: Move load_vec in the struct to fill an existing hole, reword change log]
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;

fixup
</content>
</entry>
<entry>
<title>powerpc: Remove fp_enable() and vec_enable(), use msr_check_and_{set, clear}()</title>
<updated>2015-12-01T02:52:26+00:00</updated>
<author>
<name>Anton Blanchard</name>
<email>anton@samba.org</email>
</author>
<published>2015-10-29T00:44:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=1f2e25b2d552cade43eacb2edc4e7f01c1cfecb3'/>
<id>urn:sha1:1f2e25b2d552cade43eacb2edc4e7f01c1cfecb3</id>
<content type='text'>
More consolidation of our MSR available bit handling.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
<entry>
<title>powerpc: Remove UP only lazy floating point and vector optimisations</title>
<updated>2015-12-01T02:52:24+00:00</updated>
<author>
<name>Anton Blanchard</name>
<email>anton@samba.org</email>
</author>
<published>2015-10-29T00:43:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=af1bbc3dd3d501d27da72e1764afe5f5b0d3882d'/>
<id>urn:sha1:af1bbc3dd3d501d27da72e1764afe5f5b0d3882d</id>
<content type='text'>
The UP only lazy floating point and vector optimisations were written
back when SMP was not common, and neither glibc nor gcc used vector
instructions. Now SMP is very common, glibc aggressively uses vector
instructions and gcc autovectorises.

We want to add new optimisations that apply to both UP and SMP, but
in preparation for that remove these UP only optimisations.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
<entry>
<title>powerpc: Create context switch helpers save_sprs() and restore_sprs()</title>
<updated>2015-12-01T02:52:24+00:00</updated>
<author>
<name>Anton Blanchard</name>
<email>anton@samba.org</email>
</author>
<published>2015-10-29T00:43:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=152d523e6307c7152f9986a542f873b5c5863937'/>
<id>urn:sha1:152d523e6307c7152f9986a542f873b5c5863937</id>
<content type='text'>
Move all our context switch SPR save and restore code into two
helpers. We do a few optimisations:

- Group all mfsprs and all mtsprs. In many cases an mtspr sets a
scoreboarding bit that an mfspr waits on, so the current practise of
mfspr A; mtspr A; mfpsr B; mtspr B is the worst scheduling we can
do.

- SPR writes are slow, so check that the value is changing before
writing it.

A context switch microbenchmark using yield():

http://ozlabs.org/~anton/junkcode/context_switch2.c

./context_switch2 --test=yield 0 0

shows an improvement of almost 10% on POWER8.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
<entry>
<title>powerpc/tm: Drop tm_orig_msr from thread_struct</title>
<updated>2015-07-16T06:02:37+00:00</updated>
<author>
<name>Anshuman Khandual</name>
<email>khandual@linux.vnet.ibm.com</email>
</author>
<published>2015-07-06T10:54:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=829023df86d4ec39b110860cd5f106b7ac58f772'/>
<id>urn:sha1:829023df86d4ec39b110860cd5f106b7ac58f772</id>
<content type='text'>
Currently tm_orig_msr is getting used during process context switch only.
Then there is ckpt_regs which saves the checkpointed userspace context
The MSR slot contained in ckpt_regs structure can be used during process
context switch instead of tm_orig_msr, thus allowing us to drop it from
thread_struct structure. This patch does that change.

Acked-by: Michael Neuling &lt;mikey@neuling.org&gt;
Signed-off-by: Anshuman Khandual &lt;khandual@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
<entry>
<title>powerpc/dscr: Add some in-code documentation</title>
<updated>2015-06-07T09:29:15+00:00</updated>
<author>
<name>Anshuman Khandual</name>
<email>khandual@linux.vnet.ibm.com</email>
</author>
<published>2015-05-21T06:43:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=d3cb06e0cdc9841b289a0d68acaffd2868504902'/>
<id>urn:sha1:d3cb06e0cdc9841b289a0d68acaffd2868504902</id>
<content type='text'>
This patch adds some in-code documentation to the DSCR related code to
make it more readable without having any functional change to it.

Signed-off-by: Anshuman Khandual &lt;khandual@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
<entry>
<title>powernv/powerpc: Add winkle support for offline cpus</title>
<updated>2014-12-14T23:46:41+00:00</updated>
<author>
<name>Shreyas B. Prabhu</name>
<email>shreyas@linux.vnet.ibm.com</email>
</author>
<published>2014-12-09T18:56:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=77b54e9f213f76a23736940cf94bcd765fc00f40'/>
<id>urn:sha1:77b54e9f213f76a23736940cf94bcd765fc00f40</id>
<content type='text'>
Winkle is a deep idle state supported in power8 chips. A core enters
winkle when all the threads of the core enter winkle. In this state
power supply to the entire chiplet i.e core, private L2 and private L3
is turned off. As a result it gives higher powersavings compared to
sleep.

But entering winkle results in a total hypervisor state loss. Hence the
hypervisor context has to be preserved before entering winkle and
restored upon wake up.

Power-on Reset Engine (PORE) is a dedicated engine which is responsible
for powering on the chiplet during wake up. It can be programmed to
restore the register contests of a few specific registers. This patch
uses PORE to restore register state wherever possible and uses stack to
save and restore rest of the necessary registers.

With hypervisor state restore things fall under three categories-
per-core state, per-subcore state and per-thread state. To manage this,
extend the infrastructure introduced for sleep. Mainly we add a paca
variable subcore_sibling_mask. Using this and the core_idle_state we can
distingush first thread in core and subcore.

Signed-off-by: Shreyas B. Prabhu &lt;shreyas@linux.vnet.ibm.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
<entry>
<title>powernv/cpuidle: Redesign idle states management</title>
<updated>2014-12-14T23:46:40+00:00</updated>
<author>
<name>Shreyas B. Prabhu</name>
<email>shreyas@linux.vnet.ibm.com</email>
</author>
<published>2014-12-09T18:56:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=7cba160ad789a3ad7e68b92bf20eaad6ed171f80'/>
<id>urn:sha1:7cba160ad789a3ad7e68b92bf20eaad6ed171f80</id>
<content type='text'>
Deep idle states like sleep and winkle are per core idle states. A core
enters these states only when all the threads enter either the
particular idle state or a deeper one. There are tasks like fastsleep
hardware bug workaround and hypervisor core state save which have to be
done only by the last thread of the core entering deep idle state and
similarly tasks like timebase resync, hypervisor core register restore
that have to be done only by the first thread waking up from these
state.

The current idle state management does not have a way to distinguish the
first/last thread of the core waking/entering idle states. Tasks like
timebase resync are done for all the threads. This is not only is
suboptimal, but can cause functionality issues when subcores and kvm is
involved.

This patch adds the necessary infrastructure to track idle states of
threads in a per-core structure. It uses this info to perform tasks like
fastsleep workaround and timebase resync only once per core.

Signed-off-by: Shreyas B. Prabhu &lt;shreyas@linux.vnet.ibm.com&gt;
Originally-by: Preeti U. Murthy &lt;preeti@linux.vnet.ibm.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Rafael J. Wysocki &lt;rjw@rjwysocki.net&gt;
Cc: linux-pm@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
<entry>
<title>powerpc/powernv: Return to cpu offline loop when finished in KVM guest</title>
<updated>2014-12-08T02:16:31+00:00</updated>
<author>
<name>Paul Mackerras</name>
<email>paulus@samba.org</email>
</author>
<published>2014-12-03T03:48:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=56548fc0e86cb9156af7a7e1f15ba78f251dafaf'/>
<id>urn:sha1:56548fc0e86cb9156af7a7e1f15ba78f251dafaf</id>
<content type='text'>
When a secondary hardware thread has finished running a KVM guest, we
currently put that thread into nap mode using a nap instruction in
the KVM code.  This changes the code so that instead of doing a nap
instruction directly, we instead cause the call to power7_nap() that
put the thread into nap mode to return.  The reason for doing this is
to avoid having the KVM code having to know what low-power mode to
put the thread into.

In the case of a secondary thread used to run a KVM guest, the thread
will be offline from the point of view of the host kernel, and the
relevant power7_nap() call is the one in pnv_smp_cpu_disable().
In this case we don't want to clear pending IPIs in the offline loop
in that function, since that might cause us to miss the wakeup for
the next time the thread needs to run a guest.  To tell whether or
not to clear the interrupt, we use the SRR1 value returned from
power7_nap(), and check if it indicates an external interrupt.  We
arrange that the return from power7_nap() when we have finished running
a guest returns 0, so pending interrupts don't get flushed in that
case.

Note that it is important a secondary thread that has finished
executing in the guest, or that didn't have a guest to run, should
not return to power7_nap's caller while the kvm_hstate.hwthread_req
flag in the PACA is non-zero, because the return from power7_nap
will reenable the MMU, and the MMU might still be in guest context.
In this situation we spin at low priority in real mode waiting for
hwthread_req to become zero.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
</feed>
