<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/arch/powerpc/include/asm/tlbflush.h, branch linux-7.0.y</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=linux-7.0.y</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=linux-7.0.y'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2018-07-30T12:48:21+00:00</updated>
<entry>
<title>powerpc: split asm/tlbflush.h</title>
<updated>2018-07-30T12:48:21+00:00</updated>
<author>
<name>Christophe Leroy</name>
<email>christophe.leroy@c-s.fr</email>
</author>
<published>2018-07-05T16:25:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=405cb4024e52b137685213b377ea3f0ce3f91735'/>
<id>urn:sha1:405cb4024e52b137685213b377ea3f0ce3f91735</id>
<content type='text'>
Split asm/tlbflush.h into:
asm/nohash/tlbflush.h
asm/book3s/32/tlbflush.h
asm/book3s/64/tlbflush.h (already existing)

Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
<entry>
<title>powerpc/64s: Replace CONFIG_PPC_STD_MMU_64 with CONFIG_PPC_BOOK3S_64</title>
<updated>2017-11-06T05:48:14+00:00</updated>
<author>
<name>Michael Ellerman</name>
<email>mpe@ellerman.id.au</email>
</author>
<published>2017-10-19T04:08:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4e003747043d57aa75c9762fa148ef38afe68dd8'/>
<id>urn:sha1:4e003747043d57aa75c9762fa148ef38afe68dd8</id>
<content type='text'>
CONFIG_PPC_STD_MMU_64 indicates support for the "standard" powerpc MMU
on 64-bit CPUs. The "standard" MMU refers to the hash page table MMU
found in "server" processors, from IBM mainly.

Currently CONFIG_PPC_STD_MMU_64 is == CONFIG_PPC_BOOK3S_64. While it's
annoying to have two symbols that always have the same value, it's not
quite annoying enough to bother removing one.

However with the arrival of Power9, we now have the situation where
CONFIG_PPC_STD_MMU_64 is enabled, but the kernel is running using the
Radix MMU - *not* the "standard" MMU. So it is now actively confusing
to use it, because it implies that code is disabled or inactive when
the Radix MMU is in use, however that is not necessarily true.

So s/CONFIG_PPC_STD_MMU_64/CONFIG_PPC_BOOK3S_64/, and do some minor
formatting updates of some of the affected lines.

This will be a pain for backports, but c'est la vie.

Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
<entry>
<title>powerpc/mm: remove flush_tlb_page_nohash</title>
<updated>2016-08-01T01:15:13+00:00</updated>
<author>
<name>Aneesh Kumar K.V</name>
<email>aneesh.kumar@linux.vnet.ibm.com</email>
</author>
<published>2016-07-13T09:36:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=703b41ad1a8759949adc1b641a82c2b227d18223'/>
<id>urn:sha1:703b41ad1a8759949adc1b641a82c2b227d18223</id>
<content type='text'>
This should be same as flush_tlb_page except for hash32. For hash32
I guess the existing code is wrong, because we don't seem to be
flushing tlb for Hash != 0 case at all. Fix this by switching to
calling flush_tlb_page() which does the right thing by flushing
tlb for both hash and nohash case with hash32

Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
<entry>
<title>powerpc/mm/radix: Add tlbflush routines</title>
<updated>2016-05-01T08:33:09+00:00</updated>
<author>
<name>Aneesh Kumar K.V</name>
<email>aneesh.kumar@linux.vnet.ibm.com</email>
</author>
<published>2016-04-29T13:26:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1a472c9dba6b9646fd36717968f6a531b4441c7d'/>
<id>urn:sha1:1a472c9dba6b9646fd36717968f6a531b4441c7d</id>
<content type='text'>
Core kernel doesn't track the page size of the VA range that we are
invalidating. Hence we end up flushing TLB for the entire mm here. Later
patches will improve this.

We also don't flush page walk cache separetly instead use RIC=2 when
flushing TLB, because we do a MMU gather flush after freeing page table.

MMU_NO_CONTEXT is updated for hash.

Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
<entry>
<title>powerpc/mm: Hash abstraction for tlbflush routines</title>
<updated>2016-05-01T08:33:08+00:00</updated>
<author>
<name>Aneesh Kumar K.V</name>
<email>aneesh.kumar@linux.vnet.ibm.com</email>
</author>
<published>2016-04-29T13:26:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=676012a66f651a98808459bc8ab75661828ed96f'/>
<id>urn:sha1:676012a66f651a98808459bc8ab75661828ed96f</id>
<content type='text'>
Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
<entry>
<title>powerpc/mm: Move hash64 tlbflush code into a new header</title>
<updated>2016-03-03T10:19:39+00:00</updated>
<author>
<name>Aneesh Kumar K.V</name>
<email>aneesh.kumar@linux.vnet.ibm.com</email>
</author>
<published>2016-03-01T07:29:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ee3b93ebfbed6279f7a329001433c75c50ddfcc9'/>
<id>urn:sha1:ee3b93ebfbed6279f7a329001433c75c50ddfcc9</id>
<content type='text'>
No code changes.

Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
<entry>
<title>powerpc/mm: don't do tlbie for updatepp request with NO HPTE fault</title>
<updated>2014-12-05T05:26:15+00:00</updated>
<author>
<name>Aneesh Kumar K.V</name>
<email>aneesh.kumar@linux.vnet.ibm.com</email>
</author>
<published>2014-12-04T05:30:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=aefa5688c070727b8729de1aef85cad7b9933fc7'/>
<id>urn:sha1:aefa5688c070727b8729de1aef85cad7b9933fc7</id>
<content type='text'>
upatepp can get called for a nohpte fault when we find from the linux
page table that the translation was hashed before. In that case
we are sure that there is no existing translation, hence we could
avoid doing tlbie.

We could possibly race with a parallel fault filling the TLB. But
that should be ok because updatepp is only ever relaxing permissions.
We also look at linux pte permission bits when filling hash pte
permission bits. We also hold the linux pte busy bits while
inserting/updating a hashpte entry, hence a paralle update of
linux pte is not possible. On the other hand mprotect involves
ptep_modify_prot_start which cause a hpte invalidate and not updatepp.

Performance number:
We use randbox_access_bench written by Anton.

Kernel with THP disabled and smaller hash page table size.

    86.60%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_updatepp
     2.10%  random_access_b  random_access_bench              [.] doit
     1.99%  random_access_b  [kernel.kallsyms]                [k] .do_raw_spin_lock
     1.85%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_insert
     1.26%  random_access_b  [kernel.kallsyms]                [k] .native_flush_hash_range
     1.18%  random_access_b  [kernel.kallsyms]                [k] .__delay
     0.69%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_remove
     0.37%  random_access_b  [kernel.kallsyms]                [k] .clear_user_page
     0.34%  random_access_b  [kernel.kallsyms]                [k] .__hash_page_64K
     0.32%  random_access_b  [kernel.kallsyms]                [k] fast_exception_return
     0.30%  random_access_b  [kernel.kallsyms]                [k] .hash_page_mm

With Fix:

    27.54%  random_access_b  random_access_bench              [.] doit
    22.90%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_insert
     5.76%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_remove
     5.20%  random_access_b  [kernel.kallsyms]                [k] fast_exception_return
     5.12%  random_access_b  [kernel.kallsyms]                [k] .__hash_page_64K
     4.80%  random_access_b  [kernel.kallsyms]                [k] .hash_page_mm
     3.31%  random_access_b  [kernel.kallsyms]                [k] data_access_common
     1.84%  random_access_b  [kernel.kallsyms]                [k] .trace_hardirqs_on_caller

Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
<entry>
<title>powerpc/mm/thp: Use tlbiel if possible</title>
<updated>2014-12-02T03:10:11+00:00</updated>
<author>
<name>Aneesh Kumar K.V</name>
<email>aneesh.kumar@linux.vnet.ibm.com</email>
</author>
<published>2014-11-02T15:45:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d557b09800dab5dd6804e5b79324069abcf0be11'/>
<id>urn:sha1:d557b09800dab5dd6804e5b79324069abcf0be11</id>
<content type='text'>
If we know that user address space has never executed on other cpus
we could use tlbiel.

Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
</entry>
<entry>
<title>powerpc/mm/thp: Remove code duplication</title>
<updated>2014-12-02T03:10:10+00:00</updated>
<author>
<name>Aneesh Kumar K.V</name>
<email>aneesh.kumar@linux.vnet.ibm.com</email>
</author>
<published>2014-11-02T15:45:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f1581bf14bc40b4a14bf10358eac0b22173b3313'/>
<id>urn:sha1:f1581bf14bc40b4a14bf10358eac0b22173b3313</id>
<content type='text'>
Rename invalidate_old_hpte to flush_hash_hugepage and use that in
other places.

Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
</entry>
<entry>
<title>powerpc: Replace __get_cpu_var uses</title>
<updated>2014-11-03T01:12:32+00:00</updated>
<author>
<name>Christoph Lameter</name>
<email>cl@linux.com</email>
</author>
<published>2014-10-21T20:23:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=69111bac42f5ceacdd22e30947837ceb2c4493ed'/>
<id>urn:sha1:69111bac42f5ceacdd22e30947837ceb2c4493ed</id>
<content type='text'>
This still has not been merged and now powerpc is the only arch that does
not have this change. Sorry about missing linuxppc-dev before.

V2-&gt;V2
  - Fix up to work against 3.18-rc1

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &amp;__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :

__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.

This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.

Transformations done to __get_cpu_var()

1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &amp;__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&amp;y);

2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);

3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);

4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&amp;x, this_cpu_ptr(&amp;y), sizeof(x));

5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);

6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)

Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
CC: Paul Mackerras &lt;paulus@samba.org&gt;
Signed-off-by: Christoph Lameter &lt;cl@linux.com&gt;
[mpe: Fix build errors caused by set/or_softirq_pending(), and rework
      assignment in __set_breakpoint() to use memcpy().]
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
</entry>
</feed>
