<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/asm-generic/pgtable.h, branch v3.8-rc4</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v3.8-rc4</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v3.8-rc4'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2012-12-16T23:18:08+00:00</updated>
<entry>
<title>Merge tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma</title>
<updated>2012-12-16T23:18:08+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-12-16T22:33:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3d59eebc5e137bd89c6351e4c70e90ba1d0dc234'/>
<id>urn:sha1:3d59eebc5e137bd89c6351e4c70e90ba1d0dc234</id>
<content type='text'>
Pull Automatic NUMA Balancing bare-bones from Mel Gorman:
 "There are three implementations for NUMA balancing, this tree
  (balancenuma), numacore which has been developed in tip/master and
  autonuma which is in aa.git.

  In almost all respects balancenuma is the dumbest of the three because
  its main impact is on the VM side with no attempt to be smart about
  scheduling.  In the interest of getting the ball rolling, it would be
  desirable to see this much merged for 3.8 with the view to building
  scheduler smarts on top and adapting the VM where required for 3.9.

  The most recent set of comparisons available from different people are

    mel:    https://lkml.org/lkml/2012/12/9/108
    mingo:  https://lkml.org/lkml/2012/12/7/331
    tglx:   https://lkml.org/lkml/2012/12/10/437
    srikar: https://lkml.org/lkml/2012/12/10/397

  The results are a mixed bag.  In my own tests, balancenuma does
  reasonably well.  It's dumb as rocks and does not regress against
  mainline.  On the other hand, Ingo's tests shows that balancenuma is
  incapable of converging for this workloads driven by perf which is bad
  but is potentially explained by the lack of scheduler smarts.  Thomas'
  results show balancenuma improves on mainline but falls far short of
  numacore or autonuma.  Srikar's results indicate we all suffer on a
  large machine with imbalanced node sizes.

  My own testing showed that recent numacore results have improved
  dramatically, particularly in the last week but not universally.
  We've butted heads heavily on system CPU usage and high levels of
  migration even when it shows that overall performance is better.
  There are also cases where it regresses.  Of interest is that for
  specjbb in some configurations it will regress for lower numbers of
  warehouses and show gains for higher numbers which is not reported by
  the tool by default and sometimes missed in treports.  Recently I
  reported for numacore that the JVM was crashing with
  NullPointerExceptions but currently it's unclear what the source of
  this problem is.  Initially I thought it was in how numacore batch
  handles PTEs but I'm no longer think this is the case.  It's possible
  numacore is just able to trigger it due to higher rates of migration.

  These reports were quite late in the cycle so I/we would like to start
  with this tree as it contains much of the code we can agree on and has
  not changed significantly over the last 2-3 weeks."

* tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma: (50 commits)
  mm/rmap, migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable
  mm/rmap: Convert the struct anon_vma::mutex to an rwsem
  mm: migrate: Account a transhuge page properly when rate limiting
  mm: numa: Account for failed allocations and isolations as migration failures
  mm: numa: Add THP migration for the NUMA working set scanning fault case build fix
  mm: numa: Add THP migration for the NUMA working set scanning fault case.
  mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node
  mm: sched: numa: Control enabling and disabling of NUMA balancing if !SCHED_DEBUG
  mm: sched: numa: Control enabling and disabling of NUMA balancing
  mm: sched: Adapt the scanning rate if a NUMA hinting fault does not migrate
  mm: numa: Use a two-stage filter to restrict pages being migrated for unlikely task&lt;-&gt;node relationships
  mm: numa: migrate: Set last_nid on newly allocated page
  mm: numa: split_huge_page: Transfer last_nid on tail page
  mm: numa: Introduce last_nid to the page frame
  sched: numa: Slowly increase the scanning period as NUMA faults are handled
  mm: numa: Rate limit setting of pte_numa if node is saturated
  mm: numa: Rate limit the amount of memory that is migrated between nodes
  mm: numa: Structures for Migrate On Fault per NUMA migration rate limiting
  mm: numa: Migrate pages handled during a pmd_numa hinting fault
  mm: numa: Migrate on reference policy
  ...
</content>
</entry>
<entry>
<title>asm-generic, mm: pgtable: consolidate zero page helpers</title>
<updated>2012-12-13T01:38:35+00:00</updated>
<author>
<name>Kirill A. Shutemov</name>
<email>kirill.shutemov@linux.intel.com</email>
</author>
<published>2012-12-12T21:52:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=816422ad76474fed8052b6f7b905a054d082e59a'/>
<id>urn:sha1:816422ad76474fed8052b6f7b905a054d082e59a</id>
<content type='text'>
We have two different implementation of is_zero_pfn() and my_zero_pfn()
helpers: for architectures with and without zero page coloring.

Let's consolidate them in &lt;asm-generic/pgtable.h&gt;.

Signed-off-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: numa: pte_numa() and pmd_numa()</title>
<updated>2012-12-11T14:42:36+00:00</updated>
<author>
<name>Andrea Arcangeli</name>
<email>aarcange@redhat.com</email>
</author>
<published>2012-10-03T23:50:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=be3a728427a605990a7a0b6dbf9e29b68e266146'/>
<id>urn:sha1:be3a728427a605990a7a0b6dbf9e29b68e266146</id>
<content type='text'>
Implement pte_numa and pmd_numa.

We must atomically set the numa bit and clear the present bit to
define a pte_numa or pmd_numa.

Once a pte or pmd has been set as pte_numa or pmd_numa, the next time
a thread touches a virtual address in the corresponding virtual range,
a NUMA hinting page fault will trigger. The NUMA hinting page fault
will clear the NUMA bit and set the present bit again to resolve the
page fault.

The expectation is that a NUMA hinting page fault is used as part
of a placement policy that decides if a page should remain on the
current node or migrated to a different node.

Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
</content>
</entry>
<entry>
<title>x86/mm: Introduce pte_accessible()</title>
<updated>2012-12-11T14:28:34+00:00</updated>
<author>
<name>Rik van Riel</name>
<email>riel@redhat.com</email>
</author>
<published>2012-10-09T13:31:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2c3cf556b2f7ab5823195766eddec6f1a43134f5'/>
<id>urn:sha1:2c3cf556b2f7ab5823195766eddec6f1a43134f5</id>
<content type='text'>
We need pte_present to return true for _PAGE_PROTNONE pages, to indicate that
the pte is associated with a page.

However, for TLB flushing purposes, we would like to know whether the pte
points to an actually accessible page.  This allows us to skip remote TLB
flushes for pages that are not actually accessible.

Fill in this method for x86 and provide a safe (but slower) method
on other architectures.

Signed-off-by: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Fixed-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/n/tip-66p11te4uj23gevgh4j987ip@git.kernel.org
[ Added Linus's review fixes. ]
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>mm: thp: fix the pmd_clear() arguments in pmdp_get_and_clear()</title>
<updated>2012-10-09T07:22:53+00:00</updated>
<author>
<name>Catalin Marinas</name>
<email>catalin.marinas@arm.com</email>
</author>
<published>2012-10-08T23:32:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2d28a2275c21d04290cfba1555c89a806d2b7706'/>
<id>urn:sha1:2d28a2275c21d04290cfba1555c89a806d2b7706</id>
<content type='text'>
The CONFIG_TRANSPARENT_HUGEPAGE implementation of pmdp_get_and_clear()
calls pmd_clear() with 3 arguments instead of 1.

This happens only for !__HAVE_ARCH_PMDP_GET_AND_CLEAR which doesn't seem
to happen because x86 defines this and it uses pmd_update.

[mhocko@suse.cz: changelog addition]
Signed-off-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Steve Capper &lt;steve.capper@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Reviewed-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Reviewed-by: Kirill A. Shutemov &lt;kirill@shutemov.name&gt;
Cc: Gerald Schaefer &lt;gerald.schaefer@de.ibm.com&gt;
Reviewed-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Chris Metcalf &lt;cmetcalf@tilera.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>thp: introduce pmdp_invalidate()</title>
<updated>2012-10-09T07:22:29+00:00</updated>
<author>
<name>Gerald Schaefer</name>
<email>gerald.schaefer@de.ibm.com</email>
</author>
<published>2012-10-08T23:30:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=46dcde735c9d8953bbd8d105ca6779e5b5300c28'/>
<id>urn:sha1:46dcde735c9d8953bbd8d105ca6779e5b5300c28</id>
<content type='text'>
On s390, a valid page table entry must not be changed while it is attached
to any CPU.  So instead of pmd_mknotpresent() and set_pmd_at(), an IDTE
operation would be necessary there.  This patch introduces the
pmdp_invalidate() function, to allow architecture-specific
implementations.

Signed-off-by: Gerald Schaefer &lt;gerald.schaefer@de.ibm.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Hillf Danton &lt;dhillf@gmail.com&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>thp: remove assumptions on pgtable_t type</title>
<updated>2012-10-09T07:22:29+00:00</updated>
<author>
<name>Gerald Schaefer</name>
<email>gerald.schaefer@de.ibm.com</email>
</author>
<published>2012-10-08T23:30:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e3ebcf64381188a2744a9829a4eb5c2b60f1974c'/>
<id>urn:sha1:e3ebcf64381188a2744a9829a4eb5c2b60f1974c</id>
<content type='text'>
The thp page table pre-allocation code currently assumes that pgtable_t is
of type "struct page *".  This may not be true for all architectures, so
this patch removes that assumption by replacing the functions
prepare_pmd_huge_pte() and get_pmd_huge_pte() with two new functions that
can be defined architecture-specific.

It also removes two VM_BUG_ON checks for page_count() and page_mapcount()
operating on a pgtable_t.  Apart from the VM_BUG_ON removal, there will be
no functional change introduced by this patch.

Signed-off-by: Gerald Schaefer &lt;gerald.schaefer@de.ibm.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Hillf Danton &lt;dhillf@gmail.com&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm, x86, pat: rework linear pfn-mmap tracking</title>
<updated>2012-10-09T07:22:16+00:00</updated>
<author>
<name>Konstantin Khlebnikov</name>
<email>khlebnikov@openvz.org</email>
</author>
<published>2012-10-08T23:28:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b3b9c2932c32e0692018ed5f12f3fd8c70eea8ce'/>
<id>urn:sha1:b3b9c2932c32e0692018ed5f12f3fd8c70eea8ce</id>
<content type='text'>
Replace the generic vma-flag VM_PFN_AT_MMAP with x86-only VM_PAT.

We can toss mapping address from remap_pfn_range() into
track_pfn_vma_new(), and collect all PAT-related logic together in
arch/x86/.

This patch also restores orignal frustration-free is_cow_mapping() check
in remap_pfn_range(), as it was before commit v2.6.28-rc8-88-g3c8bb73
("x86: PAT: store vm_pgoff for all linear_over_vma_region mappings - v3")

is_linear_pfn_mapping() checks can be removed from mm/huge_memory.c,
because it already handled by VM_PFNMAP in VM_NO_THP bit-mask.

[suresh.b.siddha@intel.com: Reset the VM_PAT flag as part of untrack_pfn_vma()]
Signed-off-by: Konstantin Khlebnikov &lt;khlebnikov@openvz.org&gt;
Signed-off-by: Suresh Siddha &lt;suresh.b.siddha@intel.com&gt;
Cc: Venkatesh Pallipadi &lt;venki@google.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Nick Piggin &lt;npiggin@kernel.dk&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Carsten Otte &lt;cotte@de.ibm.com&gt;
Cc: Chris Metcalf &lt;cmetcalf@tilera.com&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Cc: Eric Paris &lt;eparis@redhat.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: James Morris &lt;james.l.morris@oracle.com&gt;
Cc: Jason Baron &lt;jbaron@redhat.com&gt;
Cc: Kentaro Takeda &lt;takedakn@nttdata.co.jp&gt;
Cc: Matt Helsley &lt;matthltc@us.ibm.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Robert Richter &lt;robert.richter@amd.com&gt;
Cc: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Cc: Venkatesh Pallipadi &lt;venki@google.com&gt;
Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>x86, pat: separate the pfn attribute tracking for remap_pfn_range and vm_insert_pfn</title>
<updated>2012-10-09T07:22:16+00:00</updated>
<author>
<name>Suresh Siddha</name>
<email>suresh.b.siddha@intel.com</email>
</author>
<published>2012-10-08T23:28:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5180da410db6369d1f95c9014da1c9bc33fb043e'/>
<id>urn:sha1:5180da410db6369d1f95c9014da1c9bc33fb043e</id>
<content type='text'>
With PAT enabled, vm_insert_pfn() looks up the existing pfn memory
attribute and uses it.  Expectation is that the driver reserves the
memory attributes for the pfn before calling vm_insert_pfn().

remap_pfn_range() (when called for the whole vma) will setup a new
attribute (based on the prot argument) for the specified pfn range.
This addresses the legacy usage which typically calls remap_pfn_range()
with a desired memory attribute.  For ranges smaller than the vma size
(which is typically not the case), remap_pfn_range() will use the
existing memory attribute for the pfn range.

Expose two different API's for these different behaviors.
track_pfn_insert() for tracking the pfn attribute set by vm_insert_pfn()
and track_pfn_remap() for the remap_pfn_range().

This cleanup also prepares the ground for the track/untrack pfn vma
routines to take over the ownership of setting PAT specific vm_flag in
the 'vma'.

[khlebnikov@openvz.org: Clear checks in track_pfn_remap()]
[akpm@linux-foundation.org: tweak a few comments]
Signed-off-by: Suresh Siddha &lt;suresh.b.siddha@intel.com&gt;
Signed-off-by: Konstantin Khlebnikov &lt;khlebnikov@openvz.org&gt;
Cc: Venkatesh Pallipadi &lt;venki@google.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Nick Piggin &lt;npiggin@kernel.dk&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Carsten Otte &lt;cotte@de.ibm.com&gt;
Cc: Chris Metcalf &lt;cmetcalf@tilera.com&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Cc: Eric Paris &lt;eparis@redhat.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: James Morris &lt;james.l.morris@oracle.com&gt;
Cc: Jason Baron &lt;jbaron@redhat.com&gt;
Cc: Kentaro Takeda &lt;takedakn@nttdata.co.jp&gt;
Cc: Konstantin Khlebnikov &lt;khlebnikov@openvz.org&gt;
Cc: Matt Helsley &lt;matthltc@us.ibm.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Robert Richter &lt;robert.richter@amd.com&gt;
Cc: Suresh Siddha &lt;suresh.b.siddha@intel.com&gt;
Cc: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>thp: avoid atomic64_read in pmd_read_atomic for 32bit PAE</title>
<updated>2012-06-20T21:39:35+00:00</updated>
<author>
<name>Andrea Arcangeli</name>
<email>aarcange@redhat.com</email>
</author>
<published>2012-06-20T19:52:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e4eed03fd06578571c01d4f1478c874bb432c815'/>
<id>urn:sha1:e4eed03fd06578571c01d4f1478c874bb432c815</id>
<content type='text'>
In the x86 32bit PAE CONFIG_TRANSPARENT_HUGEPAGE=y case while holding the
mmap_sem for reading, cmpxchg8b cannot be used to read pmd contents under
Xen.

So instead of dealing only with "consistent" pmdvals in
pmd_none_or_trans_huge_or_clear_bad() (which would be conceptually
simpler) we let pmd_none_or_trans_huge_or_clear_bad() deal with pmdvals
where the low 32bit and high 32bit could be inconsistent (to avoid having
to use cmpxchg8b).

The only guarantee we get from pmd_read_atomic is that if the low part of
the pmd was found null, the high part will be null too (so the pmd will be
considered unstable).  And if the low part of the pmd is found "stable"
later, then it means the whole pmd was read atomically (because after a
pmd is stable, neither MADV_DONTNEED nor page faults can alter it anymore,
and we read the high part after the low part).

In the 32bit PAE x86 case, it is enough to read the low part of the pmdval
atomically to declare the pmd as "stable" and that's true for THP and no
THP, furthermore in the THP case we also have a barrier() that will
prevent any inconsistent pmdvals to be cached by a later re-read of the
*pmd.

Signed-off-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Jonathan Nieder &lt;jrnieder@gmail.com&gt;
Cc: Ulrich Obergfell &lt;uobergfe@redhat.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Larry Woodman &lt;lwoodman@redhat.com&gt;
Cc: Petr Matousek &lt;pmatouse@redhat.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Jan Beulich &lt;jbeulich@suse.com&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@gmail.com&gt;
Tested-by: Andrew Jones &lt;drjones@redhat.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
