<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/mm, branch v4.5.3</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v4.5.3</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v4.5.3'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2016-05-04T21:49:10+00:00</updated>
<entry>
<title>mm/hwpoison: fix wrong num_poisoned_pages accounting</title>
<updated>2016-05-04T21:49:10+00:00</updated>
<author>
<name>Minchan Kim</name>
<email>minchan@kernel.org</email>
</author>
<published>2016-04-28T23:18:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1cf34a85e4a8a268a27683387caf0372e0253b99'/>
<id>urn:sha1:1cf34a85e4a8a268a27683387caf0372e0253b99</id>
<content type='text'>
commit d7e69488bd04de165667f6bc741c1c0ec6042ab9 upstream.

Currently, migration code increses num_poisoned_pages on *failed*
migration page as well as successfully migrated one at the trial of
memory-failure.  It will make the stat wrong.  As well, it marks the
page as PG_HWPoison even if the migration trial failed.  It would mean
we cannot recover the corrupted page using memory-failure facility.

This patches fixes it.

Signed-off-by: Minchan Kim &lt;minchan@kernel.org&gt;
Reported-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Acked-by: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm: vmscan: reclaim highmem zone if buffer_heads is over limit</title>
<updated>2016-05-04T21:49:10+00:00</updated>
<author>
<name>Minchan Kim</name>
<email>minchan@kernel.org</email>
</author>
<published>2016-04-28T23:18:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=93de90e5d049a0f790d0b7f4b29c9d4c50a09e59'/>
<id>urn:sha1:93de90e5d049a0f790d0b7f4b29c9d4c50a09e59</id>
<content type='text'>
commit 7bf52fb891b64b8d61caf0b82060adb9db761aec upstream.

We have been reclaimed highmem zone if buffer_heads is over limit but
commit 6b4f7799c6a5 ("mm: vmscan: invoke slab shrinkers from
shrink_zone()") changed the behavior so it doesn't reclaim highmem zone
although buffer_heads is over the limit.  This patch restores the logic.

Fixes: 6b4f7799c6a5 ("mm: vmscan: invoke slab shrinkers from shrink_zone()")
Signed-off-by: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>numa: fix /proc/&lt;pid&gt;/numa_maps for THP</title>
<updated>2016-05-04T21:49:10+00:00</updated>
<author>
<name>Gerald Schaefer</name>
<email>gerald.schaefer@de.ibm.com</email>
</author>
<published>2016-04-28T23:18:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=42bbd053fcb84539dc54e450e58ffa57ed7b0ebf'/>
<id>urn:sha1:42bbd053fcb84539dc54e450e58ffa57ed7b0ebf</id>
<content type='text'>
commit 28093f9f34cedeaea0f481c58446d9dac6dd620f upstream.

In gather_pte_stats() a THP pmd is cast into a pte, which is wrong
because the layouts may differ depending on the architecture.  On s390
this will lead to inaccurate numa_maps accounting in /proc because of
misguided pte_present() and pte_dirty() checks on the fake pte.

On other architectures pte_present() and pte_dirty() may work by chance,
but there may be an issue with direct-access (dax) mappings w/o
underlying struct pages when HAVE_PTE_SPECIAL is set and THP is
available.  In vm_normal_page() the fake pte will be checked with
pte_special() and because there is no "special" bit in a pmd, this will
always return false and the VM_PFNMAP | VM_MIXEDMAP checking will be
skipped.  On dax mappings w/o struct pages, an invalid struct page
pointer would then be returned that can crash the kernel.

This patch fixes the numa_maps THP handling by introducing new "_pmd"
variants of the can_gather_numa_stats() and vm_normal_page() functions.

Signed-off-by: Gerald Schaefer &lt;gerald.schaefer@de.ibm.com&gt;
Cc: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: "Kirill A . Shutemov" &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Konstantin Khlebnikov &lt;koct9i@gmail.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Jerome Marchand &lt;jmarchan@redhat.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Michael Holzheu &lt;holzheu@linux.vnet.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm/huge_memory: replace VM_NO_THP VM_BUG_ON with actual VMA check</title>
<updated>2016-05-04T21:49:10+00:00</updated>
<author>
<name>Konstantin Khlebnikov</name>
<email>koct9i@gmail.com</email>
</author>
<published>2016-04-28T23:18:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d0f803397cf872c71913bb493432e410fd51ead6'/>
<id>urn:sha1:d0f803397cf872c71913bb493432e410fd51ead6</id>
<content type='text'>
commit 3486b85a29c1741db99d0c522211c82d2b7a56d0 upstream.

Khugepaged detects own VMAs by checking vm_file and vm_ops but this way
it cannot distinguish private /dev/zero mappings from other special
mappings like /dev/hpet which has no vm_ops and popultes PTEs in mmap.

This fixes false-positive VM_BUG_ON and prevents installing THP where
they are not expected.

Link: http://lkml.kernel.org/r/CACT4Y+ZmuZMV5CjSFOeXviwQdABAgT7T+StKfTqan9YDtgEi5g@mail.gmail.com
Fixes: 78f11a255749 ("mm: thp: fix /dev/zero MAP_PRIVATE and vm_flags cleanups")
Signed-off-by: Konstantin Khlebnikov &lt;koct9i@gmail.com&gt;
Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Acked-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>memcg: relocate charge moving from -&gt;attach to -&gt;post_attach</title>
<updated>2016-05-04T21:49:09+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2016-04-21T23:09:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1153d28d9e0c93edde896a12afd66267943823f3'/>
<id>urn:sha1:1153d28d9e0c93edde896a12afd66267943823f3</id>
<content type='text'>
commit 264a0ae164bc0e9144bebcd25ff030d067b1a878 upstream.

Hello,

So, this ended up a lot simpler than I originally expected.  I tested
it lightly and it seems to work fine.  Petr, can you please test these
two patches w/o the lru drain drop patch and see whether the problem
is gone?

Thanks.
------ 8&lt; ------
If charge moving is used, memcg performs relabeling of the affected
pages from its -&gt;attach callback which is called under both
cgroup_threadgroup_rwsem and thus can't create new kthreads.  This is
fragile as various operations may depend on workqueues making forward
progress which relies on the ability to create new kthreads.

There's no reason to perform charge moving from -&gt;attach which is deep
in the task migration path.  Move it to -&gt;post_attach which is called
after the actual migration is finished and cgroup_threadgroup_rwsem is
dropped.

* move_charge_struct-&gt;mm is added and -&gt;can_attach is now responsible
  for pinning and recording the target mm.  mem_cgroup_clear_mc() is
  updated accordingly.  This also simplifies mem_cgroup_move_task().

* mem_cgroup_move_task() is now called from -&gt;post_attach instead of
  -&gt;attach.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Acked-by: Michal Hocko &lt;mhocko@kernel.org&gt;
Debugged-and-tested-by: Petr Mladek &lt;pmladek@suse.com&gt;
Reported-by: Cyril Hrubis &lt;chrubis@suse.cz&gt;
Reported-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Fixes: 1ed1328792ff ("sched, cgroup: replace signal_struct-&gt;group_rwsem with a global percpu_rwsem")
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>slub: clean up code for kmem cgroup support to kmem_cache_free_bulk</title>
<updated>2016-05-04T21:49:09+00:00</updated>
<author>
<name>Jesper Dangaard Brouer</name>
<email>brouer@redhat.com</email>
</author>
<published>2016-03-15T21:53:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7eec7856fca7062dee2918013925feb426977f9d'/>
<id>urn:sha1:7eec7856fca7062dee2918013925feb426977f9d</id>
<content type='text'>
commit 376bf125ac781d32e202760ed7deb1ae4ed35d31 upstream.

This change is primarily an attempt to make it easier to realize the
optimizations the compiler performs in-case CONFIG_MEMCG_KMEM is not
enabled.

Performance wise, even when CONFIG_MEMCG_KMEM is compiled in, the
overhead is zero.  This is because, as long as no process have enabled
kmem cgroups accounting, the assignment is replaced by asm-NOP
operations.  This is possible because memcg_kmem_enabled() uses a
static_key_false() construct.

It also helps readability as it avoid accessing the p[] array like:
p[size - 1] which "expose" that the array is processed backwards inside
helper function build_detached_freelist().

Lastly this also makes the code more robust, in error case like passing
NULL pointers in the array.  Which were previously handled before commit
033745189b1b ("slub: add missing kmem cgroup support to
kmem_cache_free_bulk").

Fixes: 033745189b1b ("slub: add missing kmem cgroup support to kmem_cache_free_bulk")
Signed-off-by: Jesper Dangaard Brouer &lt;brouer@redhat.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Pekka Enberg &lt;penberg@kernel.org&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Vladimir Davydov &lt;vdavydov@virtuozzo.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm: fix invalid node in alloc_migrate_target()</title>
<updated>2016-04-20T06:45:07+00:00</updated>
<author>
<name>Xishi Qiu</name>
<email>qiuxishi@huawei.com</email>
</author>
<published>2016-04-01T21:31:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=16808319c000657dc8236b25f1ef0f1847cb6872'/>
<id>urn:sha1:16808319c000657dc8236b25f1ef0f1847cb6872</id>
<content type='text'>
commit 6f25a14a7053b69917e2ebea0d31dd444cd31fd5 upstream.

It is incorrect to use next_node to find a target node, it will return
MAX_NUMNODES or invalid node.  This will lead to crash in buddy system
allocation.

Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
Signed-off-by: Xishi Qiu &lt;qiuxishi@huawei.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Acked-by: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: Joonsoo Kim &lt;js1304@gmail.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: "Laura Abbott" &lt;lauraa@codeaurora.org&gt;
Cc: Hui Zhu &lt;zhuhui@xiaomi.com&gt;
Cc: Wang Xiaoqiang &lt;wangxq10@lzu.edu.cn&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm/page_alloc: prevent merging between isolated and other pageblocks</title>
<updated>2016-04-12T14:33:48+00:00</updated>
<author>
<name>Vlastimil Babka</name>
<email>vbabka@suse.cz</email>
</author>
<published>2016-03-25T21:21:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cbb6e048311be61c6e59d2bd413330a52eb97fab'/>
<id>urn:sha1:cbb6e048311be61c6e59d2bd413330a52eb97fab</id>
<content type='text'>
commit d9dddbf556674bf125ecd925b24e43a5cf2a568a upstream.

Hanjun Guo has reported that a CMA stress test causes broken accounting of
CMA and free pages:

&gt; Before the test, I got:
&gt; -bash-4.3# cat /proc/meminfo | grep Cma
&gt; CmaTotal:         204800 kB
&gt; CmaFree:          195044 kB
&gt;
&gt;
&gt; After running the test:
&gt; -bash-4.3# cat /proc/meminfo | grep Cma
&gt; CmaTotal:         204800 kB
&gt; CmaFree:         6602584 kB
&gt;
&gt; So the freed CMA memory is more than total..
&gt;
&gt; Also the the MemFree is more than mem total:
&gt;
&gt; -bash-4.3# cat /proc/meminfo
&gt; MemTotal:       16342016 kB
&gt; MemFree:        22367268 kB
&gt; MemAvailable:   22370528 kB

Laura Abbott has confirmed the issue and suspected the freepage accounting
rewrite around 3.18/4.0 by Joonsoo Kim.  Joonsoo had a theory that this is
caused by unexpected merging between MIGRATE_ISOLATE and MIGRATE_CMA
pageblocks:

&gt; CMA isolates MAX_ORDER aligned blocks, but, during the process,
&gt; partialy isolated block exists. If MAX_ORDER is 11 and
&gt; pageblock_order is 9, two pageblocks make up MAX_ORDER
&gt; aligned block and I can think following scenario because pageblock
&gt; (un)isolation would be done one by one.
&gt;
&gt; (each character means one pageblock. 'C', 'I' means MIGRATE_CMA,
&gt; MIGRATE_ISOLATE, respectively.
&gt;
&gt; CC -&gt; IC -&gt; II (Isolation)
&gt; II -&gt; CI -&gt; CC (Un-isolation)
&gt;
&gt; If some pages are freed at this intermediate state such as IC or CI,
&gt; that page could be merged to the other page that is resident on
&gt; different type of pageblock and it will cause wrong freepage count.

This was supposed to be prevented by CMA operating on MAX_ORDER blocks,
but since it doesn't hold the zone-&gt;lock between pageblocks, a race
window does exist.

It's also likely that unexpected merging can occur between
MIGRATE_ISOLATE and non-CMA pageblocks.  This should be prevented in
__free_one_page() since commit 3c605096d315 ("mm/page_alloc: restrict
max order of merging on isolated pageblock").  However, we only check
the migratetype of the pageblock where buddy merging has been initiated,
not the migratetype of the buddy pageblock (or group of pageblocks)
which can be MIGRATE_ISOLATE.

Joonsoo has suggested checking for buddy migratetype as part of
page_is_buddy(), but that would add extra checks in allocator hotpath
and bloat-o-meter has shown significant code bloat (the function is
inline).

This patch reduces the bloat at some expense of more complicated code.
The buddy-merging while-loop in __free_one_page() is initially bounded
to pageblock_border and without any migratetype checks.  The checks are
placed outside, bumping the max_order if merging is allowed, and
returning to the while-loop with a statement which can't be possibly
considered harmful.

This fixes the accounting bug and also removes the arguably weird state
in the original commit 3c605096d315 where buddies could be left
unmerged.

Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock")
Link: https://lkml.org/lkml/2016/3/2/280
Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Reported-by: Hanjun Guo &lt;guohanjun@huawei.com&gt;
Tested-by: Hanjun Guo &lt;guohanjun@huawei.com&gt;
Acked-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Debugged-by: Laura Abbott &lt;labbott@redhat.com&gt;
Debugged-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: "Kirill A. Shutemov" &lt;kirill@shutemov.name&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Yasuaki Ishimatsu &lt;isimatu.yasuaki@jp.fujitsu.com&gt;
Cc: Zhang Yanfei &lt;zhangyanfei@cn.fujitsu.com&gt;
Cc: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Cc: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: "Aneesh Kumar K.V" &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm: memcontrol: reclaim and OOM kill when shrinking memory.max below usage</title>
<updated>2016-04-12T14:33:35+00:00</updated>
<author>
<name>Johannes Weiner</name>
<email>hannes@cmpxchg.org</email>
</author>
<published>2016-03-17T21:20:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=97729c37439040ec07807f0dbb71ea60ccbafd27'/>
<id>urn:sha1:97729c37439040ec07807f0dbb71ea60ccbafd27</id>
<content type='text'>
commit b6e6edcfa40561e9c8abe5eecf1c96f8e5fd9c6f upstream.

Setting the original memory.limit_in_bytes hardlimit is subject to a
race condition when the desired value is below the current usage.  The
code tries a few times to first reclaim and then see if the usage has
dropped to where we would like it to be, but there is no locking, and
the workload is free to continue making new charges up to the old limit.
Thus, attempting to shrink a workload relies on pure luck and hope that
the workload happens to cooperate.

To fix this in the cgroup2 memory.max knob, do it the other way round:
set the limit first, then try enforcement.  And if reclaim is not able
to succeed, trigger OOM kills in the group.  Keep going until the new
limit is met, we run out of OOM victims and there's only unreclaimable
memory left, or the task writing to memory.max is killed.  This allows
users to shrink groups reliably, and the behavior is consistent with
what happens when new charges are attempted in excess of memory.max.

Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Vladimir Davydov &lt;vdavydov@virtuozzo.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm: memcontrol: reclaim when shrinking memory.high below usage</title>
<updated>2016-04-12T14:33:35+00:00</updated>
<author>
<name>Johannes Weiner</name>
<email>hannes@cmpxchg.org</email>
</author>
<published>2016-03-17T21:20:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=aa748be2b004bdb0c750b2ae2631be1a5da7ed08'/>
<id>urn:sha1:aa748be2b004bdb0c750b2ae2631be1a5da7ed08</id>
<content type='text'>
commit 588083bb37a3cea8533c392370a554417c8f29cb upstream.

When setting memory.high below usage, nothing happens until the next
charge comes along, and then it will only reclaim its own charge and not
the now potentially huge excess of the new memory.high.  This can cause
groups to stay in excess of their memory.high indefinitely.

To fix that, when shrinking memory.high, kick off a reclaim cycle that
goes after the delta.

Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Vladimir Davydov &lt;vdavydov@virtuozzo.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
