<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/linux/gfp.h, branch v5.10.257</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v5.10.257</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v5.10.257'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-04-13T10:59:22+00:00</updated>
<entry>
<title>mm, vmscan: prevent infinite loop for costly GFP_NOIO | __GFP_RETRY_MAYFAIL allocations</title>
<updated>2024-04-13T10:59:22+00:00</updated>
<author>
<name>Vlastimil Babka</name>
<email>vbabka@suse.cz</email>
</author>
<published>2024-02-21T11:43:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=af47e6a95eb7575ce8c1ad2ce5650b44e7f46769'/>
<id>urn:sha1:af47e6a95eb7575ce8c1ad2ce5650b44e7f46769</id>
<content type='text'>
commit 803de9000f334b771afacb6ff3e78622916668b0 upstream.

Sven reports an infinite loop in __alloc_pages_slowpath() for costly order
__GFP_RETRY_MAYFAIL allocations that are also GFP_NOIO.  Such combination
can happen in a suspend/resume context where a GFP_KERNEL allocation can
have __GFP_IO masked out via gfp_allowed_mask.

Quoting Sven:

1. try to do a "costly" allocation (order &gt; PAGE_ALLOC_COSTLY_ORDER)
   with __GFP_RETRY_MAYFAIL set.

2. page alloc's __alloc_pages_slowpath tries to get a page from the
   freelist. This fails because there is nothing free of that costly
   order.

3. page alloc tries to reclaim by calling __alloc_pages_direct_reclaim,
   which bails out because a zone is ready to be compacted; it pretends
   to have made a single page of progress.

4. page alloc tries to compact, but this always bails out early because
   __GFP_IO is not set (it's not passed by the snd allocator, and even
   if it were, we are suspending so the __GFP_IO flag would be cleared
   anyway).

5. page alloc believes reclaim progress was made (because of the
   pretense in item 3) and so it checks whether it should retry
   compaction. The compaction retry logic thinks it should try again,
   because:
    a) reclaim is needed because of the early bail-out in item 4
    b) a zonelist is suitable for compaction

6. goto 2. indefinite stall.

(end quote)

The immediate root cause is confusing the COMPACT_SKIPPED returned from
__alloc_pages_direct_compact() (step 4) due to lack of __GFP_IO to be
indicating a lack of order-0 pages, and in step 5 evaluating that in
should_compact_retry() as a reason to retry, before incrementing and
limiting the number of retries.  There are however other places that
wrongly assume that compaction can happen while we lack __GFP_IO.

To fix this, introduce gfp_compaction_allowed() to abstract the __GFP_IO
evaluation and switch the open-coded test in try_to_compact_pages() to use
it.

Also use the new helper in:
- compaction_ready(), which will make reclaim not bail out in step 3, so
  there's at least one attempt to actually reclaim, even if chances are
  small for a costly order
- in_reclaim_compaction() which will make should_continue_reclaim()
  return false and we don't over-reclaim unnecessarily
- in __alloc_pages_slowpath() to set a local variable can_compact,
  which is then used to avoid retrying reclaim/compaction for costly
  allocations (step 5) if we can't compact and also to skip the early
  compaction attempt that we do in some cases

Link: https://lkml.kernel.org/r/20240221114357.13655-2-vbabka@suse.cz
Fixes: 3250845d0526 ("Revert "mm, oom: prevent premature OOM killer invocation for high order request"")
Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Reported-by: Sven van Ashbrook &lt;svenva@chromium.org&gt;
Closes: https://lore.kernel.org/all/CAG-rBihs_xMKb3wrMO1%2B-%2Bp4fowP9oy1pa_OTkfxBzPUVOZF%2Bg@mail.gmail.com/
Tested-by: Karthikeyan Ramasubramanian &lt;kramasub@chromium.org&gt;
Cc: Brian Geffon &lt;bgeffon@google.com&gt;
Cc: Curtis Malainey &lt;cujomalainey@chromium.org&gt;
Cc: Jaroslav Kysela &lt;perex@perex.cz&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Takashi Iwai &lt;tiwai@suse.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping</title>
<updated>2020-10-15T21:43:29+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-10-15T21:43:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5a32c3413d3340f90c82c84b375ad4b335a59f28'/>
<id>urn:sha1:5a32c3413d3340f90c82c84b375ad4b335a59f28</id>
<content type='text'>
Pull dma-mapping updates from Christoph Hellwig:

 - rework the non-coherent DMA allocator

 - move private definitions out of &lt;linux/dma-mapping.h&gt;

 - lower CMA_ALIGNMENT (Paul Cercueil)

 - remove the omap1 dma address translation in favor of the common code

 - make dma-direct aware of multiple dma offset ranges (Jim Quinlan)

 - support per-node DMA CMA areas (Barry Song)

 - increase the default seg boundary limit (Nicolin Chen)

 - misc fixes (Robin Murphy, Thomas Tai, Xu Wang)

 - various cleanups

* tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping: (63 commits)
  ARM/ixp4xx: add a missing include of dma-map-ops.h
  dma-direct: simplify the DMA_ATTR_NO_KERNEL_MAPPING handling
  dma-direct: factor out a dma_direct_alloc_from_pool helper
  dma-direct check for highmem pages in dma_direct_alloc_pages
  dma-mapping: merge &lt;linux/dma-noncoherent.h&gt; into &lt;linux/dma-map-ops.h&gt;
  dma-mapping: move large parts of &lt;linux/dma-direct.h&gt; to kernel/dma
  dma-mapping: move dma-debug.h to kernel/dma/
  dma-mapping: remove &lt;asm/dma-contiguous.h&gt;
  dma-mapping: merge &lt;linux/dma-contiguous.h&gt; into &lt;linux/dma-map-ops.h&gt;
  dma-contiguous: remove dma_contiguous_set_default
  dma-contiguous: remove dev_set_cma_area
  dma-contiguous: remove dma_declare_contiguous
  dma-mapping: split &lt;linux/dma-mapping.h&gt;
  cma: decrease CMA_ALIGNMENT lower limit to 2
  firewire-ohci: use dma_alloc_pages
  dma-iommu: implement -&gt;alloc_noncoherent
  dma-mapping: add new {alloc,free}_noncoherent dma_map_ops methods
  dma-mapping: add a new dma_alloc_pages API
  dma-mapping: remove dma_cache_sync
  53c700: convert to dma_alloc_noncoherent
  ...
</content>
</entry>
<entry>
<title>mm: remove unused alloc_page_vma_node()</title>
<updated>2020-10-14T01:38:34+00:00</updated>
<author>
<name>Wei Yang</name>
<email>richard.weiyang@linux.alibaba.com</email>
</author>
<published>2020-10-13T23:57:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f8fd52535c7326d72645c9878d7897aaf44db51c'/>
<id>urn:sha1:f8fd52535c7326d72645c9878d7897aaf44db51c</id>
<content type='text'>
No one use this macro anymore.

Also fix code style of policy_node().

Signed-off-by: Wei Yang &lt;richard.weiyang@linux.alibaba.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Link: https://lkml.kernel.org/r/20200921021401.84508-1-richard.weiyang@linux.alibaba.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>include/linux/gfp.h: clarify usage of GFP_ATOMIC in !preemptible contexts</title>
<updated>2020-10-14T01:38:33+00:00</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.com</email>
</author>
<published>2020-10-13T23:56:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ab00db216c9c78cc0a68bc4e27889c1ee374598d'/>
<id>urn:sha1:ab00db216c9c78cc0a68bc4e27889c1ee374598d</id>
<content type='text'>
There is a general understanding that GFP_ATOMIC/GFP_NOWAIT are to be used
from atomic contexts.  E.g.  from within a spin lock or from the IRQ
context.  This is correct but there are some atomic contexts where the
above doesn't hold.  One of them would be an NMI context.  Page allocator
has never supported that and the general fear of this context didn't let
anybody to actually even try to use the allocator there.  Good, but let's
be more specific about that.

Another such a context, and that is where people seem to be more daring,
is raw_spin_lock.  Mostly because it simply resembles regular spin lock
which is supported by the allocator and there is not any implementation
difference with !RT kernels in the first place.  Be explicit that such a
context is not supported by the allocator.  The underlying reason is that
zone-&gt;lock would have to become raw_spin_lock as well and that has turned
out to be a problem for RT
(http://lkml.kernel.org/r/87mu305c1w.fsf@nanos.tec.linutronix.de).

Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Uladzislau Rezki &lt;urezki@gmail.com&gt;
Cc: "Paul E. McKenney" &lt;paulmck@linux.vnet.ibm.com&gt;
Link: https://lkml.kernel.org/r/20200929123010.5137-1-mhocko@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: turn alloc_pages into an inline function</title>
<updated>2020-09-25T04:20:40+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2020-08-17T17:17:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=43ee5b6daa6c45246098493dab2c229d196c9cf6'/>
<id>urn:sha1:43ee5b6daa6c45246098493dab2c229d196c9cf6</id>
<content type='text'>
To prevent a compiler error when a method call alloc_pages is
added (which I plan to for the dma_map_ops).

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
</entry>
<entry>
<title>mm: rename gfpflags_to_migratetype to gfp_migratetype for same convention</title>
<updated>2020-06-04T03:09:45+00:00</updated>
<author>
<name>Wei Yang</name>
<email>richard.weiyang@gmail.com</email>
</author>
<published>2020-06-03T22:59:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=01c0bfe061f309b848d51619f20495ee2acd7727'/>
<id>urn:sha1:01c0bfe061f309b848d51619f20495ee2acd7727</id>
<content type='text'>
Pageblock migrate type is encoded in GFP flags, just as zone_type and
zonelist.

Currently we use gfp_zone() and gfp_zonelist() to extract related
information, it would be proper to use the same naming convention for
migrate type.

Signed-off-by: Wei Yang &lt;richard.weiyang@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Link: http://lkml.kernel.org/r/20200329080823.7735-1-richard.weiyang@gmail.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: clarify __GFP_MEMALLOC usage</title>
<updated>2020-06-04T03:09:43+00:00</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.com</email>
</author>
<published>2020-06-03T22:56:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=574c1ae66c12410a08aeef8474936baa50e0371d'/>
<id>urn:sha1:574c1ae66c12410a08aeef8474936baa50e0371d</id>
<content type='text'>
It seems that the existing documentation is not explicit about the
expected usage and potential risks enough.  While it is calls out that
users have to free memory when using this flag it is not really apparent
that users have to careful to not deplete memory reserves and that they
should implement some sort of throttling wrt.  freeing process.

This is partly based on Neil's explanation [1].

Let's also call out that a pre allocated pool allocator should be
considered.

[1] http://lkml.kernel.org/r/877dz0yxoa.fsf@notabene.neil.brown.name

[akpm@linux-foundation.org: coding style fixes]
[mhocko@kernel.org: update]
  Link: http://lkml.kernel.org/r/20200406070137.GC19426@dhcp22.suse.cz
Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Joel Fernandes &lt;joel@joelfernandes.org&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: John Hubbard &lt;jhubbard@nvidia.com&gt;
Link: http://lkml.kernel.org/r/20200403083543.11552-2-mhocko@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: make it clear that gfp reclaim modifiers are valid only for sleepable allocations</title>
<updated>2020-04-07T17:43:38+00:00</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.com</email>
</author>
<published>2020-04-07T03:04:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=29fd1897070125ab49634524a20f146fb4240a51'/>
<id>urn:sha1:29fd1897070125ab49634524a20f146fb4240a51</id>
<content type='text'>
While it might be really clear to MM developers that gfp reclaim modifiers
are applicable only to sleepable allocations (those with
__GFP_DIRECT_RECLAIM) it seems that actual users of the API are not always
sure.  Make it explicit that they are not applicable for GFP_NOWAIT or
GFP_ATOMIC allocations which are the most commonly used non-sleepable
allocation masks.

Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Acked-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Link: http://lkml.kernel.org/r/20200403083543.11552-3-mhocko@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/gup/writeback: add callbacks for inaccessible pages</title>
<updated>2020-04-02T16:35:27+00:00</updated>
<author>
<name>Claudio Imbrenda</name>
<email>imbrenda@linux.ibm.com</email>
</author>
<published>2020-04-02T04:05:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f28d43636d6f940e60abef4f0131119836c8ebd4'/>
<id>urn:sha1:f28d43636d6f940e60abef4f0131119836c8ebd4</id>
<content type='text'>
With the introduction of protected KVM guests on s390 there is now a
concept of inaccessible pages.  These pages need to be made accessible
before the host can access them.

While cpu accesses will trigger a fault that can be resolved, I/O accesses
will just fail.  We need to add a callback into architecture code for
places that will do I/O, namely when writeback is started or when a page
reference is taken.

This is not only to enable paging, file backing etc, it is also necessary
to protect the host against a malicious user space.  For example a bad
QEMU could simply start direct I/O on such protected memory.  We do not
want userspace to be able to trigger I/O errors and thus the logic is
"whenever somebody accesses that page (gup) or does I/O, make sure that
this page can be accessed".  When the guest tries to access that page we
will wait in the page fault handler for writeback to have finished and for
the page_ref to be the expected value.

On s390x the function is not supposed to fail, so it is ok to use a
WARN_ON on failure.  If we ever need some more finegrained handling we can
tackle this when we know the details.

Signed-off-by: Claudio Imbrenda &lt;imbrenda@linux.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Reviewed-by: John Hubbard &lt;jhubbard@nvidia.com&gt;
Acked-by: Will Deacon &lt;will@kernel.org&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Ira Weiny &lt;ira.weiny@intel.com&gt;
Cc: Jérôme Glisse &lt;jglisse@redhat.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Shuah Khan &lt;shuah@kernel.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Link: http://lkml.kernel.org/r/20200306132537.783769-3-imbrenda@linux.ibm.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/page_alloc: add alloc_contig_pages()</title>
<updated>2019-12-01T20:59:06+00:00</updated>
<author>
<name>Anshuman Khandual</name>
<email>anshuman.khandual@arm.com</email>
</author>
<published>2019-12-01T01:55:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5e27a2df03b8933aa7c1579816ecb6a071bb0e0d'/>
<id>urn:sha1:5e27a2df03b8933aa7c1579816ecb6a071bb0e0d</id>
<content type='text'>
HugeTLB helper alloc_gigantic_page() implements fairly generic
allocation method where it scans over various zones looking for a large
contiguous pfn range before trying to allocate it with
alloc_contig_range().

Other than deriving the requested order from 'struct hstate', there is
nothing HugeTLB specific in there.  This can be made available for
general use to allocate contiguous memory which could not have been
allocated through the buddy allocator.

alloc_gigantic_page() has been split carving out actual allocation
method which is then made available via new alloc_contig_pages() helper
wrapped under CONFIG_CONTIG_ALLOC.  All references to 'gigantic' have
been replaced with more generic term 'contig'.  Allocated pages here
should be freed with free_contig_range() or by calling __free_page() on
each allocated page.

Link: http://lkml.kernel.org/r/1571300646-32240-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Pavel Tatashin &lt;pavel.tatashin@microsoft.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
