<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/mm/page-writeback.c, branch v6.1.168</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.1.168</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.1.168'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-06-27T10:07:30+00:00</updated>
<entry>
<title>mm: fix ratelimit_pages update error in dirty_ratio_handler()</title>
<updated>2025-06-27T10:07:30+00:00</updated>
<author>
<name>Jinliang Zheng</name>
<email>alexjlzheng@tencent.com</email>
</author>
<published>2025-04-15T09:02:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e6ccc1a5115b7c605be2530730555304f8c16fc1'/>
<id>urn:sha1:e6ccc1a5115b7c605be2530730555304f8c16fc1</id>
<content type='text'>
commit f83f362d40ccceb647f7d80eb92206733d76a36b upstream.

In dirty_ratio_handler(), vm_dirty_bytes must be set to zero before
calling writeback_set_ratelimit(), as global_dirty_limits() always
prioritizes the value of vm_dirty_bytes.

It's domain_dirty_limits() that's relevant here, not node_dirty_ok:

  dirty_ratio_handler
    writeback_set_ratelimit
      global_dirty_limits(&amp;dirty_thresh)           &lt;- ratelimit_pages based on dirty_thresh
        domain_dirty_limits
          if (bytes)                               &lt;- bytes = vm_dirty_bytes &lt;--------+
            thresh = f1(bytes)                     &lt;- prioritizes vm_dirty_bytes      |
          else                                                                        |
            thresh = f2(ratio)                                                        |
      ratelimit_pages = f3(dirty_thresh)                                              |
    vm_dirty_bytes = 0                             &lt;- it's late! ---------------------+

This causes ratelimit_pages to still use the value calculated based on
vm_dirty_bytes, which is wrong now.


The impact visible to userspace is difficult to capture directly because
there is no procfs/sysfs interface exported to user space.  However, it
will have a real impact on the balance of dirty pages.

For example:

1. On default, we have vm_dirty_ratio=40, vm_dirty_bytes=0

2. echo 8192 &gt; dirty_bytes, then vm_dirty_bytes=8192,
   vm_dirty_ratio=0, and ratelimit_pages is calculated based on
   vm_dirty_bytes now.

3. echo 20 &gt; dirty_ratio, then since vm_dirty_bytes is not reset to
   zero when writeback_set_ratelimit() -&gt; global_dirty_limits() -&gt;
   domain_dirty_limits() is called, reallimit_pages is still calculated
   based on vm_dirty_bytes instead of vm_dirty_ratio.  This does not
   conform to the actual intent of the user.

Link: https://lkml.kernel.org/r/20250415090232.7544-1-alexjlzheng@tencent.com
Fixes: 9d823e8f6b1b ("writeback: per task dirty rate limit")
Signed-off-by: Jinliang Zheng &lt;alexjlzheng@tencent.com&gt;
Reviewed-by: MengEn Sun &lt;mengensun@tencent.com&gt;
Cc: Andrea Righi &lt;andrea@betterlinux.com&gt;
Cc: Fenggaung Wu &lt;fengguang.wu@intel.com&gt;
Cc: Jinliang Zheng &lt;alexjlzheng@tencent.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>Revert "mm/writeback: fix possible divide-by-zero in wb_dirty_limits(), again"</title>
<updated>2024-07-11T10:47:14+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2024-06-21T14:42:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cbbe17a324437c0ff99881a3ee453da45b228a00'/>
<id>urn:sha1:cbbe17a324437c0ff99881a3ee453da45b228a00</id>
<content type='text'>
commit 30139c702048f1097342a31302cbd3d478f50c63 upstream.

Patch series "mm: Avoid possible overflows in dirty throttling".

Dirty throttling logic assumes dirty limits in page units fit into
32-bits.  This patch series makes sure this is true (see patch 2/2 for
more details).


This patch (of 2):

This reverts commit 9319b647902cbd5cc884ac08a8a6d54ce111fc78.

The commit is broken in several ways.  Firstly, the removed (u64) cast
from the multiplication will introduce a multiplication overflow on 32-bit
archs if wb_thresh * bg_thresh &gt;= 1&lt;&lt;32 (which is actually common - the
default settings with 4GB of RAM will trigger this).  Secondly, the
div64_u64() is unnecessarily expensive on 32-bit archs.  We have
div64_ul() in case we want to be safe &amp; cheap.  Thirdly, if dirty
thresholds are larger than 1&lt;&lt;32 pages, then dirty balancing is going to
blow up in many other spectacular ways anyway so trying to fix one
possible overflow is just moot.

Link: https://lkml.kernel.org/r/20240621144017.30993-1-jack@suse.cz
Link: https://lkml.kernel.org/r/20240621144246.11148-1-jack@suse.cz
Fixes: 9319b647902c ("mm/writeback: fix possible divide-by-zero in wb_dirty_limits(), again")
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Reviewed-By: Zach O'Keefe &lt;zokeefe@google.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>mm: avoid overflows in dirty throttling logic</title>
<updated>2024-07-11T10:47:13+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2024-06-21T14:42:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c83ed422c24f0d4b264f89291d4fabe285f80dbc'/>
<id>urn:sha1:c83ed422c24f0d4b264f89291d4fabe285f80dbc</id>
<content type='text'>
commit 385d838df280eba6c8680f9777bfa0d0bfe7e8b2 upstream.

The dirty throttling logic is interspersed with assumptions that dirty
limits in PAGE_SIZE units fit into 32-bit (so that various multiplications
fit into 64-bits).  If limits end up being larger, we will hit overflows,
possible divisions by 0 etc.  Fix these problems by never allowing so
large dirty limits as they have dubious practical value anyway.  For
dirty_bytes / dirty_background_bytes interfaces we can just refuse to set
so large limits.  For dirty_ratio / dirty_background_ratio it isn't so
simple as the dirty limit is computed from the amount of available memory
which can change due to memory hotplug etc.  So when converting dirty
limits from ratios to numbers of pages, we just don't allow the result to
exceed UINT_MAX.

This is root-only triggerable problem which occurs when the operator
sets dirty limits to &gt;16 TB.

Link: https://lkml.kernel.org/r/20240621144246.11148-2-jack@suse.cz
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Reported-by: Zach O'Keefe &lt;zokeefe@google.com&gt;
Reviewed-By: Zach O'Keefe &lt;zokeefe@google.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>mm/writeback: fix possible divide-by-zero in wb_dirty_limits(), again</title>
<updated>2024-02-23T08:12:32+00:00</updated>
<author>
<name>Zach O'Keefe</name>
<email>zokeefe@google.com</email>
</author>
<published>2024-01-18T18:19:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=16b1025eaa8fc223ab4273ece20d1c3a4211a95d'/>
<id>urn:sha1:16b1025eaa8fc223ab4273ece20d1c3a4211a95d</id>
<content type='text'>
commit 9319b647902cbd5cc884ac08a8a6d54ce111fc78 upstream.

(struct dirty_throttle_control *)-&gt;thresh is an unsigned long, but is
passed as the u32 divisor argument to div_u64().  On architectures where
unsigned long is 64 bytes, the argument will be implicitly truncated.

Use div64_u64() instead of div_u64() so that the value used in the "is
this a safe division" check is the same as the divisor.

Also, remove redundant cast of the numerator to u64, as that should happen
implicitly.

This would be difficult to exploit in memcg domain, given the ratio-based
arithmetic domain_drity_limits() uses, but is much easier in global
writeback domain with a BDI_CAP_STRICTLIMIT-backing device, using e.g.
vm.dirty_bytes=(1&lt;&lt;32)*PAGE_SIZE so that dtc-&gt;thresh == (1&lt;&lt;32)

Link: https://lkml.kernel.org/r/20240118181954.1415197-1-zokeefe@google.com
Fixes: f6789593d5ce ("mm/page-writeback.c: fix divide by zero in bdi_dirty_limits()")
Signed-off-by: Zach O'Keefe &lt;zokeefe@google.com&gt;
Cc: Maxim Patlasov &lt;MPatlasov@parallels.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>filemap: add a per-mapping stable writes flag</title>
<updated>2024-01-10T16:10:32+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2023-10-25T14:10:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a8e4300ae58dae7f181d7daa9034f127a33f217a'/>
<id>urn:sha1:a8e4300ae58dae7f181d7daa9034f127a33f217a</id>
<content type='text'>
[ Upstream commit 762321dab9a72760bf9aec48362f932717c9424d ]

folio_wait_stable waits for writeback to finish before modifying the
contents of a folio again, e.g. to support check summing of the data
in the block integrity code.

Currently this behavior is controlled by the SB_I_STABLE_WRITES flag
on the super_block, which means it is uniform for the entire file system.
This is wrong for the block device pseudofs which is shared by all
block devices, or file systems that can use multiple devices like XFS
witht the RT subvolume or btrfs (although btrfs currently reimplements
folio_wait_stable anyway).

Add a per-address_space AS_STABLE_WRITES flag to control the behavior
in a more fine grained way.  The existing SB_I_STABLE_WRITES is kept
to initialize AS_STABLE_WRITES to the existing default which covers
most cases.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Link: https://lore.kernel.org/r/20231025141020.192413-2-hch@lst.de
Tested-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Reviewed-by: Darrick J. Wong &lt;djwong@kernel.org&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
Stable-dep-of: 1898efcdbed3 ("block: update the stable_writes flag in bdev_add")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>mm: export balance_dirty_pages_ratelimited_flags()</title>
<updated>2022-09-26T10:28:07+00:00</updated>
<author>
<name>Stefan Roesch</name>
<email>shr@fb.com</email>
</author>
<published>2022-09-12T19:27:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=611df5d6616d80a22906c352ccd80c395982fbd9'/>
<id>urn:sha1:611df5d6616d80a22906c352ccd80c395982fbd9</id>
<content type='text'>
Export the function balance_dirty_pages_ratelimited_flags(). It is now
also called from btrfs.

Reviewed-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Signed-off-by: Stefan Roesch &lt;shr@fb.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>writeback: avoid use-after-free after removing device</title>
<updated>2022-08-28T21:02:43+00:00</updated>
<author>
<name>Khazhismel Kumykov</name>
<email>khazhy@chromium.org</email>
</author>
<published>2022-08-01T15:50:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f87904c075515f3e1d8f4a7115869d3b914674fd'/>
<id>urn:sha1:f87904c075515f3e1d8f4a7115869d3b914674fd</id>
<content type='text'>
When a disk is removed, bdi_unregister gets called to stop further
writeback and wait for associated delayed work to complete.  However,
wb_inode_writeback_end() may schedule bandwidth estimation dwork after
this has completed, which can result in the timer attempting to access the
just freed bdi_writeback.

Fix this by checking if the bdi_writeback is alive, similar to when
scheduling writeback work.

Since this requires wb-&gt;work_lock, and wb_inode_writeback_end() may get
called from interrupt, switch wb-&gt;work_lock to an irqsafe lock.

Link: https://lkml.kernel.org/r/20220801155034.3772543-1-khazhy@google.com
Fixes: 45a2966fd641 ("writeback: fix bandwidth estimate for spiky workload")
Signed-off-by: Khazhismel Kumykov &lt;khazhy@google.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: Michael Stapelberg &lt;stapelberg+linux@google.com&gt;
Cc: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: Add balance_dirty_pages_ratelimited_flags() function</title>
<updated>2022-07-25T00:39:31+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2022-06-23T17:51:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=fe6c9c6e3e3e332b998393d214fba9d09ab0acb0'/>
<id>urn:sha1:fe6c9c6e3e3e332b998393d214fba9d09ab0acb0</id>
<content type='text'>
This adds the helper function balance_dirty_pages_ratelimited_flags().
It adds the parameter flags to balance_dirty_pages_ratelimited().
The flags parameter is passed to balance_dirty_pages(). For async
buffered writes the flag value will be BDP_ASYNC.

If balance_dirty_pages() gets called for async buffered write, we don't
want to wait. Instead we need to indicate to the caller that throttling
is needed so that it can stop writing and offload the rest of the write
to a context that can block.

The new helper function is also used by balance_dirty_pages_ratelimited().

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Stefan Roesch &lt;shr@fb.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Link: https://lore.kernel.org/r/20220623175157.1715274-4-shr@fb.com
[axboe: fix kerneltest bot 'ret' issue]
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>mm: Move updates of dirty_exceeded into one place</title>
<updated>2022-07-25T00:39:31+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2022-06-23T17:51:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e92eebbb09218e128e559cf12b65317721309324'/>
<id>urn:sha1:e92eebbb09218e128e559cf12b65317721309324</id>
<content type='text'>
Transition of wb-&gt;dirty_exceeded from 0 to 1 happens before we go to
sleep in balance_dirty_pages() while transition from 1 to 0 happens when
exiting from balance_dirty_pages(), possibly based on old values. This
does not make a lot of sense since wb-&gt;dirty_exceeded should simply
reflect whether wb is over dirty limit and so we should ratelimit
entering to balance_dirty_pages() less. Move the two updates together.

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Stefan Roesch &lt;shr@fb.com&gt;
Link: https://lore.kernel.org/r/20220623175157.1715274-3-shr@fb.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>mm: Move starting of background writeback into the main balancing loop</title>
<updated>2022-07-25T00:39:31+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2022-06-23T17:51:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ea6813be07dcdc072aa9ad18099115a74cecb5e1'/>
<id>urn:sha1:ea6813be07dcdc072aa9ad18099115a74cecb5e1</id>
<content type='text'>
We start background writeback if we are over background threshold after
exiting the main loop in balance_dirty_pages(). This may result in
basing the decision on already stale values (we may have slept for
significant amount of time) and it is also inconvenient for refactoring
needed for async dirty throttling. Move the check into the main waiting
loop.

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Stefan Roesch &lt;shr@fb.com&gt;
Link: https://lore.kernel.org/r/20220623175157.1715274-2-shr@fb.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
</feed>
