<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/dax.c, branch v5.10.257</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v5.10.257</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v5.10.257'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-12-06T21:08:18+00:00</updated>
<entry>
<title>fsdax: mark the iomap argument to dax_iomap_sector as const</title>
<updated>2025-12-06T21:08:18+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2025-11-09T11:47:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e5c426a83685f249da0c6d3c049c5f4e82b44f9a'/>
<id>urn:sha1:e5c426a83685f249da0c6d3c049c5f4e82b44f9a</id>
<content type='text'>
[ Upstream commit 7e4f4b2d689d959b03cb07dfbdb97b9696cb1076 ]

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Darrick J. Wong &lt;djwong@kernel.org&gt;
Signed-off-by: Darrick J. Wong &lt;djwong@kernel.org&gt;
Signed-off-by: Eliav Farber &lt;farbere@amazon.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>fsdax: Fix infinite loop in dax_iomap_rw()</title>
<updated>2025-10-29T13:01:25+00:00</updated>
<author>
<name>Li Jinlin</name>
<email>lijinlin3@huawei.com</email>
</author>
<published>2022-07-25T03:20:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=463f36137c40342fb03bba380c1bf703c40d89a6'/>
<id>urn:sha1:463f36137c40342fb03bba380c1bf703c40d89a6</id>
<content type='text'>
commit 17d9c15c9b9e7fb285f7ac5367dfb5f00ff575e3 upstream.

I got an infinite loop and a WARNING report when executing a tail command
in virtiofs.

  WARNING: CPU: 10 PID: 964 at fs/iomap/iter.c:34 iomap_iter+0x3a2/0x3d0
  Modules linked in:
  CPU: 10 PID: 964 Comm: tail Not tainted 5.19.0-rc7
  Call Trace:
  &lt;TASK&gt;
  dax_iomap_rw+0xea/0x620
  ? __this_cpu_preempt_check+0x13/0x20
  fuse_dax_read_iter+0x47/0x80
  fuse_file_read_iter+0xae/0xd0
  new_sync_read+0xfe/0x180
  ? 0xffffffff81000000
  vfs_read+0x14d/0x1a0
  ksys_read+0x6d/0xf0
  __x64_sys_read+0x1a/0x20
  do_syscall_64+0x3b/0x90
  entry_SYSCALL_64_after_hwframe+0x63/0xcd

The tail command will call read() with a count of 0. In this case,
iomap_iter() will report this WARNING, and always return 1 which casuing
the infinite loop in dax_iomap_rw().

Fixing by checking count whether is 0 in dax_iomap_rw().

Fixes: ca289e0b95af ("fsdax: switch dax_iomap_rw to use iomap_iter")
Signed-off-by: Li Jinlin &lt;lijinlin3@huawei.com&gt;
Reviewed-by: Darrick J. Wong &lt;djwong@kernel.org&gt;
Link: https://lore.kernel.org/r/20220725032050.3873372-1-lijinlin3@huawei.com
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>dax: skip read lock assertion for read-only filesystems</title>
<updated>2025-10-29T13:01:18+00:00</updated>
<author>
<name>Yuezhang Mo</name>
<email>Yuezhang.Mo@sony.com</email>
</author>
<published>2025-09-30T05:42:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=92ce9b4d2de7fd33f56ad45cc314d01ad26a76e8'/>
<id>urn:sha1:92ce9b4d2de7fd33f56ad45cc314d01ad26a76e8</id>
<content type='text'>
[ Upstream commit 154d1e7ad9e5ce4b2aaefd3862b3dba545ad978d ]

The commit 168316db3583("dax: assert that i_rwsem is held
exclusive for writes") added lock assertions to ensure proper
locking in DAX operations. However, these assertions trigger
false-positive lockdep warnings since read lock is unnecessary
on read-only filesystems(e.g., erofs).

This patch skips the read lock assertion for read-only filesystems,
eliminating the spurious warnings while maintaining the integrity
checks for writable filesystems.

Fixes: 168316db3583 ("dax: assert that i_rwsem is held exclusive for writes")
Signed-off-by: Yuezhang Mo &lt;Yuezhang.Mo@sony.com&gt;
Reviewed-by: Friendy Su &lt;friendy.su@sony.com&gt;
Reviewed-by: Daniel Palmer &lt;daniel.palmer@sony.com&gt;
Reviewed-by: Gao Xiang &lt;hsiangkao@linux.alibaba.com&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>fsdax: switch dax_iomap_rw to use iomap_iter</title>
<updated>2025-10-29T13:01:18+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2021-08-11T01:33:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8df4919cb921b28809d05feae3e98dc5d8b48146'/>
<id>urn:sha1:8df4919cb921b28809d05feae3e98dc5d8b48146</id>
<content type='text'>
[ Upstream commit ca289e0b95afa973d204c77a4ad5c37e06145fbf ]

Switch the dax_iomap_rw implementation to use iomap_iter.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Darrick J. Wong &lt;djwong@kernel.org&gt;
Signed-off-by: Darrick J. Wong &lt;djwong@kernel.org&gt;
Stable-dep-of: 154d1e7ad9e5 ("dax: skip read lock assertion for read-only filesystems")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>dax: fix cache flush on PMD-mapped pages</title>
<updated>2022-06-09T08:21:16+00:00</updated>
<author>
<name>Muchun Song</name>
<email>songmuchun@bytedance.com</email>
</author>
<published>2022-04-29T06:16:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c1219429179d861aef8e723596bd9e4d928e543a'/>
<id>urn:sha1:c1219429179d861aef8e723596bd9e4d928e543a</id>
<content type='text'>
[ Upstream commit e583b5c472bd23d450e06f148dc1f37be74f7666 ]

The flush_cache_page() only remove a PAGE_SIZE sized range from the cache.
However, it does not cover the full pages in a THP except a head page.
Replace it with flush_cache_range() to fix this issue.  This is just a
documentation issue with the respect to properly documenting the expected
usage of cache flushing before modifying the pmd.  However, in practice
this is not a problem due to the fact that DAX is not available on
architectures with virtually indexed caches per:

  commit d92576f1167c ("dax: does not work correctly with virtual aliasing caches")

Link: https://lkml.kernel.org/r/20220403053957.10770-3-songmuchun@bytedance.com
Fixes: f729c8c9b24f ("dax: wrprotect pmd_t in dax_mapping_entry_mkclean")
Signed-off-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Reviewed-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Alistair Popple &lt;apopple@nvidia.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: "Kirill A. Shutemov" &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Ralph Campbell &lt;rcampbell@nvidia.com&gt;
Cc: Ross Zwisler &lt;zwisler@kernel.org&gt;
Cc: Xiongchun Duan &lt;duanxiongchun@bytedance.com&gt;
Cc: Xiyu Yang &lt;xiyuyang19@fudan.edu.cn&gt;
Cc: Yang Shi &lt;shy828301@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>dax: fix ENOMEM handling in grab_mapping_entry()</title>
<updated>2021-07-14T14:56:13+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2021-06-29T02:35:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c872674da72415a41159ed97e8965890f102488e'/>
<id>urn:sha1:c872674da72415a41159ed97e8965890f102488e</id>
<content type='text'>
[ Upstream commit 1a14e3779dd58c16b30e56558146e5cc850ba8b0 ]

grab_mapping_entry() has a bug in handling of ENOMEM condition.  Suppose
we have a PMD entry at index i which we are downgrading to a PTE entry.
grab_mapping_entry() will set pmd_downgrade to true, lock the entry, clear
the entry in xarray, and decrement mapping-&gt;nrpages.  The it will call:

	entry = dax_make_entry(pfn_to_pfn_t(0), flags);
	dax_lock_entry(xas, entry);

which inserts new PTE entry into xarray.  However this may fail allocating
the new node.  We handle this by:

	if (xas_nomem(xas, mapping_gfp_mask(mapping) &amp; ~__GFP_HIGHMEM))
		goto retry;

however pmd_downgrade stays set to true even though 'entry' returned from
get_unlocked_entry() will be NULL now.  And we will go again through the
downgrade branch.  This is mostly harmless except that mapping-&gt;nrpages is
decremented again and we temporarily have an invalid entry stored in
xarray.  Fix the problem by setting pmd_downgrade to false each time we
lookup the entry we work with so that it matches the entry we found.

Link: https://lkml.kernel.org/r/20210622160015.18004-1-jack@suse.cz
Fixes: b15cd800682f ("dax: Convert page fault handlers to XArray")
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Reviewed-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: "Aneesh Kumar K.V" &lt;aneesh.kumar@linux.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>dax: Wake up all waiters after invalidating dax entry</title>
<updated>2021-05-19T08:13:12+00:00</updated>
<author>
<name>Vivek Goyal</name>
<email>vgoyal@redhat.com</email>
</author>
<published>2021-04-28T19:03:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9eaa10be0c08d99e8d5e6063f670b2f6e1e3f02b'/>
<id>urn:sha1:9eaa10be0c08d99e8d5e6063f670b2f6e1e3f02b</id>
<content type='text'>
[ Upstream commit 237388320deffde7c2d65ed8fc9eef670dc979b3 ]

I am seeing missed wakeups which ultimately lead to a deadlock when I am
using virtiofs with DAX enabled and running "make -j". I had to mount
virtiofs as rootfs and also reduce to dax window size to 256M to reproduce
the problem consistently.

So here is the problem. put_unlocked_entry() wakes up waiters only
if entry is not null as well as !dax_is_conflict(entry). But if I
call multiple instances of invalidate_inode_pages2() in parallel,
then I can run into a situation where there are waiters on
this index but nobody will wake these waiters.

invalidate_inode_pages2()
  invalidate_inode_pages2_range()
    invalidate_exceptional_entry2()
      dax_invalidate_mapping_entry_sync()
        __dax_invalidate_entry() {
                xas_lock_irq(&amp;xas);
                entry = get_unlocked_entry(&amp;xas, 0);
                ...
                ...
                dax_disassociate_entry(entry, mapping, trunc);
                xas_store(&amp;xas, NULL);
                ...
                ...
                put_unlocked_entry(&amp;xas, entry);
                xas_unlock_irq(&amp;xas);
        }

Say a fault in in progress and it has locked entry at offset say "0x1c".
Now say three instances of invalidate_inode_pages2() are in progress
(A, B, C) and they all try to invalidate entry at offset "0x1c". Given
dax entry is locked, all tree instances A, B, C will wait in wait queue.

When dax fault finishes, say A is woken up. It will store NULL entry
at index "0x1c" and wake up B. When B comes along it will find "entry=0"
at page offset 0x1c and it will call put_unlocked_entry(&amp;xas, 0). And
this means put_unlocked_entry() will not wake up next waiter, given
the current code. And that means C continues to wait and is not woken
up.

This patch fixes the issue by waking up all waiters when a dax entry
has been invalidated. This seems to fix the deadlock I am facing
and I can make forward progress.

Reported-by: Sergio Lopez &lt;slp@redhat.com&gt;
Fixes: ac401cc78242 ("dax: New fault locking")
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Suggested-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Link: https://lore.kernel.org/r/20210428190314.1865312-4-vgoyal@redhat.com
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>dax: Add a wakeup mode parameter to put_unlocked_entry()</title>
<updated>2021-05-19T08:13:12+00:00</updated>
<author>
<name>Vivek Goyal</name>
<email>vgoyal@redhat.com</email>
</author>
<published>2021-04-28T19:03:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e9e70b78e163f768aee90f621566a5b7055fce17'/>
<id>urn:sha1:e9e70b78e163f768aee90f621566a5b7055fce17</id>
<content type='text'>
[ Upstream commit 4c3d043d271d4d629aa2328796cdfc96b37d3b3c ]

As of now put_unlocked_entry() always wakes up next waiter. In next
patches we want to wake up all waiters at one callsite. Hence, add a
parameter to the function.

This patch does not introduce any change of behavior.

Reviewed-by: Greg Kurz &lt;groug@kaod.org&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Suggested-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Link: https://lore.kernel.org/r/20210428190314.1865312-3-vgoyal@redhat.com
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>dax: Add an enum for specifying dax wakup mode</title>
<updated>2021-05-19T08:13:12+00:00</updated>
<author>
<name>Vivek Goyal</name>
<email>vgoyal@redhat.com</email>
</author>
<published>2021-04-28T19:03:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b93d3410e789b027dd6845362a8738d58382194a'/>
<id>urn:sha1:b93d3410e789b027dd6845362a8738d58382194a</id>
<content type='text'>
[ Upstream commit 698ab77aebffe08b312fbcdddeb0e8bd08b78717 ]

Dan mentioned that he is not very fond of passing around a boolean true/false
to specify if only next waiter should be woken up or all waiters should be
woken up. He instead prefers that we introduce an enum and make it very
explicity at the callsite itself. Easier to read code.

This patch should not introduce any change of behavior.

Reviewed-by: Greg Kurz &lt;groug@kaod.org&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Suggested-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Link: https://lore.kernel.org/r/20210428190314.1865312-2-vgoyal@redhat.com
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>mm: provide a saner PTE walking API for modules</title>
<updated>2021-02-26T09:13:01+00:00</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2021-02-05T10:07:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a42150f1c965d23ea858c1931f53b591d9e817d4'/>
<id>urn:sha1:a42150f1c965d23ea858c1931f53b591d9e817d4</id>
<content type='text'>
commit 9fd6dad1261a541b3f5fa7dc5b152222306e6702 upstream.

Currently, the follow_pfn function is exported for modules but
follow_pte is not.  However, follow_pfn is very easy to misuse,
because it does not provide protections (so most of its callers
assume the page is writable!) and because it returns after having
already unlocked the page table lock.

Provide instead a simplified version of follow_pte that does
not have the pmdpp and range arguments.  The older version
survives as follow_invalidate_pte() for use by fs/dax.c.

Reviewed-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
</feed>
