<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/aio.c, branch v5.15.208</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v5.15.208</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v5.15.208'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-04-10T14:18:46+00:00</updated>
<entry>
<title>fs/aio: Check IOCB_AIO_RW before the struct aio_kiocb conversion</title>
<updated>2024-04-10T14:18:46+00:00</updated>
<author>
<name>Bart Van Assche</name>
<email>bvanassche@acm.org</email>
</author>
<published>2024-03-04T23:57:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a71cba07783abc76b547568b6452cd1dd9981410'/>
<id>urn:sha1:a71cba07783abc76b547568b6452cd1dd9981410</id>
<content type='text'>
commit 961ebd120565cb60cebe21cb634fbc456022db4a upstream.

The first kiocb_set_cancel_fn() argument may point at a struct kiocb
that is not embedded inside struct aio_kiocb. With the current code,
depending on the compiler, the req-&gt;ki_ctx read happens either before
the IOCB_AIO_RW test or after that test. Move the req-&gt;ki_ctx read such
that it is guaranteed that the IOCB_AIO_RW test happens first.

Reported-by: Eric Biggers &lt;ebiggers@kernel.org&gt;
Cc: Benjamin LaHaise &lt;ben@communityfibre.ca&gt;
Cc: Eric Biggers &lt;ebiggers@google.com&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Avi Kivity &lt;avi@scylladb.com&gt;
Cc: Sandeep Dhavale &lt;dhavale@google.com&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
Cc: stable@vger.kernel.org
Fixes: b820de741ae4 ("fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via libaio")
Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Link: https://lore.kernel.org/r/20240304235715.3790858-1-bvanassche@acm.org
Reviewed-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Reviewed-by: Eric Biggers &lt;ebiggers@google.com&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via libaio</title>
<updated>2024-03-01T12:22:00+00:00</updated>
<author>
<name>Bart Van Assche</name>
<email>bvanassche@acm.org</email>
</author>
<published>2024-02-15T20:47:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d7b6fa97ec894edd02f64b83e5e72e1aa352f353'/>
<id>urn:sha1:d7b6fa97ec894edd02f64b83e5e72e1aa352f353</id>
<content type='text'>
commit b820de741ae48ccf50dd95e297889c286ff4f760 upstream.

If kiocb_set_cancel_fn() is called for I/O submitted via io_uring, the
following kernel warning appears:

WARNING: CPU: 3 PID: 368 at fs/aio.c:598 kiocb_set_cancel_fn+0x9c/0xa8
Call trace:
 kiocb_set_cancel_fn+0x9c/0xa8
 ffs_epfile_read_iter+0x144/0x1d0
 io_read+0x19c/0x498
 io_issue_sqe+0x118/0x27c
 io_submit_sqes+0x25c/0x5fc
 __arm64_sys_io_uring_enter+0x104/0xab0
 invoke_syscall+0x58/0x11c
 el0_svc_common+0xb4/0xf4
 do_el0_svc+0x2c/0xb0
 el0_svc+0x2c/0xa4
 el0t_64_sync_handler+0x68/0xb4
 el0t_64_sync+0x1a4/0x1a8

Fix this by setting the IOCB_AIO_RW flag for read and write I/O that is
submitted by libaio.

Suggested-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Avi Kivity &lt;avi@scylladb.com&gt;
Cc: Sandeep Dhavale &lt;dhavale@google.com&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Link: https://lore.kernel.org/r/20240215204739.2677806-2-bvanassche@acm.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>aio: fix mremap after fork null-deref</title>
<updated>2023-02-22T11:57:05+00:00</updated>
<author>
<name>Seth Jenkins</name>
<email>sethjenkins@google.com</email>
</author>
<published>2023-01-31T17:25:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=178993157e8c50aef7f35d7d6d3b44bb428199e1'/>
<id>urn:sha1:178993157e8c50aef7f35d7d6d3b44bb428199e1</id>
<content type='text'>
commit 81e9d6f8647650a7bead74c5f926e29970e834d1 upstream.

Commit e4a0d3e720e7 ("aio: Make it possible to remap aio ring") introduced
a null-deref if mremap is called on an old aio mapping after fork as
mm-&gt;ioctx_table will be set to NULL.

[jmoyer@redhat.com: fix 80 column issue]
Link: https://lkml.kernel.org/r/x49sffq4nvg.fsf@segfault.boston.devel.redhat.com
Fixes: e4a0d3e720e7 ("aio: Make it possible to remap aio ring")
Signed-off-by: Seth Jenkins &lt;sethjenkins@google.com&gt;
Signed-off-by: Jeff Moyer &lt;jmoyer@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Benjamin LaHaise &lt;bcrl@kvack.org&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Pavel Emelyanov &lt;xemul@parallels.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>aio: Fix incorrect usage of eventfd_signal_allowed()</title>
<updated>2021-12-14T09:57:22+00:00</updated>
<author>
<name>Xie Yongji</name>
<email>xieyongji@bytedance.com</email>
</author>
<published>2021-09-13T11:19:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4a7c65506473dc1415e6eadcea16e53d12740f6b'/>
<id>urn:sha1:4a7c65506473dc1415e6eadcea16e53d12740f6b</id>
<content type='text'>
commit 4b3749865374899e115aa8c48681709b086fe6d3 upstream.

We should defer eventfd_signal() to the workqueue when
eventfd_signal_allowed() return false rather than return
true.

Fixes: b542e383d8c0 ("eventfd: Make signal recursion protection a task bit")
Signed-off-by: Xie Yongji &lt;xieyongji@bytedance.com&gt;
Link: https://lore.kernel.org/r/20210913111928.98-1-xieyongji@bytedance.com
Reviewed-by: Eric Biggers &lt;ebiggers@google.com&gt;
Signed-off-by: Eric Biggers &lt;ebiggers@google.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>aio: fix use-after-free due to missing POLLFREE handling</title>
<updated>2021-12-14T09:57:15+00:00</updated>
<author>
<name>Eric Biggers</name>
<email>ebiggers@google.com</email>
</author>
<published>2021-12-09T01:04:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=60d311f9e6381d779d7d53371f87285698ecee24'/>
<id>urn:sha1:60d311f9e6381d779d7d53371f87285698ecee24</id>
<content type='text'>
commit 50252e4b5e989ce64555c7aef7516bdefc2fea72 upstream.

signalfd_poll() and binder_poll() are special in that they use a
waitqueue whose lifetime is the current task, rather than the struct
file as is normally the case.  This is okay for blocking polls, since a
blocking poll occurs within one task; however, non-blocking polls
require another solution.  This solution is for the queue to be cleared
before it is freed, by sending a POLLFREE notification to all waiters.

Unfortunately, only eventpoll handles POLLFREE.  A second type of
non-blocking poll, aio poll, was added in kernel v4.18, and it doesn't
handle POLLFREE.  This allows a use-after-free to occur if a signalfd or
binder fd is polled with aio poll, and the waitqueue gets freed.

Fix this by making aio poll handle POLLFREE.

A patch by Ramji Jiyani &lt;ramjiyani@google.com&gt;
(https://lore.kernel.org/r/20211027011834.2497484-1-ramjiyani@google.com)
tried to do this by making aio_poll_wake() always complete the request
inline if POLLFREE is seen.  However, that solution had two bugs.
First, it introduced a deadlock, as it unconditionally locked the aio
context while holding the waitqueue lock, which inverts the normal
locking order.  Second, it didn't consider that POLLFREE notifications
are missed while the request has been temporarily de-queued.

The second problem was solved by my previous patch.  This patch then
properly fixes the use-after-free by handling POLLFREE in a
deadlock-free way.  It does this by taking advantage of the fact that
freeing of the waitqueue is RCU-delayed, similar to what eventpoll does.

Fixes: 2c14fa838cbe ("aio: implement IOCB_CMD_POLL")
Cc: &lt;stable@vger.kernel.org&gt; # v4.18+
Link: https://lore.kernel.org/r/20211209010455.42744-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers &lt;ebiggers@google.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>aio: keep poll requests on waitqueue until completed</title>
<updated>2021-12-14T09:57:15+00:00</updated>
<author>
<name>Eric Biggers</name>
<email>ebiggers@google.com</email>
</author>
<published>2021-12-09T01:04:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=924f51534d428e91c98ea309ab16270f5e8289c6'/>
<id>urn:sha1:924f51534d428e91c98ea309ab16270f5e8289c6</id>
<content type='text'>
commit 363bee27e25804d8981dd1c025b4ad49dc39c530 upstream.

Currently, aio_poll_wake() will always remove the poll request from the
waitqueue.  Then, if aio_poll_complete_work() sees that none of the
polled events are ready and the request isn't cancelled, it re-adds the
request to the waitqueue.  (This can easily happen when polling a file
that doesn't pass an event mask when waking up its waitqueue.)

This is fundamentally broken for two reasons:

  1. If a wakeup occurs between vfs_poll() and the request being
     re-added to the waitqueue, it will be missed because the request
     wasn't on the waitqueue at the time.  Therefore, IOCB_CMD_POLL
     might never complete even if the polled file is ready.

  2. When the request isn't on the waitqueue, there is no way to be
     notified that the waitqueue is being freed (which happens when its
     lifetime is shorter than the struct file's).  This is supposed to
     happen via the waitqueue entries being woken up with POLLFREE.

Therefore, leave the requests on the waitqueue until they are actually
completed (or cancelled).  To keep track of when aio_poll_complete_work
needs to be scheduled, use new fields in struct poll_iocb.  Remove the
'done' field which is now redundant.

Note that this is consistent with how sys_poll() and eventpoll work;
their wakeup functions do *not* remove the waitqueue entries.

Fixes: 2c14fa838cbe ("aio: implement IOCB_CMD_POLL")
Cc: &lt;stable@vger.kernel.org&gt; # v4.18+
Link: https://lore.kernel.org/r/20211209010455.42744-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers &lt;ebiggers@google.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>eventfd: Make signal recursion protection a task bit</title>
<updated>2021-08-27T23:33:02+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2021-07-29T11:01:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b542e383d8c005f06a131e2b40d5889b812f19c6'/>
<id>urn:sha1:b542e383d8c005f06a131e2b40d5889b812f19c6</id>
<content type='text'>
The recursion protection for eventfd_signal() is based on a per CPU
variable and relies on the !RT semantics of spin_lock_irqsave() for
protecting this per CPU variable. On RT kernels spin_lock_irqsave() neither
disables preemption nor interrupts which allows the spin lock held section
to be preempted. If the preempting task invokes eventfd_signal() as well,
then the recursion warning triggers.

Paolo suggested to protect the per CPU variable with a local lock, but
that's heavyweight and actually not necessary. The goal of this protection
is to prevent the task stack from overflowing, which can be achieved with a
per task recursion protection as well.

Replace the per CPU variable with a per task bit similar to other recursion
protection bits like task_struct::in_page_owner. This works on both !RT and
RT kernels and removes as a side effect the extra per CPU storage.

No functional change for !RT kernels.

Reported-by: Daniel Bristot de Oliveira &lt;bristot@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Daniel Bristot de Oliveira &lt;bristot@redhat.com&gt;
Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Link: https://lore.kernel.org/r/87wnp9idso.ffs@tglx

</content>
</entry>
<entry>
<title>Revert "mremap: don't allow MREMAP_DONTUNMAP on special_mappings and aio"</title>
<updated>2021-04-30T18:20:39+00:00</updated>
<author>
<name>Brian Geffon</name>
<email>bgeffon@google.com</email>
</author>
<published>2021-04-30T05:57:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=14d071134c740cfe61c09fc506fd3ab052beea10'/>
<id>urn:sha1:14d071134c740cfe61c09fc506fd3ab052beea10</id>
<content type='text'>
This reverts commit cd544fd1dc9293c6702fab6effa63dac1cc67e99.

As discussed in [1] this commit was a no-op because the mapping type was
checked in vma_to_resize before move_vma is ever called.  This meant that
vm_ops-&gt;mremap() would never be called on such mappings.  Furthermore,
we've since expanded support of MREMAP_DONTUNMAP to non-anonymous
mappings, and these special mappings are still protected by the existing
check of !VM_DONTEXPAND and !VM_PFNMAP which will result in a -EINVAL.

1. https://lkml.org/lkml/2020/12/28/2340

Link: https://lkml.kernel.org/r/20210323182520.2712101-2-bgeffon@google.com
Signed-off-by: Brian Geffon &lt;bgeffon@google.com&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Reviewed-by: Dmitry Safonov &lt;0x7f454c46@gmail.com&gt;
Cc: Alejandro Colomar &lt;alx.manpages@gmail.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Cc: Axel Rasmussen &lt;axelrasmussen@google.com&gt;
Cc: "Kirill A . Shutemov" &lt;kirill@shutemov.name&gt;
Cc: Lokesh Gidra &lt;lokeshgidra@google.com&gt;
Cc: Michael Kerrisk &lt;mtk.manpages@gmail.com&gt;
Cc: "Michael S . Tsirkin" &lt;mst@redhat.com&gt;
Cc: Mike Rapoport &lt;rppt@linux.vnet.ibm.com&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Peter Xu &lt;peterx@redhat.com&gt;
Cc: Sonny Rao &lt;sonnyrao@google.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'akpm' (patches from Andrew)</title>
<updated>2020-12-15T20:53:37+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-12-15T20:53:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ac73e3dc8acd0a3be292755db30388c3580f5674'/>
<id>urn:sha1:ac73e3dc8acd0a3be292755db30388c3580f5674</id>
<content type='text'>
Merge misc updates from Andrew Morton:

 - a few random little subsystems

 - almost all of the MM patches which are staged ahead of linux-next
   material. I'll trickle to post-linux-next work in as the dependents
   get merged up.

Subsystems affected by this patch series: kthread, kbuild, ide, ntfs,
ocfs2, arch, and mm (slab-generic, slab, slub, dax, debug, pagecache,
gup, swap, shmem, memcg, pagemap, mremap, hmm, vmalloc, documentation,
kasan, pagealloc, memory-failure, hugetlb, vmscan, z3fold, compaction,
oom-kill, migration, cma, page-poison, userfaultfd, zswap, zsmalloc,
uaccess, zram, and cleanups).

* emailed patches from Andrew Morton &lt;akpm@linux-foundation.org&gt;: (200 commits)
  mm: cleanup kstrto*() usage
  mm: fix fall-through warnings for Clang
  mm: slub: convert sysfs sprintf family to sysfs_emit/sysfs_emit_at
  mm: shmem: convert shmem_enabled_show to use sysfs_emit_at
  mm:backing-dev: use sysfs_emit in macro defining functions
  mm: huge_memory: convert remaining use of sprintf to sysfs_emit and neatening
  mm: use sysfs_emit for struct kobject * uses
  mm: fix kernel-doc markups
  zram: break the strict dependency from lzo
  zram: add stat to gather incompressible pages since zram set up
  zram: support page writeback
  mm/process_vm_access: remove redundant initialization of iov_r
  mm/zsmalloc.c: rework the list_add code in insert_zspage()
  mm/zswap: move to use crypto_acomp API for hardware acceleration
  mm/zswap: fix passing zero to 'PTR_ERR' warning
  mm/zswap: make struct kernel_param_ops definitions const
  userfaultfd/selftests: hint the test runner on required privilege
  userfaultfd/selftests: fix retval check for userfaultfd_open()
  userfaultfd/selftests: always dump something in modes
  userfaultfd: selftests: make __{s,u}64 format specifiers portable
  ...
</content>
</entry>
<entry>
<title>mremap: don't allow MREMAP_DONTUNMAP on special_mappings and aio</title>
<updated>2020-12-15T20:13:41+00:00</updated>
<author>
<name>Dmitry Safonov</name>
<email>dima@arista.com</email>
</author>
<published>2020-12-15T03:08:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cd544fd1dc9293c6702fab6effa63dac1cc67e99'/>
<id>urn:sha1:cd544fd1dc9293c6702fab6effa63dac1cc67e99</id>
<content type='text'>
As kernel expect to see only one of such mappings, any further operations
on the VMA-copy may be unexpected by the kernel.  Maybe it's being on the
safe side, but there doesn't seem to be any expected use-case for this, so
restrict it now.

Link: https://lkml.kernel.org/r/20201013013416.390574-4-dima@arista.com
Fixes: commit e346b3813067 ("mm/mremap: add MREMAP_DONTUNMAP to mremap()")
Signed-off-by: Dmitry Safonov &lt;dima@arista.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Brian Geffon &lt;bgeffon@google.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Dave Jiang &lt;dave.jiang@intel.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: John Hubbard &lt;jhubbard@nvidia.com&gt;
Cc: "Kirill A. Shutemov" &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Ralph Campbell &lt;rcampbell@nvidia.com&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Thomas Bogendoerfer &lt;tsbogend@alpha.franken.de&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vishal Verma &lt;vishal.l.verma@intel.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
