summaryrefslogtreecommitdiff
path: root/fs/fuse/dev.c
AgeCommit message (Collapse)AuthorFilesLines
2022-04-20fuse: fix pipe buffer lifetime for direct_ioMiklos Szeredi1-1/+11
commit 0c4bcfdecb1ac0967619ee7ff44871d93c08c909 upstream. In FOPEN_DIRECT_IO mode, fuse_file_write_iter() calls fuse_direct_write_iter(), which normally calls fuse_direct_io(), which then imports the write buffer with fuse_get_user_pages(), which uses iov_iter_get_pages() to grab references to userspace pages instead of actually copying memory. On the filesystem device side, these pages can then either be read to userspace (via fuse_dev_read()), or splice()d over into a pipe using fuse_dev_splice_read() as pipe buffers with &nosteal_pipe_buf_ops. This is wrong because after fuse_dev_do_read() unlocks the FUSE request, the userspace filesystem can mark the request as completed, causing write() to return. At that point, the userspace filesystem should no longer have access to the pipe buffer. Fix by copying pages coming from the user address space to new pipe buffers. Reported-by: Jann Horn <jannh@google.com> Fixes: c3021629a0d8 ("fuse: support splice() reading from fuse device") Cc: <stable@vger.kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-08fuse: release pipe buf after last useMiklos Szeredi1-5/+5
commit 473441720c8616dfaf4451f9c7ea14f0eb5e5d65 upstream. Checking buf->flags should be done before the pipe_buf_release() is called on the pipe buffer, since releasing the buffer might modify the flags. This is exactly what page_cache_pipe_buf_release() does, and which results in the same VM_BUG_ON_PAGE(PageLRU(page)) that the original patch was trying to fix. Reported-by: Justin Forbes <jmforbes@linuxtx.org> Fixes: 712a951025c0 ("fuse: fix page stealing") Cc: <stable@vger.kernel.org> # v2.6.35 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-08fuse: fix page stealingMiklos Szeredi1-2/+12
commit 712a951025c0667ff00b25afc360f74e639dfabe upstream. It is possible to trigger a crash by splicing anon pipe bufs to the fuse device. The reason for this is that anon_pipe_buf_release() will reuse buf->page if the refcount is 1, but that page might have already been stolen and its flags modified (e.g. PG_lru added). This happens in the unlikely case of fuse_dev_splice_write() getting around to calling pipe_buf_release() after a page has been stolen, added to the page cache and removed from the page cache. Fix by calling pipe_buf_release() right after the page was inserted into the page cache. In this case the page has an elevated refcount so any release function will know that the page isn't reusable. Reported-by: Frank Dinoff <fdinoff@google.com> Link: https://lore.kernel.org/r/CAAmZXrsGg2xsP1CK+cbuEMumtrqdvD-NKnWzhNcvn71RV3c1yw@mail.gmail.com/ Fixes: dd3bb14f44a6 ("fuse: support splice() writing to fuse device") Cc: <stable@vger.kernel.org> # v2.6.35 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20fuse: reject internal errnoMiklos Szeredi1-1/+1
commit 49221cf86d18bb66fe95d3338cb33bd4b9880ca5 upstream. Don't allow userspace to report errors that could be kernel-internal. Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Fixes: 334f485df85a ("[PATCH] FUSE - device functions") Cc: <stable@vger.kernel.org> # v2.6.14 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20fuse: check connected before queueing on fpq->ioMiklos Szeredi1-0/+9
commit 80ef08670d4c28a06a3de954bd350368780bcfef upstream. A request could end up on the fpq->io list after fuse_abort_conn() has reset fpq->connected and aborted requests on that list: Thread-1 Thread-2 ======== ======== ->fuse_simple_request() ->shutdown ->__fuse_request_send() ->queue_request() ->fuse_abort_conn() ->fuse_dev_do_read() ->acquire(fpq->lock) ->wait_for(fpq->lock) ->set err to all req's in fpq->io ->release(fpq->lock) ->acquire(fpq->lock) ->add req to fpq->io After the userspace copy is done the request will be ended, but req->out.h.error will remain uninitialized. Also the copy might block despite being already aborted. Fix both issues by not allowing the request to be queued on the fpq->io list after fuse_abort_conn() has processed this list. Reported-by: Pradeep P V K <pragalla@codeaurora.org> Fixes: fd22d62ed0c3 ("fuse: no fc->lock for iqueue parts") Cc: <stable@vger.kernel.org> # v4.2 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-05fuse: fix page dereference after freeMiklos Szeredi1-10/+18
commit d78092e4937de9ce55edcb4ee4c5e3c707be0190 upstream. After unlock_request() pages from the ap->pages[] array may be put (e.g. by aborting the connection) and the pages can be freed. Prevent use after free by grabbing a reference to the page before calling unlock_request(). The original patch was created by Pradeep P V K. Reported-by: Pradeep P V K <ppvk@codeaurora.org> Cc: <stable@vger.kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-01fuse: don't check refcount after stealing pageMiklos Szeredi1-1/+0
[ Upstream commit 32f98877c57bee6bc27f443a96f49678a2cd6a50 ] page_count() is unstable. Unless there has been an RCU grace period between when the page was removed from the page cache and now, a speculative reference may exist from the page cache. Reported-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-15fuse: retrieve: cap requested size to negotiated max_writeKirill Smelkov1-1/+1
[ Upstream commit 7640682e67b33cab8628729afec8ca92b851394f ] FUSE filesystem server and kernel client negotiate during initialization phase, what should be the maximum write size the client will ever issue. Correspondingly the filesystem server then queues sys_read calls to read requests with buffer capacity large enough to carry request header + that max_write bytes. A filesystem server is free to set its max_write in anywhere in the range between [1*page, fc->max_pages*page]. In particular go-fuse[2] sets max_write by default as 64K, wheres default fc->max_pages corresponds to 128K. Libfuse also allows users to configure max_write, but by default presets it to possible maximum. If max_write is < fc->max_pages*page, and in NOTIFY_RETRIEVE handler we allow to retrieve more than max_write bytes, corresponding prepared NOTIFY_REPLY will be thrown away by fuse_dev_do_read, because the filesystem server, in full correspondence with server/client contract, will be only queuing sys_read with ~max_write buffer capacity, and fuse_dev_do_read throws away requests that cannot fit into server request buffer. In turn the filesystem server could get stuck waiting indefinitely for NOTIFY_REPLY since NOTIFY_RETRIEVE handler returned OK which is understood by clients as that NOTIFY_REPLY was queued and will be sent back. Cap requested size to negotiate max_write to avoid the problem. This aligns with the way NOTIFY_RETRIEVE handler works, which already unconditionally caps requested retrieve size to fuse_conn->max_pages. This way it should not hurt NOTIFY_RETRIEVE semantic if we return less data than was originally requested. Please see [1] for context where the problem of stuck filesystem was hit for real, how the situation was traced and for more involving patch that did not make it into the tree. [1] https://marc.info/?l=linux-fsdevel&m=155057023600853&w=2 [2] https://github.com/hanwen/go-fuse Signed-off-by: Kirill Smelkov <kirr@nexedi.com> Cc: Han-Wen Nienhuys <hanwen@google.com> Cc: Jakob Unterwurzacher <jakobunt@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-16fuse: fix possibly missed wake-up after abortMiklos Szeredi1-3/+9
[ Upstream commit 2d84a2d19b6150c6dbac1e6ebad9c82e4c123772 ] In current fuse_drop_waiting() implementation it's possible that fuse_wait_aborted() will not be woken up in the unlikely case that fuse_abort_conn() + fuse_wait_aborted() runs in between checking fc->connected and calling atomic_dec(&fc->num_waiting). Do the atomic_dec_and_test() unconditionally, which also provides the necessary barrier against reordering with the fc->connected check. The explicit smp_mb() in fuse_wait_aborted() is not actually needed, since the spin_unlock() in fuse_abort_conn() provides the necessary RELEASE barrier after resetting fc->connected. However, this is not a performance sensitive path, and adding the explicit barrier makes it easier to document. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: b8f95e5d13f5 ("fuse: umount should wait for all requests") Cc: <stable@vger.kernel.org> #v4.19 Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
2019-05-04fs: prevent page refcount overflow in pipe_buf_getMatthew Wilcox1-6/+6
commit 15fab63e1e57be9fdb5eec1bbc5916e9825e9acb upstream. Change pipe_buf_get() to return a bool indicating whether it succeeded in raising the refcount of the page (if the thing in the pipe is a page). This removes another mechanism for overflowing the page refcount. All callers converted to handle a failure. Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Matthew Wilcox <willy@infradead.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12fuse: handle zero sized retrieve correctlyMiklos Szeredi1-1/+1
commit 97e1532ef81acb31c30f9e75bf00306c33a77812 upstream. Dereferencing req->page_descs[0] will Oops if req->max_pages is zero. Reported-by: syzbot+c1e36d30ee3416289cc0@syzkaller.appspotmail.com Tested-by: syzbot+c1e36d30ee3416289cc0@syzkaller.appspotmail.com Fixes: b2430d7567a3 ("fuse: add per-page descriptor <offset, length> to fuse_req") Cc: <stable@vger.kernel.org> # v3.9 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12fuse: call pipe_buf_release() under pipe lockJann Horn1-0/+2
commit 9509941e9c534920ccc4771ae70bd6cbbe79df1c upstream. Some of the pipe_buf_release() handlers seem to assume that the pipe is locked - in particular, anon_pipe_buf_release() accesses pipe->tmp_page without taking any extra locks. From a glance through the callers of pipe_buf_release(), it looks like FUSE is the only one that calls pipe_buf_release() without having the pipe locked. This bug should only lead to a memory leak, nothing terrible. Fixes: dd3bb14f44a6 ("fuse: support splice() writing to fuse device") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21fuse: fix leaked notify replyMiklos Szeredi1-1/+3
commit 7fabaf303458fcabb694999d6fa772cc13d4e217 upstream. fuse_request_send_notify_reply() may fail if the connection was reset for some reason (e.g. fs was unmounted). Don't leak request reference in this case. Besides leaking memory, this resulted in fc->num_waiting not being decremented and hence fuse_wait_aborted() left in a hanging and unkillable state. Fixes: 2d45ba381a74 ("fuse: add retrieve request") Fixes: b8f95e5d13f5 ("fuse: umount should wait for all requests") Reported-and-tested-by: syzbot+6339eda9cb4ebbc4c37b@syzkaller.appspotmail.com Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org> #v2.6.36 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21fuse: set FR_SENT while lockedMiklos Szeredi1-1/+1
commit 4c316f2f3ff315cb48efb7435621e5bfb81df96d upstream. Otherwise fuse_dev_do_write() could come in and finish off the request, and the set_bit(FR_SENT, ...) could trigger the WARN_ON(test_bit(FR_SENT, ...)) in request_end(). Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Reported-by: syzbot+ef054c4d3f64cd7f7cec@syzkaller.appspotmai Fixes: 46c34a348b0a ("fuse: no fc->lock for pqueue parts") Cc: <stable@vger.kernel.org> # v4.2 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21fuse: fix blocked_waitq wakeupMiklos Szeredi1-4/+11
commit 908a572b80f6e9577b45e81b3dfe2e22111286b8 upstream. Using waitqueue_active() is racy. Make sure we issue a wake_up() unconditionally after storing into fc->blocked. After that it's okay to optimize with waitqueue_active() since the first wake up provides the necessary barrier for all waiters, not the just the woken one. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 3c18ef8117f0 ("fuse: optimize wake_up") Cc: <stable@vger.kernel.org> # v3.10 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21fuse: Fix use-after-free in fuse_dev_do_write()Kirill Tkhai1-1/+5
commit d2d2d4fb1f54eff0f3faa9762d84f6446a4bc5d0 upstream. After we found req in request_find() and released the lock, everything may happen with the req in parallel: cpu0 cpu1 fuse_dev_do_write() fuse_dev_do_write() req = request_find(fpq, ...) ... spin_unlock(&fpq->lock) ... ... req = request_find(fpq, oh.unique) ... spin_unlock(&fpq->lock) queue_interrupt(&fc->iq, req); ... ... ... ... ... request_end(fc, req); fuse_put_request(fc, req); ... queue_interrupt(&fc->iq, req); Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 46c34a348b0a ("fuse: no fc->lock for pqueue parts") Cc: <stable@vger.kernel.org> # v4.2 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-21fuse: Fix use-after-free in fuse_dev_do_read()Kirill Tkhai1-0/+2
commit bc78abbd55dd28e2287ec6d6502b842321a17c87 upstream. We may pick freed req in this way: [cpu0] [cpu1] fuse_dev_do_read() fuse_dev_do_write() list_move_tail(&req->list, ...); ... spin_unlock(&fpq->lock); ... ... request_end(fc, req); ... fuse_put_request(fc, req); if (test_bit(FR_INTERRUPTED, ...)) queue_interrupt(fiq, req); Fix that by keeping req alive until we finish all manipulations. Reported-by: syzbot+4e975615ca01f2277bdd@syzkaller.appspotmail.com Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 46c34a348b0a ("fuse: no fc->lock for pqueue parts") Cc: <stable@vger.kernel.org> # v4.2 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05fuse: umount should wait for all requestsMiklos Szeredi1-4/+19
commit b8f95e5d13f5f0191dcb4b9113113d241636e7cb upstream. fuse_abort_conn() does not guarantee that all async requests have actually finished aborting (i.e. their ->end() function is called). This could actually result in still used inodes after umount. Add a helper to wait until all requests are fully done. This is done by looking at the "num_waiting" counter. When this counter drops to zero, we can be sure that no more requests are outstanding. Fixes: 0d8e84b0432b ("fuse: simplify request abort") Cc: <stable@vger.kernel.org> # v4.2 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05fuse: fix unlocked access to processing queueMiklos Szeredi1-1/+7
commit 45ff350bbd9d0f0977ff270a0d427c71520c0c37 upstream. fuse_dev_release() assumes that it's the only one referencing the fpq->processing list, but that's not true, since fuse_abort_conn() can be doing the same without any serialization between the two. Fixes: c3696046beb3 ("fuse: separate pqueue for clones") Cc: <stable@vger.kernel.org> # v4.2 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05fuse: fix double request_end()Miklos Szeredi1-2/+3
commit 87114373ea507895a62afb10d2910bd9adac35a8 upstream. Refcounting of request is broken when fuse_abort_conn() is called and request is on the fpq->io list: - ref is taken too late - then it is not dropped Fixes: 0d8e84b0432b ("fuse: simplify request abort") Cc: <stable@vger.kernel.org> # v4.2 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-05fuse: Don't access pipe->buffers without pipe_lock()Andrey Ryabinin1-2/+5
commit a2477b0e67c52f4364a47c3ad70902bc2a61bd4c upstream. fuse_dev_splice_write() reads pipe->buffers to determine the size of 'bufs' array before taking the pipe_lock(). This is not safe as another thread might change the 'pipe->buffers' between the allocation and taking the pipe_lock(). So we end up with too small 'bufs' array. Move the bufs allocations inside pipe_lock()/pipe_unlock() to fix this. Fixes: dd3bb14f44a6 ("fuse: support splice() writing to fuse device") Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: <stable@vger.kernel.org> # v2.6.35 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03fuse: fix congested state leak on aborted connectionsTejun Heo1-2/+1
commit 8a301eb16d99983a4961f884690ec97b92e7dcfe upstream. If a connection gets aborted while congested, FUSE can leave nr_wb_congested[] stuck until reboot causing wait_iff_congested() to wait spuriously which can lead to severe performance degradation. The leak is caused by gating congestion state clearing with fc->connected test in request_end(). This was added way back in 2009 by 26c3679101db ("fuse: destroy bdi on umount"). While the commit description doesn't explain why the test was added, it most likely was to avoid dereferencing bdi after it got destroyed. Since then, bdi lifetime rules have changed many times and now we're always guaranteed to have access to the bdi while the superblock is alive (fc->sb). Drop fc->connected conditional to avoid leaking congestion states. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Joshua Miller <joshmiller@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: stable@vger.kernel.org # v2.6.29+ Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-12fuse: allow server to run in different pid_nsMiklos Szeredi1-6/+7
Commit 0b6e9ea041e6 ("fuse: Add support for pid namespaces") broke Sandstorm.io development tools, which have been sending FUSE file descriptors across PID namespace boundaries since early 2014. The above patch added a check that prevented I/O on the fuse device file descriptor if the pid namespace of the reader/writer was different from the pid namespace of the mounter. With this change passing the device file descriptor to a different pid namespace simply doesn't work. The check was added because pids are transferred to/from the fuse userspace server in the namespace registered at mount time. To fix this regression, remove the checks and do the following: 1) the pid in the request header (the pid of the task that initiated the filesystem operation) is translated to the reader's pid namespace. If a mapping doesn't exist for this pid, then a zero pid is used. Note: even if a mapping would exist between the initiator task's pid namespace and the reader's pid namespace the pid will be zero if either mapping from initator's to mounter's namespace or mapping from mounter's to reader's namespace doesn't exist. 2) The lk.pid value in setlk/setlkw requests and getlk reply is left alone. Userspace should not interpret this value anyway. Also allow the setlk/setlkw operations if the pid of the task cannot be represented in the mounter's namespace (pid being zero in that case). Reported-by: Kenton Varda <kenton@sandstorm.io> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 0b6e9ea041e6 ("fuse: Add support for pid namespaces") Cc: <stable@vger.kernel.org> # v4.12+ Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Seth Forshee <seth.forshee@canonical.com>
2017-05-10Merge branch 'for-linus' of ↵Linus Torvalds1-9/+15
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse updates from Miklos Szeredi: "Support for pid namespaces from Seth and refcount_t work from Elena" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: Add support for pid namespaces fuse: convert fuse_conn.count from atomic_t to refcount_t fuse: convert fuse_req.count from atomic_t to refcount_t fuse: convert fuse_file.count from atomic_t to refcount_t
2017-04-20fuse: Get rid of bdi_initializedJan Kara1-3/+2
It is not needed anymore since bdi is initialized whenever superblock exists. CC: Miklos Szeredi <miklos@szeredi.hu> CC: linux-fsdevel@vger.kernel.org Suggested-by: Miklos Szeredi <mszeredi@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-20fuse: Convert to separately allocated bdiJan Kara1-4/+4
Allocate struct backing_dev_info separately instead of embedding it inside the superblock. This unifies handling of bdi among users. CC: Miklos Szeredi <miklos@szeredi.hu> CC: linux-fsdevel@vger.kernel.org Acked-by: Miklos Szeredi <mszeredi@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-18fuse: Add support for pid namespacesSeth Forshee1-4/+11
When the userspace process servicing fuse requests is running in a pid namespace then pids passed via the fuse fd are not being translated into that process' namespace. Translation is necessary for the pid to be useful to that process. Since no use case currently exists for changing namespaces all translations can be done relative to the pid namespace in use when fuse_conn_init() is called. For fuse this translates to mount time, and for cuse this is when /dev/cuse is opened. IO for this connection from another namespace will return errors. Requests from processes whose pid cannot be translated into the target namespace will have a value of 0 for in.h.pid. File locking changes based on previous work done by Eric Biederman. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-04-18fuse: convert fuse_req.count from atomic_t to refcount_tElena Reshetova1-5/+4
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-03-02sched/headers: Prepare to move signal wakeup & sigpending methods from ↵Ingo Molnar1-0/+1
<linux/sched.h> into <linux/sched/signal.h> Fix up affected files that include this signal functionality via sched.h. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-16fuse: fix uninitialized flags in pipe_bufferMiklos Szeredi1-0/+1
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: d82718e348fe ("fuse_dev_splice_read(): switch to add_to_pipe()") Cc: <stable@vger.kernel.org> # 4.9+
2017-02-15fuse: fix use after free issue in fuse_dev_do_read()Sahitya Tummala1-0/+4
There is a potential race between fuse_dev_do_write() and request_wait_answer() contexts as shown below: TASK 1: __fuse_request_send(): |--spin_lock(&fiq->waitq.lock); |--queue_request(); |--spin_unlock(&fiq->waitq.lock); |--request_wait_answer(): |--if (test_bit(FR_SENT, &req->flags)) <gets pre-empted after it is validated true> TASK 2: fuse_dev_do_write(): |--clears bit FR_SENT, |--request_end(): |--sets bit FR_FINISHED |--spin_lock(&fiq->waitq.lock); |--list_del_init(&req->intr_entry); |--spin_unlock(&fiq->waitq.lock); |--fuse_put_request(); |--queue_interrupt(); <request gets queued to interrupts list> |--wake_up_locked(&fiq->waitq); |--wait_event_freezable(); <as FR_FINISHED is set, it returns and then the caller frees this request> Now, the next fuse_dev_do_read(), see interrupts list is not empty and then calls fuse_read_interrupt() which tries to access the request which is already free'd and gets the below crash: [11432.401266] Unable to handle kernel paging request at virtual address 6b6b6b6b6b6b6b6b ... [11432.418518] Kernel BUG at ffffff80083720e0 [11432.456168] PC is at __list_del_entry+0x6c/0xc4 [11432.463573] LR is at fuse_dev_do_read+0x1ac/0x474 ... [11432.679999] [<ffffff80083720e0>] __list_del_entry+0x6c/0xc4 [11432.687794] [<ffffff80082c65e0>] fuse_dev_do_read+0x1ac/0x474 [11432.693180] [<ffffff80082c6b14>] fuse_dev_read+0x6c/0x78 [11432.699082] [<ffffff80081d5638>] __vfs_read+0xc0/0xe8 [11432.704459] [<ffffff80081d5efc>] vfs_read+0x90/0x108 [11432.709406] [<ffffff80081d67f0>] SyS_read+0x58/0x94 As FR_FINISHED bit is set before deleting the intr_entry with input queue lock in request completion path, do the testing of this flag and queueing atomically with the same lock in queue_interrupt(). Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: fd22d62ed0c3 ("fuse: no fc->lock for iqueue parts") Cc: <stable@vger.kernel.org> # 4.2+
2017-01-13fuse: clear FR_PENDING flag when moving requests out of pending queueTahsin Erdogan1-1/+2
fuse_abort_conn() moves requests from pending list to a temporary list before canceling them. This operation races with request_wait_answer() which also tries to remove the request after it gets a fatal signal. It checks FR_PENDING flag to determine whether the request is still in the pending list. Make fuse_abort_conn() clear FR_PENDING flag so that request_wait_answer() does not remove the request from temporary list. This bug causes an Oops when trying to delete an already deleted list entry in end_requests(). Fixes: ee314a870e40 ("fuse: abort: no fc->lock needed for request ending") Signed-off-by: Tahsin Erdogan <tahsin@google.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org> # 4.2+
2016-10-08Merge branch 'work.splice_read' of ↵Linus Torvalds1-47/+16
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS splice updates from Al Viro: "There's a bunch of branches this cycle, both mine and from other folks and I'd rather send pull requests separately. This one is the conversion of ->splice_read() to ITER_PIPE iov_iter (and introduction of such). Gets rid of a lot of code in fs/splice.c and elsewhere; there will be followups, but these are for the next cycle... Some pipe/splice-related cleanups from Miklos in the same branch as well" * 'work.splice_read' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: pipe: fix comment in pipe_buf_operations pipe: add pipe_buf_steal() helper pipe: add pipe_buf_confirm() helper pipe: add pipe_buf_release() helper pipe: add pipe_buf_get() helper relay: simplify relay_file_read() switch default_file_splice_read() to use of pipe-backed iov_iter switch generic_file_splice_read() to use of ->read_iter() new iov_iter flavour: pipe-backed fuse_dev_splice_read(): switch to add_to_pipe() skb_splice_bits(): get rid of callback new helper: add_to_pipe() splice: lift pipe_lock out of splice_to_pipe() splice: switch get_iovec_page_array() to iov_iter splice_to_pipe(): don't open-code wakeup_pipe_readers() consistent treatment of EFAULT on O_DIRECT read/write
2016-10-06pipe: add pipe_buf_steal() helperMiklos Szeredi1-1/+1
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-06pipe: add pipe_buf_confirm() helperMiklos Szeredi1-2/+2
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-06pipe: add pipe_buf_release() helperMiklos Szeredi1-4/+3
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-06pipe: add pipe_buf_get() helperMiklos Szeredi1-1/+1
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-04fuse_dev_splice_read(): switch to add_to_pipe()Al Viro1-37/+9
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-04splice: lift pipe_lock out of splice_to_pipe()Al Viro1-2/+0
* splice_to_pipe() stops at pipe overflow and does *not* take pipe_lock * ->splice_read() instances do the same * vmsplice_to_pipe() and do_splice() (ultimate callers of splice_to_pipe()) arrange for waiting, looping, etc. themselves. That should make pipe_lock the outermost one. Unfortunately, existing rules for the amount passed by vmsplice_to_pipe() and do_splice() are quite ugly _and_ userland code can be easily broken by changing those. It's not even "no more than the maximal capacity of this pipe" - it's "once we'd fed pipe->nr_buffers pages into the pipe, leave instead of waiting". Considering how poorly these rules are documented, let's try "wait for some space to appear, unless given SPLICE_F_NONBLOCK, then push into pipe and if we run into overflow, we are done". Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-01fuse: remove duplicate cs->offset assignmentMiklos Szeredi1-1/+0
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-07-29Merge branch 'for-linus' of ↵Linus Torvalds1-27/+3
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse updates from Miklos Szeredi: "This fixes error propagation from writeback to fsync/close for writeback cache mode as well as adding a missing capability flag to the INIT message. The rest are cleanups. (The commits are recent but all the code actually sat in -next for a while now. The recommits are due to conflict avoidance and the addition of Cc: stable@...)" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: use filemap_check_errors() mm: export filemap_check_errors() to modules fuse: fix wrong assignment of ->flags in fuse_send_init() fuse: fuse_flush must check mapping->flags for errors fuse: fsync() did not return IO errors fuse: don't mess with blocking signals new helper: wait_event_killable_exclusive() fuse: improve aio directIO write performance for size extending writes
2016-07-19fuse: don't mess with blocking signalsAl Viro1-27/+3
just use wait_event_killable{,_exclusive}(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-06-11vfs: make the string hashes salt the hashLinus Torvalds1-2/+0
We always mixed in the parent pointer into the dentry name hash, but we did it late at lookup time. It turns out that we can simplify that lookup-time action by salting the hash with the parent pointer early instead of late. A few other users of our string hashes also wanted to mix in their own pointers into the hash, and those are updated to use the same mechanism. Hash users that don't have any particular initial salt can just use the NULL pointer as a no-salt. Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-04mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macrosKirill A. Shutemov1-13/+13
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-08-16fs/fuse: fix ioctl type confusionJann Horn1-1/+9
fuse_dev_ioctl() performed fuse_get_dev() on a user-supplied fd, leading to a type confusion issue. Fix it by checking file->f_op. Signed-off-by: Jann Horn <jann@thejh.net> Acked-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-01fuse: separate pqueue for clonesMiklos Szeredi1-26/+37
Make each fuse device clone refer to a separate processing queue. The only constraint on userspace code is that the request answer must be written to the same device clone as it was read off. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2015-07-01fuse: introduce per-instance fuse_dev structureMiklos Szeredi1-28/+42
Allow fuse device clones to refer to be distinguished. This patch just adds the infrastructure by associating a separate "struct fuse_dev" with each clone. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: device fd cloneMiklos Szeredi1-0/+40
Allow an open fuse device to be "cloned". Userspace can create a clone by: newfd = open("/dev/fuse", O_RDWR) ioctl(newfd, FUSE_DEV_IOC_CLONE, &oldfd); At this point newfd will refer to the same fuse connection as oldfd. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: abort: no fc->lock needed for request endingMiklos Szeredi1-9/+5
In fuse_abort_conn() when all requests are on private lists we no longer need fc->lock protection. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
2015-07-01fuse: no fc->lock for pqueue partsMiklos Szeredi1-14/+2
Remove fc->lock protection from processing queue members, now protected by fpq->lock. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reviewed-by: Ashish Samant <ashish.samant@oracle.com>