<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/ipc/mqueue.c, branch v5.10.78</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v5.10.78</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v5.10.78'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2021-05-26T10:06:54+00:00</updated>
<entry>
<title>ipc/mqueue, msg, sem: avoid relying on a stack reference past its expiry</title>
<updated>2021-05-26T10:06:54+00:00</updated>
<author>
<name>Varad Gautam</name>
<email>varad.gautam@suse.com</email>
</author>
<published>2021-05-23T00:41:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4528c0c323085e645b8765913b4a7fd42cf49b65'/>
<id>urn:sha1:4528c0c323085e645b8765913b4a7fd42cf49b65</id>
<content type='text'>
commit a11ddb37bf367e6b5239b95ca759e5389bb46048 upstream.

do_mq_timedreceive calls wq_sleep with a stack local address.  The
sender (do_mq_timedsend) uses this address to later call pipelined_send.

This leads to a very hard to trigger race where a do_mq_timedreceive
call might return and leave do_mq_timedsend to rely on an invalid
address, causing the following crash:

  RIP: 0010:wake_q_add_safe+0x13/0x60
  Call Trace:
   __x64_sys_mq_timedsend+0x2a9/0x490
   do_syscall_64+0x80/0x680
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f5928e40343

The race occurs as:

1. do_mq_timedreceive calls wq_sleep with the address of `struct
   ext_wait_queue` on function stack (aliased as `ewq_addr` here) - it
   holds a valid `struct ext_wait_queue *` as long as the stack has not
   been overwritten.

2. `ewq_addr` gets added to info-&gt;e_wait_q[RECV].list in wq_add, and
   do_mq_timedsend receives it via wq_get_first_waiter(info, RECV) to call
   __pipelined_op.

3. Sender calls __pipelined_op::smp_store_release(&amp;this-&gt;state,
   STATE_READY).  Here is where the race window begins.  (`this` is
   `ewq_addr`.)

4. If the receiver wakes up now in do_mq_timedreceive::wq_sleep, it
   will see `state == STATE_READY` and break.

5. do_mq_timedreceive returns, and `ewq_addr` is no longer guaranteed
   to be a `struct ext_wait_queue *` since it was on do_mq_timedreceive's
   stack.  (Although the address may not get overwritten until another
   function happens to touch it, which means it can persist around for an
   indefinite time.)

6. do_mq_timedsend::__pipelined_op() still believes `ewq_addr` is a
   `struct ext_wait_queue *`, and uses it to find a task_struct to pass to
   the wake_q_add_safe call.  In the lucky case where nothing has
   overwritten `ewq_addr` yet, `ewq_addr-&gt;task` is the right task_struct.
   In the unlucky case, __pipelined_op::wake_q_add_safe gets handed a
   bogus address as the receiver's task_struct causing the crash.

do_mq_timedsend::__pipelined_op() should not dereference `this` after
setting STATE_READY, as the receiver counterpart is now free to return.
Change __pipelined_op to call wake_q_add_safe on the receiver's
task_struct returned by get_task_struct, instead of dereferencing `this`
which sits on the receiver's stack.

As Manfred pointed out, the race potentially also exists in
ipc/msg.c::expunge_all and ipc/sem.c::wake_up_sem_queue_prepare.  Fix
those in the same way.

Link: https://lkml.kernel.org/r/20210510102950.12551-1-varad.gautam@suse.com
Fixes: c5b2cbdbdac563 ("ipc/mqueue.c: update/document memory barriers")
Fixes: 8116b54e7e23ef ("ipc/sem.c: document and update memory barriers")
Fixes: 0d97a82ba830d8 ("ipc/msg.c: update and document memory barriers")
Signed-off-by: Varad Gautam &lt;varad.gautam@suse.com&gt;
Reported-by: Matthias von Faber &lt;matthias.vonfaber@aox-tech.de&gt;
Acked-by: Davidlohr Bueso &lt;dbueso@suse.de&gt;
Acked-by: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Cc: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>ipc/mqueue.c: change __do_notify() to bypass check_kill_permission()</title>
<updated>2020-05-08T02:27:20+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2020-05-08T01:35:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b5f2006144c6ae941726037120fa1001ddede784'/>
<id>urn:sha1:b5f2006144c6ae941726037120fa1001ddede784</id>
<content type='text'>
Commit cc731525f26a ("signal: Remove kernel interal si_code magic")
changed the value of SI_FROMUSER(SI_MESGQ), this means that mq_notify() no
longer works if the sender doesn't have rights to send a signal.

Change __do_notify() to use do_send_sig_info() instead of kill_pid_info()
to avoid check_kill_permission().

This needs the additional notify.sigev_signo != 0 check, shouldn't we
change do_mq_notify() to deny sigev_signo == 0 ?

Test-case:

	#include &lt;signal.h&gt;
	#include &lt;mqueue.h&gt;
	#include &lt;unistd.h&gt;
	#include &lt;sys/wait.h&gt;
	#include &lt;assert.h&gt;

	static int notified;

	static void sigh(int sig)
	{
		notified = 1;
	}

	int main(void)
	{
		signal(SIGIO, sigh);

		int fd = mq_open("/mq", O_RDWR|O_CREAT, 0666, NULL);
		assert(fd &gt;= 0);

		struct sigevent se = {
			.sigev_notify	= SIGEV_SIGNAL,
			.sigev_signo	= SIGIO,
		};
		assert(mq_notify(fd, &amp;se) == 0);

		if (!fork()) {
			assert(setuid(1) == 0);
			mq_send(fd, "",1,0);
			return 0;
		}

		wait(NULL);
		mq_unlink("/mq");
		assert(notified);
		return 0;
	}

[manfred@colorfullife.com: 1) Add self_exec_id evaluation so that the implementation matches do_notify_parent 2) use PIDTYPE_TGID everywhere]
Fixes: cc731525f26a ("signal: Remove kernel interal si_code magic")
Reported-by: Yoji &lt;yoji.fujihar.min@gmail.com&gt;
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Acked-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Markus Elfring &lt;elfring@users.sourceforge.net&gt;
Cc: &lt;1vier1@web.de&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Link: http://lkml.kernel.org/r/e2a782e4-eab9-4f5c-c749-c07a8f7a4e66@colorfullife.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ipc/mqueue.c: fix a brace coding style issue</title>
<updated>2020-04-07T17:43:45+00:00</updated>
<author>
<name>Somala Swaraj</name>
<email>somalaswaraj@gmail.com</email>
</author>
<published>2020-04-07T03:12:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=43afe4d3661b04915aa357c5ab0f2a08718af9fd'/>
<id>urn:sha1:43afe4d3661b04915aa357c5ab0f2a08718af9fd</id>
<content type='text'>
Signed-off-by: somala swaraj &lt;somalaswaraj@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Link: http://lkml.kernel.org/r/20200301135530.18340-1-somalaswaraj@gmail.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ipc/mqueue.c: update/document memory barriers</title>
<updated>2020-02-04T03:05:23+00:00</updated>
<author>
<name>Manfred Spraul</name>
<email>manfred@colorfullife.com</email>
</author>
<published>2020-02-04T01:34:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c5b2cbdbdac563f46ecd5e187253ab1abbd6fc04'/>
<id>urn:sha1:c5b2cbdbdac563f46ecd5e187253ab1abbd6fc04</id>
<content type='text'>
Update and document memory barriers for mqueue.c:

- ewp-&gt;state is read without any locks, thus READ_ONCE is required.

- add smp_aquire__after_ctrl_dep() after the READ_ONCE, we need
  acquire semantics if the value is STATE_READY.

- use wake_q_add_safe()

- document why __set_current_state() may be used:
  Reading task-&gt;state cannot happen before the wake_q_add() call,
  which happens while holding info-&gt;lock. Thus the spin_unlock()
  is the RELEASE, and the spin_lock() is the ACQUIRE.

For completeness: there is also a 3 CPU scenario, if the to be woken
up task is already on another wake_q.
Then:
- CPU1: spin_unlock() of the task that goes to sleep is the RELEASE
- CPU2: the spin_lock() of the waker is the ACQUIRE
- CPU2: smp_mb__before_atomic inside wake_q_add() is the RELEASE
- CPU3: smp_mb__after_spinlock() inside try_to_wake_up() is the ACQUIRE

Link: http://lkml.kernel.org/r/20191020123305.14715-4-manfred@colorfullife.com
Signed-off-by: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Reviewed-by: Davidlohr Bueso &lt;dbueso@suse.de&gt;
Cc: Waiman Long &lt;longman@redhat.com&gt;
Cc: &lt;1vier1@web.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ipc/mqueue.c: remove duplicated code</title>
<updated>2020-02-04T03:05:23+00:00</updated>
<author>
<name>Davidlohr Bueso</name>
<email>dave@stgolabs.net</email>
</author>
<published>2020-02-04T01:34:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ed29f171518cbe11c81e8c20d393bb094a9e2ce7'/>
<id>urn:sha1:ed29f171518cbe11c81e8c20d393bb094a9e2ce7</id>
<content type='text'>
pipelined_send() and pipelined_receive() are identical, so merge them.

[manfred@colorfullife.com: add changelog]
Link: http://lkml.kernel.org/r/20191020123305.14715-3-manfred@colorfullife.com
Signed-off-by: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Signed-off-by: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Cc: &lt;1vier1@web.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Waiman Long &lt;longman@redhat.com&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ipc/mqueue: improve exception handling in do_mq_notify()</title>
<updated>2019-09-26T00:51:41+00:00</updated>
<author>
<name>Markus Elfring</name>
<email>elfring@users.sourceforge.net</email>
</author>
<published>2019-09-25T23:48:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c231740dd95e854de5034cff8f49737d942bc098'/>
<id>urn:sha1:c231740dd95e854de5034cff8f49737d942bc098</id>
<content type='text'>
Null pointers were assigned to local variables in a few cases as exception
handling.  The jump target “out” was used where no meaningful data
processing actions should eventually be performed by branches of an if
statement then.  Use an additional jump target for calling dev_kfree_skb()
directly.

Return also directly after error conditions were detected when no extra
clean-up is needed by this function implementation.

Link: http://lkml.kernel.org/r/592ef10e-0b69-72d0-9789-fc48f638fdfd@web.de
Signed-off-by: Markus Elfring &lt;elfring@users.sourceforge.net&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ipc/mqueue.c: delete an unnecessary check before the macro call dev_kfree_skb()</title>
<updated>2019-09-26T00:51:41+00:00</updated>
<author>
<name>Markus Elfring</name>
<email>elfring@users.sourceforge.net</email>
</author>
<published>2019-09-25T23:48:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=97b0b1ad58fab9f71b1a6bc056a09af6065ec3bc'/>
<id>urn:sha1:97b0b1ad58fab9f71b1a6bc056a09af6065ec3bc</id>
<content type='text'>
dev_kfree_skb() input parameter validation, thus the test around the call
is not needed.

This issue was detected by using the Coccinelle software.

Link: http://lkml.kernel.org/r/07477187-63e5-cc80-34c1-32dd16b38e12@web.de
Signed-off-by: Markus Elfring &lt;elfring@users.sourceforge.net&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>new helper: get_tree_keyed()</title>
<updated>2019-09-05T18:34:22+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2019-09-03T23:05:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=533770cc0ae84890624dc129609f3d75855c8982'/>
<id>urn:sha1:533770cc0ae84890624dc129609f3d75855c8982</id>
<content type='text'>
For vfs_get_keyed_super users.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2019-07-19T17:42:02+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-07-19T17:42:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=933a90bf4f3505f8ec83bda21a3c7d70d7c2b426'/>
<id>urn:sha1:933a90bf4f3505f8ec83bda21a3c7d70d7c2b426</id>
<content type='text'>
Pull vfs mount updates from Al Viro:
 "The first part of mount updates.

  Convert filesystems to use the new mount API"

* 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
  mnt_init(): call shmem_init() unconditionally
  constify ksys_mount() string arguments
  don't bother with registering rootfs
  init_rootfs(): don't bother with init_ramfs_fs()
  vfs: Convert smackfs to use the new mount API
  vfs: Convert selinuxfs to use the new mount API
  vfs: Convert securityfs to use the new mount API
  vfs: Convert apparmorfs to use the new mount API
  vfs: Convert openpromfs to use the new mount API
  vfs: Convert xenfs to use the new mount API
  vfs: Convert gadgetfs to use the new mount API
  vfs: Convert oprofilefs to use the new mount API
  vfs: Convert ibmasmfs to use the new mount API
  vfs: Convert qib_fs/ipathfs to use the new mount API
  vfs: Convert efivarfs to use the new mount API
  vfs: Convert configfs to use the new mount API
  vfs: Convert binfmt_misc to use the new mount API
  convenience helper: get_tree_single()
  convenience helper get_tree_nodev()
  vfs: Kill sget_userns()
  ...
</content>
</entry>
<entry>
<title>ipc/mqueue.c: only perform resource calculation if user valid</title>
<updated>2019-07-17T02:23:24+00:00</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2019-07-16T23:30:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a318f12ed8843cfac53198390c74a565c632f417'/>
<id>urn:sha1:a318f12ed8843cfac53198390c74a565c632f417</id>
<content type='text'>
Andreas Christoforou reported:

  UBSAN: Undefined behaviour in ipc/mqueue.c:414:49 signed integer overflow:
  9 * 2305843009213693951 cannot be represented in type 'long int'
  ...
  Call Trace:
    mqueue_evict_inode+0x8e7/0xa10 ipc/mqueue.c:414
    evict+0x472/0x8c0 fs/inode.c:558
    iput_final fs/inode.c:1547 [inline]
    iput+0x51d/0x8c0 fs/inode.c:1573
    mqueue_get_inode+0x8eb/0x1070 ipc/mqueue.c:320
    mqueue_create_attr+0x198/0x440 ipc/mqueue.c:459
    vfs_mkobj+0x39e/0x580 fs/namei.c:2892
    prepare_open ipc/mqueue.c:731 [inline]
    do_mq_open+0x6da/0x8e0 ipc/mqueue.c:771

Which could be triggered by:

        struct mq_attr attr = {
                .mq_flags = 0,
                .mq_maxmsg = 9,
                .mq_msgsize = 0x1fffffffffffffff,
                .mq_curmsgs = 0,
        };

        if (mq_open("/testing", 0x40, 3, &amp;attr) == (mqd_t) -1)
                perror("mq_open");

mqueue_get_inode() was correctly rejecting the giant mq_msgsize, and
preparing to return -EINVAL.  During the cleanup, it calls
mqueue_evict_inode() which performed resource usage tracking math for
updating "user", before checking if there was a valid "user" at all
(which would indicate that the calculations would be sane).  Instead,
delay this check to after seeing a valid "user".

The overflow was real, but the results went unused, so while the flaw is
harmless, it's noisy for kernel fuzzers, so just fix it by moving the
calculation under the non-NULL "user" where it actually gets used.

Link: http://lkml.kernel.org/r/201906072207.ECB65450@keescook
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Reported-by: Andreas Christoforou &lt;andreaschristofo@gmail.com&gt;
Acked-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
