<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/exec.c, branch v3.16.50</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v3.16.50</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v3.16.50'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2017-10-12T14:28:22+00:00</updated>
<entry>
<title>sched: move no_new_privs into new atomic flags</title>
<updated>2017-10-12T14:28:22+00:00</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2014-05-21T22:23:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5f3333219189a3fe4bbbd2d67e636ed1bb74d243'/>
<id>urn:sha1:5f3333219189a3fe4bbbd2d67e636ed1bb74d243</id>
<content type='text'>
commit 1d4457f99928a968767f6405b4a1f50845aa15fd upstream.

Since seccomp transitions between threads requires updates to the
no_new_privs flag to be atomic, the flag must be part of an atomic flag
set. This moves the nnp flag into a separate task field, and introduces
accessors.

Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Reviewed-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Reviewed-by: Andy Lutomirski &lt;luto@amacapital.net&gt;
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>fs/exec.c: account for argv/envp pointers</title>
<updated>2017-07-18T17:40:40+00:00</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2017-06-23T22:08:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d7575b387f81c1a92e32c25dcb6a0e14458ae66c'/>
<id>urn:sha1:d7575b387f81c1a92e32c25dcb6a0e14458ae66c</id>
<content type='text'>
commit 98da7d08850fb8bdeb395d6368ed15753304aa0c upstream.

When limiting the argv/envp strings during exec to 1/4 of the stack limit,
the storage of the pointers to the strings was not included.  This means
that an exec with huge numbers of tiny strings could eat 1/4 of the stack
limit in strings and then additional space would be later used by the
pointers to the strings.

For example, on 32-bit with a 8MB stack rlimit, an exec with 1677721
single-byte strings would consume less than 2MB of stack, the max (8MB /
4) amount allowed, but the pointers to the strings would consume the
remaining additional stack space (1677721 * 4 == 6710884).

The result (1677721 + 6710884 == 8388605) would exhaust stack space
entirely.  Controlling this stack exhaustion could result in
pathological behavior in setuid binaries (CVE-2017-1000365).

[akpm@linux-foundation.org: additional commenting from Kees]
Fixes: b6a2fea39318 ("mm: variable length argument support")
Link: http://lkml.kernel.org/r/20170622001720.GA32173@beast
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Qualys Security Advisory &lt;qsa@qualys.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
[bwh: Backported to 3.16: use ACCESS_ONCE() instead of READ_ONCE()]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>vfs: Commit to never having exectuables on proc and sysfs.</title>
<updated>2017-04-04T21:21:56+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2015-06-29T19:42:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=495d1af4041fcb3cb726ca414d3c09cbb251fefa'/>
<id>urn:sha1:495d1af4041fcb3cb726ca414d3c09cbb251fefa</id>
<content type='text'>
commit 22f6b4d34fcf039c63a94e7670e0da24f8575a5a upstream.

Today proc and sysfs do not contain any executable files.  Several
applications today mount proc or sysfs without noexec and nosuid and
then depend on there being no exectuables files on proc or sysfs.
Having any executable files show on proc or sysfs would cause
a user space visible regression, and most likely security problems.

Therefore commit to never allowing executables on proc and sysfs by
adding a new flag to mark them as filesystems without executables and
enforce that flag.

Test the flag where MNT_NOEXEC is tested today, so that the only user
visible effect will be that exectuables will be treated as if the
execute bit is cleared.

The filesystems proc and sysfs do not currently incoporate any
executable files so this does not result in any user visible effects.

This makes it unnecessary to vet changes to proc and sysfs tightly for
adding exectuable files or changes to chattr that would modify
existing files, as no matter what the individual file say they will
not be treated as exectuable files by the vfs.

Not having to vet changes to closely is important as without this we
are only one proc_create call (or another goof up in the
implementation of notify_change) from having problematic executables
on proc.  Those mistakes are all too easy to make and would create
a situation where there are security issues or the assumptions of
some program having to be broken (and cause userspace regressions).

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
[bwh: Backported to 3.16: we don't have super_block::s_iflags; use
 file_system_type::fs_flags instead]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>fs: exec: apply CLOEXEC before changing dumpable task flags</title>
<updated>2017-03-16T02:26:38+00:00</updated>
<author>
<name>Aleksa Sarai</name>
<email>asarai@suse.de</email>
</author>
<published>2016-12-21T05:26:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8f7ceb94581c9bc3f91f5ef29e3efc3a67de0b57'/>
<id>urn:sha1:8f7ceb94581c9bc3f91f5ef29e3efc3a67de0b57</id>
<content type='text'>
commit 613cc2b6f272c1a8ad33aefa21cad77af23139f7 upstream.

If you have a process that has set itself to be non-dumpable, and it
then undergoes exec(2), any CLOEXEC file descriptors it has open are
"exposed" during a race window between the dumpable flags of the process
being reset for exec(2) and CLOEXEC being applied to the file
descriptors. This can be exploited by a process by attempting to access
/proc/&lt;pid&gt;/fd/... during this window, without requiring CAP_SYS_PTRACE.

The race in question is after set_dumpable has been (for get_link,
though the trace is basically the same for readlink):

[vfs]
-&gt; proc_pid_link_inode_operations.get_link
   -&gt; proc_pid_get_link
      -&gt; proc_fd_access_allowed
         -&gt; ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS);

Which will return 0, during the race window and CLOEXEC file descriptors
will still be open during this window because do_close_on_exec has not
been called yet. As a result, the ordering of these calls should be
reversed to avoid this race window.

This is of particular concern to container runtimes, where joining a
PID namespace with file descriptors referring to the host filesystem
can result in security issues (since PRCTL_SET_DUMPABLE doesn't protect
against access of CLOEXEC file descriptors -- file descriptors which may
reference filesystem objects the container shouldn't have access to).

Cc: dev@opencontainers.org
Reported-by: Michael Crosby &lt;crosbymichael@gmail.com&gt;
Signed-off-by: Aleksa Sarai &lt;asarai@suse.de&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>parisc,metag: Fix crashes due to stack randomization on stack-grows-upwards architectures</title>
<updated>2015-05-28T09:00:03+00:00</updated>
<author>
<name>Helge Deller</name>
<email>deller@gmx.de</email>
</author>
<published>2015-05-11T20:01:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=441b34775e8375194ce741e53f0e19241749ef35'/>
<id>urn:sha1:441b34775e8375194ce741e53f0e19241749ef35</id>
<content type='text'>
commit d045c77c1a69703143a36169c224429c48b9eecd upstream.

On architectures where the stack grows upwards (CONFIG_STACK_GROWSUP=y,
currently parisc and metag only) stack randomization sometimes leads to crashes
when the stack ulimit is set to lower values than STACK_RND_MASK (which is 8 MB
by default if not defined in arch-specific headers).

The problem is, that when the stack vm_area_struct is set up in fs/exec.c, the
additional space needed for the stack randomization (as defined by the value of
STACK_RND_MASK) was not taken into account yet and as such, when the stack
randomization code added a random offset to the stack start, the stack
effectively got smaller than what the user defined via rlimit_max(RLIMIT_STACK)
which then sometimes leads to out-of-stack situations and crashes.

This patch fixes it by adding the maximum possible amount of memory (based on
STACK_RND_MASK) which theoretically could be added by the stack randomization
code to the initial stack size. That way, the user-defined stack size is always
guaranteed to be at minimum what is defined via rlimit_max(RLIMIT_STACK).

This bug is currently not visible on the metag architecture, because on metag
STACK_RND_MASK is defined to 0 which effectively disables stack randomization.

The changes to fs/exec.c are inside an "#ifdef CONFIG_STACK_GROWSUP"
section, so it does not affect other platformws beside those where the
stack grows upwards (parisc and metag).

Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
Cc: linux-parisc@vger.kernel.org
Cc: James Hogan &lt;james.hogan@imgtec.com&gt;
Cc: linux-metag@vger.kernel.org
Signed-off-by: Luis Henriques &lt;luis.henriques@canonical.com&gt;
</content>
</entry>
<entry>
<title>fs: take i_mutex during prepare_binprm for set[ug]id executables</title>
<updated>2015-04-28T11:35:37+00:00</updated>
<author>
<name>Jann Horn</name>
<email>jann@thejh.net</email>
</author>
<published>2015-04-19T00:48:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=fb8aa6acf19e19953b9bd25c52409406a0f7ea62'/>
<id>urn:sha1:fb8aa6acf19e19953b9bd25c52409406a0f7ea62</id>
<content type='text'>
commit 8b01fc86b9f425899f8a3a8fc1c47d73c2c20543 upstream.

This prevents a race between chown() and execve(), where chowning a
setuid-user binary to root would momentarily make the binary setuid
root.

This patch was mostly written by Linus Torvalds.

Signed-off-by: Jann Horn &lt;jann@thejh.net&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
[ luis: backported to 3.16:
  - relaced task_no_new_privs() by current-&gt;no_new_privs
  - replaced READ_ONCE() by ACCESS_ONCE() ]
Signed-off-by: Luis Henriques &lt;luis.henriques@canonical.com&gt;
</content>
</entry>
<entry>
<title>perf: Differentiate exec() and non-exec() comm events</title>
<updated>2014-06-06T05:56:22+00:00</updated>
<author>
<name>Adrian Hunter</name>
<email>adrian.hunter@intel.com</email>
</author>
<published>2014-05-28T08:45:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=82b897782d10fcc4930c9d4a15b175348fdd2871'/>
<id>urn:sha1:82b897782d10fcc4930c9d4a15b175348fdd2871</id>
<content type='text'>
perf tools like 'perf report' can aggregate samples by comm strings,
which generally works.  However, there are other potential use-cases.
For example, to pair up 'calls' with 'returns' accurately (from branch
events like Intel BTS) it is necessary to identify whether the process
has exec'd.  Although a comm event is generated when an 'exec' happens
it is also generated whenever the comm string is changed on a whim
(e.g. by prctl PR_SET_NAME).  This patch adds a flag to the comm event
to differentiate one case from the other.

In order to determine whether the kernel supports the new flag, a
selection bit named 'exec' is added to struct perf_event_attr.  The
bit does nothing but will cause perf_event_open() to fail if the bit
is set on kernels that do not have it defined.

Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/537D9EBE.7030806@intel.com
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Dave Jones &lt;davej@redhat.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Cc: David Ahern &lt;dsahern@gmail.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf: Fix perf_event_comm() vs. exec() assumption</title>
<updated>2014-06-06T05:54:02+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2014-05-21T15:32:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e041e328c4b41e1db79bfe5ba9992c2ed771ad19'/>
<id>urn:sha1:e041e328c4b41e1db79bfe5ba9992c2ed771ad19</id>
<content type='text'>
perf_event_comm() assumes that set_task_comm() is only called on
exec(), and in particular that its only called on current.

Neither are true, as Dave reported a WARN triggered by set_task_comm()
being called on !current.

Separate the exec() hook from the comm hook.

Reported-by: Dave Jones &lt;davej@redhat.com&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20140521153219.GH5226@laptop.programming.kicks-ass.net
[ Build fix. ]
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>metag: Reduce maximum stack size to 256MB</title>
<updated>2014-05-14T23:00:35+00:00</updated>
<author>
<name>James Hogan</name>
<email>james.hogan@imgtec.com</email>
</author>
<published>2014-05-13T22:58:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d71f290b4e98a39f49f2595a13be3b4d5ce8e1f1'/>
<id>urn:sha1:d71f290b4e98a39f49f2595a13be3b4d5ce8e1f1</id>
<content type='text'>
Specify the maximum stack size for arches where the stack grows upward
(parisc and metag) in asm/processor.h rather than hard coding in
fs/exec.c so that metag can specify a smaller value of 256MB rather than
1GB.

This fixes a BUG on metag if the RLIMIT_STACK hard limit is increased
beyond a safe value by root. E.g. when starting a process after running
"ulimit -H -s unlimited" it will then attempt to use a stack size of the
maximum 1GB which is far too big for metag's limited user virtual
address space (stack_top is usually 0x3ffff000):

BUG: failure at fs/exec.c:589/shift_arg_pages()!

Signed-off-by: James Hogan &lt;james.hogan@imgtec.com&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;
Cc: "James E.J. Bottomley" &lt;jejb@parisc-linux.org&gt;
Cc: linux-parisc@vger.kernel.org
Cc: linux-metag@vger.kernel.org
Cc: John David Anglin &lt;dave.anglin@bell.net&gt;
Cc: stable@vger.kernel.org # only needed for &gt;= v3.9 (arch/metag)
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2014-04-12T21:49:50+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-04-12T21:49:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5166701b368caea89d57b14bf41cf39e819dad51'/>
<id>urn:sha1:5166701b368caea89d57b14bf41cf39e819dad51</id>
<content type='text'>
Pull vfs updates from Al Viro:
 "The first vfs pile, with deep apologies for being very late in this
  window.

  Assorted cleanups and fixes, plus a large preparatory part of iov_iter
  work.  There's a lot more of that, but it'll probably go into the next
  merge window - it *does* shape up nicely, removes a lot of
  boilerplate, gets rid of locking inconsistencie between aio_write and
  splice_write and I hope to get Kent's direct-io rewrite merged into
  the same queue, but some of the stuff after this point is having
  (mostly trivial) conflicts with the things already merged into
  mainline and with some I want more testing.

  This one passes LTP and xfstests without regressions, in addition to
  usual beating.  BTW, readahead02 in ltp syscalls testsuite has started
  giving failures since "mm/readahead.c: fix readahead failure for
  memoryless NUMA nodes and limit readahead pages" - might be a false
  positive, might be a real regression..."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
  missing bits of "splice: fix racy pipe-&gt;buffers uses"
  cifs: fix the race in cifs_writev()
  ceph_sync_{,direct_}write: fix an oops on ceph_osdc_new_request() failure
  kill generic_file_buffered_write()
  ocfs2_file_aio_write(): switch to generic_perform_write()
  ceph_aio_write(): switch to generic_perform_write()
  xfs_file_buffered_aio_write(): switch to generic_perform_write()
  export generic_perform_write(), start getting rid of generic_file_buffer_write()
  generic_file_direct_write(): get rid of ppos argument
  btrfs_file_aio_write(): get rid of ppos
  kill the 5th argument of generic_file_buffered_write()
  kill the 4th argument of __generic_file_aio_write()
  lustre: don't open-code kernel_recvmsg()
  ocfs2: don't open-code kernel_recvmsg()
  drbd: don't open-code kernel_recvmsg()
  constify blk_rq_map_user_iov() and friends
  lustre: switch to kernel_sendmsg()
  ocfs2: don't open-code kernel_sendmsg()
  take iov_iter stuff to mm/iov_iter.c
  process_vm_access: tidy up a bit
  ...
</content>
</entry>
</feed>
