<feed xmlns='http://www.w3.org/2005/Atom'>
<title>BMC/Intel-BMC/linux.git/fs/exec.c, branch v3.3</title>
<subtitle>Intel OpenBMC Linux kernel source tree (mirror)</subtitle>
<id>https://git.radix-linux.su/BMC/Intel-BMC/linux.git/atom?h=v3.3</id>
<link rel='self' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/atom?h=v3.3'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/'/>
<updated>2012-03-05T23:49:42+00:00</updated>
<entry>
<title>coredump_wait: don't call complete_vfork_done()</title>
<updated>2012-03-05T23:49:42+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2012-03-05T22:59:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=57b59c4a1400fa6c34764eab2e35a8762dc05a09'/>
<id>urn:sha1:57b59c4a1400fa6c34764eab2e35a8762dc05a09</id>
<content type='text'>
Now that CLONE_VFORK is killable, coredump_wait() no longer needs
complete_vfork_done().  zap_threads() should find and kill all tasks with
the same -&gt;mm, this includes our parent if -&gt;vfork_done is set.

mm_release() becomes the only caller, unexport complete_vfork_done().

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>vfork: introduce complete_vfork_done()</title>
<updated>2012-03-05T23:49:42+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2012-03-05T22:59:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=c415c3b47ea2754659d915cca387a20999044163'/>
<id>urn:sha1:c415c3b47ea2754659d915cca387a20999044163</id>
<content type='text'>
No functional changes.

Move the clear-and-complete-vfork_done code into the new trivial helper,
complete_vfork_done().

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>exec: fix use-after-free bug in setup_new_exec()</title>
<updated>2012-02-06T23:15:20+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>heiko.carstens@de.ibm.com</email>
</author>
<published>2012-02-04T09:47:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=96e02d1586782eadf051fa3d6bc4132d2447ac2c'/>
<id>urn:sha1:96e02d1586782eadf051fa3d6bc4132d2447ac2c</id>
<content type='text'>
Setting the task name is done within setup_new_exec() by accessing
bprm-&gt;filename. However this happens after flush_old_exec().
This may result in a use after free bug, flush_old_exec() may
"complete" vfork_done, which will wake up the parent which in turn
may free the passed in filename.
To fix this add a new tcomm field in struct linux_binprm which
contains the now early generated task name until it is used.

Fixes this bug on s390:

  Unable to handle kernel pointer dereference at virtual kernel address 0000000039768000
  Process kworker/u:3 (pid: 245, task: 000000003a3dc840, ksp: 0000000039453818)
  Krnl PSW : 0704000180000000 0000000000282e94 (setup_new_exec+0xa0/0x374)
  Call Trace:
  ([&lt;0000000000282e2c&gt;] setup_new_exec+0x38/0x374)
   [&lt;00000000002dd12e&gt;] load_elf_binary+0x402/0x1bf4
   [&lt;0000000000280a42&gt;] search_binary_handler+0x38e/0x5bc
   [&lt;0000000000282b6c&gt;] do_execve_common+0x410/0x514
   [&lt;0000000000282cb6&gt;] do_execve+0x46/0x58
   [&lt;00000000005bce58&gt;] kernel_execve+0x28/0x70
   [&lt;000000000014ba2e&gt;] ____call_usermodehelper+0x102/0x140
   [&lt;00000000005bc8da&gt;] kernel_thread_starter+0x6/0xc
   [&lt;00000000005bc8d4&gt;] kernel_thread_starter+0x0/0xc
  Last Breaking-Event-Address:
   [&lt;00000000002830f0&gt;] setup_new_exec+0x2fc/0x374

  Kernel panic - not syncing: Fatal exception: panic_on_oops

Reported-by: Sebastian Ott &lt;sebott@linux.vnet.ibm.com&gt;
Signed-off-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>tracepoint: add tracepoints for debugging oom_score_adj</title>
<updated>2012-01-11T00:30:44+00:00</updated>
<author>
<name>KAMEZAWA Hiroyuki</name>
<email>kamezawa.hiroyu@jp.fujitsu.com</email>
</author>
<published>2012-01-10T23:08:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=43d2b113241d6797b890318767e0af78e313414b'/>
<id>urn:sha1:43d2b113241d6797b890318767e0af78e313414b</id>
<content type='text'>
oom_score_adj is used for guarding processes from OOM-Killer.  One of
problem is that it's inherited at fork().  When a daemon set oom_score_adj
and make children, it's hard to know where the value is set.

This patch adds some tracepoints useful for debugging. This patch adds
3 trace points.
  - creating new task
  - renaming a task (exec)
  - set oom_score_adj

To debug, users need to enable some trace pointer. Maybe filtering is useful as

# EVENT=/sys/kernel/debug/tracing/events/task/
# echo "oom_score_adj != 0" &gt; $EVENT/task_newtask/filter
# echo "oom_score_adj != 0" &gt; $EVENT/task_rename/filter
# echo 1 &gt; $EVENT/enable
# EVENT=/sys/kernel/debug/tracing/events/oom/
# echo 1 &gt; $EVENT/enable

output will be like this.
# grep oom /sys/kernel/debug/tracing/trace
bash-7699  [007] d..3  5140.744510: oom_score_adj_update: pid=7699 comm=bash oom_score_adj=-1000
bash-7699  [007] ...1  5151.818022: task_newtask: pid=7729 comm=bash clone_flags=1200011 oom_score_adj=-1000
ls-7729  [003] ...2  5151.818504: task_rename: pid=7729 oldcomm=bash newcomm=ls oom_score_adj=-1000
bash-7699  [002] ...1  5175.701468: task_newtask: pid=7730 comm=bash clone_flags=1200011 oom_score_adj=-1000
grep-7730  [007] ...2  5175.701993: task_rename: pid=7730 oldcomm=bash newcomm=grep oom_score_adj=-1000

Signed-off-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>trim fs/internal.h</title>
<updated>2012-01-04T03:52:35+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2011-11-22T02:15:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=f47ec3f28354795f000c14bf18ed967ec81a3ec3'/>
<id>urn:sha1:f47ec3f28354795f000c14bf18ed967ec81a3ec3</id>
<content type='text'>
some stuff in there can actually become static; some belongs to pnode.h
as it's a private interface between namespace.c and pnode.c...

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>oom: remove oom_disable_count</title>
<updated>2011-11-01T00:30:45+00:00</updated>
<author>
<name>David Rientjes</name>
<email>rientjes@google.com</email>
</author>
<published>2011-11-01T00:07:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=c9f01245b6a7d77d17deaa71af10f6aca14fa24e'/>
<id>urn:sha1:c9f01245b6a7d77d17deaa71af10f6aca14fa24e</id>
<content type='text'>
This removes mm-&gt;oom_disable_count entirely since it's unnecessary and
currently buggy.  The counter was intended to be per-process but it's
currently decremented in the exit path for each thread that exits, causing
it to underflow.

The count was originally intended to prevent oom killing threads that
share memory with threads that cannot be killed since it doesn't lead to
future memory freeing.  The counter could be fixed to represent all
threads sharing the same mm, but it's better to remove the count since:

 - it is possible that the OOM_DISABLE thread sharing memory with the
   victim is waiting on that thread to exit and will actually cause
   future memory freeing, and

 - there is no guarantee that a thread is disabled from oom killing just
   because another thread sharing its mm is oom disabled.

Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Reported-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Reviewed-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Ying Han &lt;yinghan@google.com&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>move RLIMIT_NPROC check from set_user() to do_execve_common()</title>
<updated>2011-08-11T18:24:42+00:00</updated>
<author>
<name>Vasiliy Kulikov</name>
<email>segoon@openwall.com</email>
</author>
<published>2011-08-08T15:02:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=72fa59970f8698023045ab0713d66f3f4f96945c'/>
<id>urn:sha1:72fa59970f8698023045ab0713d66f3f4f96945c</id>
<content type='text'>
The patch http://lkml.org/lkml/2003/7/13/226 introduced an RLIMIT_NPROC
check in set_user() to check for NPROC exceeding via setuid() and
similar functions.

Before the check there was a possibility to greatly exceed the allowed
number of processes by an unprivileged user if the program relied on
rlimit only.  But the check created new security threat: many poorly
written programs simply don't check setuid() return code and believe it
cannot fail if executed with root privileges.  So, the check is removed
in this patch because of too often privilege escalations related to
buggy programs.

The NPROC can still be enforced in the common code flow of daemons
spawning user processes.  Most of daemons do fork()+setuid()+execve().
The check introduced in execve() (1) enforces the same limit as in
setuid() and (2) doesn't create similar security issues.

Neil Brown suggested to track what specific process has exceeded the
limit by setting PF_NPROC_EXCEEDED process flag.  With the change only
this process would fail on execve(), and other processes' execve()
behaviour is not changed.

Solar Designer suggested to re-check whether NPROC limit is still
exceeded at the moment of execve().  If the process was sleeping for
days between set*uid() and execve(), and the NPROC counter step down
under the limit, the defered execve() failure because NPROC limit was
exceeded days ago would be unexpected.  If the limit is not exceeded
anymore, we clear the flag on successful calls to execve() and fork().

The flag is also cleared on successful calls to set_user() as the limit
was exceeded for the previous user, not the current one.

Similar check was introduced in -ow patches (without the process flag).

v3 - clear PF_NPROC_EXCEEDED on successful calls to set_user().

Reviewed-by: James Morris &lt;jmorris@namei.org&gt;
Signed-off-by: Vasiliy Kulikov &lt;segoon@openwall.com&gt;
Acked-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fs/exec.c:acct_arg_size(): ptl is no longer needed for add_mm_counter()</title>
<updated>2011-07-26T23:49:44+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2011-07-26T23:08:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=32e107f71e4a993ac438f0049aa4019457911ffb'/>
<id>urn:sha1:32e107f71e4a993ac438f0049aa4019457911ffb</id>
<content type='text'>
acct_arg_size() takes -&gt;page_table_lock around add_mm_counter() if
!SPLIT_RSS_COUNTING.  This is not needed after commit 172703b08cd0 ("mm:
delete non-atomic mm counter implementation").

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Reviewed-by: Matt Fleming &lt;matt.fleming@linux.intel.com&gt;
Cc: Dave Hansen &lt;dave@linux.vnet.ibm.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>exec: do not retry load_binary method if CONFIG_MODULES=n</title>
<updated>2011-07-26T23:49:44+00:00</updated>
<author>
<name>Tetsuo Handa</name>
<email>penguin-kernel@i-love.sakura.ne.jp</email>
</author>
<published>2011-07-26T23:08:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=b4edf8bd06916645b57df23a720b17cae4051c43'/>
<id>urn:sha1:b4edf8bd06916645b57df23a720b17cae4051c43</id>
<content type='text'>
If CONFIG_MODULES=n, it makes no sense to retry the list of binary formats
handler because the list will not be modified by request_module().

Signed-off-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>exec: do not call request_module() twice from search_binary_handler()</title>
<updated>2011-07-26T23:49:44+00:00</updated>
<author>
<name>Tetsuo Handa</name>
<email>penguin-kernel@I-love.SAKURA.ne.jp</email>
</author>
<published>2011-07-26T23:08:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=912193521b719fbfc2f16776febf5232fe8ba261'/>
<id>urn:sha1:912193521b719fbfc2f16776febf5232fe8ba261</id>
<content type='text'>
Currently, search_binary_handler() tries to load binary loader module
using request_module() if a loader for the requested program is not yet
loaded.  But second attempt of request_module() does not affect the result
of search_binary_handler().

If request_module() triggered recursion, calling request_module() twice
causes 2 to the power of MAX_KMOD_CONCURRENT (= 50) repetitions.  It is
not an infinite loop but is sufficient for users to consider as a hang up.

Therefore, this patch changes not to call request_module() twice, making 1
to the power of MAX_KMOD_CONCURRENT repetitions in case of recursion.

Signed-off-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Reported-by: Richard Weinberger &lt;richard@nod.at&gt;
Tested-by: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
