<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/linux/sched.h, branch linux-2.6.28.y</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=linux-2.6.28.y</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=linux-2.6.28.y'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2009-05-02T17:56:49+00:00</updated>
<entry>
<title>sched: do not count frozen tasks toward load</title>
<updated>2009-05-02T17:56:49+00:00</updated>
<author>
<name>Nathan Lynch</name>
<email>ntl@pobox.com</email>
</author>
<published>2009-04-09T18:20:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2c8a5ffdefb7e231f3d8ecdc029112a646c1562e'/>
<id>urn:sha1:2c8a5ffdefb7e231f3d8ecdc029112a646c1562e</id>
<content type='text'>
upstream commit: e3c8ca8336707062f3f7cb1cd7e6b3c753baccdd

Freezing tasks via the cgroup freezer causes the load average to climb
because the freezer's current implementation puts frozen tasks in
uninterruptible sleep (D state).

Some applications which perform job-scheduling functions consult the
load average when making decisions.  If a cgroup is frozen, the load
average does not provide a useful measure of the system's utilization
to such applications.  This is especially inconvenient if the job
scheduler employs the cgroup freezer as a mechanism for preempting low
priority jobs.  Contrast this with using SIGSTOP for the same purpose:
the stopped tasks do not count toward system load.

Change task_contributes_to_load() to return false if the task is
frozen.  This results in /proc/loadavg behavior that better meets
users' expectations.

Signed-off-by: Nathan Lynch &lt;ntl@pobox.com&gt;
Acked-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Acked-by: Nigel Cunningham &lt;nigel@tuxonice.net&gt;
Tested-by: Nigel Cunningham &lt;nigel@tuxonice.net&gt;
Cc: containers@lists.linux-foundation.org
Cc: linux-pm@lists.linux-foundation.org
Cc: Matt Helsley &lt;matthltc@us.ibm.com&gt;
LKML-Reference: &lt;20090408194512.47a99b95@manatee.lan&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
</content>
</entry>
<entry>
<title>epoll: drop max_user_instances and rely only on max_user_watches</title>
<updated>2009-02-02T17:53:26+00:00</updated>
<author>
<name>Davide Libenzi</name>
<email>davidel@xmailserver.org</email>
</author>
<published>2009-01-29T22:25:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=273140886a6985eb08245334210550b340daa1c3'/>
<id>urn:sha1:273140886a6985eb08245334210550b340daa1c3</id>
<content type='text'>
commit 9df04e1f25effde823a600e755b51475d438f56b upstream.

Linus suggested to put limits where the money is, and max_user_watches
already does that w/out the need of max_user_instances.  That has the
advantage to mitigate the potential DoS while allowing pretty generous
default behavior.

Allowing top 4% of low memory (per user) to be allocated in epoll watches,
we have:

LOMEM    MAX_WATCHES (per user)
512MB    ~178000
1GB      ~356000
2GB      ~712000

A box with 512MB of lomem, will meet some challenge in hitting 180K
watches, socket buffers math teaches us.  No more max_user_instances
limits then.

Signed-off-by: Davide Libenzi &lt;davidel@xmailserver.org&gt;
Cc: Willy Tarreau &lt;w@1wt.eu&gt;
Cc: Michael Kerrisk &lt;mtk.manpages@googlemail.com&gt;
Cc: Bron Gondwana &lt;brong@fastmail.fm&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>epoll: introduce resource usage limits</title>
<updated>2008-12-02T03:55:24+00:00</updated>
<author>
<name>Davide Libenzi</name>
<email>davidel@xmailserver.org</email>
</author>
<published>2008-12-01T21:13:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7ef9964e6d1b911b78709f144000aacadd0ebc21'/>
<id>urn:sha1:7ef9964e6d1b911b78709f144000aacadd0ebc21</id>
<content type='text'>
It has been thought that the per-user file descriptors limit would also
limit the resources that a normal user can request via the epoll
interface.  Vegard Nossum reported a very simple program (a modified
version attached) that can make a normal user to request a pretty large
amount of kernel memory, well within the its maximum number of fds.  To
solve such problem, default limits are now imposed, and /proc based
configuration has been introduced.  A new directory has been created,
named /proc/sys/fs/epoll/ and inside there, there are two configuration
points:

  max_user_instances = Maximum number of devices - per user

  max_user_watches   = Maximum number of "watched" fds - per user

The current default for "max_user_watches" limits the memory used by epoll
to store "watches", to 1/32 of the amount of the low RAM.  As example, a
256MB 32bit machine, will have "max_user_watches" set to roughly 90000.
That should be enough to not break existing heavy epoll users.  The
default value for "max_user_instances" is set to 128, that should be
enough too.

This also changes the userspace, because a new error code can now come out
from EPOLL_CTL_ADD (-ENOSPC).  The EMFILE from epoll_create() was already
listed, so that should be ok.

[akpm@linux-foundation.org: use get_current_user()]
Signed-off-by: Davide Libenzi &lt;davidel@xmailserver.org&gt;
Cc: Michael Kerrisk &lt;mtk.manpages@gmail.com&gt;
Cc: &lt;stable@kernel.org&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@gmail.com&gt;
Reported-by: Vegard Nossum &lt;vegardno@ifi.uio.no&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fix for account_group_exec_runtime(), make sure -&gt;signal can't be freed under rq-&gt;lock</title>
<updated>2008-11-11T07:01:43+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2008-11-10T14:39:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ad474caca3e2a0550b7ce0706527ad5ab389a4d4'/>
<id>urn:sha1:ad474caca3e2a0550b7ce0706527ad5ab389a4d4</id>
<content type='text'>
Impact: fix hang/crash on ia64 under high load

This is ugly, but the simplest patch by far.

Unlike other similar routines, account_group_exec_runtime() could be
called "implicitly" from within scheduler after exit_notify(). This
means we can race with the parent doing release_task(), we can't just
check -&gt;signal != NULL.

Change __exit_signal() to do spin_unlock_wait(&amp;task_rq(tsk)-&gt;lock)
before __cleanup_signal() to make sure -&gt;signal can't be freed under
task_rq(tsk)-&gt;lock. Note that task_rq_unlock_wait() doesn't care
about the case when tsk changes cpu/rq under us, this should be OK.

Thanks to Ingo who nacked my previous buggy patch.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Reported-by: Doug Chapman &lt;doug.chapman@hp.com&gt;
</content>
</entry>
<entry>
<title>net: Fix recursive descent in __scm_destroy().</title>
<updated>2008-11-06T21:51:50+00:00</updated>
<author>
<name>David Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2008-11-06T08:37:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f8d570a4745835f2238a33b537218a1bb03fc671'/>
<id>urn:sha1:f8d570a4745835f2238a33b537218a1bb03fc671</id>
<content type='text'>
__scm_destroy() walks the list of file descriptors in the scm_fp_list
pointed to by the scm_cookie argument.

Those, in turn, can close sockets and invoke __scm_destroy() again.

There is nothing which limits how deeply this can occur.

The idea for how to fix this is from Linus.  Basically, we do all of
the fput()s at the top level by collecting all of the scm_fp_list
objects hit by an fput().  Inside of the initial __scm_destroy() we
keep running the list until it is empty.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge commit 'v2.6.28-rc1' into sched/urgent</title>
<updated>2008-10-24T10:48:46+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@elte.hu</email>
</author>
<published>2008-10-24T10:48:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8c82a17e9c924c0e9f13e75e4c2f6bca19a4b516'/>
<id>urn:sha1:8c82a17e9c924c0e9f13e75e4c2f6bca19a4b516</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Merge branch 'proc' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc</title>
<updated>2008-10-23T19:04:37+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2008-10-23T19:04:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=88ed86fee6651033de9b7038dac7869a9f19775a'/>
<id>urn:sha1:88ed86fee6651033de9b7038dac7869a9f19775a</id>
<content type='text'>
* 'proc' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc: (35 commits)
  proc: remove fs/proc/proc_misc.c
  proc: move /proc/vmcore creation to fs/proc/vmcore.c
  proc: move pagecount stuff to fs/proc/page.c
  proc: move all /proc/kcore stuff to fs/proc/kcore.c
  proc: move /proc/schedstat boilerplate to kernel/sched_stats.h
  proc: move /proc/modules boilerplate to kernel/module.c
  proc: move /proc/diskstats boilerplate to block/genhd.c
  proc: move /proc/zoneinfo boilerplate to mm/vmstat.c
  proc: move /proc/vmstat boilerplate to mm/vmstat.c
  proc: move /proc/pagetypeinfo boilerplate to mm/vmstat.c
  proc: move /proc/buddyinfo boilerplate to mm/vmstat.c
  proc: move /proc/vmallocinfo to mm/vmalloc.c
  proc: move /proc/slabinfo boilerplate to mm/slub.c, mm/slab.c
  proc: move /proc/slab_allocators boilerplate to mm/slab.c
  proc: move /proc/interrupts boilerplate code to fs/proc/interrupts.c
  proc: move /proc/stat to fs/proc/stat.c
  proc: move rest of /proc/partitions code to block/genhd.c
  proc: move /proc/cpuinfo code to fs/proc/cpuinfo.c
  proc: move /proc/devices code to fs/proc/devices.c
  proc: move rest of /proc/locks to fs/locks.c
  ...
</content>
</entry>
<entry>
<title>Merge branch 'v28-range-hrtimers-for-linus-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip</title>
<updated>2008-10-23T17:53:02+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2008-10-23T17:53:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1f6d6e8ebe73ba9d9d4c693f7f6f50f661dbd6e4'/>
<id>urn:sha1:1f6d6e8ebe73ba9d9d4c693f7f6f50f661dbd6e4</id>
<content type='text'>
* 'v28-range-hrtimers-for-linus-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (37 commits)
  hrtimers: add missing docbook comments to struct hrtimer
  hrtimers: simplify hrtimer_peek_ahead_timers()
  hrtimers: fix docbook comments
  DECLARE_PER_CPU needs linux/percpu.h
  hrtimers: fix typo
  rangetimers: fix the bug reported by Ingo for real
  rangetimer: fix BUG_ON reported by Ingo
  rangetimer: fix x86 build failure for the !HRTIMERS case
  select: fix alpha OSF wrapper
  select: fix alpha OSF wrapper
  hrtimer: peek at the timer queue just before going idle
  hrtimer: make the futex() system call use the per process slack value
  hrtimer: make the nanosleep() syscall use the per process slack
  hrtimer: fix signed/unsigned bug in slack estimator
  hrtimer: show the timer ranges in /proc/timer_list
  hrtimer: incorporate feedback from Peter Zijlstra
  hrtimer: add a hrtimer_start_range() function
  hrtimer: another build fix
  hrtimer: fix build bug found by Ingo
  hrtimer: make select() and poll() use the hrtimer range feature
  ...
</content>
</entry>
<entry>
<title>Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip</title>
<updated>2008-10-23T16:37:16+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2008-10-23T16:37:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=133e887f90208d339088dd60cb1d08a72ba27288'/>
<id>urn:sha1:133e887f90208d339088dd60cb1d08a72ba27288</id>
<content type='text'>
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: disable the hrtick for now
  sched: revert back to per-rq vruntime
  sched: fair scheduler should not resched rt tasks
  sched: optimize group load balancer
  sched: minor fast-path overhead reduction
  sched: fix the wrong mask_len, cleanup
  sched: kill unused scheduler decl.
  sched: fix the wrong mask_len
  sched: only update rq-&gt;clock while holding rq-&gt;lock
</content>
</entry>
<entry>
<title>proc: move /proc/schedstat boilerplate to kernel/sched_stats.h</title>
<updated>2008-10-23T14:06:12+00:00</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@gmail.com</email>
</author>
<published>2008-10-06T09:23:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b5aadf7f14c1acc94956aa257e018e9de3881f41'/>
<id>urn:sha1:b5aadf7f14c1acc94956aa257e018e9de3881f41</id>
<content type='text'>
Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
</content>
</entry>
</feed>
