<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/linux/sched.h, branch v6.6.132</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.132</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.132'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-02-06T15:48:28+00:00</updated>
<entry>
<title>perf: sched: Fix perf crash with new is_user_task() helper</title>
<updated>2026-02-06T15:48:28+00:00</updated>
<author>
<name>Steven Rostedt</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2026-02-03T20:27:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d84a4836dc246b7dc244e46a08ff992956b68db0'/>
<id>urn:sha1:d84a4836dc246b7dc244e46a08ff992956b68db0</id>
<content type='text'>
[ Upstream commit 76ed27608f7dd235b727ebbb12163438c2fbb617 ]

In order to do a user space stacktrace the current task needs to be a user
task that has executed in user space. It use to be possible to test if a
task is a user task or not by simply checking the task_struct mm field. If
it was non NULL, it was a user task and if not it was a kernel task.

But things have changed over time, and some kernel tasks now have their
own mm field.

An idea was made to instead test PF_KTHREAD and two functions were used to
wrap this check in case it became more complex to test if a task was a
user task or not[1]. But this was rejected and the C code simply checked
the PF_KTHREAD directly.

It was later found that not all kernel threads set PF_KTHREAD. The io-uring
helpers instead set PF_USER_WORKER and this needed to be added as well.

But checking the flags is still not enough. There's a very small window
when a task exits that it frees its mm field and it is set back to NULL.
If perf were to trigger at this moment, the flags test would say its a
user space task but when perf would read the mm field it would crash with
at NULL pointer dereference.

Now there are flags that can be used to test if a task is exiting, but
they are set in areas that perf may still want to profile the user space
task (to see where it exited). The only real test is to check both the
flags and the mm field.

Instead of making this modification in every location, create a new
is_user_task() helper function that does all the tests needed to know if
it is safe to read the user space memory or not.

[1] https://lore.kernel.org/all/20250425204120.639530125@goodmis.org/

Fixes: 90942f9fac05 ("perf: Use current-&gt;flags &amp; PF_KTHREAD|PF_USER_WORKER instead of current-&gt;mm == NULL")
Closes: https://lore.kernel.org/all/0d877e6f-41a7-4724-875d-0b0a27b8a545@roeck-us.net/
Reported-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Tested-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260129102821.46484722@gandalf.local.home
[ Adjust context ]
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>rseq: Protect event mask against membarrier IPI</title>
<updated>2025-10-19T14:30:59+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2025-10-16T01:31:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d0d9fa88d7ab7e3ef6adc4cb4d1419b7c2d8016f'/>
<id>urn:sha1:d0d9fa88d7ab7e3ef6adc4cb4d1419b7c2d8016f</id>
<content type='text'>
[ Upstream commit 6eb350a2233100a283f882c023e5ad426d0ed63b ]

rseq_need_restart() reads and clears task::rseq_event_mask with preemption
disabled to guard against the scheduler.

But membarrier() uses an IPI and sets the PREEMPT bit in the event mask
from the IPI, which leaves that RMW operation unprotected.

Use guard(irq) if CONFIG_MEMBARRIER is enabled to fix that.

Fixes: 2a36ab717e8f ("rseq/membarrier: Add MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ")
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Reviewed-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: stable@vger.kernel.org
[ Applied changes to include/linux/sched.h instead of include/linux/rseq.h ]
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>sched/core: Remove ifdeffery for saved_state</title>
<updated>2025-08-15T10:09:07+00:00</updated>
<author>
<name>Elliot Berman</name>
<email>quic_eberman@quicinc.com</email>
</author>
<published>2023-09-08T22:49:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8afa818c77330fcb269c77540b268f199d84ee8e'/>
<id>urn:sha1:8afa818c77330fcb269c77540b268f199d84ee8e</id>
<content type='text'>
commit fbaa6a181a4b1886cbf4214abdf9a2df68471510 upstream.

In preparation for freezer to also use saved_state, remove the
CONFIG_PREEMPT_RT compilation guard around saved_state.

On the arm64 platform I tested which did not have CONFIG_PREEMPT_RT,
there was no statistically significant deviation by applying this patch.

Test methodology:

perf bench sched message -g 40 -l 40

Signed-off-by: Elliot Berman &lt;quic_eberman@quicinc.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Chen Ridong &lt;chenridong@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>NFS: fix nfs_release_folio() to not deadlock via kcompactd writeback</title>
<updated>2025-03-13T11:58:27+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@kernel.org</email>
</author>
<published>2025-02-25T02:20:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ab0727d6e2196682351c25c1dd112136f6991f11'/>
<id>urn:sha1:ab0727d6e2196682351c25c1dd112136f6991f11</id>
<content type='text'>
commit ce6d9c1c2b5cc785016faa11b48b6cd317eb367e upstream.

Add PF_KCOMPACTD flag and current_is_kcompactd() helper to check for it so
nfs_release_folio() can skip calling nfs_wb_folio() from kcompactd.

Otherwise NFS can deadlock waiting for kcompactd enduced writeback which
recurses back to NFS (which triggers writeback to NFSD via NFS loopback
mount on the same host, NFSD blocks waiting for XFS's call to
__filemap_get_folio):

6070.550357] INFO: task kcompactd0:58 blocked for more than 4435 seconds.

{---
[58] "kcompactd0"
[&lt;0&gt;] folio_wait_bit+0xe8/0x200
[&lt;0&gt;] folio_wait_writeback+0x2b/0x80
[&lt;0&gt;] nfs_wb_folio+0x80/0x1b0 [nfs]
[&lt;0&gt;] nfs_release_folio+0x68/0x130 [nfs]
[&lt;0&gt;] split_huge_page_to_list_to_order+0x362/0x840
[&lt;0&gt;] migrate_pages_batch+0x43d/0xb90
[&lt;0&gt;] migrate_pages_sync+0x9a/0x240
[&lt;0&gt;] migrate_pages+0x93c/0x9f0
[&lt;0&gt;] compact_zone+0x8e2/0x1030
[&lt;0&gt;] compact_node+0xdb/0x120
[&lt;0&gt;] kcompactd+0x121/0x2e0
[&lt;0&gt;] kthread+0xcf/0x100
[&lt;0&gt;] ret_from_fork+0x31/0x40
[&lt;0&gt;] ret_from_fork_asm+0x1a/0x30
---}

[akpm@linux-foundation.org: fix build]
Link: https://lkml.kernel.org/r/20250225022002.26141-1-snitzer@kernel.org
Fixes: 96780ca55e3c ("NFS: fix up nfs_release_folio() to try to release the page")
Signed-off-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Cc: Anna Schumaker &lt;anna.schumaker@oracle.com&gt;
Cc: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>sched/fair: Fix value reported by hot tasks pulled in /proc/schedstat</title>
<updated>2025-02-08T08:51:44+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2024-12-20T06:32:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=32fe5c4c3e553ae83f6c45242b50c79cfd74cd5a'/>
<id>urn:sha1:32fe5c4c3e553ae83f6c45242b50c79cfd74cd5a</id>
<content type='text'>
[ Upstream commit a430d99e349026d53e2557b7b22bd2ebd61fe12a ]

In /proc/schedstat, lb_hot_gained reports the number hot tasks pulled
during load balance. This value is incremented in can_migrate_task()
if the task is migratable and hot. After incrementing the value,
load balancer can still decide not to migrate this task leading to wrong
accounting. Fix this by incrementing stats when hot tasks are detached.
This issue only exists in detach_tasks() where we can decide to not
migrate hot task even if it is migratable. However, in detach_one_task(),
we migrate it unconditionally.

[Swapnil: Handled the case where nr_failed_migrations_hot was not accounted properly and wrote commit log]

Fixes: d31980846f96 ("sched: Move up affinity check to mitigate useless redoing overhead")
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reported-by: "Gautham R. Shenoy" &lt;gautham.shenoy@amd.com&gt;
Not-yet-signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Swapnil Sapkal &lt;swapnil.sapkal@amd.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20241220063224.17767-2-swapnil.sapkal@amd.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>freezer, sched: Report frozen tasks as 'D' instead of 'R'</title>
<updated>2025-01-02T09:32:09+00:00</updated>
<author>
<name>Chen Ridong</name>
<email>chenridong@huawei.com</email>
</author>
<published>2024-12-17T00:48:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c1a26ea77f8138cc88e136ac5f15930e8ab2d5a0'/>
<id>urn:sha1:c1a26ea77f8138cc88e136ac5f15930e8ab2d5a0</id>
<content type='text'>
[ Upstream commit f718faf3940e95d5d34af9041f279f598396ab7d ]

Before commit:

  f5d39b020809 ("freezer,sched: Rewrite core freezer logic")

the frozen task stat was reported as 'D' in cgroup v1.

However, after rewriting the core freezer logic, the frozen task stat is
reported as 'R'. This is confusing, especially when a task with stat of
'S' is frozen.

This bug can be reproduced with these steps:

	$ cd /sys/fs/cgroup/freezer/
	$ mkdir test
	$ sleep 1000 &amp;
	[1] 739         // task whose stat is 'S'
	$ echo 739 &gt; test/cgroup.procs
	$ echo FROZEN &gt; test/freezer.state
	$ ps -aux | grep 739
	root     739  0.1  0.0   8376  1812 pts/0    R    10:56   0:00 sleep 1000

As shown above, a task whose stat is 'S' was changed to 'R' when it was
frozen.

To solve this regression, simply maintain the same reported state as
before the rewrite.

[ mingo: Enhanced the changelog and comments ]

Fixes: f5d39b020809 ("freezer,sched: Rewrite core freezer logic")
Signed-off-by: Chen Ridong &lt;chenridong@huawei.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Michal Koutný &lt;mkoutny@suse.com&gt;
Link: https://lore.kernel.org/r/20241217004818.3200515-1-chenridong@huaweicloud.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched: Unify runtime accounting across classes</title>
<updated>2024-12-14T19:00:19+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2023-11-04T10:59:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4db5988bb0996126895df56784f59076bc7b370a'/>
<id>urn:sha1:4db5988bb0996126895df56784f59076bc7b370a</id>
<content type='text'>
[ Upstream commit 5d69eca542ee17c618f9a55da52191d5e28b435f ]

All classes use sched_entity::exec_start to track runtime and have
copies of the exact same code around to compute runtime.

Collapse all that.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Daniel Bristot de Oliveira &lt;bristot@kernel.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Phil Auld &lt;pauld@redhat.com&gt;
Reviewed-by: Valentin Schneider &lt;vschneid@redhat.com&gt;
Reviewed-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Link: https://lkml.kernel.org/r/54d148a144f26d9559698c4dd82d8859038a7380.1699095159.git.bristot@kernel.org
Stable-dep-of: 0664e2c311b9 ("sched/deadline: Fix warning in migrate_enable for boosted tasks")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/headers: Move 'struct sched_param' out of uapi, to work around glibc/musl breakage</title>
<updated>2024-12-14T19:00:19+00:00</updated>
<author>
<name>Kir Kolyshkin</name>
<email>kolyshkin@gmail.com</email>
</author>
<published>2023-08-08T03:03:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=654f3294c69e0064df5c6e8552dc188433b123aa'/>
<id>urn:sha1:654f3294c69e0064df5c6e8552dc188433b123aa</id>
<content type='text'>
[ Upstream commit d844fe65f0957024c3e1b0bf2a0615246184d9bc ]

Both glibc and musl define 'struct sched_param' in sched.h, while kernel
has it in uapi/linux/sched/types.h, making it cumbersome to use
sched_getattr(2) or sched_setattr(2) from userspace.

For example, something like this:

	#include &lt;sched.h&gt;
	#include &lt;linux/sched/types.h&gt;

	struct sched_attr sa;

will result in "error: redefinition of ‘struct sched_param’" (note the
code doesn't need sched_param at all -- it needs struct sched_attr
plus some stuff from sched.h).

The situation is, glibc is not going to provide a wrapper for
sched_{get,set}attr, thus the need to include linux/sched_types.h
directly, which leads to the above problem.

Thus, the userspace is left with a few sub-par choices when it wants to
use e.g. sched_setattr(2), such as maintaining a copy of struct
sched_attr definition, or using some other ugly tricks.

OTOH, 'struct sched_param' is well known, defined in POSIX, and it won't
be ever changed (as that would break backward compatibility).

So, while 'struct sched_param' is indeed part of the kernel uapi,
exposing it the way it's done now creates an issue, and hiding it
(like this patch does) fixes that issue, hopefully without creating
another one: common userspace software rely on libc headers, and as
for "special" software (like libc), it looks like glibc and musl
do not rely on kernel headers for 'struct sched_param' definition
(but let's Cc their mailing lists in case it's otherwise).

The alternative to this patch would be to move struct sched_attr to,
say, linux/sched.h, or linux/sched/attr.h (the new file).

Oh, and here is the previous attempt to fix the issue:

  https://lore.kernel.org/all/20200528135552.GA87103@google.com/

While I support Linus arguments, the issue is still here
and needs to be fixed.

[ mingo: Linus is right, this shouldn't be needed - but on the other
         hand I agree that this header is not really helpful to
	 user-space as-is. So let's pretend that
	 &lt;uapi/linux/sched/types.h&gt; is only about sched_attr, and
	 call this commit a workaround for user-space breakage
	 that it in reality is ... Also, remove the Fixes tag. ]

Signed-off-by: Kir Kolyshkin &lt;kolyshkin@gmail.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20230808030357.1213829-1-kolyshkin@gmail.com
Stable-dep-of: 0664e2c311b9 ("sched/deadline: Fix warning in migrate_enable for boosted tasks")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>rcu-tasks: Add data to eliminate RCU-tasks/do_exit() deadlocks</title>
<updated>2024-11-08T15:28:22+00:00</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2024-02-05T21:08:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=dc5d4d4c12246b802177742e965fcf81691d2da8'/>
<id>urn:sha1:dc5d4d4c12246b802177742e965fcf81691d2da8</id>
<content type='text'>
[ Upstream commit bfe93930ea1ea3c6c115a7d44af6e4fea609067e ]

Holding a mutex across synchronize_rcu_tasks() and acquiring
that same mutex in code called from do_exit() after its call to
exit_tasks_rcu_start() but before its call to exit_tasks_rcu_stop()
results in deadlock.  This is by design, because tasks that are far
enough into do_exit() are no longer present on the tasks list, making
it a bit difficult for RCU Tasks to find them, let alone wait on them
to do a voluntary context switch.  However, such deadlocks are becoming
more frequent.  In addition, lockdep currently does not detect such
deadlocks and they can be difficult to reproduce.

In addition, if a task voluntarily context switches during that time
(for example, if it blocks acquiring a mutex), then this task is in an
RCU Tasks quiescent state.  And with some adjustments, RCU Tasks could
just as well take advantage of that fact.

This commit therefore adds the data structures that will be needed
to rely on these quiescent states and to eliminate these deadlocks.

Link: https://lore.kernel.org/all/20240118021842.290665-1-chenzhongjin@huawei.com/

Reported-by: Chen Zhongjin &lt;chenzhongjin@huawei.com&gt;
Reported-by: Yang Jihong &lt;yangjihong1@huawei.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Tested-by: Yang Jihong &lt;yangjihong1@huawei.com&gt;
Tested-by: Chen Zhongjin &lt;chenzhongjin@huawei.com&gt;
Reviewed-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Stable-dep-of: fd70e9f1d85f ("rcu-tasks: Fix access non-existent percpu rtpcp variable in rcu_tasks_need_gpcb()")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/core: Report correct state for TASK_IDLE | TASK_FREEZABLE</title>
<updated>2023-08-30T08:08:38+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2023-08-29T23:04:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0d6b35283bcf1a379cf20066544af8e6a6b16b46'/>
<id>urn:sha1:0d6b35283bcf1a379cf20066544af8e6a6b16b46</id>
<content type='text'>
task_state_index() ignores uninteresting state flags (such as
TASK_FREEZABLE) for most states, but for TASK_IDLE and TASK_RTLOCK_WAIT
it does not.

So if a task is waiting TASK_IDLE|TASK_FREEZABLE it gets incorrectly
reported as TASK_UNINTERRUPTIBLE or "D".  (it is planned for nfsd to
change to use this state).

Fix this by only testing the interesting bits and not the irrelevant
bits in __task_state_index()

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lore.kernel.org/r/169335025927.5133.4781141800413736103@noble.neil.brown.name
</content>
</entry>
</feed>
