diff options
author | Vincent Donnefort <vincent.donnefort@arm.com> | 2021-06-21 13:37:51 +0300 |
---|---|---|
committer | Peter Zijlstra <peterz@infradead.org> | 2021-06-22 17:41:59 +0300 |
commit | fecfcbc288e9f4923f40fd23ca78a6acdc7fdf6c (patch) | |
tree | 3e31673c76ca64c60d17b456665f96d9f5495043 | |
parent | 2f064a59a11ff9bc22e52e9678bc601404c7cb34 (diff) | |
download | linux-fecfcbc288e9f4923f40fd23ca78a6acdc7fdf6c.tar.xz |
sched/rt: Fix RT utilization tracking during policy change
RT keeps track of the utilization on a per-rq basis with the structure
avg_rt. This utilization is updated during task_tick_rt(),
put_prev_task_rt() and set_next_task_rt(). However, when the current
running task changes its policy, set_next_task_rt() which would usually
take care of updating the utilization when the rq starts running RT tasks,
will not see a such change, leaving the avg_rt structure outdated. When
that very same task will be dequeued later, put_prev_task_rt() will then
update the utilization, based on a wrong last_update_time, leading to a
huge spike in the RT utilization signal.
The signal would eventually recover from this issue after few ms. Even if
no RT tasks are run, avg_rt is also updated in __update_blocked_others().
But as the CPU capacity depends partly on the avg_rt, this issue has
nonetheless a significant impact on the scheduler.
Fix this issue by ensuring a load update when a running task changes
its policy to RT.
Fixes: 371bf427 ("sched/rt: Add rt_rq utilization tracking")
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/1624271872-211872-2-git-send-email-vincent.donnefort@arm.com
-rw-r--r-- | kernel/sched/rt.c | 17 |
1 files changed, 12 insertions, 5 deletions
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index a5254471371c..3daf42a0f462 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -2341,13 +2341,20 @@ void __init init_sched_rt_class(void) static void switched_to_rt(struct rq *rq, struct task_struct *p) { /* - * If we are already running, then there's nothing - * that needs to be done. But if we are not running - * we may need to preempt the current running task. - * If that current running task is also an RT task + * If we are running, update the avg_rt tracking, as the running time + * will now on be accounted into the latter. + */ + if (task_current(rq, p)) { + update_rt_rq_load_avg(rq_clock_pelt(rq), rq, 0); + return; + } + + /* + * If we are not running we may need to preempt the current + * running task. If that current running task is also an RT task * then see if we can move to another run queue. */ - if (task_on_rq_queued(p) && rq->curr != p) { + if (task_on_rq_queued(p)) { #ifdef CONFIG_SMP if (p->nr_cpus_allowed > 1 && rq->rt.overloaded) rt_queue_push_tasks(rq); |