summaryrefslogtreecommitdiff
path: root/include/linux/sched.h
diff options
context:
space:
mode:
authorValentin Schneider <valentin.schneider@arm.com>2022-01-27 18:40:59 +0300
committerPeter Zijlstra <peterz@infradead.org>2022-03-01 18:18:38 +0300
commit49bef33e4b87b743495627a529029156c6e09530 (patch)
tree1022a0794c2ffbb6d35f75f9e014d3a2937cb21c /include/linux/sched.h
parent3eba0505d03a9c1eb30d40c2330c0880b22d1b3a (diff)
downloadlinux-49bef33e4b87b743495627a529029156c6e09530.tar.xz
sched/rt: Plug rt_mutex_setprio() vs push_rt_task() race
John reported that push_rt_task() can end up invoking find_lowest_rq(rq->curr) when curr is not an RT task (in this case a CFS one), which causes mayhem down convert_prio(). This can happen when current gets demoted to e.g. CFS when releasing an rt_mutex, and the local CPU gets hit with an rto_push_work irqwork before getting the chance to reschedule. Exactly who triggers this work isn't entirely clear to me - switched_from_rt() only invokes rt_queue_pull_task() if there are no RT tasks on the local RQ, which means the local CPU can't be in the rto_mask. My current suspected sequence is something along the lines of the below, with the demoted task being current. mark_wakeup_next_waiter() rt_mutex_adjust_prio() rt_mutex_setprio() // deboost originally-CFS task check_class_changed() switched_from_rt() // Only rt_queue_pull_task() if !rq->rt.rt_nr_running switched_to_fair() // Sets need_resched __balance_callbacks() // if pull_rt_task(), tell_cpu_to_push() can't select local CPU per the above raw_spin_rq_unlock(rq) // need_resched is set, so task_woken_rt() can't // invoke push_rt_tasks(). Best I can come up with is // local CPU has rt_nr_migratory >= 2 after the demotion, so stays // in the rto_mask, and then: <some other CPU running rto_push_irq_work_func() queues rto_push_work on this CPU> push_rt_task() // breakage follows here as rq->curr is CFS Move an existing check to check rq->curr vs the next pushable task's priority before getting anywhere near find_lowest_rq(). While at it, add an explicit sched_class of rq->curr check prior to invoking find_lowest_rq(rq->curr). Align the DL logic to also reschedule regardless of next_task's migratability. Fixes: a7c81556ec4d ("sched: Fix migrate_disable() vs rt/dl balancing") Reported-by: John Keeping <john@metanate.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Tested-by: John Keeping <john@metanate.com> Link: https://lore.kernel.org/r/20220127154059.974729-1-valentin.schneider@arm.com
Diffstat (limited to 'include/linux/sched.h')
0 files changed, 0 insertions, 0 deletions