diff options
author | Rafael J. Wysocki <rafael.j.wysocki@intel.com> | 2023-12-18 22:25:02 +0300 |
---|---|---|
committer | Greg Kroah-Hartman <gregkh@linuxfoundation.org> | 2024-02-05 23:14:15 +0300 |
commit | fcecef9a84f689631074240d63f4e00c4c94a614 (patch) | |
tree | be1c4ec471a01d5221c2bd560d82c4452dab9280 /include/linux/thermal.h | |
parent | 410063c9e100cb694f83591704fa69f28696dfa1 (diff) | |
download | linux-fcecef9a84f689631074240d63f4e00c4c94a614.tar.xz |
thermal: core: Fix thermal zone suspend-resume synchronization
[ Upstream commit 4e814173a8c4f432fd068b1c796f0416328c9d99 ]
There are 3 synchronization issues with thermal zone suspend-resume
during system-wide transitions:
1. The resume code runs in a PM notifier which is invoked after user
space has been thawed, so it can run concurrently with user space
which can trigger a thermal zone device removal. If that happens,
the thermal zone resume code may use a stale pointer to the next
list element and crash, because it does not hold thermal_list_lock
while walking thermal_tz_list.
2. The thermal zone resume code calls thermal_zone_device_init()
outside the zone lock, so user space or an update triggered by
the platform firmware may see an inconsistent state of a
thermal zone leading to unexpected behavior.
3. Clearing the in_suspend global variable in thermal_pm_notify()
allows __thermal_zone_device_update() to continue for all thermal
zones and it may as well run before the thermal_tz_list walk (or
at any point during the list walk for that matter) and attempt to
operate on a thermal zone that has not been resumed yet. It may
also race destructively with thermal_zone_device_init().
To address these issues, add thermal_list_lock locking to
thermal_pm_notify(), especially arount the thermal_tz_list,
make it call thermal_zone_device_init() back-to-back with
__thermal_zone_device_update() under the zone lock and replace
in_suspend with per-zone bool "suspend" indicators set and unset
under the given zone's lock.
Link: https://lore.kernel.org/linux-pm/20231218162348.69101-1-bo.ye@mediatek.com/
Reported-by: Bo Ye <bo.ye@mediatek.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Diffstat (limited to 'include/linux/thermal.h')
-rw-r--r-- | include/linux/thermal.h | 2 |
1 files changed, 2 insertions, 0 deletions
diff --git a/include/linux/thermal.h b/include/linux/thermal.h index a5ae4af955ff..4012f440bfdc 100644 --- a/include/linux/thermal.h +++ b/include/linux/thermal.h @@ -150,6 +150,7 @@ struct thermal_cooling_device { * @node: node in thermal_tz_list (in thermal_core.c) * @poll_queue: delayed work for polling * @notify_event: Last notification event + * @suspended: thermal zone suspend indicator */ struct thermal_zone_device { int id; @@ -183,6 +184,7 @@ struct thermal_zone_device { struct list_head node; struct delayed_work poll_queue; enum thermal_notify_event notify_event; + bool suspended; }; /** |