summaryrefslogtreecommitdiff
path: root/drivers/thermal/intel/intel_pch_thermal.c
AgeCommit message (Collapse)AuthorFilesLines
2022-05-19thermal: intel: pch: improve the cooling delay logZhang Rui1-11/+20
Previously, during suspend, intel_pch_thermal driver logs for every cooling iteration, about the current PCH temperature and number of cooling iterations that have been tried, like below [ 100.955526] intel_pch_thermal 0000:00:14.2: CPU-PCH current temp [53C] higher than the threshold temp [50C], sleep 1 times for 100 ms duration [ 101.064156] intel_pch_thermal 0000:00:14.2: CPU-PCH current temp [53C] higher than the threshold temp [50C], sleep 2 times for 100 ms duration After changing the default delay_cnt to 600, in practice, it is common to see tens of the above messages if the system is suspended when PCH overheats. Thus, change this log message from dev_warn to dev_dbg because it is only useful when we want to check the temperature trend. At the same time, there is always a one-line message given by the driver with the patch applied, with below four possibilities. 1. PCH is cool, no cooling delay needed [ 1791.902853] intel_pch_thermal 0000:00:12.0: CPU-PCH is cool [48C] 2. PCH overheats and becomes cool after the cooling delays [ 1475.511617] intel_pch_thermal 0000:00:12.0: CPU-PCH is cool [49C] after 30700 ms delay 3. PCH still overheats after the overall cooling timeout [ 2250.157487] intel_pch_thermal 0000:00:12.0: CPU-PCH is hot [60C] after 60000 ms delay. S0ix might fail 4. PCH aborts cooling because of wakeup event detected during the delay [ 1933.639509] intel_pch_thermal 0000:00:12.0: Wakeup event detected, abort cooling Signed-off-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-19thermal: intel: pch: enhance overheat handlingZhang Rui1-5/+8
Commit ef63b043ac86 ("thermal: intel: pch: fix S0ix failure due to PCH temperature above threshold") introduces delay loop mechanism that allows PCH temperature to go down below threshold during suspend so it won't block S0ix. And the default overall delay timeout is 1 second. However, in practice, we found that the time it takes to cool the PCH down below threshold highly depends on the initial PCH temperature when the delay starts, as well as the ambient temperature. And in some cases, the 1 second delay is not sufficient. As a result, the system stays in a shallower power state like PCx instead of S0ix, and drains the battery power, without user' notice. To make sure S0ix is not blocked by the PCH overheating, we 1. expand the default overall timeout to 60 seconds. 2. make sure the temperature is below threshold rather than equal to it. At the same time, as the cooling delay can be much longer and many wakeup events (ACPI Power Button press, USB mouse move, etc) becomes valid in the suspend_noirq phase, add detection of wakeup event so that the driver does not delay blindly when the system suspend is likely to abort soon. This patch may introduce longer suspend time, but only in the cases when the system overheats and Linux used to enter a shallower S2idle state, say, PCx instead of S0ix. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-19thermal: intel: pch: move cooling delay to suspend_noirq phaseZhang Rui1-2/+3
Move the PCH Thermal driver suspend callback to suspend_noirq to do cooling while the system is more quiescent. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-01-20thermal: intel: pch: Fix unexpected shutdown at critical temperatureKai-Heng Feng1-0/+6
Like previous patch, the intel_pch_thermal device is not in ACPI ThermalZone namespace, so a critical trip doesn't mean shutdown. Override the default .critical callback to prevent surprising thermal shutdoown. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20201221172345.36976-2-kai.heng.feng@canonical.com
2020-12-10thermal: intel: pch: use macro for temperature calculationSumeet Pawnikar1-10/+3
Use macro for temperature calculation Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20201210124801.13850-1-sumeet.r.pawnikar@intel.com
2020-11-17thermal: intel_pch_thermal: fix build for ACPI not enabledRandy Dunlap1-0/+4
The reference to acpi_gbl_FADT causes a build error when ACPI is not enabled. Fix by making that conditional on CONFIG_ACPI. ../drivers/thermal/intel/intel_pch_thermal.c: In function 'pch_wpt_suspend': ../drivers/thermal/intel/intel_pch_thermal.c:217:8: error: 'acpi_gbl_FADT' undeclared (first use in this function); did you mean 'acpi_get_type'? if (!(acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0)) ^~~~~~~~~~~~~ Fixes: ef63b043ac86 ("thermal: intel: pch: fix S0ix failure due to PCH temperature above threshold") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Amit Kucheria <amitk@kernel.org> Cc: linux-pm@vger.kernel.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20201117023807.8266-1-rdunlap@infradead.org
2020-11-14thermal: intel_pch_thermal: Add PCI ids for Lewisburg PCH.Andres Freund1-1/+9
I noticed that I couldn't read the PCH temperature on my workstation (C620 series chipset, w/ 2x Xeon Gold 5215 CPUs) directly, but had to go through IPMI. Looking at the data sheet, it looks to me like the existing intel PCH thermal driver should work without changes for Lewisburg. I suspect there's some other PCI IDs missing. But I hope somebody at Intel would have an easier time figuring that out than I... Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Tushar Dave <tushar.n.dave@intel.com> Cc: Zhang Rui <rui.zhang@intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Link: https://lore.kernel.org/lkml/20200115184415.1726953-1-andres@anarazel.de/ Signed-off-by: Andres Freund <andres@anarazel.de> Reviewed-by: Pandruvada, Srinivas <srinivas.pandruvada@linux.intel.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20201113204916.1144907-1-andres@anarazel.de
2020-11-07thermal: intel: pch: fix S0ix failure due to PCH temperature above thresholdSumeet Pawnikar1-6/+70
When system tries to enter S0ix suspend state, just after active load scenarios, it fails due to PCH current temperature is higher than set threshold. This patch introduces delay loop mechanism that allows PCH temperature to go down below threshold during suspend so it won't fail to enter S0ix. Add delay loop timeout and count as module parameters for user to tune it, if required based on system design. This change notifies the different warning messages like when PCH temperature above the threshold and executing delay loop. Also, notify the messages when it success or failure for S0ix entry. Previously out of 1000 runs around 3 to 5 times it might fail to enter S0ix just after heavy workload. With this change, S0ix failures reduced as PCH cools down below threshold. Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20201106170633.20838-1-sumeet.r.pawnikar@intel.com
2020-08-04thermal: intel: intel_pch_thermal: Add Cannon Lake Low Power PCH supportSumeet Pawnikar1-0/+3
Add LP (Low Power) PCH id for Cannon Lake (CNL) based platforms. Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/1596097503-27924-1-git-send-email-sumeet.r.pawnikar@intel.com
2020-06-29thermal: Explicitly enable non-changing thermal zone devicesAndrzej Pietrasiewicz1-0/+5
Some thermal zone devices never change their state, so they should be always enabled. Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200629122925.21729-9-andrzej.p@collabora.com
2020-01-31thermal: intel_pch: switch to use <linux/units.h> helpersAkinobu Mita1-1/+2
This switches the intel pch thermal driver to use deci_kelvin_to_millicelsius() in <linux/units.h> instead of helpers in <linux/thermal.h>. This is preparation for centralizing the kelvin to/from Celsius conversion helpers in <linux/units.h>. Link: http://lkml.kernel.org/r/1576386975-7941-7-git-send-email-akinobu.mita@gmail.com Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Sujith Thomas <sujith.thomas@intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Andy Shevchenko <andy@infradead.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Amit Kucheria <amit.kucheria@verdurent.com> Cc: Jean Delvare <jdelvare@suse.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Keith Busch <kbusch@kernel.org> Cc: Jens Axboe <axboe@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Cc: Hartmut Knaack <knaack.h@gmx.de> Cc: Johannes Berg <johannes.berg@intel.com> Cc: Jonathan Cameron <jic23@kernel.org> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: Lars-Peter Clausen <lars@metafoo.de> Cc: Luca Coelho <luciano.coelho@intel.com> Cc: Peter Meerwald-Stadler <pmeerw@pmeerw.net> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-27thermal: intel: intel_pch_thermal: Add Comet Lake (CML) platform supportGayatri Kammela1-0/+8
Add Comet Lake to the list of the platforms to support intel_pch_thermal driver. Cc: Zhang rui <rui.zhang@intel.com> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Gayatri Kammela <gayatri.kammela@intel.com> Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20191211200043.4985-1-gayatri.kammela@intel.com
2020-01-27thermal: intel: Fix unmatched pci_release_regionChuhong Yuan1-1/+1
The driver calls pci_request_regions() in probe and uses pci_release_regions() in probe failure. However, it calls pci_release_region() in remove, which does match the other two calls. Use pci_release_regions() instead to unify them. Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20191206075531.18637-1-hslester96@gmail.com
2019-09-24thermal: intel: Use dev_get_drvdataChuhong Yuan1-4/+2
Instead of using to_pci_dev + pci_get_drvdata, use dev_get_drvdata to make code simpler. Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-06-05treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288Thomas Gleixner1-10/+1
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms and conditions of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 263 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190529141901.208660670@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-07drivers: thermal: Move various drivers for intel platforms into a subdirAmit Kucheria1-0/+432
This cleans up the directory a bit, now that we have several other platforms using platform-specific sub-directories. Compile-tested with ARCH=x86 defconfig and the drivers explicitly enabled with menuconfig. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Zhang Rui <rui.zhang@intel.com>