summaryrefslogtreecommitdiff
path: root/drivers/thermal
AgeCommit message (Collapse)AuthorFilesLines
2023-03-11thermal: intel: powerclamp: Fix cur_state for multi package systemSrinivas Pandruvada1-4/+16
commit 8e47363588377e1bdb65e2b020b409cfb44dd260 upstream. The powerclamp cooling device cur_state shows actual idle observed by package C-state idle counters. But the implementation is not sufficient for multi package or multi die system. The cur_state value is incorrect. On these systems, these counters must be read from each package/die and somehow aggregate them. But there is no good method for aggregation. It was not a problem when explicit CPU model addition was required to enable intel powerclamp. In this way certain CPU models could have been avoided. But with the removal of CPU model check with the availability of Package C-state counters, the driver is loaded on most of the recent systems. For multi package/die systems, just show the actual target idle state, the system is trying to achieve. In powerclamp this is the user set state minus one. Also there is no use of starting a worker thread for polling package C-state counters and applying any compensation for multiple package or multiple die systems. Fixes: b721ca0d1927 ("thermal/powerclamp: remove cpu whitelist") Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: 4.14+ <stable@vger.kernel.org> # 4.14+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-11thermal: intel: quark_dts: fix error pointer dereferenceDan Carpenter1-10/+2
[ Upstream commit f1b930e740811d416de4d2074da48b6633a672c8 ] If alloc_soc_dts() fails, then we can just return. Trying to free "soc_dts" will lead to an Oops. Fixes: 8c1876939663 ("thermal: intel Quark SoC X1000 DTS thermal driver") Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-11thermal: intel: Fix unsigned comparison with less than zeroYang Li1-1/+1
[ Upstream commit e7fcfe67f9f410736b758969477b17ea285e8e6c ] The return value from the call to intel_tcc_get_tjmax() is int, which can be a negative error code. However, the return value is being assigned to an u32 variable 'tj_max', so making 'tj_max' an int. Eliminate the following warning: ./drivers/thermal/intel/intel_soc_dts_iosf.c:394:5-11: WARNING: Unsigned expression compared with zero: tj_max < 0 Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3637 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-22thermal: intel: int340x: Add locking to int340x_thermal_get_trip_type()Rafael J. Wysocki1-3/+7
commit acd7e9ee57c880b99671dd99680cb707b7b5b0ee upstream. In order to prevent int340x_thermal_get_trip_type() from possibly racing with int340x_thermal_read_trips() invoked by int3403_notify() add locking to it in analogy with int340x_thermal_get_trip_temp(). Fixes: 6757a7abe47b ("thermal: intel: int340x: Protect trip temperature from concurrent updates") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-22thermal: intel: int340x: Protect trip temperature from concurrent updatesSrinivas Pandruvada2-3/+16
[ Upstream commit 6757a7abe47bcb12cb2d45661067e182424b0ee3 ] Trip temperatures are read using ACPI methods and stored in the memory during zone initializtion and when the firmware sends a notification for change. This trip temperature is returned when the thermal core calls via callback get_trip_temp(). But it is possible that while updating the memory copy of the trips when the firmware sends a notification for change, thermal core is reading the trip temperature via the callback get_trip_temp(). This may return invalid trip temperature. To address this add a mutex to protect the invalid temperature reads in the callback get_trip_temp() and int340x_thermal_read_trips(). Fixes: 5fbf7f27fa3d ("Thermal/int340x: Add common thermal zone handler") Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: 5.0+ <stable@vger.kernel.org> # 5.0+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-10-26thermal: intel_powerclamp: Use first online CPU as control_cpuRafael J. Wysocki1-5/+1
commit 4bb7f6c2781e46fc5bd00475a66df2ea30ef330d upstream. Commit 68b99e94a4a2 ("thermal: intel_powerclamp: Use get_cpu() instead of smp_processor_id() to avoid crash") fixed an issue related to using smp_processor_id() in preemptible context by replacing it with a pair of get_cpu()/put_cpu(), but what is needed there really is any online CPU and not necessarily the one currently running the code. Arguably, getting the one that's running the code in there is confusing. For this reason, simply give the control CPU role to the first online one which automatically will be CPU0 if it is online, so one check can be dropped from the code for an added benefit. Link: https://lore.kernel.org/linux-pm/20221011113646.GA12080@duo.ucw.cz/ Fixes: 68b99e94a4a2 ("thermal: intel_powerclamp: Use get_cpu() instead of smp_processor_id() to avoid crash") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-10-26thermal: intel_powerclamp: Use get_cpu() instead of smp_processor_id() to ↵Srinivas Pandruvada1-2/+4
avoid crash [ Upstream commit 68b99e94a4a2db6ba9b31fe0485e057b9354a640 ] When CPU 0 is offline and intel_powerclamp is used to inject idle, it generates kernel BUG: BUG: using smp_processor_id() in preemptible [00000000] code: bash/15687 caller is debug_smp_processor_id+0x17/0x20 CPU: 4 PID: 15687 Comm: bash Not tainted 5.19.0-rc7+ #57 Call Trace: <TASK> dump_stack_lvl+0x49/0x63 dump_stack+0x10/0x16 check_preemption_disabled+0xdd/0xe0 debug_smp_processor_id+0x17/0x20 powerclamp_set_cur_state+0x7f/0xf9 [intel_powerclamp] ... ... Here CPU 0 is the control CPU by default and changed to the current CPU, if CPU 0 offlined. This check has to be performed under cpus_read_lock(), hence the above warning. Use get_cpu() instead of smp_processor_id() to avoid this BUG. Suggested-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw: Subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-25thermal: sysfs: Fix cooling_device_stats_setup() error code pathRafael J. Wysocki1-3/+7
commit d5a8aa5d7d80d21ab6b266f1bed4194b61746199 upstream. If cooling_device_stats_setup() fails to create the stats object, it must clear the last slot in cooling_device_attr_groups that was initially empty (so as to make it possible to add stats attributes to the cooling device attribute groups). Failing to do so may cause the stats attributes to be created by mistake for a device that doesn't have a stats object, because the slot in question might be populated previously during the registration of another cooling device. Fixes: 8ea229511e06 ("thermal: Add cooling device's statistics in sysfs") Reported-by: Di Shen <di.shen@unisoc.com> Tested-by: Di Shen <di.shen@unisoc.com> Cc: 4.17+ <stable@vger.kernel.org> # 4.17+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-15thermal: int340x: Increase bitmap sizeSrinivas Pandruvada1-1/+1
commit 668f69a5f863b877bc3ae129efe9a80b6f055141 upstream. The number of policies are 10, so can't be supported by the bitmap size of u8. Even though there are no platfoms with these many policies, but for correctness increase to u32. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Fixes: 16fc8eca1975 ("thermal/int340x_thermal: Add additional UUIDs") Cc: 5.1+ <stable@vger.kernel.org> # 5.1+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-28thermal: int340x: fix memory leak in int3400_notify()Chuansheng Liu1-0/+4
commit 3abea10e6a8f0e7804ed4c124bea2d15aca977c8 upstream. It is easy to hit the below memory leaks in my TigerLake platform: unreferenced object 0xffff927c8b91dbc0 (size 32): comm "kworker/0:2", pid 112, jiffies 4294893323 (age 83.604s) hex dump (first 32 bytes): 4e 41 4d 45 3d 49 4e 54 33 34 30 30 20 54 68 65 NAME=INT3400 The 72 6d 61 6c 00 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 rmal.kkkkkkkkkk. backtrace: [<ffffffff9c502c3e>] __kmalloc_track_caller+0x2fe/0x4a0 [<ffffffff9c7b7c15>] kvasprintf+0x65/0xd0 [<ffffffff9c7b7d6e>] kasprintf+0x4e/0x70 [<ffffffffc04cb662>] int3400_notify+0x82/0x120 [int3400_thermal] [<ffffffff9c8b7358>] acpi_ev_notify_dispatch+0x54/0x71 [<ffffffff9c88f1a7>] acpi_os_execute_deferred+0x17/0x30 [<ffffffff9c2c2c0a>] process_one_work+0x21a/0x3f0 [<ffffffff9c2c2e2a>] worker_thread+0x4a/0x3b0 [<ffffffff9c2cb4dd>] kthread+0xfd/0x130 [<ffffffff9c201c1f>] ret_from_fork+0x1f/0x30 Fix it by calling kfree() accordingly. Fixes: 38e44da59130 ("thermal: int3400_thermal: process "thermal table changed" event") Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com> Cc: 4.14+ <stable@vger.kernel.org> # 4.14+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [sudip: change in old path] Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-08thermal: core: Reset previous low and high trip during thermal zone initManaf Meethalavalappu Pallikunhi1-0/+2
[ Upstream commit 99b63316c39988039965693f5f43d8b4ccb1c86c ] During the suspend is in process, thermal_zone_device_update bails out thermal zone re-evaluation for any sensor trip violation without setting next valid trip to that sensor. It assumes during resume it will re-evaluate same thermal zone and update trip. But when it is in suspend temperature goes down and on resume path while updating thermal zone if temperature is less than previously violated trip, thermal zone set trip function evaluates the same previous high and previous low trip as new high and low trip. Since there is no change in high/low trip, it bails out from thermal zone set trip API without setting any trip. It leads to a case where sensor high trip or low trip is disabled forever even though thermal zone has a valid high or low trip. During thermal zone device init, reset thermal zone previous high and low trip. It resolves above mentioned scenario. Signed-off-by: Manaf Meethalavalappu Pallikunhi <manafm@codeaurora.org> Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-10-06thermal/core: Potential buffer overflow in thermal_build_list_of_policies()Dan Carpenter1-4/+3
[ Upstream commit 1bb30b20b49773369c299d4d6c65227201328663 ] After printing the list of thermal governors, then this function prints a newline character. The problem is that "size" has not been updated after printing the last governor. This means that it can write one character (the NUL terminator) beyond the end of the buffer. Get rid of the "size" variable and just use "PAGE_SIZE - count" directly. Fixes: 1b4f48494eb2 ("thermal: core: group functions related to governor handling") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210916131342.GB25094@kili Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-26thermal/drivers/exynos: Fix an error code in exynos_tmu_probe()Dan Carpenter1-0/+1
commit 02d438f62c05f0d055ceeedf12a2f8796b258c08 upstream. This error path return success but it should propagate the negative error code from devm_clk_get(). Fixes: 6c247393cfdd ("thermal: exynos: Add TMU support for Exynos7 SoC") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210810084413.GA23810@kili Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-28thermal/core: Correct function name thermal_zone_device_unregister()Yang Yingliang1-1/+1
[ Upstream commit a052b5118f13febac1bd901fe0b7a807b9d6b51c ] Fix the following make W=1 kernel build warning: drivers/thermal/thermal_core.c:1376: warning: expecting prototype for thermal_device_unregister(). Prototype was for thermal_zone_device_unregister() instead Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210517051020.3463536-1-yangyingliang@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-22thermal/core/fair share: Lock the thermal zone while looping over instancesLukasz Luba1-0/+4
commit fef05776eb02238dcad8d5514e666a42572c3f32 upstream. The tz->lock must be hold during the looping over the instances in that thermal zone. This lock was missing in the governor code since the beginning, so it's hard to point into a particular commit. CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210422153624.6074-2-lukasz.luba@arm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-07thermal/core: Add NULL pointer check before using cooling device statsManaf Meethalavalappu Pallikunhi1-0/+3
[ Upstream commit 2046a24ae121cd107929655a6aaf3b8c5beea01f ] There is a possible chance that some cooling device stats buffer allocation fails due to very high cooling device max state value. Later cooling device update sysfs can try to access stats data for the same cooling device. It will lead to NULL pointer dereference issue. Add a NULL pointer check before accessing thermal cooling device stats data. It fixes the following bug [ 26.812833] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000004 [ 27.122960] Call trace: [ 27.122963] do_raw_spin_lock+0x18/0xe8 [ 27.122966] _raw_spin_lock+0x24/0x30 [ 27.128157] thermal_cooling_device_stats_update+0x24/0x98 [ 27.128162] cur_state_store+0x88/0xb8 [ 27.128166] dev_attr_store+0x40/0x58 [ 27.128169] sysfs_kf_write+0x50/0x68 [ 27.133358] kernfs_fop_write+0x12c/0x1c8 [ 27.133362] __vfs_write+0x54/0x160 [ 27.152297] vfs_write+0xcc/0x188 [ 27.157132] ksys_write+0x78/0x108 [ 27.162050] ksys_write+0xf8/0x108 [ 27.166968] __arm_smccc_hvc+0x158/0x4b0 [ 27.166973] __arm_smccc_hvc+0x9c/0x4b0 [ 27.186005] el0_svc+0x8/0xc Signed-off-by: Manaf Meethalavalappu Pallikunhi <manafm@codeaurora.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/1607367181-24589-1-git-send-email-manafm@codeaurora.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01thermal: rcar_thermal: Handle probe error gracefullyNiklas Söderlund1-2/+4
[ Upstream commit 39056e8a989ef52486e063e34b4822b341e47b0e ] If the common register memory resource is not available the driver needs to fail gracefully to disable PM. Instead of returning the error directly store it in ret and use the already existing error path. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200310114709.1483860-1-niklas.soderlund+renesas@ragnatech.se Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-09thermal: ti-soc-thermal: Fix bogus thermal shutdowns for omap4430Tony Lindgren2-14/+19
[ Upstream commit 30d24faba0532d6972df79a1bf060601994b5873 ] We can sometimes get bogus thermal shutdowns on omap4430 at least with droid4 running idle with a battery charger connected: thermal thermal_zone0: critical temperature reached (143 C), shutting down Dumping out the register values shows we can occasionally get a 0x7f value that is outside the TRM listed values in the ADC conversion table. And then we get a normal value when reading again after that. Reading the register multiple times does not seem help avoiding the bogus values as they stay until the next sample is ready. Looking at the TRM chapter "18.4.10.2.3 ADC Codes Versus Temperature", we should have values from 13 to 107 listed with a total of 95 values. But looking at the omap4430_adc_to_temp array, the values are off, and the end values are missing. And it seems that the 4430 ADC table is similar to omap3630 rather than omap4460. Let's fix the issue by using values based on the omap3630 table and just ignoring invalid values. Compared to the 4430 TRM, the omap3630 table has the missing values added while the TRM table only shows every second value. Note that sometimes the ADC register values within the valid table can also be way off for about 1 out of 10 values. But it seems that those just show about 25 C too low values rather than too high values. So those do not cause a bogus thermal shutdown. Fixes: 1a31270e54d7 ("staging: omap-thermal: add OMAP4 data structures") Cc: Merlijn Wajer <merlijn@wizzup.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200706183338.25622-1-tony@atomide.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-19thermal: ti-soc-thermal: Fix reversed condition in ti_thermal_expose_sensor()Dan Carpenter1-1/+1
[ Upstream commit 0f348db01fdf128813fdd659fcc339038fb421a4 ] This condition is reversed and will cause breakage. Fixes: 7440f518dad9 ("thermal/drivers/ti-soc-thermal: Avoid dereferencing ERR_PTR") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200616091949.GA11940@mwanda Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-22thermal/drivers/cpufreq_cooling: Fix wrong frequency converted from powerFinley Xiao1-3/+3
commit 371a3bc79c11b707d7a1b7a2c938dc3cc042fffb upstream. The function cpu_power_to_freq is used to find a frequency and set the cooling device to consume at most the power to be converted. For example, if the power to be converted is 80mW, and the em table is as follow. struct em_cap_state table[] = { /* KHz mW */ { 1008000, 36, 0 }, { 1200000, 49, 0 }, { 1296000, 59, 0 }, { 1416000, 72, 0 }, { 1512000, 86, 0 }, }; The target frequency should be 1416000KHz, not 1512000KHz. Fixes: 349d39dc5739 ("thermal: cpu_cooling: merge frequency and power tables") Cc: <stable@vger.kernel.org> # v4.13+ Signed-off-by: Finley Xiao <finley.xiao@rock-chips.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200619090825.32747-1-finley.xiao@rock-chips.com Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-22Revert "thermal: mediatek: fix register index error"Enric Balletbo i Serra1-4/+2
[ Upstream commit a8f62f183021be389561570ab5f8c701a5e70298 ] This reverts commit eb9aecd90d1a39601e91cd08b90d5fee51d321a6 The above patch is supposed to fix a register index error on mt2701. It is not clear if the problem solved is a hang or just an invalid value returned, my guess is the second. The patch introduces, though, a new hang on MT8173 device making them unusable. So, seems reasonable, revert the patch because introduces a worst issue. The reason I send a revert instead of trying to fix the issue for MT8173 is because the information needed to fix the issue is in the datasheet and is not public. So I am not really able to fix it. Fixes the following bug when CONFIG_MTK_THERMAL is set on MT8173 devices. [ 2.222488] Unable to handle kernel paging request at virtual address ffff8000125f5001 [ 2.230421] Mem abort info: [ 2.233207] ESR = 0x96000021 [ 2.236261] EC = 0x25: DABT (current EL), IL = 32 bits [ 2.241571] SET = 0, FnV = 0 [ 2.244623] EA = 0, S1PTW = 0 [ 2.247762] Data abort info: [ 2.250640] ISV = 0, ISS = 0x00000021 [ 2.254473] CM = 0, WnR = 0 [ 2.257544] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000041850000 [ 2.264251] [ffff8000125f5001] pgd=000000013ffff003, pud=000000013fffe003, pmd=000000013fff9003, pte=006800001100b707 [ 2.274867] Internal error: Oops: 96000021 [#1] PREEMPT SMP [ 2.280432] Modules linked in: [ 2.283483] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc6+ #162 [ 2.289914] Hardware name: Google Elm (DT) [ 2.294003] pstate: 20000005 (nzCv daif -PAN -UAO) [ 2.298792] pc : mtk_read_temp+0xb8/0x1c8 [ 2.302793] lr : mtk_read_temp+0x7c/0x1c8 [ 2.306794] sp : ffff80001003b930 [ 2.310100] x29: ffff80001003b930 x28: 0000000000000000 [ 2.315404] x27: 0000000000000002 x26: ffff0000f9550b10 [ 2.320709] x25: ffff0000f9550a80 x24: 0000000000000090 [ 2.326014] x23: ffff80001003ba24 x22: 00000000610344c0 [ 2.331318] x21: 0000000000002710 x20: 00000000000001f4 [ 2.336622] x19: 0000000000030d40 x18: ffff800011742ec0 [ 2.341926] x17: 0000000000000001 x16: 0000000000000001 [ 2.347230] x15: ffffffffffffffff x14: ffffff0000000000 [ 2.352535] x13: ffffffffffffffff x12: 0000000000000028 [ 2.357839] x11: 0000000000000003 x10: ffff800011295ec8 [ 2.363143] x9 : 000000000000291b x8 : 0000000000000002 [ 2.368447] x7 : 00000000000000a8 x6 : 0000000000000004 [ 2.373751] x5 : 0000000000000000 x4 : ffff800011295cb0 [ 2.379055] x3 : 0000000000000002 x2 : ffff8000125f5001 [ 2.384359] x1 : 0000000000000001 x0 : ffff0000f9550a80 [ 2.389665] Call trace: [ 2.392105] mtk_read_temp+0xb8/0x1c8 [ 2.395760] of_thermal_get_temp+0x2c/0x40 [ 2.399849] thermal_zone_get_temp+0x78/0x160 [ 2.404198] thermal_zone_device_update.part.0+0x3c/0x1f8 [ 2.409589] thermal_zone_device_update+0x34/0x48 [ 2.414286] of_thermal_set_mode+0x58/0x88 [ 2.418375] thermal_zone_of_sensor_register+0x1a8/0x1d8 [ 2.423679] devm_thermal_zone_of_sensor_register+0x64/0xb0 [ 2.429242] mtk_thermal_probe+0x690/0x7d0 [ 2.433333] platform_drv_probe+0x5c/0xb0 [ 2.437335] really_probe+0xe4/0x448 [ 2.440901] driver_probe_device+0xe8/0x140 [ 2.445077] device_driver_attach+0x7c/0x88 [ 2.449252] __driver_attach+0xac/0x178 [ 2.453082] bus_for_each_dev+0x78/0xc8 [ 2.456909] driver_attach+0x2c/0x38 [ 2.460476] bus_add_driver+0x14c/0x230 [ 2.464304] driver_register+0x6c/0x128 [ 2.468131] __platform_driver_register+0x50/0x60 [ 2.472831] mtk_thermal_driver_init+0x24/0x30 [ 2.477268] do_one_initcall+0x50/0x298 [ 2.481098] kernel_init_freeable+0x1ec/0x264 [ 2.485450] kernel_init+0x1c/0x110 [ 2.488931] ret_from_fork+0x10/0x1c [ 2.492502] Code: f9401081 f9400402 b8a67821 8b010042 (b9400042) [ 2.498599] ---[ end trace e43e3105ed27dc99 ]--- [ 2.503367] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b [ 2.511020] SMP: stopping secondary CPUs [ 2.514941] Kernel Offset: disabled [ 2.518421] CPU features: 0x090002,25006005 [ 2.522595] Memory Limit: none [ 2.525644] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]-- Cc: Michael Kao <michael.kao@mediatek.com> Fixes: eb9aecd90d1a ("thermal: mediatek: fix register index error") Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200707103412.1010823-1-enric.balletbo@collabora.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-25thermal/drivers/ti-soc-thermal: Avoid dereferencing ERR_PTRSudip Mukherjee1-3/+3
[ Upstream commit 7440f518dad9d861d76c64956641eeddd3586f75 ] On error the function ti_bandgap_get_sensor_data() returns the error code in ERR_PTR() but we only checked if the return value is NULL or not. And, so we can dereference an error code inside ERR_PTR. While at it, convert a check to IS_ERR_OR_NULL. Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200424161944.6044-1-sudipm.mukherjee@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-03-05thermal: brcmstb_thermal: Do not use DT coefficientsFlorian Fainelli1-22/+9
commit e1ff6fc22f19e2af8adbad618526b80067911d40 upstream. At the time the brcmstb_thermal driver and its binding were merged, the DT binding did not make the coefficients properties a mandatory one, therefore all users of the brcmstb_thermal driver out there have a non functional implementation with zero coefficients. Even if these properties were provided, the formula used for computation is incorrect. The coefficients are entirely process specific (right now, only 28nm is supported) and not board or SoC specific, it is therefore appropriate to hard code them in the driver given the compatibility string we are probed with which has to be updated whenever a new process is introduced. We remove the existing coefficients definition since subsequent patches are going to add support for a new process and will introduce new coefficients as well. Fixes: 9e03cf1b2dd5 ("thermal: add brcmstb AVS TMON driver") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200114190607.29339-2-f.fainelli@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-27thermal: cpu_cooling: Actually trace CPU load in thermal_power_cpu_get_powerMatthias Kaehlcke1-1/+1
[ Upstream commit bf45ac18b78038e43af3c1a273cae4ab5704d2ce ] The CPU load values passed to the thermal_power_cpu_get_power tracepoint are zero for all CPUs, unless, unless the thermal_power_cpu_limit tracepoint is enabled too: irq/41-rockchip-98 [000] .... 290.972410: thermal_power_cpu_get_power: cpus=0000000f freq=1800000 load={{0x0,0x0,0x0,0x0}} dynamic_power=4815 vs irq/41-rockchip-96 [000] .... 95.773585: thermal_power_cpu_get_power: cpus=0000000f freq=1800000 load={{0x56,0x64,0x64,0x5e}} dynamic_power=4959 irq/41-rockchip-96 [000] .... 95.773596: thermal_power_cpu_limit: cpus=0000000f freq=408000 cdev_state=10 power=416 There seems to be no good reason for omitting the CPU load information depending on another tracepoint. My guess is that the intention was to check whether thermal_power_cpu_get_power is (still) enabled, however 'load_cpu != NULL' already indicates that it was at least enabled when cpufreq_get_requested_power() was entered, there seems little gain from omitting the assignment if the tracepoint was just disabled, so just remove the check. Fixes: 6828a4711f99 ("thermal: add trace events to the power allocator governor") Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Javi Merino <javi.merino@kernel.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27thermal: rcar_gen3_thermal: fix interrupt typeJiada Wang1-32/+6
[ Upstream commit 2c0928c9e004589dc9e7672c40a38d6c4ca12701 ] Currently IRQF_SHARED type interrupt line is allocated, but it is not appropriate, as the interrupt line isn't shared between different devices, instead IRQF_ONESHOT is the proper type. By changing interrupt type to IRQF_ONESHOT, now irq handler is no longer needed, as clear of interrupt status can be done in threaded interrupt context. Because IRQF_ONESHOT type interrupt line is kept disabled until the threaded handler has been run, so there is no need to protect read/write of REG_GEN3_IRQSTR with lock. Fixes: 7d4b269776ec6 ("enable hardware interrupts for trip points") Signed-off-by: Jiada Wang <jiada_wang@mentor.com> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Tested-by: Simon Horman <horms+renesas@verge.net.au> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27thermal: mediatek: fix register index errorMichael Kao1-2/+4
[ Upstream commit eb9aecd90d1a39601e91cd08b90d5fee51d321a6 ] The index of msr and adcpnp should match the sensor which belongs to the selected bank in the for loop. Fixes: b7cf0053738c ("thermal: Add Mediatek thermal driver for mt2701.") Signed-off-by: Michael Kao <michael.kao@mediatek.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-13thermal: Fix deadlock in thermal thermal_zone_device_checkWei Wang1-2/+2
commit 163b00cde7cf2206e248789d2780121ad5e6a70b upstream. 1851799e1d29 ("thermal: Fix use-after-free when unregistering thermal zone device") changed cancel_delayed_work to cancel_delayed_work_sync to avoid a use-after-free issue. However, cancel_delayed_work_sync could be called insides the WQ causing deadlock. [54109.642398] c0 1162 kworker/u17:1 D 0 11030 2 0x00000000 [54109.642437] c0 1162 Workqueue: thermal_passive_wq thermal_zone_device_check [54109.642447] c0 1162 Call trace: [54109.642456] c0 1162 __switch_to+0x138/0x158 [54109.642467] c0 1162 __schedule+0xba4/0x1434 [54109.642480] c0 1162 schedule_timeout+0xa0/0xb28 [54109.642492] c0 1162 wait_for_common+0x138/0x2e8 [54109.642511] c0 1162 flush_work+0x348/0x40c [54109.642522] c0 1162 __cancel_work_timer+0x180/0x218 [54109.642544] c0 1162 handle_thermal_trip+0x2c4/0x5a4 [54109.642553] c0 1162 thermal_zone_device_update+0x1b4/0x25c [54109.642563] c0 1162 thermal_zone_device_check+0x18/0x24 [54109.642574] c0 1162 process_one_work+0x3cc/0x69c [54109.642583] c0 1162 worker_thread+0x49c/0x7c0 [54109.642593] c0 1162 kthread+0x17c/0x1b0 [54109.642602] c0 1162 ret_from_fork+0x10/0x18 [54109.643051] c0 1162 kworker/u17:2 D 0 16245 2 0x00000000 [54109.643067] c0 1162 Workqueue: thermal_passive_wq thermal_zone_device_check [54109.643077] c0 1162 Call trace: [54109.643085] c0 1162 __switch_to+0x138/0x158 [54109.643095] c0 1162 __schedule+0xba4/0x1434 [54109.643104] c0 1162 schedule_timeout+0xa0/0xb28 [54109.643114] c0 1162 wait_for_common+0x138/0x2e8 [54109.643122] c0 1162 flush_work+0x348/0x40c [54109.643131] c0 1162 __cancel_work_timer+0x180/0x218 [54109.643141] c0 1162 handle_thermal_trip+0x2c4/0x5a4 [54109.643150] c0 1162 thermal_zone_device_update+0x1b4/0x25c [54109.643159] c0 1162 thermal_zone_device_check+0x18/0x24 [54109.643167] c0 1162 process_one_work+0x3cc/0x69c [54109.643177] c0 1162 worker_thread+0x49c/0x7c0 [54109.643186] c0 1162 kthread+0x17c/0x1b0 [54109.643195] c0 1162 ret_from_fork+0x10/0x18 [54109.644500] c0 1162 cat D 0 7766 1 0x00000001 [54109.644515] c0 1162 Call trace: [54109.644524] c0 1162 __switch_to+0x138/0x158 [54109.644536] c0 1162 __schedule+0xba4/0x1434 [54109.644546] c0 1162 schedule_preempt_disabled+0x80/0xb0 [54109.644555] c0 1162 __mutex_lock+0x3a8/0x7f0 [54109.644563] c0 1162 __mutex_lock_slowpath+0x14/0x20 [54109.644575] c0 1162 thermal_zone_get_temp+0x84/0x360 [54109.644586] c0 1162 temp_show+0x30/0x78 [54109.644609] c0 1162 dev_attr_show+0x5c/0xf0 [54109.644628] c0 1162 sysfs_kf_seq_show+0xcc/0x1a4 [54109.644636] c0 1162 kernfs_seq_show+0x48/0x88 [54109.644656] c0 1162 seq_read+0x1f4/0x73c [54109.644664] c0 1162 kernfs_fop_read+0x84/0x318 [54109.644683] c0 1162 __vfs_read+0x50/0x1bc [54109.644692] c0 1162 vfs_read+0xa4/0x140 [54109.644701] c0 1162 SyS_read+0xbc/0x144 [54109.644708] c0 1162 el0_svc_naked+0x34/0x38 [54109.845800] c0 1162 D 720.000s 1->7766->7766 cat [panic] Fixes: 1851799e1d29 ("thermal: Fix use-after-free when unregistering thermal zone device") Cc: stable@vger.kernel.org Signed-off-by: Wei Wang <wvw@google.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-01thermal: rcar_thermal: Prevent hardware access during system suspendGeert Uytterhoeven1-2/+2
[ Upstream commit 3a31386217628ffe2491695be2db933c25dde785 ] On r8a7791/koelsch, sometimes the following message is printed during system suspend: rcar_thermal e61f0000.thermal: thermal sensor was broken This happens if the workqueue runs while the device is already suspended. Fix this by using the freezable system workqueue instead, cfr. commit 51e20d0e3a60cf46 ("thermal: Prevent polling from happening during system suspend"). Fixes: e0a5172e9eec7f0d ("thermal: rcar: add interrupt support") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-01thermal: rcar_thermal: fix duplicate IRQ requestSergei Shtylyov1-1/+1
[ Upstream commit df016bbba63743bbef9ff5c6c282561211dd72cc ] The driver on R8A77995 requests the same IRQ twice since platform_get_resource() is always called for the 1st IRQ resource. Fixes: 1969d9dc2079 ("thermal: rcar_thermal: add r8a77995 support") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-01thermal: armada: fix a test in probe()Dan Carpenter1-2/+2
[ Upstream commit d1d2c290b3c04b65fa6132eeebe50a070746d8f6 ] The platform_get_resource() function doesn't return error pointers, it returns NULL on error. Fixes: 3d4e51844a4e ("thermal: armada: convert driver to syscon register accesses") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-11thermal_hwmon: Sanitize thermal_zone typeStefan Mavrodiev1-2/+6
[ Upstream commit 8c7aa184281c01fc26f319059efb94725012921d ] When calling thermal_add_hwmon_sysfs(), the device type is sanitized by replacing '-' with '_'. However tz->type remains unsanitized. Thus calling thermal_hwmon_lookup_by_type() returns no device. And if there is no device, thermal_remove_hwmon_sysfs() fails with "hwmon device lookup failed!". The result is unregisted hwmon devices in the sysfs. Fixes: 409ef0bacacf ("thermal_hwmon: Sanitize attribute name passed to hwmon") Signed-off-by: Stefan Mavrodiev <stefan@olimex.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-11thermal: Fix use-after-free when unregistering thermal zone deviceIdo Schimmel1-1/+1
[ Upstream commit 1851799e1d2978f68eea5d9dff322e121dcf59c1 ] thermal_zone_device_unregister() cancels the delayed work that polls the thermal zone, but it does not wait for it to finish. This is racy with respect to the freeing of the thermal zone device, which can result in a use-after-free [1]. Fix this by waiting for the delayed work to finish before freeing the thermal zone device. Note that thermal_zone_device_set_polling() is never invoked from an atomic context, so it is safe to call cancel_delayed_work_sync() that can block. [1] [ +0.002221] ================================================================== [ +0.000064] BUG: KASAN: use-after-free in __mutex_lock+0x1076/0x11c0 [ +0.000016] Read of size 8 at addr ffff8881e48e0450 by task kworker/1:0/17 [ +0.000023] CPU: 1 PID: 17 Comm: kworker/1:0 Not tainted 5.2.0-rc6-custom-02495-g8e73ca3be4af #1701 [ +0.000010] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016 [ +0.000016] Workqueue: events_freezable_power_ thermal_zone_device_check [ +0.000012] Call Trace: [ +0.000021] dump_stack+0xa9/0x10e [ +0.000020] print_address_description.cold.2+0x9/0x25e [ +0.000018] __kasan_report.cold.3+0x78/0x9d [ +0.000016] kasan_report+0xe/0x20 [ +0.000016] __mutex_lock+0x1076/0x11c0 [ +0.000014] step_wise_throttle+0x72/0x150 [ +0.000018] handle_thermal_trip+0x167/0x760 [ +0.000019] thermal_zone_device_update+0x19e/0x5f0 [ +0.000019] process_one_work+0x969/0x16f0 [ +0.000017] worker_thread+0x91/0xc40 [ +0.000014] kthread+0x33d/0x400 [ +0.000015] ret_from_fork+0x3a/0x50 [ +0.000020] Allocated by task 1: [ +0.000015] save_stack+0x19/0x80 [ +0.000015] __kasan_kmalloc.constprop.4+0xc1/0xd0 [ +0.000014] kmem_cache_alloc_trace+0x152/0x320 [ +0.000015] thermal_zone_device_register+0x1b4/0x13a0 [ +0.000015] mlxsw_thermal_init+0xc92/0x23d0 [ +0.000014] __mlxsw_core_bus_device_register+0x659/0x11b0 [ +0.000013] mlxsw_core_bus_device_register+0x3d/0x90 [ +0.000013] mlxsw_pci_probe+0x355/0x4b0 [ +0.000014] local_pci_probe+0xc3/0x150 [ +0.000013] pci_device_probe+0x280/0x410 [ +0.000013] really_probe+0x26a/0xbb0 [ +0.000013] driver_probe_device+0x208/0x2e0 [ +0.000013] device_driver_attach+0xfe/0x140 [ +0.000013] __driver_attach+0x110/0x310 [ +0.000013] bus_for_each_dev+0x14b/0x1d0 [ +0.000013] driver_register+0x1c0/0x400 [ +0.000015] mlxsw_sp_module_init+0x5d/0xd3 [ +0.000014] do_one_initcall+0x239/0x4dd [ +0.000013] kernel_init_freeable+0x42b/0x4e8 [ +0.000012] kernel_init+0x11/0x18b [ +0.000013] ret_from_fork+0x3a/0x50 [ +0.000015] Freed by task 581: [ +0.000013] save_stack+0x19/0x80 [ +0.000014] __kasan_slab_free+0x125/0x170 [ +0.000013] kfree+0xf3/0x310 [ +0.000013] thermal_release+0xc7/0xf0 [ +0.000014] device_release+0x77/0x200 [ +0.000014] kobject_put+0x1a8/0x4c0 [ +0.000014] device_unregister+0x38/0xc0 [ +0.000014] thermal_zone_device_unregister+0x54e/0x6a0 [ +0.000014] mlxsw_thermal_fini+0x184/0x35a [ +0.000014] mlxsw_core_bus_device_unregister+0x10a/0x640 [ +0.000013] mlxsw_devlink_core_bus_device_reload+0x92/0x210 [ +0.000015] devlink_nl_cmd_reload+0x113/0x1f0 [ +0.000014] genl_family_rcv_msg+0x700/0xee0 [ +0.000013] genl_rcv_msg+0xca/0x170 [ +0.000013] netlink_rcv_skb+0x137/0x3a0 [ +0.000012] genl_rcv+0x29/0x40 [ +0.000013] netlink_unicast+0x49b/0x660 [ +0.000013] netlink_sendmsg+0x755/0xc90 [ +0.000013] __sys_sendto+0x3de/0x430 [ +0.000013] __x64_sys_sendto+0xe2/0x1b0 [ +0.000013] do_syscall_64+0xa4/0x4d0 [ +0.000013] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ +0.000017] The buggy address belongs to the object at ffff8881e48e0008 which belongs to the cache kmalloc-2k of size 2048 [ +0.000012] The buggy address is located 1096 bytes inside of 2048-byte region [ffff8881e48e0008, ffff8881e48e0808) [ +0.000007] The buggy address belongs to the page: [ +0.000012] page:ffffea0007923800 refcount:1 mapcount:0 mapping:ffff88823680d0c0 index:0x0 compound_mapcount: 0 [ +0.000020] flags: 0x200000000010200(slab|head) [ +0.000019] raw: 0200000000010200 ffffea0007682008 ffffea00076ab808 ffff88823680d0c0 [ +0.000016] raw: 0000000000000000 00000000000d000d 00000001ffffffff 0000000000000000 [ +0.000007] page dumped because: kasan: bad access detected [ +0.000012] Memory state around the buggy address: [ +0.000012] ffff8881e48e0300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000012] ffff8881e48e0380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000012] >ffff8881e48e0400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000008] ^ [ +0.000012] ffff8881e48e0480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000012] ffff8881e48e0500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ +0.000007] ================================================================== Fixes: b1569e99c795 ("ACPI: move thermal trip handling to generic thermal layer") Reported-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-15drivers: thermal: tsens: Don't print error message on -EPROBE_DEFERAmit Kucheria1-1/+2
[ Upstream commit fc7d18cf6a923cde7f5e7ba2c1105bb106d3e29a ] We print a calibration failure message on -EPROBE_DEFER from nvmem/qfprom as follows: [ 3.003090] qcom-tsens 4a9000.thermal-sensor: version: 1.4 [ 3.005376] qcom-tsens 4a9000.thermal-sensor: tsens calibration failed [ 3.113248] qcom-tsens 4a9000.thermal-sensor: version: 1.4 This confuses people when, in fact, calibration succeeds later when nvmem/qfprom device is available. Don't print this message on a -EPROBE_DEFER. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-15thermal: rcar_gen3_thermal: disable interrupt in .removeJiada Wang1-0/+3
[ Upstream commit 63f55fcea50c25ae5ad45af92d08dae3b84534c2 ] Currently IRQ remains enabled after .remove, later if device is probed, IRQ is requested before .thermal_init, this may cause IRQ function be called before device is initialized. this patch disables interrupt in .remove, to ensure irq function only be called after device is fully initialized. Signed-off-by: Jiada Wang <jiada_wang@mentor.com> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-14x86/cpu: Sanitize FAM6_ATOM namingPeter Zijlstra1-1/+1
commit f2c4db1bd80720cd8cb2a5aa220d9bc9f374f04e upstream Going primarily by: https://en.wikipedia.org/wiki/List_of_Intel_Atom_microprocessors with additional information gleaned from other related pages; notably: - Bonnell shrink was called Saltwell - Moorefield is the Merriefield refresh which makes it Airmont The general naming scheme is: FAM6_ATOM_UARCH_SOCTYPE for i in `git grep -l FAM6_ATOM` ; do sed -i -e 's/ATOM_PINEVIEW/ATOM_BONNELL/g' \ -e 's/ATOM_LINCROFT/ATOM_BONNELL_MID/' \ -e 's/ATOM_PENWELL/ATOM_SALTWELL_MID/g' \ -e 's/ATOM_CLOVERVIEW/ATOM_SALTWELL_TABLET/g' \ -e 's/ATOM_CEDARVIEW/ATOM_SALTWELL/g' \ -e 's/ATOM_SILVERMONT1/ATOM_SILVERMONT/g' \ -e 's/ATOM_SILVERMONT2/ATOM_SILVERMONT_X/g' \ -e 's/ATOM_MERRIFIELD/ATOM_SILVERMONT_MID/g' \ -e 's/ATOM_MOOREFIELD/ATOM_AIRMONT_MID/g' \ -e 's/ATOM_DENVERTON/ATOM_GOLDMONT_X/g' \ -e 's/ATOM_GEMINI_LAKE/ATOM_GOLDMONT_PLUS/g' ${i} done Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: dave.hansen@linux.intel.com Cc: len.brown@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-20thermal/intel_powerclamp: fix truncated kthread nameZhang Rui1-1/+1
[ Upstream commit e925b5be5751f6a7286bbd9a4cbbc4ac90cc5fa6 ] kthread name only allows 15 characters (TASK_COMMON_LEN is 16). Thus rename the kthreads created by intel_powerclamp driver from "kidle_inject/ + decimal cpuid" to "kidle_inj/ + decimal cpuid" to avoid truncated kthead name for cpu 100 and later. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-20thermal/int340x_thermal: fix mode settingMatthew Garrett1-4/+3
[ Upstream commit 396ee4d0cd52c13b3f6421b8d324d65da5e7e409 ] int3400 only pushes the UUID into the firmware when the mode is flipped to "enable". The current code only exposes the mode flag if the firmware supports the PASSIVE_1 UUID, which not all machines do. Remove the restriction. Signed-off-by: Matthew Garrett <mjg59@google.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-20thermal/int340x_thermal: Add additional UUIDsMatthew Garrett1-0/+14
[ Upstream commit 16fc8eca1975358111dbd7ce65e4ce42d1a848fb ] Add more supported DPTF policies than the driver currently exposes. Signed-off-by: Matthew Garrett <mjg59@google.com> Cc: Nisha Aram <nisha.aram@intel.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-20thermal: bcm2835: Fix crash in bcm2835_thermal_debugfsPhil Elwell1-5/+4
[ Upstream commit 35122495a8c6683e863acf7b05a7036b2be64c7a ] "cat /sys/kernel/debug/bcm2835_thermal/regset" causes a NULL pointer dereference in bcm2835_thermal_debugfs. The driver makes use of the implementation details of the thermal framework to retrieve a pointer to its private data from a struct thermal_zone_device, and gets it wrong - leading to the crash. Instead, store its private data as the drvdata and retrieve the thermal_zone_device pointer from it. Fixes: bcb7dd9ef206 ("thermal: bcm2835: add thermal driver for bcm2835 SoC") Signed-off-by: Phil Elwell <phil@raspberrypi.org> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-20thermal: samsung: Fix incorrect check after code mergeMarek Szyprowski1-1/+1
[ Upstream commit 3b5236cc5d086dd3ddd01113ee9255421aab9fab ] Merge commit 19785cf93b6c ("Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal") broke the code introduced by commit ffe6e16f14fa ("thermal: exynos: Reduce severity of too early temperature read"). Restore the original code from the mentioned commit to finally fix the warning message during boot: thermal thermal_zone0: failed to read out thermal zone (-22) Reported-by: Marian Mihailescu <mihailescu2m@gmail.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Fixes: 19785cf93b6c ("Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal") Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-20thermal/intel_powerclamp: fix __percpu declaration of worker_dataLuc Van Oostenryck1-1/+1
[ Upstream commit aa36e3616532f82a920b5ebf4e059fbafae63d88 ] This variable is declared as: static struct powerclamp_worker_data * __percpu worker_data; In other words, a percpu pointer to struct ... But this variable not used like so but as a pointer to a percpu struct powerclamp_worker_data. So fix the declaration as: static struct powerclamp_worker_data __percpu *worker_data; This also quiets Sparse's warnings from __verify_pcpu_ptr(), like: 494:49: warning: incorrect type in initializer (different address spaces) 494:49: expected void const [noderef] <asn:3> *__vpp_verify 494:49: got struct powerclamp_worker_data * Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-03-05drivers: thermal: int340x_thermal: Fix sysfs race conditionAaron Hill1-13/+15
[ Upstream commit 129699bb8c7572106b5bbb2407c2daee4727ccad ] Changes since V1: * Use dev_info instead of printk * Use dev_warn instead of BUG_ON Previously, sysfs_create_group was called before all initialization had fully run - specifically, before pci_set_drvdata was called. Since the sysctl group is visible to userspace as soon as sysfs_create_group returns, a small window of time existed during which a process could read from an uninitialized/partially-initialized device. This commit moves the creation of the sysctl group to after all initialized is completed. This ensures that it's impossible for userspace to read from a sysctl file before initialization has fully completed. To catch any future regressions, I've added a check to ensure that proc_thermal_emum_mode is never PROC_THERMAL_NONE when a process tries to read from a sysctl file. Previously, the aforementioned race condition could result in the 'else' branch running while PROC_THERMAL_NONE was set, leading to a null pointer deference. Signed-off-by: Aaron Hill <aa1ronham@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-03-05thermal: int340x_thermal: Fix a NULL vs IS_ERR() checkDan Carpenter1-1/+1
[ Upstream commit 3fe931b31a4078395c1967f0495dcc9e5ec6b5e3 ] The intel_soc_dts_iosf_init() function doesn't return NULL, it returns error pointers. Fixes: 4d0dd6c1576b ("Thermal/int340x/processor_thermal: Enable auxiliary DTS for Braswell") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-02-12thermal: hwmon: inline helpers when CONFIG_THERMAL_HWMON is not setEduardo Valentin1-2/+2
commit 03334ba8b425b2ad275c8f390cf83c7b081c3095 upstream. Avoid warnings like this: thermal_hwmon.h:29:1: warning: ‘thermal_remove_hwmon_sysfs’ defined but not used [-Wunused-function] thermal_remove_hwmon_sysfs(struct thermal_zone_device *tz) Fixes: 0dd88793aacd ("thermal: hwmon: move hwmon support to single file") Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-12thermal: generic-adc: Fix adc to temp interpolationBjorn Andersson1-4/+8
[ Upstream commit 9d216211fded20fff301d0317af3238d8383634c ] First correct the edge case to return the last element if we're outside the range, rather than at the last element, so that interpolation is not omitted for points between the two last entries in the table. Then correct the formula to perform linear interpolation based the two points surrounding the read ADC value. The indices for temp are kept as "hi" and "lo" to pair with the adc indices, but there's no requirement that the temperature is provided in descendent order. mult_frac() is used to prevent issues with overflowing the int. Cc: Laxman Dewangan <ldewangan@nvidia.com> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-02-12thermal: bcm2835: enable hwmon explicitlyMatthias Brugger1-0/+11
[ Upstream commit d56c19d07e0bc3ceff366a49b7d7a2440c967b1b ] By defaul of-based thermal driver do not enable hwmon. This patch does this explicitly, so that the temperature can be read through the common hwmon sysfs. Signed-off-by: Matthias Brugger <mbrugger@suse.com> Acked-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-02-12thermal: Fix locking in cooling device sysfs update cur_stateThara Gopinath1-4/+7
[ Upstream commit 68000a0d983f539c95ebe5dccd4f29535c7ac0af ] Sysfs interface to update cooling device cur_state does not currently holding cooling device lock sometimes leading to stale values in cur_state if getting updated simultanelously from user space and thermal framework. Adding the proper locking code fixes this issue. Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-02-12Thermal: do not clear passive state during system sleepWei Wang1-4/+8
[ Upstream commit 964f4843a455d2ffb199512b08be8d5f077c4cac ] commit ff140fea847e ("Thermal: handle thermal zone device properly during system sleep") added PM hook to call thermal zone reset during sleep. However resetting thermal zone will also clear the passive state and thus cancel the polling queue which leads the passive cooling device state not being cleared properly after sleep. thermal_pm_notify => thermal_zone_device_reset set passive to 0 thermal_zone_trip_update will skip update passive as `old_target == instance->target'. monitor_thermal_zone => thermal_zone_device_set_polling will cancel tz->poll_queue, so the cooling device state will not be changed afterwards. Reported-by: Kame Wang <kamewang@google.com> Signed-off-by: Wei Wang <wvw@google.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-21thermal: armada: fix legacy validity test senseRussell King1-1/+1
[ Upstream commit 70bb27b79adf63ea39e37371d09c823c7a8f93ce ] Commit 8c0e64ac4075 ("thermal: armada: get rid of the ->is_valid() pointer") removed the unnecessary indirection through a function pointer, but in doing so, also removed the negation operator too: - if (priv->data->is_valid && !priv->data->is_valid(priv)) { + if (armada_is_valid(priv)) { which results in: armada_thermal f06f808c.thermal: Temperature sensor reading not valid armada_thermal f2400078.thermal: Temperature sensor reading not valid armada_thermal f4400078.thermal: Temperature sensor reading not valid at boot, or whenever the "temp" sysfs file is read. Replace the negation operator. Fixes: 8c0e64ac4075 ("thermal: armada: get rid of the ->is_valid() pointer") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-11-21thermal: core: Fix use-after-free in thermal_cooling_device_destroy_sysfsDmitry Osipenko1-1/+2
commit 3c587768271e9c20276522025729e4ebca51583b upstream. This patch fixes use-after-free that was detected by KASAN. The bug is triggered on a CPUFreq driver module unload by freeing 'cdev' on device unregister and then using the freed structure during of the cdev's sysfs data destruction. The solution is to unregister the sysfs at first, then destroy sysfs data and finally release the cooling device. Cc: <stable@vger.kernel.org> # v4.17+ Fixes: 8ea229511e06 ("thermal: Add cooling device's statistics in sysfs") Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>