summaryrefslogtreecommitdiff
path: root/drivers/thermal/intel
AgeCommit message (Collapse)AuthorFilesLines
2025-06-04thermal: intel: x86_pkg_temp_thermal: Fix bogus trip temperatureZhang Rui1-0/+1
commit cf948c8e274e8b406e846cdf6cc48fe47f98cf57 upstream. The tj_max value obtained from the Intel TCC library are in Celsius, whereas the thermal subsystem operates in milli-Celsius. This discrepancy leads to incorrect trip temperature calculations. Fix bogus trip temperature by converting tj_max to milli-Celsius Unit. Fixes: 8ef0ca4a177d ("Merge back other thermal control material for 6.3.") Signed-off-by: Zhang Rui <rui.zhang@intel.com> Reported-by: zhang ning <zhangn1985@outlook.com> Closes: https://lore.kernel.org/all/TY2PR01MB3786EF0FE24353026293F5ACCD97A@TY2PR01MB3786.jpnprd01.prod.outlook.com/ Tested-by: zhang ning <zhangn1985@outlook.com> Cc: 6.3+ <stable@vger.kernel.org> # 6.3+ Link: https://patch.msgid.link/20250519070901.1031233-1-rui.zhang@intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-10thermal: int340x: Add NULL check for adevChenyuan Yang1-0/+3
[ Upstream commit 2542a3f70e563a9e70e7ded314286535a3321bdb ] Not all devices have an ACPI companion fwnode, so adev might be NULL. This is similar to the commit cd2fd6eab480 ("platform/x86: int3472: Check for adev == NULL"). Add a check for adev not being set and return -ENODEV in that case to avoid a possible NULL pointer deref in int3402_thermal_probe(). Note, under the same directory, int3400_thermal_probe() has such a check. Fixes: 77e337c6e23e ("Thermal: introduce INT3402 thermal driver") Signed-off-by: Chenyuan Yang <chenyuan0y@gmail.com> Acked-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Link: https://patch.msgid.link/20250313043611.1212116-1-chenyuan0y@gmail.com [ rjw: Subject edit, added Fixes: ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-09thermal: int3400: Fix reading of current_uuid for active policySrinivas Pandruvada1-1/+1
commit 7082503622986537f57bdb5ef23e69e70cfad881 upstream. When the current_uuid attribute is set to the active policy UUID, reading back the same attribute is returning "INVALID" instead of the active policy UUID on some platforms before Ice Lake. In platforms before Ice Lake, firmware provides a list of supported thermal policies. In this case, user space can select any of the supported thermal policies via a write to attribute "current_uuid". In commit c7ff29763989 ("thermal: int340x: Update OS policy capability handshake")', the OS policy handshake was updated to support Ice Lake and later platforms and it treated priv->current_uuid_index=0 as invalid. However, priv->current_uuid_index=0 is for the active policy, only priv->current_uuid_index=-1 is invalid. Fix this issue by updating the priv->current_uuid_index check. Fixes: c7ff29763989 ("thermal: int340x: Update OS policy capability handshake") Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: 5.18+ <stable@vger.kernel.org> # 5.18+ Link: https://patch.msgid.link/20241114200213.422303-1-srinivas.pandruvada@linux.intel.com [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-08thermal: intel: int340x: processor: Add MMIO RAPL PL4 supportZhang Rui1-2/+2
[ Upstream commit 3fb0eea8a1c4be5884e0731ea76cbd3ce126e1f3 ] Similar to the MSR RAPL interface, MMIO RAPL supports PL4 too, so add MMIO RAPL PL4d support to the processor_thermal driver. As a result, the powercap sysfs for MMIO RAPL will show a new "peak power" constraint. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://patch.msgid.link/20240930081801.28502-7-rui.zhang@intel.com [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-11-08thermal: intel: int340x: processor: Remove MMIO RAPL CPU hotplug supportZhang Rui1-44/+22
[ Upstream commit bfc6819e4bf56a55df6178f93241b5845ad672eb ] CPU0/package0 is always online and the MMIO RAPL driver runs on single package systems only, so there is no need to handle CPU hotplug in it. Always register a RAPL package device for package 0 and remove the unnecessary CPU hotplug support. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://patch.msgid.link/20240930081801.28502-6-rui.zhang@intel.com [ rjw: Subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17thermal: intel: int340x: processor: Fix warning during module unloadZhang Rui1-2/+0
[ Upstream commit 99ca0b57e49fb73624eede1c4396d9e3d10ccf14 ] The processor_thermal driver uses pcim_device_enable() to enable a PCI device, which means the device will be automatically disabled on driver detach. Thus there is no need to call pci_disable_device() again on it. With recent PCI device resource management improvements, e.g. commit f748a07a0b64 ("PCI: Remove legacy pcim_release()"), this problem is exposed and triggers the warining below. [ 224.010735] proc_thermal_pci 0000:00:04.0: disabling already-disabled device [ 224.010747] WARNING: CPU: 8 PID: 4442 at drivers/pci/pci.c:2250 pci_disable_device+0xe5/0x100 ... [ 224.010844] Call Trace: [ 224.010845] <TASK> [ 224.010847] ? show_regs+0x6d/0x80 [ 224.010851] ? __warn+0x8c/0x140 [ 224.010854] ? pci_disable_device+0xe5/0x100 [ 224.010856] ? report_bug+0x1c9/0x1e0 [ 224.010859] ? handle_bug+0x46/0x80 [ 224.010862] ? exc_invalid_op+0x1d/0x80 [ 224.010863] ? asm_exc_invalid_op+0x1f/0x30 [ 224.010867] ? pci_disable_device+0xe5/0x100 [ 224.010869] ? pci_disable_device+0xe5/0x100 [ 224.010871] ? kfree+0x21a/0x2b0 [ 224.010873] pcim_disable_device+0x20/0x30 [ 224.010875] devm_action_release+0x16/0x20 [ 224.010878] release_nodes+0x47/0xc0 [ 224.010880] devres_release_all+0x9f/0xe0 [ 224.010883] device_unbind_cleanup+0x12/0x80 [ 224.010885] device_release_driver_internal+0x1ca/0x210 [ 224.010887] driver_detach+0x4e/0xa0 [ 224.010889] bus_remove_driver+0x6f/0xf0 [ 224.010890] driver_unregister+0x35/0x60 [ 224.010892] pci_unregister_driver+0x44/0x90 [ 224.010894] proc_thermal_pci_driver_exit+0x14/0x5f0 [processor_thermal_device_pci] ... [ 224.010921] ---[ end trace 0000000000000000 ]--- Remove the excess pci_disable_device() calls. Fixes: acd65d5d1cf4 ("thermal/drivers/int340x/processor_thermal: Add PCI MMIO based thermal driver") Signed-off-by: Zhang Rui <rui.zhang@intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://patch.msgid.link/20240930081801.28502-3-rui.zhang@intel.com [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17thermal: int340x: processor_thermal: Set feature mask before proc_thermal_addSrinivas Pandruvada1-11/+10
[ Upstream commit 6ebc25d8b053a208786295bab58abbb66b39c318 ] The function proc_thermal_add() adds sysfs entries for power limits. The feature mask of available features is not present at that time, so it cannot be used by proc_thermal_add() to selectively create sysfs attributes. The feature mask is set by proc_thermal_mmio_add(), so modify the code to call it before proc_thermal_add() so as to allow the latter to use the feature mask. There is no functional impact with this change. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Stable-dep-of: 99ca0b57e49f ("thermal: intel: int340x: processor: Fix warning during module unload") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-03powercap: intel_rapl: Fix locking in TPMI RAPLZhang Rui1-4/+4
[ Upstream commit 1aa09b9379a7a644cd2f75ae0bac82b8783df600 ] The RAPL framework uses CPU hotplug locking to protect the rapl_packages list and rp->lead_cpu to guarantee that 1. the RAPL package device is not unprobed and freed 2. the cached rp->lead_cpu is always valid for operations like powercap sysfs accesses. Current RAPL APIs assume being called from CPU hotplug callbacks which hold the CPU hotplug lock, but TPMI RAPL driver invokes the APIs in the driver's .probe() function without acquiring the CPU hotplug lock. Fix the problem by providing both locked and lockless versions of RAPL APIs. Fixes: 9eef7f9da928 ("powercap: intel_rapl: Introduce RAPL TPMI interface driver") Signed-off-by: Zhang Rui <rui.zhang@intel.com> Cc: 6.5+ <stable@vger.kernel.org> # 6.5+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-03thermal/intel: Fix intel_tcc_get_temp() to support negative CPU temperatureZhang Rui3-14/+14
[ Upstream commit 7251b9e8a007ddd834aa81f8c7ea338884629fec ] CPU temperature can be negative in some cases. Thus the negative CPU temperature should not be considered as a failure. Fix intel_tcc_get_temp() and its users to support negative CPU temperature. Fixes: a3c1f066e1c5 ("thermal/intel: Introduce Intel TCC library") Signed-off-by: Zhang Rui <rui.zhang@intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Cc: 6.3+ <stable@vger.kernel.org> # 6.3+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-02-01thermal: intel: hfi: Add syscore callbacks for system-wide PMRicardo Neri1-0/+28
[ Upstream commit 97566d09fd02d2ab329774bb89a2cdf2267e86d9 ] The kernel allocates a memory buffer and provides its location to the hardware, which uses it to update the HFI table. This allocation occurs during boot and remains constant throughout runtime. When resuming from hibernation, the restore kernel allocates a second memory buffer and reprograms the HFI hardware with the new location as part of a normal boot. The location of the second memory buffer may differ from the one allocated by the image kernel. When the restore kernel transfers control to the image kernel, its HFI buffer becomes invalid, potentially leading to memory corruption if the hardware writes to it (the hardware continues to use the buffer from the restore kernel). It is also possible that the hardware "forgets" the address of the memory buffer when resuming from "deep" suspend. Memory corruption may also occur in such a scenario. To prevent the described memory corruption, disable HFI when preparing to suspend or hibernate. Enable it when resuming. Add syscore callbacks to handle the package of the boot CPU (packages of non-boot CPUs are handled via CPU offline). Syscore ops always run on the boot CPU. Additionally, HFI only needs to be disabled during "deep" suspend and hibernation. Syscore ops only run in these cases. Cc: 6.1+ <stable@vger.kernel.org> # 6.1+ Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> [ rjw: Comment adjustment, subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-02-01thermal: intel: hfi: Disable an HFI instance when all its CPUs go offlineRicardo Neri1-0/+35
[ Upstream commit 1c53081d773c2cb4461636559b0d55b46559ceec ] In preparation to support hibernation, add functionality to disable an HFI instance during CPU offline. The last CPU of an instance that goes offline will disable such instance. The Intel Software Development Manual states that the operating system must wait for the hardware to set MSR_IA32_PACKAGE_THERM_STATUS[26] after disabling an HFI instance to ensure that it will no longer write on the HFI memory. Some processors, however, do not ever set such bit. Wait a minimum of 2ms to give time hardware to complete any pending memory writes. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Stable-dep-of: 97566d09fd02 ("thermal: intel: hfi: Add syscore callbacks for system-wide PM") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-02-01thermal: intel: hfi: Refactor enabling code into helper functionsRicardo Neri1-21/+22
[ Upstream commit 8a8b6bb93c704776c4b05cb517c3fa8baffb72f5 ] In preparation for the addition of a suspend notifier, wrap the logic to enable HFI and program its memory buffer into helper functions. Both the CPU hotplug callback and the suspend notifier will use them. This refactoring does not introduce functional changes. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Stable-dep-of: 97566d09fd02 ("thermal: intel: hfi: Add syscore callbacks for system-wide PM") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28thermal: intel: powerclamp: fix mismatch in get function for max_idleDavid Arcari1-1/+1
commit fae633cfb729da2771b5433f6b84ae7e8b4aa5f7 upstream. KASAN reported this [ 444.853098] BUG: KASAN: global-out-of-bounds in param_get_int+0x77/0x90 [ 444.853111] Read of size 4 at addr ffffffffc16c9220 by task cat/2105 ... [ 444.853442] The buggy address belongs to the variable: [ 444.853443] max_idle+0x0/0xffffffffffffcde0 [intel_powerclamp] There is a mismatch between the param_get_int and the definition of max_idle. Replacing param_get_int with param_get_byte resolves this issue. Fixes: ebf519710218 ("thermal: intel: powerclamp: Add two module parameters") Cc: 6.3+ <stable@vger.kernel.org> # 6.3+ Signed-off-by: David Arcari <darcari@redhat.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-05thermal: Use thermal_tripless_zone_device_register()Rafael J. Wysocki1-3/+3
All of the remaining callers of thermal_zone_device_register() can use thermal_tripless_zone_device_register(), so make them do so in order to allow the former to be dropped. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Miquel Raynal <miquel.raynal@bootlin.com>
2023-08-29Merge tag 'thermal-6.6-rc1' of ↵Linus Torvalds6-175/+104
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control updates from Rafael Wysocki: "These rework the Intel DTS IOSF and the ACPI thermal drivers to pass tables of generic trip point structures to the core during initialization and make some requisite modifications in the thermal core, fix a few issues elsewhere and clean up code. This includes changes that are present in the ACPI updates too, because they involve both ACPI and the thermal core. The list of specific changes below is limited to thermal control, however. Specifics: - Make the ACPI thermal driver use its own Notify() handler (Michal Wilczynski) - Rework the ACPI thermal driver to use a table of generic trip point structures on top of the internal representation of trip points and remove thermal zone callbacks that are not necessary any more from that driver (Rafael Wysocki) - Fix a few issues in the Intel DTS IOSF thermal driver, clean up code in it and make it pass tables of generic trip point structures to the core during thermal zone registration (Rafael Wysocki) - Drop a redundant check from the Intel DTS IOSF thermal driver's "remove" routine (Zhang Rui) - Use module_platform_driver() to replace an open-coded counterpart of it in the int340x thermal driver (Yang Yingliang) - Fix possible uninitialized value access in __thermal_of_bind() and __thermal_of_unbind() (Peng Fan) - Make the int3400 driver use thermal zone device wrappers (Daniel Lezcano) - Remove redundant thermal zone state check from the int340x thermal driver (Daniel Lezcano) - Drop non-functional nocrt parameter from ACPI thermal (Mario Limonciello) - Explicitly include correct DT includes in the thermal core and drivers (Rob Herring)" * tag 'thermal-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal: intel: intel_soc_dts_iosf: Remove redundant check thermal: intel: int340x: simplify the code with module_platform_driver() thermal/of: Fix potential uninitialized value access thermal: intel: intel_soc_dts_iosf: Use struct thermal_trip thermal: intel: intel_soc_dts_iosf: Rework critical trip setup thermal: intel: intel_soc_dts_iosf: Add helper for resetting trip points thermal: intel: intel_soc_dts_iosf: Change initialization ordering thermal: intel: intel_soc_dts_iosf: Pass sensors to update_trip_temp() thermal: intel: intel_soc_dts_iosf: Untangle update_trip_temp() thermal: intel: intel_soc_dts_iosf: Always assume notification support thermal: intel: intel_soc_dts_iosf: Drop redundant symbol definition thermal: intel: intel_soc_dts_iosf: Always use 2 trips thermal: Explicitly include correct DT includes thermal/drivers/int340x: Do not check the thermal zone state thermal/drivers/int3400: Use thermal zone device wrappers
2023-08-29Merge tag 'perf-core-2023-08-28' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf event updates from Ingo Molnar: - AMD IBS improvements - Intel PMU driver updates - Extend core perf facilities & the ARM PMU driver to better handle ARM big.LITTLE events - Micro-optimize software events and the ring-buffer code - Misc cleanups & fixes * tag 'perf-core-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/uncore: Remove unnecessary ?: operator around pcibios_err_to_errno() call perf/x86/intel: Add Crestmont PMU x86/cpu: Update Hybrids x86/cpu: Fix Crestmont uarch x86/cpu: Fix Gracemont uarch perf: Remove unused extern declaration arch_perf_get_page_size() perf: Remove unused PERF_PMU_CAP_HETEROGENEOUS_CPUS capability arm_pmu: Remove unused PERF_PMU_CAP_HETEROGENEOUS_CPUS capability perf/x86: Remove unused PERF_PMU_CAP_HETEROGENEOUS_CPUS capability arm_pmu: Add PERF_PMU_CAP_EXTENDED_HW_TYPE capability perf/x86/ibs: Set mem_lvl_num, mem_remote and mem_hops for data_src perf/mem: Add PERF_MEM_LVLNUM_NA to PERF_MEM_NA perf/mem: Introduce PERF_MEM_LVLNUM_UNC perf/ring_buffer: Use local_try_cmpxchg in __perf_output_begin locking/arch: Avoid variable shadowing in local_try_cmpxchg() perf/core: Use local64_try_cmpxchg in perf_swevent_set_period perf/x86: Use local64_try_cmpxchg perf/amd: Prevent grouping of IBS events
2023-08-22thermal: intel: intel_soc_dts_iosf: Remove redundant checkZhang Rui1-5/+3
Remove the redundant check in remove_dts_thermal_zone() because all of its existing callers pass a valid pointer as the argument. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-08-22thermal: intel: int340x: simplify the code with module_platform_driver()Yang Yingliang1-12/+1
The init/exit() of the driver only calls platform_driver_{un}register(), so it can be simpilfied by using module_platform_driver(). Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-08-11thermal: intel: intel_soc_dts_iosf: Use struct thermal_tripRafael J. Wysocki2-45/+8
Because the number of trip points in each thermal zone and their types are known to intel_soc_dts_iosf_init() prior to the registration of the thermal zones, make it create an array of struct thermal_trip entries in each struct intel_soc_dts_sensor_entry object and make add_dts_thermal_zone() use thermal_zone_device_register_with_trips() for thermal zone registration and pass that array as its second argument. Drop the sys_get_trip_temp() and sys_get_trip_type() callback functions along with the respective callback pointers in tzone_ops, because they are not necessary any more. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-08-11thermal: intel: intel_soc_dts_iosf: Rework critical trip setupRafael J. Wysocki4-51/+27
Critical trip points appear in the DTS thermal zones only after those thermal zones have been registered via intel_soc_dts_iosf_init(). Moreover, they are "created" by changing the type of an existing trip point from THERMAL_TRIP_PASSIVE to THERMAL_TRIP_CRITICAL via intel_soc_dts_iosf_add_read_only_critical_trip(), the caller of which has to be careful enough to pass at least 1 as the number of read-only trip points to intel_soc_dts_iosf_init() beforehand. This is questionable, because user space may have started to use the trips at the time when intel_soc_dts_iosf_add_read_only_critical_trip() runs and there is no synchronization between it and sys_set_trip_temp(). To address it, use the observation that nonzero number of read-only trip points is only passed to intel_soc_dts_iosf_init() when critical trip points are going to be used, so in fact that function may get all of the information regarding the critical trip points upfront and it can configure them before registering the corresponding thermal zones. Accordingly, replace the read_only_trip_count argument of intel_soc_dts_iosf_init() with a pair of new arguments related to critical trip points: a bool one indicating whether or not critical trip points are to be used at all and an int one representing the critical trip point temperature offset relative to Tj_max. Use these arguments to configure the critical trip points before the registration of the thermal zones and to compute the number of writeable trip points in add_dts_thermal_zone(). Modify both callers of intel_soc_dts_iosf_init() to take these changes into account and drop the intel_soc_dts_iosf_add_read_only_critical_trip() call, that is not necessary any more, from intel_soc_thermal_init(), which also allows it to return success right after requesting the IRQ. Finally, drop intel_soc_dts_iosf_add_read_only_critical_trip() altogether, because it does not have any more users. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-08-11thermal: intel: intel_soc_dts_iosf: Add helper for resetting trip pointsRafael J. Wysocki1-6/+9
Because trip points are reset for each sensor in two places in the same way, add a helper function for that to reduce code duplication a bit. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-08-11thermal: intel: intel_soc_dts_iosf: Change initialization orderingRafael J. Wysocki1-9/+16
The initial configuration of trip points in intel_soc_dts_iosf_init() takes place after registering the sensor thermal zones which is potentially problematic, because it may race with the setting of trip point temperatures via sysfs, as there is no synchronization between it and sys_set_trip_temp(). To address this, change the initialization ordering so that the trip points are configured prior to the registration of thermal zones. Accordingly, change the cleanup ordering in intel_soc_dts_iosf_exit() to remove the thermal zones before resetting the trip points. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-08-11thermal: intel: intel_soc_dts_iosf: Pass sensors to update_trip_temp()Rafael J. Wysocki1-4/+3
After previous changes, update_trip_temp() only uses its dts argument to get to the sensors field in the struct intel_soc_dts_sensor_entry object pointed to by that argument, so pass the value of that field directly to it instead. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-08-11thermal: intel: intel_soc_dts_iosf: Untangle update_trip_temp()Rafael J. Wysocki1-13/+24
Function update_trip_temp() is currently used for the initialization of trip points as well as for changing trip point temperatures in sys_set_trip_temp(). This is quite confusing and passing the value of dts->trip_types[trip] to it so that it can store that value in the same memory location is not particularly useful, because it only is necessary to set the trip point type once, at the initialization time. For this reason, drop the last argument from update_trip_temp() and introduce configure_trip() calling the former internally for the initial configuration of trip points. Modify the majority of update_trip_temp() callers to use configure_trip() instead of it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-08-11thermal: intel: intel_soc_dts_iosf: Always assume notification supportRafael J. Wysocki1-13/+8
None of the existing callers of intel_soc_dts_iosf_init() passes INTEL_SOC_DTS_INTERRUPT_NONE as the first argument to it, so the notification local variable in it is always true and the notification_support argument of add_dts_thermal_zone() is always true either. For this reason, drop the notification local variable from intel_soc_dts_iosf_init() and the notification_support argument from add_dts_thermal_zone() and rearrange the latter to always set writable_trip_cnt and trip_mask. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-08-10thermal: intel: intel_soc_dts_iosf: Drop redundant symbol definitionRafael J. Wysocki1-3/+0
SOC_MAX_DTS_SENSORS is already defined in intel_soc_dts_iosf.h which is included in intel_soc_dts_iosf.c, so it does not need to be defined in the latter again. Drop the redundant definition of that symbol from intel_soc_dts_iosf.c. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-08-10thermal: intel: intel_soc_dts_iosf: Always use 2 tripsRafael J. Wysocki4-24/+15
Both the existing callers of intel_soc_dts_iosf_init() pass 2 as the trip count argument, so it can be replaced with SOC_MAX_DTS_TRIPS everywhere in the code and the trip_count argument of that function can be dropped. This also allows the trip_count field to be dropped from struct intel_soc_dts_sensor_entry, as it is always equal to 2, and some related code can be simplified. Make changes accordingly. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-08-09x86/cpu: Fix Gracemont uarchPeter Zijlstra1-1/+1
Alderlake N is an E-core only product using Gracemont micro-architecture. It fits the pre-existing naming scheme perfectly fine, adhere to it. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20230807150405.686834933@infradead.org
2023-08-01powercap: intel_rapl: Fix a sparse warning in TPMI interfaceZhang Rui1-8/+8
Depends on the interface used, the RAPL registers can be either MSR indexes or memory mapped IO addresses. Current RAPL common code uses u64 to save both MSR and memory mapped IO registers. With this, when handling register address with an __iomem annotation, it triggers a sparse warning like below: sparse warnings: (new ones prefixed by >>) >> drivers/powercap/intel_rapl_tpmi.c:141:41: sparse: sparse: incorrect type in initializer (different address spaces) @@ expected unsigned long long [usertype] *tpmi_rapl_regs @@ got void [noderef] __iomem * @@ drivers/powercap/intel_rapl_tpmi.c:141:41: sparse: expected unsigned long long [usertype] *tpmi_rapl_regs drivers/powercap/intel_rapl_tpmi.c:141:41: sparse: got void [noderef] __iomem * Fix the problem by using a union to save the registers instead. Suggested-by: David Laight <David.Laight@ACULAB.COM> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202307031405.dy3druuy-lkp@intel.com/ Tested-by: Wang Wendy <wendy.wang@intel.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-07-14thermal/drivers/int340x: Do not check the thermal zone stateDaniel Lezcano1-18/+14
The driver is accessing the thermal zone state to ensure the state is different from the one we want to set. We don't want the driver to access the thermal zone device internals. Actually, the thermal core code already checks if the thermal zone's state is different before calling this function, thus this check is duplicate. Remove it. Acked-by: srinivas pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-07-14thermal/drivers/int3400: Use thermal zone device wrappersDaniel Lezcano1-4/+8
Use the thermal core API to access the thermal zone "type" field instead of directly using the structure field. While here, remove access to the temperature field, as this driver is reporting fake temperature, which can be replaced with INT3400_FAKE_TEMP. Also replace hardcoded 20C with INT3400_FAKE_TEMP Acked-by: srinivas pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-06-27Merge tag 'thermal-6.5-rc1' of ↵Linus Torvalds2-0/+275
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control updates from Rafael Wysocki: "These extend the int340x thermal driver, add thermal DT bindings for some Qcom platforms, add DT bindings and support for Armada AP807 and MSM8909, allow selecting the bang-bang thermal governor as the default one, address issues in several thermal drivers for ARM platforms and clean up code. Specifics: - Add new IOCTLs to the int340x thermal driver to allow user space to retrieve the Passive v2 thermal table (Srinivas Pandruvada) - Add DT bindings for SM6375, MSM8226 and QCM2290 Qcom platforms (Konrad Dybcio) - Add DT bindings and support for QCom MSM8226 (Matti Lehtimäki) - Add DT bindings for QCom ipq9574 (Praveenkumar I) - Convert bcm2835 DT bindings to the yaml schema (Stefan Wahren) - Allow selecting the bang-bang governor as default (Thierry Reding) - Refactor and prepare the code to set the scene for RCar Gen4 (Wolfram Sang) - Clean up and fix the QCom tsens drivers. Add DT bindings and calibration for the MSM8909 platform (Stephan Gerhold) - Revert a patch introducing a wrong usage of devm_of_iomap() on the Mediatek platform (Ricardo Cañuelo) - Fix the clock vs reset ordering in order to conform to the documentation on the sun8i (Christophe JAILLET) - Prevent setting up undocumented registers, enable the only described sensors and add the version 2.1 on the Qoriq sensor (Peng Fan) - Add DT bindings and support for the Armada AP807 (Alex Leibovich) - Update the mlx5 driver with the recent thermal changes (Daniel Lezcano) - Convert to platform remove callback returning void on STM32 (Uwe Kleine-König) - Add an error information printing for devm_thermal_add_hwmon_sysfs() and remove the error from the Sun8i, Amlogic, i.MX, TI, K3, Tegra, Qoriq, Mediateka and QCom (Yangtao Li) - Register as hwmon sensor for the Generic ADC (Chen-Yu Tsai) - Use the dev_err_probe() function in the QCom tsens alarm driver (Luca Weiss)" * tag 'thermal-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (39 commits) thermal/drivers/qcom/temp-alarm: Use dev_err_probe thermal/drivers/generic-adc: Register thermal zones as hwmon sensors thermal/drivers/mediatek/lvts_thermal: Remove redundant msg in lvts_ctrl_start() thermal/drivers/qcom: Remove redundant msg at probe time thermal/drivers/ti-soc: Remove redundant msg in ti_thermal_expose_sensor() thermal/drivers/qoriq: Remove redundant msg in qoriq_tmu_register_tmu_zone() thermal/drivers/tegra: Remove redundant msg in tegra_tsensor_register_channel() drivers/thermal/k3: Remove redundant msg in k3_bandgap_probe() thermal/drivers/imx: Remove redundant msg in imx8mm_tmu_probe() and imx_sc_thermal_probe() thermal/drivers/amlogic: Remove redundant msg in amlogic_thermal_probe() thermal/drivers/sun8i: Remove redundant msg in sun8i_ths_register() thermal/hwmon: Add error information printing for devm_thermal_add_hwmon_sysfs() thermal/drivers/stm32: Convert to platform remove callback returning void net/mlx5: Update the driver with the recent thermal changes thermal/drivers/armada: Add support for AP807 thermal data dt-bindings: armada-thermal: Add armada-ap807-thermal compatible thermal/drivers/qoriq: Support version 2.1 thermal/drivers/qoriq: Only enable supported sensors thermal/drivers/qoriq: No need to program site adjustment register thermal/drivers/mediatek/lvts_thermal: Register thermal zones as hwmon sensors ...
2023-06-27Merge tag 'pm-6.5-rc1' of ↵Linus Torvalds1-5/+6
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These add Intel TPMI (Topology Aware Register and PM Capsule Interface) support to the power capping subsystem, extend the intel_idle driver to work in VM guests where MWAIT is not available, extend the system-wide power management diagnostics, fix bugs and clean up code. Specifics: - Introduce power capping core support for Intel TPMI (Topology Aware Register and PM Capsule Interface) and a TPMI interface driver for Intel RAPL (Zhang Rui, Dan Carpenter) - Fix CONFIG_IOSF_MBI dependency in the Intel RAPL power capping driver (Zhang Rui) - Fix invalid initialization for pl4_supported field in the Intel RAPL power capping driver (Sumeet Pawnikar) - Clean up the intel_idle driver, make it work with VM guests that cannot use the MWAIT instruction and address the case in which the host may enter a deep idle state when the guest is idle (Arjan van de Ven) - Prevent cpufreq drivers that provide the ->adjust_perf() callback without a ->fast_switch() one which is used as a fallback from the former in some cases (Wyes Karny) - Fix some issues related to the AMD P-state cpufreq driver (Mario Limonciello, Wyes Karny) - Fix the energy_performance_preference attribute handling in the intel_pstate driver in passive mode (Tero Kristo) - Fix the handling of pm_suspend_target_state when CONFIG_PM is unset (Kai-Heng Feng) - Correct spelling mistake in a comment in the hibernation code (Wang Honghui) - Add arch_resume_nosmt() prototype to avoid a "missing prototypes" build warning (Arnd Bergmann) - Restrict pm_pr_dbg() to system-wide power transitions and use it in a few additional places (Mario Limonciello) - Drop verification of in-params from genpd_add_device() and ensure that all of its callers will do it (Ulf Hansson) - Prevent possible integer overflows from occurring in genpd_parse_state() (Nikita Zhandarovich) - Reorder fieldls in 'struct devfreq_dev_status' to reduce its size somewhat (Christophe JAILLET) - Ensure that the Exynos PPMU driver is already loaded before the Exynos Bus driver starts probing so as to avoid a possible freeze loading of the kernel modules (Marek Szyprowski) - Fix variable deferencing before NULL check in the mtk-cci devfreq driver (Sukrut Bellary)" * tag 'pm-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (42 commits) intel_idle: Add a "Long HLT" C1 state for the VM guest mode cpufreq: intel_pstate: Fix energy_performance_preference for passive cpufreq: amd-pstate: Add a kernel config option to set default mode cpufreq: amd-pstate: Set a fallback policy based on preferred_profile ACPI: CPPC: Add definition for undefined FADT preferred PM profile value cpufreq: amd-pstate: Set default governor to schedutil PM: domains: Move the verification of in-params from genpd_add_device() cpufreq: amd-pstate: Make amd-pstate EPP driver name hyphenated cpufreq: amd-pstate: Write CPPC enable bit per-socket intel_idle: Add support for using intel_idle in a VM guest using just hlt cpufreq: Fail driver register if it has adjust_perf without fast_switch intel_idle: clean up the (new) state_update_enter_method function intel_idle: refactor state->enter manipulation into its own function platform/x86/amd: pmc: Use pm_pr_dbg() for suspend related messages pinctrl: amd: Use pm_pr_dbg to show debugging messages ACPI: x86: Add pm_debug_messages for LPS0 _DSM state tracking include/linux/suspend.h: Only show pm_pr_dbg messages at suspend/resume powercap: RAPL: Fix a NULL vs IS_ERR() bug powercap: RAPL: Fix CONFIG_IOSF_MBI dependency powercap: RAPL: fix invalid initialization for pl4_supported field ...
2023-06-26Merge back earlier Intel thermal control material for 6.5.Rafael J. Wysocki2-0/+275
2023-06-26Merge branch 'powercap'Rafael J. Wysocki1-5/+6
Merge power capping updates for 6.5-rc1: - Introduce power capping core support for Intel TPMI (Topology Aware Register and PM Capsule Interface) and a TPMI interface driver for Intel RAPL (Zhang Rui, Dan Carpenter). - Fix CONFIG_IOSF_MBI dependency in the Intel RAPL power capping driver (Zhang Rui). - Fix invalid initialization for pl4_supported field in the Intel RAPL power capping driver (Sumeet Pawnikar). * powercap: powercap: RAPL: Fix a NULL vs IS_ERR() bug powercap: RAPL: Fix CONFIG_IOSF_MBI dependency powercap: RAPL: fix invalid initialization for pl4_supported field powercap: intel_rapl: Introduce RAPL TPMI interface driver powercap: intel_rapl: Introduce core support for TPMI interface powercap: intel_rapl: Introduce RAPL I/F type powercap: intel_rapl: Make cpu optional for rapl_package powercap: intel_rapl: Remove redundant cpu parameter powercap: intel_rapl: Add support for lock bit per Power Limit powercap: intel_rapl: Cleanup Power Limits support powercap: intel_rapl: Use bitmap for Power Limits powercap: intel_rapl: Change primitive order powercap: intel_rapl: Use index to initialize primitive information powercap: intel_rapl: Support per domain energy/power/time unit powercap: intel_rapl: Support per Interface primitive information powercap: intel_rapl: Support per Interface rapl_defaults powercap: intel_rapl: Allow probing without CPUID match powercap: intel_rapl: Remove unused field in struct rapl_if_priv
2023-06-15thermal/intel/intel_soc_dts_iosf: Fix reporting wrong temperaturesHans de Goede1-1/+1
Since commit 955fb8719efb ("thermal/intel/intel_soc_dts_iosf: Use Intel TCC library") intel_soc_dts_iosf is reporting the wrong temperature. The driver expects tj_max to be in milli-degrees-celcius but after the switch to the TCC library this is now in degrees celcius so instead of e.g. 90000 it is set to 90 causing a temperature 45 degrees below tj_max to be reported as -44910 milli-degrees instead of as 45000 milli-degrees. Fix this by adding back the lost factor of 1000. Fixes: 955fb8719efb ("thermal/intel/intel_soc_dts_iosf: Use Intel TCC library") Reported-by: Bernhard Krug <b.krug@elektronenpumpe.de> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Zhang Rui <rui.zhang@intel.com> Cc: 6.3+ <stable@vger.kernel.org> # 6.3+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-06-13thermal: intel: int340x_thermal: New IOCTLs for Passive v2 tableSrinivas Pandruvada2-0/+275
Export Passive version 2 table similar to the way _TRT and _ART tables via IOCTLs. This removes need for binary utility to read ACPI Passive 2 table by providing open source support. This table already has open source implementation in the user space thermald, when the table is part of data vault exported by the int3400 sysfs. This table is supported in some older platforms before Ice Lake generation. Passive 2 tables contain multiple entries. Each entry has following fields: * Source: Named Reference (String). This is the source device for temperature. * Target: Named Reference (String). This is the target device to control. * Priority: Priority of this device compared to others. * SamplingPeriod: Time Period in 1/10 of seconds unit. * PassiveTemp: Passive Temperature in 1/10 of Kelvin. * SourceDomain: Domain for the source (00:Processor, others reserved). * ControlKnob: Type of control knob (00:Power Limit 1, others: reserved) * Limit: The target state to set on reaching passive temperature. This can be a string "max", "min" or a power limit value. * LimitStepSize: Step size during activation. * UnLimitStepSize: Step size during deactivation. * Reserved1: Reserved Three IOCTLs are added similar to IOCTLs for reading TRT: ACPI_THERMAL_GET_PSVT_COUNT: Number of passive 2 entries. ACPI_THERMAL_GET_PSVT_LEN: Total return data size (count x each passive 2 entry size). ACPI_THERMAL_GET_PSVT: Get the data as an array of objects with passive 2 entries. This change is based on original development done by: Todd Brandt <todd.e.brandt@linux.intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw: Changelog and subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-05-24thermal: intel: int340x: Add new line for UUID displaySrinivas Pandruvada1-2/+2
Prior to the commit "763bd29fd3d1 ("thermal: int340x_thermal: Use sysfs_emit_at() instead of scnprintf()", there was a new line after each UUID string. With the newline removed, existing user space like "thermald" fails to compare each supported UUID as it is using getline() to read UUID and apply correct thermal table. To avoid breaking existing user space, add newline after each UUID string. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Fixes: 763bd29fd3d1 ("thermal: int340x_thermal: Use sysfs_emit_at() instead of scnprintf()") Cc: 6.3+ <stable@vger.kernel.org> # 6.3+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-05-24powercap: intel_rapl: Introduce RAPL I/F typeZhang Rui1-0/+1
Different RAPL Interfaces may have different primitive information and rapl_defaults calls. To better distinguish this difference in the RAPL framework code, introduce a new enum to represent different types of RAPL Interfaces. No functional change. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Wang Wendy <wendy.wang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-05-24powercap: intel_rapl: Make cpu optional for rapl_packageZhang Rui1-3/+3
MSR RAPL Interface always removes a rapl_package when all the CPUs in that rapl_package are offlined. This is because it relies on an online CPU to access the MSR. But for RAPL Interface using MMIO registers, when all the cpus within the rapl_package are offlined, 1. the register can still be accessed 2. monitoring and setting the Power Pimits for the rapl_package is still meaningful because of uncore power. This means that, a valid rapl_package doesn't rely on one or more cpus being onlined. For this sense, make cpu optional for rapl_package. A rapl_package can be registered either using a CPU id to represent the physical package/die, or using the physical package id directly. Note that, the thermal throttling interrupt is not disabled via MSR_IA32_PACKAGE_THERM_INTERRUPT for such rapl_package at the moment. If it is still needed in the future, this can be achieved by selecting an onlined CPU using the physical package id. Note that, processor_thermal_rapl, the current MMIO RAPL Interface driver, can also be converted to register using a package id instead. But this is not done right now because processor_thermal_rapl driver works on single-package systems only, and offlining the only package will not happen. So keep the previous logic. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Wang Wendy <wendy.wang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-05-24powercap: intel_rapl: Use bitmap for Power LimitsZhang Rui1-2/+2
Currently, a RAPL package is registered with the number of Power Limits supported in each RAPL domain. But this doesn't tell which Power Limits are available. Using the number of Power Limits supported to guess the availability of each Power Limit is fragile. Use bitmap to represent the availability of each Power Limit. Note that PL1 is mandatory thus it does not need to be set explicitly by the RAPL Interface drivers. No functional change intended. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Wang Wendy <wendy.wang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-05-04thermal: intel: powerclamp: Fix NULL pointer access issueSrinivas Pandruvada1-0/+4
If cur_state for the powerclamp cooling device is set to the default minimum state of 0, without setting first to cur_state > 0, this results in NULL pointer access. This NULL pointer access happens in the powercap core idle-inject function idle_inject_set_duration() as there is no NULL check for idle_inject_device pointer. This pointer must be allocated by calling idle_inject_register() or idle_inject_register_full(). In the function powerclamp_set_cur_state(), idle_inject_device pointer is allocated only when the cur_state > 0. But setting 0 without changing to any other state, idle_inject_set_duration() will be called with a NULL idle_inject_device pointer. To address this, just return from powerclamp_set_cur_state() if the current cooling device state is the same as the last one. Since the power-up default cooling device state is 0, changing the state to 0 again here will return without calling idle_inject_set_duration(). Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Fixes: 8526eb7fc75a ("thermal: intel: powerclamp: Use powercap idle-inject feature") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=217386 Tested-by: Risto A. Paju <teknohog@iki.fi> Cc: 6.3+ <stable@kernel.org> # 6.3+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-04-27thermal: intel: menlow: Get rid of this driverRafael J. Wysocki3-531/+0
According to my information, there are no active users of this driver in the field. Moreover, it does some really questionable things and gets in the way of thermal core improvements, so drop it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-04-27thermal: intel: pch_thermal: Use thermal driver device to write a traceDaniel Lezcano1-1/+2
The pch_critical() callback accesses the thermal zone device structure internals, it dereferences the thermal zone struct device and the 'type'. Use the available accessors instead of accessing the structure directly. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-04-18thermal: intel: int340x: Add DLVR support for RFIM controlSrinivas Pandruvada4-7/+91
Add support for DLVR (Digital Linear Voltage Regulator) attributes, which can be used to control RFIM. Here instead of "fivr" another directory "dlvr" is created with DLVR attributes: /sys/bus/pci/devices/0000:00:04.0/dlvr ├── dlvr_freq_mhz ├── dlvr_freq_select ├── dlvr_hardware_rev ├── dlvr_pll_busy ├── dlvr_rfim_enable └── dlvr_spread_spectrum_pct └── dlvr_control_mode └── dlvr_control_lock Attributes dlvr_freq_mhz (RO): Current DLVR PLL frequency in MHz. dlvr_freq_select (RW): Sets DLVR PLL clock frequency. dlvr_hardware_rev (RO): DLVR hardware revision. dlvr_pll_busy (RO): PLL can't accept frequency change when set. dlvr_rfim_enable (RW): 0: Disable RF frequency hopping, 1: Enable RF frequency hopping. dlvr_control_mode (RW): Specifies how frequencies are spread. 0: Down spread, 1: Spread in Center. dlvr_control_lock (RW): 1: future writes are ignored. dlvr_spread_spectrum_pct (RW) A write to this register updates the DLVR spread spectrum percent value. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw: Subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-04-14Merge back Intel thermal control material for 6.4-rc1.Rafael J. Wysocki7-22/+21
2023-04-11thermal: intel: Avoid updating unsupported THERM_STATUS_CLEAR mask bitsSrinivas Pandruvada1-7/+66
Some older processors don't allow BIT(13) and BIT(15) in the current mask set by "THERM_STATUS_CLEAR_CORE_MASK". This results in: unchecked MSR access error: WRMSR to 0x19c (tried to write 0x000000000000aaa8) at rIP: 0xffffffff816f66a6 (throttle_active_work+0xa6/0x1d0) To avoid unchecked MSR issues, check CPUID for each relevant feature and use that information to set the supported feature bits only in the "clear" mask for cores. Do the same for the analogous package mask set by "THERM_STATUS_CLEAR_PKG_MASK". Introduce functions thermal_intr_init_core_clear_mask() and thermal_intr_init_pkg_clear_mask() to set core and package mask bits, respectively. These functions are called during initialization. Fixes: 6fe1e64b6026 ("thermal: intel: Prevent accidental clearing of HFI status") Reported-by: Rui Salvaterra <rsalvaterra@gmail.com> Link: https://lore.kernel.org/lkml/cdf43fb423368ee3994124a9e8c9b4f8d00712c6.camel@linux.intel.com/T/ Tested-by: Rui Salvaterra <rsalvaterra@gmail.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: 6.2+ <stable@kernel.org> # 6.2+ [ rjw: Renamed 2 funtions and 2 static variables, edited subject and changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-03-31Merge back Intel thermal driver changes for 6.4-rc1.Rafael J. Wysocki7-22/+21
2023-03-30thermal: intel: powerclamp: Fix cpumask and max_idle module parametersDavid Arcari1-1/+8
When cpumask is specified as a module parameter the value is overwritten by the module init routine. This can easily be fixed by checking to see if the mask has already been allocated in the init routine. When max_idle is specified as a module parameter a panic will occur. The problem is that the idle_injection_cpu_mask is not allocated until the module init routine executes. This can easily be fixed by allocating the cpumask if it's not already allocated. Fixes: ebf519710218 ("thermal: intel: powerclamp: Add two module parameters") Signed-off-by: David Arcari <darcari@redhat.com> Reviewed-by: Srinivas Pandruvada<srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-03-29thermal: intel: int340x: processor_thermal: Fix additional deadlockSrinivas Pandruvada1-1/+0
Commit 52f04f10b900 ("thermal: intel: int340x: processor_thermal: Fix deadlock") addressed deadlock issue during user space trip update. But it missed a case when thermal zone device is disabled when user writes 0. Call to thermal_zone_device_disable() also causes deadlock as it also tries to lock tz->lock, which is already claimed by trip_point_temp_store() in the thermal core code. Remove call to thermal_zone_device_disable() in the function sys_set_trip_temp(), which is called from trip_point_temp_store(). Fixes: 52f04f10b900 ("thermal: intel: int340x: processor_thermal: Fix deadlock") Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: 6.2+ <stable@vger.kernel.org> # 6.2+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>