Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control updates from Rafael Wysocki:
"The majority of changes here are related to the general switch-over to
using arrays of generic trip point structures registered along with a
thermal zone instead of trip point callbacks (this has been done
mostly by Daniel Lezcano with some help from yours truly on the Intel
drivers front).
Apart from that and the related reorganization of code, there are some
enhancements of the existing driver and a new Mediatek Low Voltage
Thermal Sensor (LVTS) driver. The Intel powerclamp undergoes a major
rework so it will use the generic idle_inject facility for CPU idle
time injection going forward and it will take additional module
parameters for specifying the subset of CPUs to be affected by it
(work done by Srinivas Pandruvada).
Also included are assorted fixes and a whole bunch of cleanups.
Specifics:
- Rework a large bunch of drivers to use the generic thermal trip
structure and use the opportunity to do more cleanups by removing
unused functions from the OF code (Daniel Lezcano)
- Remove core header inclusion from drivers (Daniel Lezcano)
- Fix some locking issues related to the generic thermal trip rework
(Johan Hovold)
- Fix a crash when requesting the critical temperature on tegra,
which is related to the generic trip point work (Jon Hunter)
- Clean up thermal device unregistration code (Viresh Kumar)
- Fix and clean up thermal control core initialization error code
paths (Daniel Lezcano)
- Relocate the trip points handling code into a separate file (Daniel
Lezcano)
- Make the thermal core fail registration of thermal zones and
cooling devices if the thermal class has not been registered
(Rafael Wysocki)
- Add trip point initialization helper functions for ACPI-defined
trip points and modify two thermal drivers to use them (Rafael
Wysocki, Daniel Lezcano)
- Make the core thermal control code use sysfs_emit_at() instead of
scnprintf() where applicable (ye xingchen)
- Consolidate code accessing the Intel TCC (Thermal Control
Circuitry) MSRs by introducing library functions for that and
making the TCC-related code in thermal drivers use them (Zhang Rui)
- Enhance the x86_pkg_temp_thermal driver to support dynamic tjmax
changes (Zhang Rui)
- Address an "unsigned expression compared with zero" warning in the
intel_soc_dts_iosf thermal driver (Yang Li)
- Update comments regarding two functions in the Intel Menlow thermal
driver (Deming Wang)
- Use sysfs_emit_at() instead of scnprintf() in the int340x thermal
driver (ye xingchen)
- Make the intel_pch thermal driver support the Wellsburg PCH (Tim
Zimmermann)
- Modify the intel_pch and processor_thermal_device_pci thermal
drivers use generic trip point tables instead of thermal zone trip
point callbacks (Daniel Lezcano)
- Add production mode attribute sysfs attribute to the int340x
thermal driver (Srinivas Pandruvada)
- Rework dynamic trip point updates handling and locking in the
int340x thermal driver (Rafael Wysocki)
- Make the int340x thermal driver use a generic trip points table
instead of thermal zone trip point callbacks (Rafael Wysocki,
Daniel Lezcano)
- Clean up and improve the int340x thermal driver (Rafael Wysocki)
- Simplify and clean up the intel_pch thermal driver (Rafael Wysocki)
- Fix the Intel powerclamp thermal driver and make it use the common
idle injection framework (Srinivas Pandruvada)
- Add two module parameters, cpumask and max_idle, to the Intel
powerclamp thermal driver to allow it to affect only a specific
subset of CPUs instead of all of them (Srinivas Pandruvada)
- Make the Intel quark_dts thermal driver Use generic trip point
objects instead of its own trip point representation (Daniel
Lezcano)
- Add toctree entry for thermal documents and fix two issues in the
Intel powerclamp driver documentation (Bagas Sanjaya)
- Use strscpy() to instead of strncpy() in the thermal core (Xu
Panda)
- Fix thermal_sampling_exit() (Vincent Guittot)
- Add Mediatek Low Voltage Thermal Sensor (LVTS) driver (Balsam
Chihi)
- Add r8a779g0 RCar support to the rcar_gen3 thermal driver (Geert
Uytterhoeven)
- Fix useless call to set_trips() when resuming in the rcar_gen3
thermal control driver and add interrupt support detection at init
time to it (Niklas Söderlund)
- Fix memory corruption in the hi3660 thermal driver (Yongqin Liu)
- Fix include path for libnl3 in pkg-config file for libthermal
(Vibhav Pant)
- Remove syscfg-based driver for st as the platform is not supported
any more (Alain Volmat)"
* tag 'thermal-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (135 commits)
thermal/drivers/st: Remove syscfg based driver
thermal: Remove core header inclusion from drivers
tools/lib/thermal: Fix include path for libnl3 in pkg-config file.
thermal/drivers/hisi: Drop second sensor hi3660
thermal/drivers/rcar_gen3_thermal: Fix device initialization
thermal/drivers/rcar_gen3_thermal: Create device local ops struct
thermal/drivers/rcar_gen3_thermal: Do not call set_trips() when resuming
thermal/drivers/rcar_gen3: Add support for R-Car V4H
dt-bindings: thermal: rcar-gen3-thermal: Add r8a779g0 support
thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver
dt-bindings: thermal: mediatek: Add LVTS thermal controllers
thermal/drivers/mediatek: Relocate driver to mediatek folder
tools/lib/thermal: Fix thermal_sampling_exit()
Documentation: powerclamp: Fix numbered lists formatting
Documentation: powerclamp: Escape wildcard in cpumask description
Documentation: admin-guide: Add toctree entry for thermal docs
thermal: intel: powerclamp: Add two module parameters
Documentation: admin-guide: Move intel_powerclamp documentation
thermal: core: Use sysfs_emit_at() instead of scnprintf()
thermal: intel: powerclamp: Fix duration module parameter
...
|
|
The powercap/idle_inject core uses play_idle_precise() to inject idle
time. But play_idle_precise() can't ensure that the CPU is fully idle
for the specified duration because of wakeups due to interrupts. To
compensate for the reduced idle time due to these wakes, the caller
can adjust requested idle time for the next cycle.
The goal of idle injection is to keep system at some idle percent on
average, so this is fine to overshoot or undershoot instantaneous idle
times.
The idle inject core provides an interface idle_inject_set_duration()
to set idle and runtime duration.
Some architectures provide interface to get actual idle time observed
by the hardware. So, the effective idle percent can be adjusted using
the hardware feedback. For example, Intel CPUs provides package idle
counters, which is currently used by Intel powerclamp driver to
readjust runtime duration.
When the caller's desired idle time over a period is less or greater
than the actual CPU idle time observed by the hardware, caller can
readjust idle and runtime duration for the next cycle.
The only way this can be done currently is by monitoring hardware idle
time from a different software thread and readjust idle and runtime
duration using idle_inject_set_duration().
This can be avoided by adding a callback which callers can register and
readjust from this callback function.
Add a capability to register an optional update() callback, which can be
called from the idle inject core before waking up CPUs for idle injection.
This callback can be registered via a new interface:
idle_inject_register_full().
During this process of constantly adjusting idle and runtime duration
there can be some cases where actual idle time is more than the desired.
In this case idle inject can be skipped for a cycle. If update() callback
returns false, then the idle inject core skips waking up CPUs for the
idle injection.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Export symbols for external interfaces, so that they can be used in
other loadable modules.
Export is done under name space IDLE_INJECT.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The users of the idle injection framework allow 100% idle injection. For
example: thermal/cpuidle_cooling.c driver. When the ratio is set to
100%, the runtime_duration becomes zero.
However, idle_inject_set_duration() in the idle injection framework
silently ignores run_duration_us == 0 without any error (it is a void
function). The caller will then assume that everything is fine and
100% idle is effective, but in reality the idle duration will not
change.
There are two options:
- The caller may change their max state to 99% instead of 100% and
document that 100% is not supported by the idle inject framework.
- Add 100% idle support to the idle inject framework.
Since there are other protections via RT throttling, this framework can
allow 100% idle. The RT throttling will be activated at 95% idle by
default. The caller disabling RT throttling and injecting 100% idle,
should be aware that CPU can't be used at all.
The idle inject timer is started for (run_duration_us + idle_duration_us)
duration. Hence replace (run_duration_us && idle_duration_us) with
(run_duration_us + idle_duration_us) in the function
idle_inject_set_duration(). Also check for !(run_duration_us +
idle_duration_us) to return -EINVAL in idle_inject_start().
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Fix following warning at three places:
Function parameter or member 'ii_dev' not described.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Now that wait_task_inactive()'s @match_state argument is a mask (like
ttwu()) it is possible to replace the special !match_state case with
an 'all-states' value such that any blocked state will match.
Suggested-by: Ingo Molnar (mingo@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/YxhkzfuFTvRnpUaH@hirez.programming.kicks-ass.net
|
|
Drop superfluous "the" from the comment in line 15.
Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
[ rjw: Subject edit, new changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Include the linux/idle_inject.h header to fix W=1 build warning:
drivers/powercap/idle_inject.c:152:6: warning: no previous prototype for ‘idle_inject_set_duration’ [-Wmissing-prototypes]
drivers/powercap/idle_inject.c:167:6: warning: no previous prototype for ‘idle_inject_get_duration’ [-Wmissing-prototypes]
drivers/powercap/idle_inject.c:179:6: warning: no previous prototype for ‘idle_inject_set_latency’ [-Wmissing-prototypes]
drivers/powercap/idle_inject.c:195:5: warning: no previous prototype for ‘idle_inject_start’ [-Wmissing-prototypes]
drivers/powercap/idle_inject.c:227:6: warning: no previous prototype for ‘idle_inject_stop’ [-Wmissing-prototypes]
drivers/powercap/idle_inject.c:299:28: warning: no previous prototype for ‘idle_inject_register’ [-Wmissing-prototypes]
drivers/powercap/idle_inject.c:345:6: warning: no previous prototype for ‘idle_inject_unregister’ [-Wmissing-prototypes]
Signed-off-by: Pujin Shi <shipj@lemote.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull sched/fifo updates from Ingo Molnar:
"This adds the sched_set_fifo*() encapsulation APIs to remove static
priority level knowledge from non-scheduler code.
The three APIs for non-scheduler code to set SCHED_FIFO are:
- sched_set_fifo()
- sched_set_fifo_low()
- sched_set_normal()
These are two FIFO priority levels: default (high), and a 'low'
priority level, plus sched_set_normal() to set the policy back to
non-SCHED_FIFO.
Since the changes affect a lot of non-scheduler code, we kept this in
a separate tree"
* tag 'sched-fifo-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
sched,tracing: Convert to sched_set_fifo()
sched: Remove sched_set_*() return value
sched: Remove sched_setscheduler*() EXPORTs
sched,psi: Convert to sched_set_fifo_low()
sched,rcutorture: Convert to sched_set_fifo_low()
sched,rcuperf: Convert to sched_set_fifo_low()
sched,locktorture: Convert to sched_set_fifo()
sched,irq: Convert to sched_set_fifo()
sched,watchdog: Convert to sched_set_fifo()
sched,serial: Convert to sched_set_fifo()
sched,powerclamp: Convert to sched_set_fifo()
sched,ion: Convert to sched_set_normal()
sched,powercap: Convert to sched_set_fifo*()
sched,spi: Convert to sched_set_fifo*()
sched,mmc: Convert to sched_set_fifo*()
sched,ivtv: Convert to sched_set_fifo*()
sched,drm/scheduler: Convert to sched_set_fifo*()
sched,msm: Convert to sched_set_fifo*()
sched,psci: Convert to sched_set_fifo*()
sched,drbd: Convert to sched_set_fifo*()
...
|
|
After commit 333cff6c963fbc ("powercap/drivers/idle_inject: Specify
idle state max latency"), we convert to use play_idle_precise() with
max allowed latency to specify the idle state.
Some function comments still use play_idle(), let's update it to
play_idle_precise().
Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Signed-off-by: Frank Lee <frank@allwinnertech.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Because SCHED_FIFO is a broken scheduler model (see previous patches)
take away the priority field, the kernel can't possibly make an
informed decision.
Effectively no change.
Cc: daniel.lezcano@linaro.org
Cc: rafael.j.wysocki@intel.com
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
|
|
Currently the idle injection framework uses the play_idle() function
which puts the current CPU in an idle state. The idle state is the
deepest one, as specified by the latency constraint when calling the
subsequent play_idle_precise() function with the INT_MAX.
The idle_injection is used by the cpuidle_cooling device which
computes the idle / run duration to mitigate the temperature by
injecting idle cycles. The cooling device has no control on the depth
of the idle state.
Allow finer control of the idle injection mechanism by allowing to
specify the latency for the idle state. Thus the cooling device has
the ability to have a guarantee on the exit latency of the idle states
it is injecting.
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Link: https://lore.kernel.org/r/20200429103644.5492-1-daniel.lezcano@linaro.org
|
|
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by
this change:
"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]
Lastly, fix the following checkpatch warning:
WARNING: Prefer 'unsigned long' over 'unsigned long int' as the int is unnecessary
+ unsigned long int cpumask[];
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The resolution of the idle injection is limited to 1ms. If there is
a need for an injection of 1.2 ms, it is not possible.
The idle injection API is not yet used, so it is safe to convert the
existing API to the new time unit instead of adding more functions.
Convert to microsecond in order to use a finer grain time unit when
injecting idle cycles.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The play_idle resolution is 1ms. The intel_powerclamp bases the idle
duration on jiffies. The idle injection API is also using msec based
duration but has no user yet.
Unfortunately, msec based time does not fit well when we want to
inject idle cycle precisely with shallow idle state.
In order to set the scene for the incoming idle injection user, move
the precision up to usec when calling play_idle.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Initially, the cpu_cooling device for ARM was changed by adding a new
policy inserting idle cycles. The intel_powerclamp driver does a
similar action.
Instead of implementing idle injections privately in the cpu_cooling
device, move the idle injection code in a dedicated framework and give
the opportunity to other frameworks to make use of it.
The framework relies on the smpboot kthreads which handles via its
main loop the common code for hotplugging and [un]parking.
This code was previously tested with the cpu cooling device and went
through several iterations. It results now in split code and API
exported in the header file. It was tested with the cpu cooling device
with success.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org>
[ rjw: Rewrite of all comments ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|