diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2021-04-27 01:10:25 +0300 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2021-04-27 01:10:25 +0300 |
commit | 5469f160e6bf38b84eb237055868286e629b8d44 (patch) | |
tree | f4ca3ebd04e46af3e895f941ec76d714f92670ff /drivers | |
parent | d8f9176b4ece17e831306072678cd9ae49688cf5 (diff) | |
parent | 59e2c959f20f9f255a42de52cde54a2962fb726f (diff) | |
download | linux-5469f160e6bf38b84eb237055868286e629b8d44.tar.xz |
Merge tag 'pm-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These add some new hardware support (for example, IceLake-D idle
states in intel_idle), fix some issues (for example, the handling of
negative "sleep length" values in cpuidle governors), add new
functionality to the existing drivers (for example, scale-invariance
support in the ACPI CPPC cpufreq driver) and clean up code all over.
Specifics:
- Add idle states table for IceLake-D to the intel_idle driver and
update IceLake-X C6 data in it (Artem Bityutskiy).
- Fix the C7 idle state on Tegra114 in the tegra cpuidle driver and
drop the unused do_idle() firmware call from it (Dmitry Osipenko).
- Fix cpuidle-qcom-spm Kconfig entry (He Ying).
- Fix handling of possible negative tick_nohz_get_next_hrtimer()
return values of in cpuidle governors (Rafael Wysocki).
- Add support for frequency-invariance to the ACPI CPPC cpufreq
driver and update the frequency-invariance engine (FIE) to use it
as needed (Viresh Kumar).
- Simplify the default delay_us setting in the ACPI CPPC cpufreq
driver (Tom Saeger).
- Clean up frequency-related computations in the intel_pstate cpufreq
driver (Rafael Wysocki).
- Fix TBG parent setting for load levels in the armada-37xx cpufreq
driver and drop the CPU PM clock .set_parent method for armada-37xx
(Marek Behún).
- Fix multiple issues in the armada-37xx cpufreq driver (Pali Rohár).
- Fix handling of dev_pm_opp_of_cpumask_add_table() return values in
cpufreq-dt to take the -EPROBE_DEFER one into acconut as
appropriate (Quanyang Wang).
- Fix format string in ia64-acpi-cpufreq (Sergei Trofimovich).
- Drop the unused for_each_policy() macro from cpufreq (Shaokun
Zhang).
- Simplify computations in the schedutil cpufreq governor to avoid
unnecessary overhead (Yue Hu).
- Fix typos in the s5pv210 cpufreq driver (Bhaskar Chowdhury).
- Fix cpufreq documentation links in Kconfig (Alexander Monakov).
- Fix PCI device power state handling in pci_enable_device_flags() to
avoid issuse in some cases when the device depends on an ACPI power
resource (Rafael Wysocki).
- Add missing documentation of pm_runtime_resume_and_get() (Alan
Stern).
- Add missing static inline stub for pm_runtime_has_no_callbacks() to
pm_runtime.h and drop the unused try_to_freeze_nowarn() definition
(YueHaibing).
- Drop duplicate struct device declaration from pm.h and fix a
structure type declaration in intel_rapl.h (Wan Jiabing).
- Use dev_set_name() instead of an open-coded equivalent of it in the
wakeup sources code and drop a redundant local variable
initialization from it (Andy Shevchenko, Colin Ian King).
- Use crc32 instead of md5 for e820 memory map integrity check during
resume from hibernation on x86 (Chris von Recklinghausen).
- Fix typos in comments in the system-wide and hibernation support
code (Lu Jialin).
- Modify the generic power domains (genpd) code to avoid resuming
devices in the "prepare" phase of system-wide suspend and
hibernation (Ulf Hansson).
- Add Hygon Fam18h RAPL support to the intel_rapl power capping
driver (Pu Wen).
- Add MAINTAINERS entry for the dynamic thermal power management
(DTPM) code (Daniel Lezcano).
- Add devm variants of operating performance points (OPP) API
functions and switch over some users of the OPP framework to the
new resource-managed API (Yangtao Li and Dmitry Osipenko).
- Update devfreq core:
* Register devfreq devices as cooling devices on demand (Daniel
Lezcano).
* Add missing unlock opeation in devfreq_add_device() (Lukasz
Luba).
* Use the next frequency as resume_freq instead of the previous
frequency when using the opp-suspend property (Dong Aisheng).
* Check get_dev_status in devfreq_update_stats() (Dong Aisheng).
* Fix set_freq path for the userspace governor in Kconfig (Dong
Aisheng).
* Remove invalid description of get_target_freq() (Dong Aisheng).
- Update devfreq drivers:
* imx8m-ddrc: Remove imx8m_ddrc_get_dev_status() and unneeded
of_match_ptr() (Dong Aisheng, Fabio Estevam).
* rk3399_dmc: dt-bindings: Add rockchip,pmu phandle and drop
references to undefined symbols (Enric Balletbo i Serra, Gaël
PORTAY).
* rk3399_dmc: Use dev_err_probe() to simplify the code (Krzysztof
Kozlowski).
* imx-bus: Remove unneeded of_match_ptr() (Fabio Estevam).
- Fix kernel-doc warnings in three places (Pierre-Louis Bossart).
- Fix typo in the pm-graph utility code (Ricardo Ribalda)"
* tag 'pm-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (74 commits)
PM: wakeup: remove redundant assignment to variable retval
PM: hibernate: x86: Use crc32 instead of md5 for hibernation e820 integrity check
cpufreq: Kconfig: fix documentation links
PM: wakeup: use dev_set_name() directly
PM: runtime: Add documentation for pm_runtime_resume_and_get()
cpufreq: intel_pstate: Simplify intel_pstate_update_perf_limits()
cpufreq: armada-37xx: Fix module unloading
cpufreq: armada-37xx: Remove cur_frequency variable
cpufreq: armada-37xx: Fix determining base CPU frequency
cpufreq: armada-37xx: Fix driver cleanup when registration failed
clk: mvebu: armada-37xx-periph: Fix workaround for switching from L1 to L0
clk: mvebu: armada-37xx-periph: Fix switching CPU freq from 250 Mhz to 1 GHz
cpufreq: armada-37xx: Fix the AVS value for load L1
clk: mvebu: armada-37xx-periph: remove .set_parent method for CPU PM clock
cpufreq: armada-37xx: Fix setting TBG parent for load levels
cpuidle: Fix ARM_QCOM_SPM_CPUIDLE configuration
cpuidle: tegra: Remove do_idle firmware call
cpuidle: tegra: Fix C7 idling state on Tegra114
PM: sleep: fix typos in comments
cpufreq: Remove unused for_each_policy macro
...
Diffstat (limited to 'drivers')
42 files changed, 801 insertions, 484 deletions
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c index de8587cc119e..c1179edc0f3b 100644 --- a/drivers/base/arch_topology.c +++ b/drivers/base/arch_topology.c @@ -21,17 +21,94 @@ #include <linux/sched.h> #include <linux/smp.h> +static DEFINE_PER_CPU(struct scale_freq_data *, sft_data); +static struct cpumask scale_freq_counters_mask; +static bool scale_freq_invariant; + +static bool supports_scale_freq_counters(const struct cpumask *cpus) +{ + return cpumask_subset(cpus, &scale_freq_counters_mask); +} + bool topology_scale_freq_invariant(void) { return cpufreq_supports_freq_invariance() || - arch_freq_counters_available(cpu_online_mask); + supports_scale_freq_counters(cpu_online_mask); } -__weak bool arch_freq_counters_available(const struct cpumask *cpus) +static void update_scale_freq_invariant(bool status) { - return false; + if (scale_freq_invariant == status) + return; + + /* + * Task scheduler behavior depends on frequency invariance support, + * either cpufreq or counter driven. If the support status changes as + * a result of counter initialisation and use, retrigger the build of + * scheduling domains to ensure the information is propagated properly. + */ + if (topology_scale_freq_invariant() == status) { + scale_freq_invariant = status; + rebuild_sched_domains_energy(); + } } -DEFINE_PER_CPU(unsigned long, freq_scale) = SCHED_CAPACITY_SCALE; + +void topology_set_scale_freq_source(struct scale_freq_data *data, + const struct cpumask *cpus) +{ + struct scale_freq_data *sfd; + int cpu; + + /* + * Avoid calling rebuild_sched_domains() unnecessarily if FIE is + * supported by cpufreq. + */ + if (cpumask_empty(&scale_freq_counters_mask)) + scale_freq_invariant = topology_scale_freq_invariant(); + + for_each_cpu(cpu, cpus) { + sfd = per_cpu(sft_data, cpu); + + /* Use ARCH provided counters whenever possible */ + if (!sfd || sfd->source != SCALE_FREQ_SOURCE_ARCH) { + per_cpu(sft_data, cpu) = data; + cpumask_set_cpu(cpu, &scale_freq_counters_mask); + } + } + + update_scale_freq_invariant(true); +} +EXPORT_SYMBOL_GPL(topology_set_scale_freq_source); + +void topology_clear_scale_freq_source(enum scale_freq_source source, + const struct cpumask *cpus) +{ + struct scale_freq_data *sfd; + int cpu; + + for_each_cpu(cpu, cpus) { + sfd = per_cpu(sft_data, cpu); + + if (sfd && sfd->source == source) { + per_cpu(sft_data, cpu) = NULL; + cpumask_clear_cpu(cpu, &scale_freq_counters_mask); + } + } + + update_scale_freq_invariant(false); +} +EXPORT_SYMBOL_GPL(topology_clear_scale_freq_source); + +void topology_scale_freq_tick(void) +{ + struct scale_freq_data *sfd = *this_cpu_ptr(&sft_data); + + if (sfd) + sfd->set_freq_scale(); +} + +DEFINE_PER_CPU(unsigned long, arch_freq_scale) = SCHED_CAPACITY_SCALE; +EXPORT_PER_CPU_SYMBOL_GPL(arch_freq_scale); void topology_set_freq_scale(const struct cpumask *cpus, unsigned long cur_freq, unsigned long max_freq) @@ -47,13 +124,13 @@ void topology_set_freq_scale(const struct cpumask *cpus, unsigned long cur_freq, * want to update the scale factor with information from CPUFREQ. * Instead the scale factor will be updated from arch_scale_freq_tick. */ - if (arch_freq_counters_available(cpus)) + if (supports_scale_freq_counters(cpus)) return; scale = (cur_freq << SCHED_CAPACITY_SHIFT) / max_freq; for_each_cpu(i, cpus) - per_cpu(freq_scale, i) = scale; + per_cpu(arch_freq_scale, i) = scale; } DEFINE_PER_CPU(unsigned long, cpu_scale) = SCHED_CAPACITY_SCALE; diff --git a/drivers/base/power/clock_ops.c b/drivers/base/power/clock_ops.c index 84d5acb6301b..0251f3e6e61d 100644 --- a/drivers/base/power/clock_ops.c +++ b/drivers/base/power/clock_ops.c @@ -140,7 +140,7 @@ static void pm_clk_op_unlock(struct pm_subsys_data *psd, unsigned long *flags) } /** - * pm_clk_enable - Enable a clock, reporting any errors + * __pm_clk_enable - Enable a clock, reporting any errors * @dev: The device for the given clock * @ce: PM clock entry corresponding to the clock. */ diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index 78c310d3179d..b6a782c31613 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -1088,34 +1088,6 @@ static void genpd_sync_power_on(struct generic_pm_domain *genpd, bool use_lock, } /** - * resume_needed - Check whether to resume a device before system suspend. - * @dev: Device to check. - * @genpd: PM domain the device belongs to. - * - * There are two cases in which a device that can wake up the system from sleep - * states should be resumed by genpd_prepare(): (1) if the device is enabled - * to wake up the system and it has to remain active for this purpose while the - * system is in the sleep state and (2) if the device is not enabled to wake up - * the system from sleep states and it generally doesn't generate wakeup signals - * by itself (those signals are generated on its behalf by other parts of the - * system). In the latter case it may be necessary to reconfigure the device's - * wakeup settings during system suspend, because it may have been set up to - * signal remote wakeup from the system's working state as needed by runtime PM. - * Return 'true' in either of the above cases. - */ -static bool resume_needed(struct device *dev, - const struct generic_pm_domain *genpd) -{ - bool active_wakeup; - - if (!device_can_wakeup(dev)) - return false; - - active_wakeup = genpd_is_active_wakeup(genpd); - return device_may_wakeup(dev) ? active_wakeup : !active_wakeup; -} - -/** * genpd_prepare - Start power transition of a device in a PM domain. * @dev: Device to start the transition of. * @@ -1135,14 +1107,6 @@ static int genpd_prepare(struct device *dev) if (IS_ERR(genpd)) return -EINVAL; - /* - * If a wakeup request is pending for the device, it should be woken up - * at this point and a system wakeup event should be reported if it's - * set up to wake up the system from sleep states. - */ - if (resume_needed(dev, genpd)) - pm_runtime_resume(dev); - genpd_lock(genpd); if (genpd->prepared_count++ == 0) diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c index fe1dad68aee4..1fc1a992f90c 100644 --- a/drivers/base/power/runtime.c +++ b/drivers/base/power/runtime.c @@ -951,7 +951,7 @@ static void pm_runtime_work(struct work_struct *work) /** * pm_suspend_timer_fn - Timer function for pm_schedule_suspend(). - * @data: Device pointer passed by pm_schedule_suspend(). + * @timer: hrtimer used by pm_schedule_suspend(). * * Check if the time is right and queue a suspend request. */ diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c index 92073ac68473..f0b37c188514 100644 --- a/drivers/base/power/wakeup.c +++ b/drivers/base/power/wakeup.c @@ -400,9 +400,9 @@ void device_wakeup_detach_irq(struct device *dev) } /** - * device_wakeup_arm_wake_irqs(void) + * device_wakeup_arm_wake_irqs - * - * Itereates over the list of device wakeirqs to arm them. + * Iterates over the list of device wakeirqs to arm them. */ void device_wakeup_arm_wake_irqs(void) { @@ -416,9 +416,9 @@ void device_wakeup_arm_wake_irqs(void) } /** - * device_wakeup_disarm_wake_irqs(void) + * device_wakeup_disarm_wake_irqs - * - * Itereates over the list of device wakeirqs to disarm them. + * Iterates over the list of device wakeirqs to disarm them. */ void device_wakeup_disarm_wake_irqs(void) { @@ -532,6 +532,7 @@ EXPORT_SYMBOL_GPL(device_init_wakeup); /** * device_set_wakeup_enable - Enable or disable a device to wake up the system. * @dev: Device to handle. + * @enable: enable/disable flag */ int device_set_wakeup_enable(struct device *dev, bool enable) { @@ -581,7 +582,7 @@ static bool wakeup_source_not_registered(struct wakeup_source *ws) */ /** - * wakup_source_activate - Mark given wakeup source as active. + * wakeup_source_activate - Mark given wakeup source as active. * @ws: Wakeup source to handle. * * Update the @ws' statistics and, if @ws has just been activated, notify the PM @@ -686,7 +687,7 @@ static inline void update_prevent_sleep_time(struct wakeup_source *ws, #endif /** - * wakup_source_deactivate - Mark given wakeup source as inactive. + * wakeup_source_deactivate - Mark given wakeup source as inactive. * @ws: Wakeup source to handle. * * Update the @ws' statistics and notify the PM core that the wakeup source has @@ -785,7 +786,7 @@ EXPORT_SYMBOL_GPL(pm_relax); /** * pm_wakeup_timer_fn - Delayed finalization of a wakeup event. - * @data: Address of the wakeup source object associated with the event source. + * @t: timer list * * Call wakeup_source_deactivate() for the wakeup source whose address is stored * in @data if it is currently active and its timer has not been canceled and @@ -1021,7 +1022,7 @@ bool pm_save_wakeup_count(unsigned int count) #ifdef CONFIG_PM_AUTOSLEEP /** * pm_wakep_autosleep_enabled - Modify autosleep_enabled for all wakeup sources. - * @enabled: Whether to set or to clear the autosleep_enabled flags. + * @set: Whether to set or to clear the autosleep_enabled flags. */ void pm_wakep_autosleep_enabled(bool set) { diff --git a/drivers/base/power/wakeup_stats.c b/drivers/base/power/wakeup_stats.c index 5ade7539ac02..924fac493c4f 100644 --- a/drivers/base/power/wakeup_stats.c +++ b/drivers/base/power/wakeup_stats.c @@ -137,7 +137,7 @@ static struct device *wakeup_source_device_create(struct device *parent, struct wakeup_source *ws) { struct device *dev = NULL; - int retval = -ENODEV; + int retval; dev = kzalloc(sizeof(*dev), GFP_KERNEL); if (!dev) { diff --git a/drivers/clk/mvebu/armada-37xx-periph.c b/drivers/clk/mvebu/armada-37xx-periph.c index f5746f9ea929..32ac6b6b7530 100644 --- a/drivers/clk/mvebu/armada-37xx-periph.c +++ b/drivers/clk/mvebu/armada-37xx-periph.c @@ -84,6 +84,7 @@ struct clk_pm_cpu { void __iomem *reg_div; u8 shift_div; struct regmap *nb_pm_base; + unsigned long l1_expiration; }; #define to_clk_double_div(_hw) container_of(_hw, struct clk_double_div, hw) @@ -440,33 +441,6 @@ static u8 clk_pm_cpu_get_parent(struct clk_hw *hw) return val; } -static int clk_pm_cpu_set_parent(struct clk_hw *hw, u8 index) -{ - struct clk_pm_cpu *pm_cpu = to_clk_pm_cpu(hw); - struct regmap *base = pm_cpu->nb_pm_base; - int load_level; - - /* - * We set the clock parent only if the DVFS is available but - * not enabled. - */ - if (IS_ERR(base) || armada_3700_pm_dvfs_is_enabled(base)) - return -EINVAL; - - /* Set the parent clock for all the load level */ - for (load_level = 0; load_level < LOAD_LEVEL_NR; load_level++) { - unsigned int reg, mask, val, - offset = ARMADA_37XX_NB_TBG_SEL_OFF; - - armada_3700_pm_dvfs_update_regs(load_level, ®, &offset); - - val = index << offset; - mask = ARMADA_37XX_NB_TBG_SEL_MASK << offset; - regmap_update_bits(base, reg, mask, val); - } - return 0; -} - static unsigned long clk_pm_cpu_recalc_rate(struct clk_hw *hw, unsigned long parent_rate) { @@ -514,8 +488,10 @@ static long clk_pm_cpu_round_rate(struct clk_hw *hw, unsigned long rate, } /* - * Switching the CPU from the L2 or L3 frequencies (300 and 200 Mhz - * respectively) to L0 frequency (1.2 Ghz) requires a significant + * Workaround when base CPU frequnecy is 1000 or 1200 MHz + * + * Switching the CPU from the L2 or L3 frequencies (250/300 or 200 MHz + * respectively) to L0 frequency (1/1.2 GHz) requires a significant * amount of time to let VDD stabilize to the appropriate * voltage. This amount of time is large enough that it cannot be * covered by the hardware countdown register. Due to this, the CPU @@ -525,26 +501,56 @@ static long clk_pm_cpu_round_rate(struct clk_hw *hw, unsigned long rate, * To work around this problem, we prevent switching directly from the * L2/L3 frequencies to the L0 frequency, and instead switch to the L1 * frequency in-between. The sequence therefore becomes: - * 1. First switch from L2/L3(200/300MHz) to L1(600MHZ) + * 1. First switch from L2/L3 (200/250/300 MHz) to L1 (500/600 MHz) * 2. Sleep 20ms for stabling VDD voltage - * 3. Then switch from L1(600MHZ) to L0(1200Mhz). + * 3. Then switch from L1 (500/600 MHz) to L0 (1000/1200 MHz). */ -static void clk_pm_cpu_set_rate_wa(unsigned long rate, struct regmap *base) +static void clk_pm_cpu_set_rate_wa(struct clk_pm_cpu *pm_cpu, + unsigned int new_level, unsigned long rate, + struct regmap *base) { unsigned int cur_level; - if (rate != 1200 * 1000 * 1000) - return; - regmap_read(base, ARMADA_37XX_NB_CPU_LOAD, &cur_level); cur_level &= ARMADA_37XX_NB_CPU_LOAD_MASK; - if (cur_level <= ARMADA_37XX_DVFS_LOAD_1) + + if (cur_level == new_level) + return; + + /* + * System wants to go to L1 on its own. If we are going from L2/L3, + * remember when 20ms will expire. If from L0, set the value so that + * next switch to L0 won't have to wait. + */ + if (new_level == ARMADA_37XX_DVFS_LOAD_1) { + if (cur_level == ARMADA_37XX_DVFS_LOAD_0) + pm_cpu->l1_expiration = jiffies; + else + pm_cpu->l1_expiration = jiffies + msecs_to_jiffies(20); return; + } + + /* + * If we are setting to L2/L3, just invalidate L1 expiration time, + * sleeping is not needed. + */ + if (rate < 1000*1000*1000) + goto invalidate_l1_exp; + + /* + * We are going to L0 with rate >= 1GHz. Check whether we have been at + * L1 for long enough time. If not, go to L1 for 20ms. + */ + if (pm_cpu->l1_expiration && jiffies >= pm_cpu->l1_expiration) + goto invalidate_l1_exp; regmap_update_bits(base, ARMADA_37XX_NB_CPU_LOAD, ARMADA_37XX_NB_CPU_LOAD_MASK, ARMADA_37XX_DVFS_LOAD_1); msleep(20); + +invalidate_l1_exp: + pm_cpu->l1_expiration = 0; } static int clk_pm_cpu_set_rate(struct clk_hw *hw, unsigned long rate, @@ -578,7 +584,9 @@ static int clk_pm_cpu_set_rate(struct clk_hw *hw, unsigned long rate, reg = ARMADA_37XX_NB_CPU_LOAD; mask = ARMADA_37XX_NB_CPU_LOAD_MASK; - clk_pm_cpu_set_rate_wa(rate, base); + /* Apply workaround when base CPU frequency is 1000 or 1200 MHz */ + if (parent_rate >= 1000*1000*1000) + clk_pm_cpu_set_rate_wa(pm_cpu, load_level, rate, base); regmap_update_bits(base, reg, mask, load_level); @@ -592,7 +600,6 @@ static int clk_pm_cpu_set_rate(struct clk_hw *hw, unsigned long rate, static const struct clk_ops clk_pm_cpu_ops = { .get_parent = clk_pm_cpu_get_parent, - .set_parent = clk_pm_cpu_set_parent, .round_rate = clk_pm_cpu_round_rate, .set_rate = clk_pm_cpu_set_rate, .recalc_rate = clk_pm_cpu_recalc_rate, diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig index 85de313ddec2..c3038cdc6865 100644 --- a/drivers/cpufreq/Kconfig +++ b/drivers/cpufreq/Kconfig @@ -13,7 +13,8 @@ config CPU_FREQ clock speed, you need to either enable a dynamic cpufreq governor (see below) after boot, or use a userspace tool. - For details, take a look at <file:Documentation/cpu-freq>. + For details, take a look at + <file:Documentation/admin-guide/pm/cpufreq.rst>. If in doubt, say N. @@ -140,8 +141,6 @@ config CPU_FREQ_GOV_USERSPACE To compile this driver as a module, choose M here: the module will be called cpufreq_userspace. - For details, take a look at <file:Documentation/cpu-freq/>. - If in doubt, say Y. config CPU_FREQ_GOV_ONDEMAND @@ -158,7 +157,8 @@ config CPU_FREQ_GOV_ONDEMAND To compile this driver as a module, choose M here: the module will be called cpufreq_ondemand. - For details, take a look at linux/Documentation/cpu-freq. + For details, take a look at + <file:Documentation/admin-guide/pm/cpufreq.rst>. If in doubt, say N. @@ -182,7 +182,8 @@ config CPU_FREQ_GOV_CONSERVATIVE To compile this driver as a module, choose M here: the module will be called cpufreq_conservative. - For details, take a look at linux/Documentation/cpu-freq. + For details, take a look at + <file:Documentation/admin-guide/pm/cpufreq.rst>. If in doubt, say N. @@ -246,8 +247,6 @@ config IA64_ACPI_CPUFREQ This driver adds a CPUFreq driver which utilizes the ACPI Processor Performance States. - For details, take a look at <file:Documentation/cpu-freq/>. - If in doubt, say N. endif @@ -271,8 +270,6 @@ config LOONGSON2_CPUFREQ Loongson2F and it's successors support this feature. - For details, take a look at <file:Documentation/cpu-freq/>. - If in doubt, say N. config LOONGSON1_CPUFREQ @@ -282,8 +279,6 @@ config LOONGSON1_CPUFREQ This option adds a CPUFreq driver for loongson1 processors which support software configurable cpu frequency. - For details, take a look at <file:Documentation/cpu-freq/>. - If in doubt, say N. endif @@ -293,8 +288,6 @@ config SPARC_US3_CPUFREQ help This adds the CPUFreq driver for UltraSPARC-III processors. - For details, take a look at <file:Documentation/cpu-freq>. - If in doubt, say N. config SPARC_US2E_CPUFREQ @@ -302,8 +295,6 @@ config SPARC_US2E_CPUFREQ help This adds the CPUFreq driver for UltraSPARC-IIe processors. - For details, take a look at <file:Documentation/cpu-freq>. - If in doubt, say N. endif @@ -318,8 +309,6 @@ config SH_CPU_FREQ will also generate a notice in the boot log before disabling itself if the CPU in question is not capable of rate rounding. - For details, take a look at <file:Documentation/cpu-freq>. - If unsure, say N. endif diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm index e65e0a43be64..a5c5f70acfc9 100644 --- a/drivers/cpufreq/Kconfig.arm +++ b/drivers/cpufreq/Kconfig.arm @@ -19,6 +19,16 @@ config ACPI_CPPC_CPUFREQ If in doubt, say N. +config ACPI_CPPC_CPUFREQ_FIE + bool "Frequency Invariance support for CPPC cpufreq driver" + depends on ACPI_CPPC_CPUFREQ && GENERIC_ARCH_TOPOLOGY + default y + help + This extends frequency invariance support in the CPPC cpufreq driver, + by using CPPC delivered and reference performance counters. + + If in doubt, say N. + config ARM_ALLWINNER_SUN50I_CPUFREQ_NVMEM tristate "Allwinner nvmem based SUN50I CPUFreq driver" depends on ARCH_SUNXI diff --git a/drivers/cpufreq/armada-37xx-cpufreq.c b/drivers/cpufreq/armada-37xx-cpufreq.c index b4af4094309b..3fc98a3ffd91 100644 --- a/drivers/cpufreq/armada-37xx-cpufreq.c +++ b/drivers/cpufreq/armada-37xx-cpufreq.c @@ -25,6 +25,10 @@ #include "cpufreq-dt.h" +/* Clk register set */ +#define ARMADA_37XX_CLK_TBG_SEL 0 +#define ARMADA_37XX_CLK_TBG_SEL_CPU_OFF 22 + /* Power management in North Bridge register set */ #define ARMADA_37XX_NB_L0L1 0x18 #define ARMADA_37XX_NB_L2L3 0x1C @@ -69,6 +73,8 @@ #define LOAD_LEVEL_NR 4 #define MIN_VOLT_MV 1000 +#define MIN_VOLT_MV_FOR_L1_1000MHZ 1108 +#define MIN_VOLT_MV_FOR_L1_1200MHZ 1155 /* AVS value for the corresponding voltage (in mV) */ static int avs_map[] = { @@ -80,6 +86,8 @@ static int avs_map[] = { }; struct armada37xx_cpufreq_state { + struct platform_device *pdev; + struct device *cpu_dev; struct regmap *regmap; u32 nb_l0l1; u32 nb_l2l3; @@ -120,10 +128,15 @@ static struct armada_37xx_dvfs *armada_37xx_cpu_freq_info_get(u32 freq) * will be configured then the DVFS will be enabled. */ static void __init armada37xx_cpufreq_dvfs_setup(struct regmap *base, - struct clk *clk, u8 *divider) + struct regmap *clk_base, u8 *divider) { + u32 cpu_tbg_sel; int load_lvl; - struct clk *parent; + + /* Determine to which TBG clock is CPU connected */ + regmap_read(clk_base, ARMADA_37XX_CLK_TBG_SEL, &cpu_tbg_sel); + cpu_tbg_sel >>= ARMADA_37XX_CLK_TBG_SEL_CPU_OFF; + cpu_tbg_sel &= ARMADA_37XX_NB_TBG_SEL_MASK; for (load_lvl = 0; load_lvl < LOAD_LEVEL_NR; load_lvl++) { unsigned int reg, mask, val, offset = 0; @@ -142,6 +155,11 @@ static void __init armada37xx_cpufreq_dvfs_setup(struct regmap *base, mask = (ARMADA_37XX_NB_CLK_SEL_MASK << ARMADA_37XX_NB_CLK_SEL_OFF); + /* Set TBG index, for all levels we use the same TBG */ + val = cpu_tbg_sel << ARMADA_37XX_NB_TBG_SEL_OFF; + mask = (ARMADA_37XX_NB_TBG_SEL_MASK + << ARMADA_37XX_NB_TBG_SEL_OFF); + /* * Set cpu divider based on the pre-computed array in * order to have balanced step. @@ -160,14 +178,6 @@ static void __init armada37xx_cpufreq_dvfs_setup(struct regmap *base, regmap_update_bits(base, reg, mask, val); } - - /* - * Set cpu clock source, for all the level we keep the same - * clock source that the one already configured. For this one - * we need to use the clock framework - */ - parent = clk_get_parent(clk); - clk_set_parent(clk, parent); } /* @@ -202,6 +212,8 @@ static u32 armada_37xx_avs_val_match(int target_vm) * - L2 & L3 voltage should be about 150mv smaller than L0 voltage. * This function calculates L1 & L2 & L3 AVS values dynamically based * on L0 voltage and fill all AVS values to the AVS value table. + * When base CPU frequency is 1000 or 1200 MHz then there is additional + * minimal avs value for load L1. */ static void __init armada37xx_cpufreq_avs_configure(struct regmap *base, struct armada_37xx_dvfs *dvfs) @@ -233,6 +245,19 @@ static void __init armada37xx_cpufreq_avs_configure(struct regmap *base, for (load_level = 1; load_level < LOAD_LEVEL_NR; load_level++) dvfs->avs[load_level] = avs_min; + /* + * Set the avs values for load L0 and L1 when base CPU frequency + * is 1000/1200 MHz to its typical initial values according to + * the Armada 3700 Hardware Specifications. + */ + if (dvfs->cpu_freq_max >= 1000*1000*1000) { + if (dvfs->cpu_freq_max >= 1200*1000*1000) + avs_min = armada_37xx_avs_val_match(MIN_VOLT_MV_FOR_L1_1200MHZ); + else + avs_min = armada_37xx_avs_val_match(MIN_VOLT_MV_FOR_L1_1000MHZ); + dvfs->avs[0] = dvfs->avs[1] = avs_min; + } + return; } @@ -252,6 +277,26 @@ static void __init armada37xx_cpufreq_avs_configure(struct regmap *base, target_vm = avs_map[l0_vdd_min] - 150; target_vm = target_vm > MIN_VOLT_MV ? target_vm : MIN_VOLT_MV; dvfs->avs[2] = dvfs->avs[3] = armada_37xx_avs_val_match(target_vm); + + /* + * Fix the avs value for load L1 when base CPU frequency is 1000/1200 MHz, + * otherwise the CPU gets stuck when switching from load L1 to load L0. + * Also ensure that avs value for load L1 is not higher than for L0. + */ + if (dvfs->cpu_freq_max >= 1000*1000*1000) { + u32 avs_min_l1; + + if (dvfs->cpu_freq_max >= 1200*1000*1000) + avs_min_l1 = armada_37xx_avs_val_match(MIN_VOLT_MV_FOR_L1_1200MHZ); + else + avs_min_l1 = armada_37xx_avs_val_match(MIN_VOLT_MV_FOR_L1_1000MHZ); + + if (avs_min_l1 > dvfs->avs[0]) + avs_min_l1 = dvfs->avs[0]; + + if (dvfs->avs[1] < avs_min_l1) + dvfs->avs[1] = avs_min_l1; + } } static void __init armada37xx_cpufreq_avs_setup(struct regmap *base, @@ -357,12 +402,17 @@ static int __init armada37xx_cpufreq_driver_init(void) struct armada_37xx_dvfs *dvfs; struct platform_device *pdev; unsigned long freq; - unsigned int cur_frequency, base_frequency; - struct regmap *nb_pm_base, *avs_base; + unsigned int base_frequency; + struct regmap *nb_clk_base, *nb_pm_base, *avs_base; struct device *cpu_dev; int load_lvl, ret; struct clk *clk, *parent; + nb_clk_base = + syscon_regmap_lookup_by_compatible("marvell,armada-3700-periph-clock-nb"); + if (IS_ERR(nb_clk_base)) + return -ENODEV; + nb_pm_base = syscon_regmap_lookup_by_compatible("marvell,armada-3700-nb-pm"); @@ -413,15 +463,7 @@ static int __init armada37xx_cpufreq_driver_init(void) return -EINVAL; } - /* Get nominal (current) CPU frequency */ - cur_frequency = clk_get_rate(clk); - if (!cur_frequency) { - dev_err(cpu_dev, "Failed to get clock rate for CPU\n"); - clk_put(clk); - return -EINVAL; - } - - dvfs = armada_37xx_cpu_freq_info_get(cur_frequency); + dvfs = armada_37xx_cpu_freq_info_get(base_frequency); if (!dvfs) { clk_put(clk); return -EINVAL; @@ -439,7 +481,7 @@ static int __init armada37xx_cpufreq_driver_init(void) armada37xx_cpufreq_avs_configure(avs_base, dvfs); armada37xx_cpufreq_avs_setup(avs_base, dvfs); - armada37xx_cpufreq_dvfs_setup(nb_pm_base, clk, dvfs->divider); + armada37xx_cpufreq_dvfs_setup(nb_pm_base, nb_clk_base, dvfs->divider); clk_put(clk); for (load_lvl = ARMADA_37XX_DVFS_LOAD_0; load_lvl < LOAD_LEVEL_NR; @@ -466,6 +508,9 @@ static int __init armada37xx_cpufreq_driver_init(void) if (ret) goto disable_dvfs; + armada37xx_cpufreq_state->cpu_dev = cpu_dev; + armada37xx_cpufreq_state->pdev = pdev; + platform_set_drvdata(pdev, dvfs); return 0; disable_dvfs: @@ -473,7 +518,7 @@ disable_dvfs: remove_opp: /* clean-up the already added opp before leaving */ while (load_lvl-- > ARMADA_37XX_DVFS_LOAD_0) { - freq = cur_frequency / dvfs->divider[load_lvl]; + freq = base_frequency / dvfs->divider[load_lvl]; dev_pm_opp_remove(cpu_dev, freq); } @@ -484,6 +529,26 @@ remove_opp: /* late_initcall, to guarantee the driver is loaded after A37xx clock driver */ late_initcall(armada37xx_cpufreq_driver_init); +static void __exit armada37xx_cpufreq_driver_exit(void) +{ + struct platform_device *pdev = armada37xx_cpufreq_state->pdev; + struct armada_37xx_dvfs *dvfs = platform_get_drvdata(pdev); + unsigned long freq; + int load_lvl; + + platform_device_unregister(pdev); + + armada37xx_cpufreq_disable_dvfs(armada37xx_cpufreq_state->regmap); + + for (load_lvl = ARMADA_37XX_DVFS_LOAD_0; load_lvl < LOAD_LEVEL_NR; load_lvl++) { + freq = dvfs->cpu_freq_max / dvfs->divider[load_lvl]; + dev_pm_opp_remove(armada37xx_cpufreq_state->cpu_dev, freq); + } + + kfree(armada37xx_cpufreq_state); +} +module_exit(armada37xx_cpufreq_driver_exit); + static const struct of_device_id __maybe_unused armada37xx_cpufreq_of_match[] = { { .compatible = "marvell,armada-3700-nb-pm" }, { }, diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c index 8a482c434ea6..3848b4c222e1 100644 --- a/drivers/cpufreq/cppc_cpufreq.c +++ b/drivers/cpufreq/cppc_cpufreq.c @@ -10,14 +10,18 @@ #define pr_fmt(fmt) "CPPC Cpufreq:" fmt +#include <linux/arch_topology.h> #include <linux/kernel.h> #include <linux/module.h> #include <linux/delay.h> #include <linux/cpu.h> #include <linux/cpufreq.h> #include <linux/dmi.h> +#include <linux/irq_work.h> +#include <linux/kthread.h> #include <linux/time.h> #include <linux/vmalloc.h> +#include <uapi/linux/sched/types.h> #include <asm/unaligned.h> @@ -57,6 +61,204 @@ static struct cppc_workaround_oem_info wa_info[] = { } }; +#ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE + +/* Frequency invariance support */ +struct cppc_freq_invariance { + int cpu; + struct irq_work irq_work; + struct kthread_work work; + struct cppc_perf_fb_ctrs prev_perf_fb_ctrs; + struct cppc_cpudata *cpu_data; +}; + +static DEFINE_PER_CPU(struct cppc_freq_invariance, cppc_freq_inv); +static struct kthread_worker *kworker_fie; +static bool fie_disabled; + +static struct cpufreq_driver cppc_cpufreq_driver; +static unsigned int hisi_cppc_cpufreq_get_rate(unsigned int cpu); +static int cppc_perf_from_fbctrs(struct cppc_cpudata *cpu_data, + struct cppc_perf_fb_ctrs fb_ctrs_t0, + struct cppc_perf_fb_ctrs fb_ctrs_t1); + +/** + * cppc_scale_freq_workfn - CPPC arch_freq_scale updater for frequency invariance + * @work: The work item. + * + * The CPPC driver register itself with the topology core to provide its own + * implementation (cppc_scale_freq_tick()) of topology_scale_freq_tick() which + * gets called by the scheduler on every tick. + * + * Note that the arch specific counters have higher priority than CPPC counters, + * if available, though the CPPC driver doesn't need to have any special + * handling for that. + * + * On an invocation of cppc_scale_freq_tick(), we schedule an irq work (since we + * reach here from hard-irq context), which then schedules a normal work item + * and cppc_scale_freq_workfn() updates the per_cpu arch_freq_scale variable + * based on the counter updates since the last tick. + */ +static void cppc_scale_freq_workfn(struct kthread_work *work) +{ + struct cppc_freq_invariance *cppc_fi; + struct cppc_perf_fb_ctrs fb_ctrs = {0}; + struct cppc_cpudata *cpu_data; + unsigned long local_freq_scale; + u64 perf; + + cppc_fi = container_of(work, struct cppc_freq_invariance, work); + cpu_data = cppc_fi->cpu_data; + + if (cppc_get_perf_ctrs(cppc_fi->cpu, &fb_ctrs)) { + pr_warn("%s: failed to read perf counters\n", __func__); + return; + } + + cppc_fi->prev_perf_fb_ctrs = fb_ctrs; + perf = cppc_perf_from_fbctrs(cpu_data, cppc_fi->prev_perf_fb_ctrs, + fb_ctrs); + + perf <<= SCHED_CAPACITY_SHIFT; + local_freq_scale = div64_u64(perf, cpu_data->perf_caps.highest_perf); + if (WARN_ON(local_freq_scale > 1024)) + local_freq_scale = 1024; + + per_cpu(arch_freq_scale, cppc_fi->cpu) = local_freq_scale; +} + +static void cppc_irq_work(struct irq_work *irq_work) +{ + struct cppc_freq_invariance *cppc_fi; + + cppc_fi = container_of(irq_work, struct cppc_freq_invariance, irq_work); + kthread_queue_work(kworker_fie, &cppc_fi->work); +} + +static void cppc_scale_freq_tick(void) +{ + struct cppc_freq_invariance *cppc_fi = &per_cpu(cppc_freq_inv, smp_processor_id()); + + /* + * cppc_get_perf_ctrs() can potentially sleep, call that from the right + * context. + */ + irq_work_queue(&cppc_fi->irq_work); +} + +static struct scale_freq_data cppc_sftd = { + .source = SCALE_FREQ_SOURCE_CPPC, + .set_freq_scale = cppc_scale_freq_tick, +}; + +static void cppc_freq_invariance_policy_init(struct cpufreq_policy *policy, + struct cppc_cpudata *cpu_data) +{ + struct cppc_perf_fb_ctrs fb_ctrs = {0}; + struct cppc_freq_invariance *cppc_fi; + int i, ret; + + if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate) + return; + + if (fie_disabled) + return; + + for_each_cpu(i, policy->cpus) { + cppc_fi = &per_cpu(cppc_freq_inv, i); + cppc_fi->cpu = i; + cppc_fi->cpu_data = cpu_data; + kthread_init_work(&cppc_fi->work, cppc_scale_freq_workfn); + init_irq_work(&cppc_fi->irq_work, cppc_irq_work); + + ret = cppc_get_perf_ctrs(i, &fb_ctrs); + if (ret) { + pr_warn("%s: failed to read perf counters: %d\n", + __func__, ret); + fie_disabled = true; + } else { + cppc_fi->prev_perf_fb_ctrs = fb_ctrs; + } + } +} + +static void __init cppc_freq_invariance_init(void) +{ + struct sched_attr attr = { + .size = sizeof(struct sched_attr), + .sched_policy = SCHED_DEADLINE, + .sched_nice = 0, + .sched_priority = 0, + /* + * Fake (unused) bandwidth; workaround to "fix" + * priority inheritance. + */ + .sched_runtime = 1000000, + .sched_deadline = 10000000, + .sched_period = 10000000, + }; + int ret; + + if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate) + return; + + if (fie_disabled) + return; + + kworker_fie = kthread_create_worker(0, "cppc_fie"); + if (IS_ERR(kworker_fie)) + return; + + ret = sched_setattr_nocheck(kworker_fie->task, &attr); + if (ret) { + pr_warn("%s: failed to set SCHED_DEADLINE: %d\n", __func__, + ret); + kthread_destroy_worker(kworker_fie); + return; + } + + /* Register for freq-invariance */ + topology_set_scale_freq_source(&cppc_sftd, cpu_present_mask); +} + +static void cppc_freq_invariance_exit(void) +{ + struct cppc_freq_invariance *cppc_fi; + int i; + + if (cppc_cpufreq_driver.get == hisi_cppc_cpufreq_get_rate) + return; + + if (fie_disabled) + return; + + topology_clear_scale_freq_source(SCALE_FREQ_SOURCE_CPPC, cpu_present_mask); + + for_each_possible_cpu(i) { + cppc_fi = &per_cpu(cppc_freq_inv, i); + irq_work_sync(&cppc_fi->irq_work); + } + + kthread_destroy_worker(kworker_fie); + kworker_fie = NULL; +} + +#else +static inline void +cppc_freq_invariance_policy_init(struct cpufreq_policy *policy, + struct cppc_cpudata *cpu_data) +{ +} + +static inline void cppc_freq_invariance_init(void) +{ +} + +static inline void cppc_freq_invariance_exit(void) +{ +} +#endif /* CONFIG_ACPI_CPPC_CPUFREQ_FIE */ + /* Callback function used to retrieve the max frequency from DMI */ static void cppc_find_dmi_mhz(const struct dmi_header *dm, void *private) { @@ -216,26 +418,16 @@ static unsigned int cppc_cpufreq_get_transition_delay_us(unsigned int cpu) { unsigned long implementor = read_cpuid_implementor(); unsigned long part_num = read_cpuid_part_number(); - unsigned int delay_us = 0; switch (implementor) { case ARM_CPU_IMP_QCOM: switch (part_num) { case QCOM_CPU_PART_FALKOR_V1: case QCOM_CPU_PART_FALKOR: - delay_us = 10000; - break; - default: - delay_us = cppc_get_transition_latency(cpu) / NSEC_PER_USEC; - break; + return 10000; } - break; - default: - delay_us = cppc_get_transition_latency(cpu) / NSEC_PER_USEC; - break; } - - return delay_us; + return cppc_get_transition_latency(cpu) / NSEC_PER_USEC; } #else @@ -355,9 +547,12 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy) cpu_data->perf_ctrls.desired_perf = caps->highest_perf; ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls); - if (ret) + if (ret) { pr_debug("Err setting perf value:%d on CPU:%d. ret:%d\n", caps->highest_perf, cpu, ret); + } else { + cppc_freq_invariance_policy_init(policy, cpu_data); + } return ret; } @@ -370,12 +565,12 @@ static inline u64 get_delta(u64 t1, u64 t0) return (u32)t1 - (u32)t0; } -static int cppc_get_rate_from_fbctrs(struct cppc_cpudata *cpu_data, - struct cppc_perf_fb_ctrs fb_ctrs_t0, - struct cppc_perf_fb_ctrs fb_ctrs_t1) +static int cppc_perf_from_fbctrs(struct cppc_cpudata *cpu_data, + struct cppc_perf_fb_ctrs fb_ctrs_t0, + struct cppc_perf_fb_ctrs fb_ctrs_t1) { u64 delta_reference, delta_delivered; - u64 reference_perf, delivered_perf; + u64 reference_perf; reference_perf = fb_ctrs_t0.reference_perf; @@ -384,12 +579,21 @@ static int cppc_get_rate_from_fbctrs(struct cppc_cpudata *cpu_data, delta_delivered = get_delta(fb_ctrs_t1.delivered, fb_ctrs_t0.delivered); - /* Check to avoid divide-by zero */ - if (delta_reference || delta_delivered) - delivered_perf = (reference_perf * delta_delivered) / - delta_reference; - else - delivered_perf = cpu_data->perf_ctrls.desired_perf; + /* Check to avoid divide-by zero and invalid delivered_perf */ + if (!delta_reference || !delta_delivered) + return cpu_data->perf_ctrls.desired_perf; + + return (reference_perf * delta_delivered) / delta_reference; +} + +static int cppc_get_rate_from_fbctrs(struct cppc_cpudata *cpu_data, + struct cppc_perf_fb_ctrs fb_ctrs_t0, + struct cppc_perf_fb_ctrs fb_ctrs_t1) +{ + u64 delivered_perf; + + delivered_perf = cppc_perf_from_fbctrs(cpu_data, fb_ctrs_t0, + fb_ctrs_t1); return cppc_cpufreq_perf_to_khz(cpu_data, delivered_perf); } @@ -514,6 +718,8 @@ static void cppc_check_hisi_workaround(void) static int __init cppc_cpufreq_init(void) { + int ret; + if ((acpi_disabled) || !acpi_cpc_valid()) return -ENODEV; @@ -521,7 +727,11 @@ static int __init cppc_cpufreq_init(void) cppc_check_hisi_workaround(); - return cpufreq_register_driver(&cppc_cpufreq_driver); + ret = cpufreq_register_driver(&cppc_cpufreq_driver); + if (!ret) + cppc_freq_invariance_init(); + + return ret; } static inline void free_cpu_data(void) @@ -538,6 +748,7 @@ static inline void free_cpu_data(void) static void __exit cppc_cpufreq_exit(void) { + cppc_freq_invariance_exit(); cpufreq_unregister_driver(&cppc_cpufreq_driver); free_cpu_data(); diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c index b1e1bdc63b01..ece52863ba62 100644 --- a/drivers/cpufreq/cpufreq-dt.c +++ b/drivers/cpufreq/cpufreq-dt.c @@ -255,10 +255,15 @@ static int dt_cpufreq_early_init(struct device *dev, int cpu) * before updating priv->cpus. Otherwise, we will end up creating * duplicate OPPs for the CPUs. * - * OPPs might be populated at runtime, don't check for error here. + * OPPs might be populated at runtime, don't fail for error here unless + * it is -EPROBE_DEFER. */ - if (!dev_pm_opp_of_cpumask_add_table(priv->cpus)) + ret = dev_pm_opp_of_cpumask_add_table(priv->cpus); + if (!ret) { priv->have_static_opps = true; + } else if (ret == -EPROBE_DEFER) { + goto out; + } /* * The OPP table must be initialized, statically or dynamically, by this diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 1d1b563cea4b..802abc925b2a 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -42,9 +42,6 @@ static LIST_HEAD(cpufreq_policy_list); #define for_each_inactive_policy(__policy) \ for_each_suitable_policy(__policy, false) -#define for_each_policy(__policy) \ - list_for_each_entry(__policy, &cpufreq_policy_list, policy_list) - /* Iterate over governors */ static LIST_HEAD(cpufreq_governor_list); #define for_each_governor(__governor) \ diff --git a/drivers/cpufreq/ia64-acpi-cpufreq.c b/drivers/cpufreq/ia64-acpi-cpufreq.c index 2efe7189ccc4..c6bdc455517f 100644 --- a/drivers/cpufreq/ia64-acpi-cpufreq.c +++ b/drivers/cpufreq/ia64-acpi-cpufreq.c @@ -54,7 +54,7 @@ processor_set_pstate ( retval = ia64_pal_set_pstate((u64)value); if (retval) { - pr_debug("Failed to set freq to 0x%x, with error 0x%lx\n", + pr_debug("Failed to set freq to 0x%x, with error 0x%llx\n", value, retval); return -ENODEV; } @@ -77,7 +77,7 @@ processor_get_pstate ( if (retval) pr_debug("Failed to get current freq with " - "error 0x%lx, idx 0x%x\n", retval, *value); + "error 0x%llx, idx 0x%x\n", retval, *value); return (int)retval; } diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index 5175ae3cac44..f0401064d7aa 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -819,19 +819,21 @@ static struct freq_attr *hwp_cpufreq_attrs[] = { NULL, }; -static void intel_pstate_get_hwp_max(struct cpudata *cpu, int *phy_max, - int *current_max) +static void __intel_pstate_get_hwp_cap(struct cpudata *cpu) { u64 cap; rdmsrl_on_cpu(cpu->cpu, MSR_HWP_CAPABILITIES, &cap); WRITE_ONCE(cpu->hwp_cap_cached, cap); - if (global.no_turbo || global.turbo_disabled) - *current_max = HWP_GUARANTEED_PERF(cap); - else - *current_max = HWP_HIGHEST_PERF(cap); + cpu->pstate.max_pstate = HWP_GUARANTEED_PERF(cap); + cpu->pstate.turbo_pstate = HWP_HIGHEST_PERF(cap); +} - *phy_max = HWP_HIGHEST_PERF(cap); +static void intel_pstate_get_hwp_cap(struct cpudata *cpu) +{ + __intel_pstate_get_hwp_cap(cpu); + cpu->pstate.max_freq = cpu->pstate.max_pstate * cpu->pstate.scaling; + cpu->pstate.turbo_freq = cpu->pstate.turbo_pstate * cpu->pstate.scaling; } static void intel_pstate_hwp_set(unsigned int cpu) @@ -1195,12 +1197,13 @@ static ssize_t store_no_turbo(struct kobject *a, struct kobj_attribute *b, static void update_qos_request(enum freq_qos_req_type type) { - int max_state, turbo_max, freq, i, perf_pct; struct freq_qos_request *req; struct cpufreq_policy *policy; + int i; for_each_possible_cpu(i) { struct cpudata *cpu = all_cpu_data[i]; + unsigned int freq, perf_pct; policy = cpufreq_cpu_get(i); if (!policy) @@ -1213,9 +1216,7 @@ static void update_qos_request(enum freq_qos_req_type type) continue; if (hwp_active) - intel_pstate_get_hwp_max(cpu, &turbo_max, &max_state); - else - turbo_max = cpu->pstate.turbo_pstate; + intel_pstate_get_hwp_cap(cpu); if (type == FREQ_QOS_MIN) { perf_pct = global.min_perf_pct; @@ -1224,8 +1225,7 @@ static void update_qos_request(enum freq_qos_req_type type) perf_pct = global.max_perf_pct; } - freq = DIV_ROUND_UP(turbo_max * perf_pct, 100); - freq *= cpu->pstate.scaling; + freq = DIV_ROUND_UP(cpu->pstate.turbo_freq * perf_pct, 100); if (freq_qos_update_request(req, freq) < 0) pr_warn("Failed to update freq constraint: CPU%d\n", i); @@ -1715,21 +1715,17 @@ static void intel_pstate_get_cpu_pstates(struct cpudata *cpu) { cpu->pstate.min_pstate = pstate_funcs.get_min(); cpu->pstate.max_pstate_physical = pstate_funcs.get_max_physical(); - cpu->pstate.turbo_pstate = pstate_funcs.get_turbo(); cpu->pstate.scaling = pstate_funcs.get_scaling(); if (hwp_active && !hwp_mode_bdw) { - unsigned int phy_max, current_max; - - intel_pstate_get_hwp_max(cpu, &phy_max, ¤t_max); - cpu->pstate.turbo_freq = phy_max * cpu->pstate.scaling; - cpu->pstate.turbo_pstate = phy_max; - cpu->pstate.max_pstate = HWP_GUARANTEED_PERF(READ_ONCE(cpu->hwp_cap_cached)); + __intel_pstate_get_hwp_cap(cpu); } else { - cpu->pstate.turbo_freq = cpu->pstate.turbo_pstate * cpu->pstate.scaling; cpu->pstate.max_pstate = pstate_funcs.get_max(); + cpu->pstate.turbo_pstate = pstate_funcs.get_turbo(); } + cpu->pstate.max_freq = cpu->pstate.max_pstate * cpu->pstate.scaling; + cpu->pstate.turbo_freq = cpu->pstate.turbo_pstate * cpu->pstate.scaling; if (pstate_funcs.get_aperf_mperf_shift) cpu->aperf_mperf_shift = pstate_funcs.get_aperf_mperf_shift(); @@ -2199,41 +2195,34 @@ static void intel_pstate_update_perf_limits(struct cpudata *cpu, unsigned int policy_min, unsigned int policy_max) { + int scaling = cpu->pstate.scaling; int32_t max_policy_perf, min_policy_perf; - int max_state, turbo_max; - int max_freq; /* - * HWP needs some special consideration, because on BDX the - * HWP_REQUEST uses abstract value to represent performance - * rather than pure ratios. + * HWP needs some special consideration, because HWP_REQUEST uses + * abstract values to represent performance rather than pure ratios. */ - if (hwp_active) { - intel_pstate_get_hwp_max(cpu, &turbo_max, &max_state); - } else { - max_state = global.no_turbo || global.turbo_disabled ? - cpu->pstate.max_pstate : cpu->pstate.turbo_pstate; - turbo_max = cpu->pstate.turbo_pstate; - } - max_freq = max_state * cpu->pstate.scaling; + if (hwp_active) + intel_pstate_get_hwp_cap(cpu); - max_policy_perf = max_state * policy_max / max_freq; + max_policy_perf = policy_max / scaling; if (policy_max == policy_min) { min_policy_perf = max_policy_perf; } else { - min_policy_perf = max_state * policy_min / max_freq; + min_policy_perf = policy_min / scaling; min_policy_perf = clamp_t(int32_t, min_policy_perf, 0, max_policy_perf); } - pr_debug("cpu:%d max_state %d min_policy_perf:%d max_policy_perf:%d\n", - cpu->cpu, max_state, min_policy_perf, max_policy_perf); + pr_debug("cpu:%d min_policy_perf:%d max_policy_perf:%d\n", + cpu->cpu, min_policy_perf, max_policy_perf); /* Normalize user input to [min_perf, max_perf] */ if (per_cpu_limits) { cpu->min_perf_ratio = min_policy_perf; cpu->max_perf_ratio = max_policy_perf; } else { + int turbo_max = cpu->pstate.turbo_pstate; int32_t global_min, global_max; /* Global limits are in percent of the maximum turbo P-state. */ @@ -2322,10 +2311,9 @@ static void intel_pstate_verify_cpu_policy(struct cpudata *cpu, update_turbo_state(); if (hwp_active) { - int max_state, turbo_max; - - intel_pstate_get_hwp_max(cpu, &turbo_max, &max_state); - max_freq = max_state * cpu->pstate.scaling; + intel_pstate_get_hwp_cap(cpu); + max_freq = global.no_turbo || global.turbo_disabled ? + cpu->pstate.max_freq : cpu->pstate.turbo_freq; } else { max_freq = intel_pstate_get_max_freq(cpu); } @@ -2416,25 +2404,15 @@ static int __intel_pstate_cpu_init(struct cpufreq_policy *policy) cpu->max_perf_ratio = 0xFF; cpu->min_perf_ratio = 0; - policy->min = cpu->pstate.min_pstate * cpu->pstate.scaling; - policy->max = cpu->pstate.turbo_pstate * cpu->pstate.scaling; - /* cpuinfo and default policy values */ policy->cpuinfo.min_freq = cpu->pstate.min_pstate * cpu->pstate.scaling; update_turbo_state(); global.turbo_disabled_mf = global.turbo_disabled; policy->cpuinfo.max_freq = global.turbo_disabled ? - cpu->pstate.max_pstate : cpu->pstate.turbo_pstate; - policy->cpuinfo.max_freq *= cpu->pstate.scaling; - - if (hwp_active) { - unsigned int max_freq; - - max_freq = global.turbo_disabled ? cpu->pstate.max_freq : cpu->pstate.turbo_freq; - if (max_freq < policy->cpuinfo.max_freq) - policy->cpuinfo.max_freq = max_freq; - } + + policy->min = policy->cpuinfo.min_freq; + policy->max = policy->cpuinfo.max_freq; intel_pstate_init_acpi_perf_limits(policy); @@ -2683,10 +2661,10 @@ static void intel_cpufreq_adjust_perf(unsigned int cpunum, static int intel_cpufreq_cpu_init(struct cpufreq_policy *policy) { - int max_state, turbo_max, min_freq, max_freq, ret; struct freq_qos_request *req; struct cpudata *cpu; struct device *dev; + int ret, freq; dev = get_cpu_device(policy->cpu); if (!dev) @@ -2711,30 +2689,31 @@ static int intel_cpufreq_cpu_init(struct cpufreq_policy *policy) if (hwp_active) { u64 value; - intel_pstate_get_hwp_max(cpu, &turbo_max, &max_state); policy->transition_delay_us = INTEL_CPUFREQ_TRANSITION_DELAY_HWP; + + intel_pstate_get_hwp_cap(cpu); + rdmsrl_on_cpu(cpu->cpu, MSR_HWP_REQUEST, &value); WRITE_ONCE(cpu->hwp_req_cached, value); + cpu->epp_cached = intel_pstate_get_epp(cpu, value); } else { - turbo_max = cpu->pstate.turbo_pstate; policy->transition_delay_us = INTEL_CPUFREQ_TRANSITION_DELAY; } - min_freq = DIV_ROUND_UP(turbo_max * global.min_perf_pct, 100); - min_freq *= cpu->pstate.scaling; - max_freq = DIV_ROUND_UP(turbo_max * global.max_perf_pct, 100); - max_freq *= cpu->pstate.scaling; + freq = DIV_ROUND_UP(cpu->pstate.turbo_freq * global.min_perf_pct, 100); ret = freq_qos_add_request(&policy->constraints, req, FREQ_QOS_MIN, - min_freq); + freq); if (ret < 0) { dev_err(dev, "Failed to add min-freq constraint (%d)\n", ret); goto free_req; } + freq = DIV_ROUND_UP(cpu->pstate.turbo_freq * global.max_perf_pct, 100); + ret = freq_qos_add_request(&policy->constraints, req + 1, FREQ_QOS_MAX, - max_freq); + freq); if (ret < 0) { dev_err(dev, "Failed to add max-freq constraint (%d)\n", ret); goto remove_min_req; diff --git a/drivers/cpufreq/s5pv210-cpufreq.c b/drivers/cpufreq/s5pv210-cpufreq.c index 69786e5bbf05..ad7d4f272ddc 100644 --- a/drivers/cpufreq/s5pv210-cpufreq.c +++ b/drivers/cpufreq/s5pv210-cpufreq.c @@ -91,7 +91,7 @@ static DEFINE_MUTEX(set_freq_lock); /* Use 800MHz when entering sleep mode */ #define SLEEP_FREQ (800 * 1000) -/* Tracks if cpu freqency can be updated anymore */ +/* Tracks if CPU frequency can be updated anymore */ static bool no_cpufreq_access; /* @@ -190,7 +190,7 @@ static u32 clkdiv_val[5][11] = { /* * This function set DRAM refresh counter - * accoriding to operating frequency of DRAM + * according to operating frequency of DRAM * ch: DMC port number 0 or 1 * freq: Operating frequency of DRAM(KHz) */ @@ -320,7 +320,7 @@ static int s5pv210_target(struct cpufreq_policy *policy, unsigned int index) /* * 3. DMC1 refresh count for 133Mhz if (index == L4) is - * true refresh counter is already programed in upper + * true refresh counter is already programmed in upper * code. 0x287@83Mhz */ if (!bus_speed_changing) @@ -378,7 +378,7 @@ static int s5pv210_target(struct cpufreq_policy *policy, unsigned int index) /* * 6. Turn on APLL * 6-1. Set PMS values - * 6-2. Wait untile the PLL is locked + * 6-2. Wait until the PLL is locked */ if (index == L0) writel_relaxed(APLL_VAL_1000, S5P_APLL_CON); @@ -390,7 +390,7 @@ static int s5pv210_target(struct cpufreq_policy *policy, unsigned int index) } while (!(reg & (0x1 << 29))); /* - * 7. Change souce clock from SCLKMPLL(667Mhz) + * 7. Change source clock from SCLKMPLL(667Mhz) * to SCLKA2M(200Mhz) in MFC_MUX and G3D MUX * (667/4=166)->(200/4=50)Mhz */ @@ -439,8 +439,8 @@ static int s5pv210_target(struct cpufreq_policy *policy, unsigned int index) } /* - * L4 level need to change memory bus speed, hence onedram clock divier - * and memory refresh parameter should be changed + * L4 level needs to change memory bus speed, hence ONEDRAM clock + * divider and memory refresh parameter should be changed */ if (bus_speed_changing) { reg = readl_relaxed(S5P_CLK_DIV6); diff --git a/drivers/cpuidle/Kconfig.arm b/drivers/cpuidle/Kconfig.arm index 0844fadc4be8..334f83e56120 100644 --- a/drivers/cpuidle/Kconfig.arm +++ b/drivers/cpuidle/Kconfig.arm @@ -107,7 +107,7 @@ config ARM_TEGRA_CPUIDLE config ARM_QCOM_SPM_CPUIDLE bool "CPU Idle Driver for Qualcomm Subsystem Power Manager (SPM)" - depends on (ARCH_QCOM || COMPILE_TEST) && !ARM64 + depends on (ARCH_QCOM || COMPILE_TEST) && !ARM64 && MMU select ARM_CPU_SUSPEND select CPU_IDLE_MULTIPLE_DRIVERS select DT_IDLE_STATES diff --git a/drivers/cpuidle/cpuidle-tegra.c b/drivers/cpuidle/cpuidle-tegra.c index 191966dc8d02..508bd9f23792 100644 --- a/drivers/cpuidle/cpuidle-tegra.c +++ b/drivers/cpuidle/cpuidle-tegra.c @@ -48,11 +48,6 @@ enum tegra_state { static atomic_t tegra_idle_barrier; static atomic_t tegra_abort_flag; -static inline bool tegra_cpuidle_using_firmware(void) -{ - return firmware_ops->prepare_idle && firmware_ops->do_idle; -} - static void tegra_cpuidle_report_cpus_state(void) { unsigned long cpu, lcpu, csr; @@ -135,13 +130,9 @@ static int tegra_cpuidle_c7_enter(void) { int err; - if (tegra_cpuidle_using_firmware()) { - err = call_firmware_op(prepare_idle, TF_PM_MODE_LP2_NOFLUSH_L2); - if (err) - return err; - - return call_firmware_op(do_idle, 0); - } + err = call_firmware_op(prepare_idle, TF_PM_MODE_LP2_NOFLUSH_L2); + if (err && err != -ENOSYS) + return err; return cpu_suspend(0, tegra30_pm_secondary_cpu_suspend); } @@ -356,9 +347,7 @@ static int tegra_cpuidle_probe(struct platform_device *pdev) * is disabled. */ if (!IS_ENABLED(CONFIG_PM_SLEEP)) { - if (!tegra_cpuidle_using_firmware()) - tegra_cpuidle_disable_state(TEGRA_C7); - + tegra_cpuidle_disable_state(TEGRA_C7); tegra_cpuidle_disable_state(TEGRA_CC6); } diff --git a/drivers/cpuidle/driver.c b/drivers/cpuidle/driver.c index 4070e573bf43..f70aa17e2a8e 100644 --- a/drivers/cpuidle/driver.c +++ b/drivers/cpuidle/driver.c @@ -181,9 +181,13 @@ static void __cpuidle_driver_init(struct cpuidle_driver *drv) */ if (s->target_residency > 0) s->target_residency_ns = s->target_residency * NSEC_PER_USEC; + else if (s->target_residency_ns < 0) + s->target_residency_ns = 0; if (s->exit_latency > 0) s->exit_latency_ns = s->exit_latency * NSEC_PER_USEC; + else if (s->exit_latency_ns < 0) + s->exit_latency_ns = 0; } } diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c index b0a7ad566081..c3aa8d6ccee3 100644 --- a/drivers/cpuidle/governors/menu.c +++ b/drivers/cpuidle/governors/menu.c @@ -271,7 +271,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, u64 predicted_ns; u64 interactivity_req; unsigned long nr_iowaiters; - ktime_t delta_next; + ktime_t delta, delta_tick; int i, idx; if (data->needs_update) { @@ -280,7 +280,12 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, } /* determine the expected residency time, round up */ - data->next_timer_ns = tick_nohz_get_sleep_length(&delta_next); + delta = tick_nohz_get_sleep_length(&delta_tick); + if (unlikely(delta < 0)) { + delta = 0; + delta_tick = 0; + } + data->next_timer_ns = delta; nr_iowaiters = nr_iowait_cpu(dev->cpu); data->bucket = which_bucket(data->next_timer_ns, nr_iowaiters); @@ -318,7 +323,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, * state selection. */ if (predicted_ns < TICK_NSEC) - predicted_ns = delta_next; + predicted_ns = data->next_timer_ns; } else { /* * Use the performance multiplier and the user-configurable @@ -377,7 +382,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, * stuck in the shallow one for too long. */ if (drv->states[idx].target_residency_ns < TICK_NSEC && - s->target_residency_ns <= delta_next) + s->target_residency_ns <= delta_tick) idx = i; return idx; @@ -399,7 +404,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, predicted_ns < TICK_NSEC) && !tick_nohz_tick_stopped()) { *stop_tick = false; - if (idx > 0 && drv->states[idx].target_residency_ns > delta_next) { + if (idx > 0 && drv->states[idx].target_residency_ns > delta_tick) { /* * The tick is not going to be stopped and the target * residency of the state to be returned is not within @@ -411,7 +416,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, continue; idx = i; - if (drv->states[i].target_residency_ns <= delta_next) + if (drv->states[i].target_residency_ns <= delta_tick) break; } } diff --git a/drivers/cpuidle/governors/teo.c b/drivers/cpuidle/governors/teo.c index 6deaaf5f05b5..ac4bb27d69b0 100644 --- a/drivers/cpuidle/governors/teo.c +++ b/drivers/cpuidle/governors/teo.c @@ -100,8 +100,8 @@ struct teo_idle_state { * @intervals: Saved idle duration values. */ struct teo_cpu { - u64 time_span_ns; - u64 sleep_length_ns; + s64 time_span_ns; + s64 sleep_length_ns; struct teo_idle_state states[CPUIDLE_STATE_MAX]; int interval_idx; u64 intervals[INTERVALS]; @@ -117,7 +117,8 @@ static DEFINE_PER_CPU(struct teo_cpu, teo_cpus); static void teo_update(struct cpuidle_driver *drv, struct cpuidle_device *dev) { struct teo_cpu *cpu_data = per_cpu_ptr(&teo_cpus, dev->cpu); - int i, idx_hit = -1, idx_timer = -1; + int i, idx_hit = 0, idx_timer = 0; + unsigned int hits, misses; u64 measured_ns; if (cpu_data->time_span_ns >= cpu_data->sleep_length_ns) { @@ -174,25 +175,22 @@ static void teo_update(struct cpuidle_driver *drv, struct cpuidle_device *dev) * also increase the "early hits" metric for the state that actually * matches the measured idle duration. */ - if (idx_timer >= 0) { - unsigned int hits = cpu_data->states[idx_timer].hits; - unsigned int misses = cpu_data->states[idx_timer].misses; - - hits -= hits >> DECAY_SHIFT; - misses -= misses >> DECAY_SHIFT; - - if (idx_timer > idx_hit) { - misses += PULSE; - if (idx_hit >= 0) - cpu_data->states[idx_hit].early_hits += PULSE; - } else { - hits += PULSE; - } + hits = cpu_data->states[idx_timer].hits; + hits -= hits >> DECAY_SHIFT; + + misses = cpu_data->states[idx_timer].misses; + misses -= misses >> DECAY_SHIFT; - cpu_data->states[idx_timer].misses = misses; - cpu_data->states[idx_timer].hits = hits; + if (idx_timer == idx_hit) { + hits += PULSE; + } else { + misses += PULSE; + cpu_data->states[idx_hit].early_hits += PULSE; } + cpu_data->states[idx_timer].misses = misses; + cpu_data->states[idx_timer].hits = hits; + /* * Save idle duration values corresponding to non-timer wakeups for * pattern detection. @@ -216,7 +214,7 @@ static bool teo_time_ok(u64 interval_ns) */ static int teo_find_shallower_state(struct cpuidle_driver *drv, struct cpuidle_device *dev, int state_idx, - u64 duration_ns) + s64 duration_ns) { int i; @@ -242,10 +240,10 @@ static int teo_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, { struct teo_cpu *cpu_data = per_cpu_ptr(&teo_cpus, dev->cpu); s64 latency_req = cpuidle_governor_latency_req(dev->cpu); - u64 duration_ns; + int max_early_idx, prev_max_early_idx, constraint_idx, idx0, idx, i; unsigned int hits, misses, early_hits; - int max_early_idx, prev_max_early_idx, constraint_idx, idx, i; ktime_t delta_tick; + s64 duration_ns; if (dev->last_state_idx >= 0) { teo_update(drv, dev); @@ -264,6 +262,7 @@ static int teo_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, prev_max_early_idx = -1; constraint_idx = drv->state_count; idx = -1; + idx0 = idx; for (i = 0; i < drv->state_count; i++) { struct cpuidle_state *s = &drv->states[i]; @@ -324,6 +323,7 @@ static int teo_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, idx = i; /* first enabled state */ hits = cpu_data->states[i].hits; misses = cpu_data->states[i].misses; + idx0 = i; } if (s->target_residency_ns > duration_ns) @@ -376,11 +376,16 @@ static int teo_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, if (idx < 0) { idx = 0; /* No states enabled. Must use 0. */ - } else if (idx > 0) { + } else if (idx > idx0) { unsigned int count = 0; u64 sum = 0; /* + * The target residencies of at least two different enabled idle + * states are less than or equal to the current expected idle + * duration. Try to refine the selection using the most recent + * measured idle duration values. + * * Count and sum the most recent idle duration values less than * the current expected idle duration value. */ @@ -428,7 +433,8 @@ static int teo_select(struct cpuidle_driver *drv, struct cpuidle_device *dev, * till the closest timer including the tick, try to correct * that. */ - if (idx > 0 && drv->states[idx].target_residency_ns > delta_tick) + if (idx > idx0 && + drv->states[idx].target_residency_ns > delta_tick) idx = teo_find_shallower_state(drv, dev, idx, delta_tick); } diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig index 00704efe6398..20373a893b44 100644 --- a/drivers/devfreq/Kconfig +++ b/drivers/devfreq/Kconfig @@ -62,7 +62,7 @@ config DEVFREQ_GOV_USERSPACE help Sets the frequency at the user specified one. This governor returns the user configured frequency if there - has been an input to /sys/devices/.../power/devfreq_set_freq. + has been an input to /sys/devices/.../userspace/set_freq. Otherwise, the governor does not change the frequency given at the initialization. diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c index bf3047896e41..fe08c46642f7 100644 --- a/drivers/devfreq/devfreq.c +++ b/drivers/devfreq/devfreq.c @@ -11,6 +11,7 @@ #include <linux/kmod.h> #include <linux/sched.h> #include <linux/debugfs.h> +#include <linux/devfreq_cooling.h> #include <linux/errno.h> #include <linux/err.h> #include <linux/init.h> @@ -387,7 +388,7 @@ static int devfreq_set_target(struct devfreq *devfreq, unsigned long new_freq, devfreq->previous_freq = new_freq; if (devfreq->suspend_freq) - devfreq->resume_freq = cur_freq; + devfreq->resume_freq = new_freq; return err; } @@ -821,7 +822,8 @@ struct devfreq *devfreq_add_device(struct device *dev, if (devfreq->profile->timer < 0 || devfreq->profile->timer >= DEVFREQ_TIMER_NUM) { - goto err_out; + mutex_unlock(&devfreq->lock); + goto err_dev; } if (!devfreq->profile->max_state && !devfreq->profile->freq_table) { @@ -935,6 +937,12 @@ struct devfreq *devfreq_add_device(struct device *dev, mutex_unlock(&devfreq_list_lock); + if (devfreq->profile->is_cooling_device) { + devfreq->cdev = devfreq_cooling_em_register(devfreq, NULL); + if (IS_ERR(devfreq->cdev)) + devfreq->cdev = NULL; + } + return devfreq; err_init: @@ -960,6 +968,8 @@ int devfreq_remove_device(struct devfreq *devfreq) if (!devfreq) return -EINVAL; + devfreq_cooling_unregister(devfreq->cdev); + if (devfreq->governor) { devfreq->governor->event_handler(devfreq, DEVFREQ_GOV_STOP, NULL); diff --git a/drivers/devfreq/governor.h b/drivers/devfreq/governor.h index 70f44b3ca42e..2d69a0ce6291 100644 --- a/drivers/devfreq/governor.h +++ b/drivers/devfreq/governor.h @@ -57,8 +57,6 @@ * Basically, get_target_freq will run * devfreq_dev_profile.get_dev_status() to get the * status of the device (load = busy_time / total_time). - * If no_central_polling is set, this callback is called - * only with update_devfreq() notified by OPP. * @event_handler: Callback for devfreq core framework to notify events * to governors. Events include per device governor * init and exit, opp changes out of devfreq, suspend @@ -91,6 +89,9 @@ int devfreq_update_target(struct devfreq *devfreq, unsigned long freq); static inline int devfreq_update_stats(struct devfreq *df) { + if (!df->profile->get_dev_status) + return -EINVAL; + return df->profile->get_dev_status(df->dev.parent, &df->last_status); } #endif /* _GOVERNOR_H */ diff --git a/drivers/devfreq/imx-bus.c b/drivers/devfreq/imx-bus.c index 4f38455ad742..3fc3fd77492d 100644 --- a/drivers/devfreq/imx-bus.c +++ b/drivers/devfreq/imx-bus.c @@ -169,7 +169,7 @@ static struct platform_driver imx_bus_platdrv = { .probe = imx_bus_probe, .driver = { .name = "imx-bus-devfreq", - .of_match_table = of_match_ptr(imx_bus_of_match), + .of_match_table = imx_bus_of_match, }, }; module_platform_driver(imx_bus_platdrv); diff --git a/drivers/devfreq/imx8m-ddrc.c b/drivers/devfreq/imx8m-ddrc.c index bc82d3653bff..16636973eb10 100644 --- a/drivers/devfreq/imx8m-ddrc.c +++ b/drivers/devfreq/imx8m-ddrc.c @@ -280,18 +280,6 @@ static int imx8m_ddrc_get_cur_freq(struct device *dev, unsigned long *freq) return 0; } -static int imx8m_ddrc_get_dev_status(struct device *dev, - struct devfreq_dev_status *stat) -{ - struct imx8m_ddrc *priv = dev_get_drvdata(dev); - - stat->busy_time = 0; - stat->total_time = 0; - stat->current_frequency = clk_get_rate(priv->dram_core); - - return 0; -} - static int imx8m_ddrc_init_freq_info(struct device *dev) { struct imx8m_ddrc *priv = dev_get_drvdata(dev); @@ -429,9 +417,7 @@ static int imx8m_ddrc_probe(struct platform_device *pdev) if (ret < 0) goto err; - priv->profile.polling_ms = 1000; priv->profile.target = imx8m_ddrc_target; - priv->profile.get_dev_status = imx8m_ddrc_get_dev_status; priv->profile.exit = imx8m_ddrc_exit; priv->profile.get_cur_freq = imx8m_ddrc_get_cur_freq; priv->profile.initial_freq = clk_get_rate(priv->dram_core); @@ -461,7 +447,7 @@ static struct platform_driver imx8m_ddrc_platdrv = { .probe = imx8m_ddrc_probe, .driver = { .name = "imx8m-ddrc-devfreq", - .of_match_table = of_match_ptr(imx8m_ddrc_of_match), + .of_match_table = imx8m_ddrc_of_match, }, }; module_platform_driver(imx8m_ddrc_platdrv); diff --git a/drivers/devfreq/rk3399_dmc.c b/drivers/devfreq/rk3399_dmc.c index 9e9d3b4c6d48..293857ebfd75 100644 --- a/drivers/devfreq/rk3399_dmc.c +++ b/drivers/devfreq/rk3399_dmc.c @@ -324,22 +324,14 @@ static int rk3399_dmcfreq_probe(struct platform_device *pdev) mutex_init(&data->lock); data->vdd_center = devm_regulator_get(dev, "center"); - if (IS_ERR(data->vdd_center)) { - if (PTR_ERR(data->vdd_center) == -EPROBE_DEFER) - return -EPROBE_DEFER; - - dev_err(dev, "Cannot get the regulator \"center\"\n"); - return PTR_ERR(data->vdd_center); - } + if (IS_ERR(data->vdd_center)) + return dev_err_probe(dev, PTR_ERR(data->vdd_center), + "Cannot get the regulator \"center\"\n"); data->dmc_clk = devm_clk_get(dev, "dmc_clk"); - if (IS_ERR(data->dmc_clk)) { - if (PTR_ERR(data->dmc_clk) == -EPROBE_DEFER) - return -EPROBE_DEFER; - - dev_err(dev, "Cannot get the clk dmc_clk\n"); - return PTR_ERR(data->dmc_clk); - } + if (IS_ERR(data->dmc_clk)) + return dev_err_probe(dev, PTR_ERR(data->dmc_clk), + "Cannot get the clk dmc_clk\n"); data->edev = devfreq_event_get_edev_by_phandle(dev, "devfreq-events", 0); if (IS_ERR(data->edev)) diff --git a/drivers/gpu/drm/lima/lima_devfreq.c b/drivers/gpu/drm/lima/lima_devfreq.c index 5686ad4aaf7c..dbc1d1eb9543 100644 --- a/drivers/gpu/drm/lima/lima_devfreq.c +++ b/drivers/gpu/drm/lima/lima_devfreq.c @@ -99,20 +99,12 @@ void lima_devfreq_fini(struct lima_device *ldev) devm_devfreq_remove_device(ldev->dev, devfreq->devfreq); devfreq->devfreq = NULL; } - - dev_pm_opp_of_remove_table(ldev->dev); - - dev_pm_opp_put_regulators(devfreq->regulators_opp_table); - dev_pm_opp_put_clkname(devfreq->clkname_opp_table); - devfreq->regulators_opp_table = NULL; - devfreq->clkname_opp_table = NULL; } int lima_devfreq_init(struct lima_device *ldev) { struct thermal_cooling_device *cooling; struct device *dev = ldev->dev; - struct opp_table *opp_table; struct devfreq *devfreq; struct lima_devfreq *ldevfreq = &ldev->devfreq; struct dev_pm_opp *opp; @@ -125,40 +117,28 @@ int lima_devfreq_init(struct lima_device *ldev) spin_lock_init(&ldevfreq->lock); - opp_table = dev_pm_opp_set_clkname(dev, "core"); - if (IS_ERR(opp_table)) { - ret = PTR_ERR(opp_table); - goto err_fini; - } - - ldevfreq->clkname_opp_table = opp_table; - - opp_table = dev_pm_opp_set_regulators(dev, - (const char *[]){ "mali" }, - 1); - if (IS_ERR(opp_table)) { - ret = PTR_ERR(opp_table); + ret = devm_pm_opp_set_clkname(dev, "core"); + if (ret) + return ret; + ret = devm_pm_opp_set_regulators(dev, (const char *[]){ "mali" }, 1); + if (ret) { /* Continue if the optional regulator is missing */ if (ret != -ENODEV) - goto err_fini; - } else { - ldevfreq->regulators_opp_table = opp_table; + return ret; } - ret = dev_pm_opp_of_add_table(dev); + ret = devm_pm_opp_of_add_table(dev); if (ret) - goto err_fini; + return ret; lima_devfreq_reset(ldevfreq); cur_freq = clk_get_rate(ldev->clk_gpu); opp = devfreq_recommended_opp(dev, &cur_freq, 0); - if (IS_ERR(opp)) { - ret = PTR_ERR(opp); - goto err_fini; - } + if (IS_ERR(opp)) + return PTR_ERR(opp); lima_devfreq_profile.initial_freq = cur_freq; dev_pm_opp_put(opp); @@ -167,8 +147,7 @@ int lima_devfreq_init(struct lima_device *ldev) DEVFREQ_GOV_SIMPLE_ONDEMAND, NULL); if (IS_ERR(devfreq)) { dev_err(dev, "Couldn't initialize GPU devfreq\n"); - ret = PTR_ERR(devfreq); - goto err_fini; + return PTR_ERR(devfreq); } ldevfreq->devfreq = devfreq; @@ -180,10 +159,6 @@ int lima_devfreq_init(struct lima_device *ldev) ldevfreq->cooling = cooling; return 0; - -err_fini: - lima_devfreq_fini(ldev); - return ret; } void lima_devfreq_record_busy(struct lima_devfreq *devfreq) diff --git a/drivers/gpu/drm/lima/lima_devfreq.h b/drivers/gpu/drm/lima/lima_devfreq.h index 2d9b3008ce77..688ee71e263a 100644 --- a/drivers/gpu/drm/lima/lima_devfreq.h +++ b/drivers/gpu/drm/lima/lima_devfreq.h @@ -8,15 +8,12 @@ #include <linux/ktime.h> struct devfreq; -struct opp_table; struct thermal_cooling_device; struct lima_device; struct lima_devfreq { struct devfreq *devfreq; - struct opp_table *clkname_opp_table; - struct opp_table *regulators_opp_table; struct thermal_cooling_device *cooling; ktime_t busy_time; diff --git a/drivers/gpu/drm/panfrost/panfrost_devfreq.c b/drivers/gpu/drm/panfrost/panfrost_devfreq.c index 56b3f5935703..c878391f3e8c 100644 --- a/drivers/gpu/drm/panfrost/panfrost_devfreq.c +++ b/drivers/gpu/drm/panfrost/panfrost_devfreq.c @@ -89,29 +89,25 @@ int panfrost_devfreq_init(struct panfrost_device *pfdev) unsigned long cur_freq; struct device *dev = &pfdev->pdev->dev; struct devfreq *devfreq; - struct opp_table *opp_table; struct thermal_cooling_device *cooling; struct panfrost_devfreq *pfdevfreq = &pfdev->pfdevfreq; - opp_table = dev_pm_opp_set_regulators(dev, pfdev->comp->supply_names, - pfdev->comp->num_supplies); - if (IS_ERR(opp_table)) { - ret = PTR_ERR(opp_table); + ret = devm_pm_opp_set_regulators(dev, pfdev->comp->supply_names, + pfdev->comp->num_supplies); + if (ret) { /* Continue if the optional regulator is missing */ if (ret != -ENODEV) { DRM_DEV_ERROR(dev, "Couldn't set OPP regulators\n"); - goto err_fini; + return ret; } - } else { - pfdevfreq->regulators_opp_table = opp_table; } - ret = dev_pm_opp_of_add_table(dev); + ret = devm_pm_opp_of_add_table(dev); if (ret) { /* Optional, continue without devfreq */ if (ret == -ENODEV) ret = 0; - goto err_fini; + return ret; } pfdevfreq->opp_of_table_added = true; @@ -122,10 +118,8 @@ int panfrost_devfreq_init(struct panfrost_device *pfdev) cur_freq = clk_get_rate(pfdev->clock); opp = devfreq_recommended_opp(dev, &cur_freq, 0); - if (IS_ERR(opp)) { - ret = PTR_ERR(opp); - goto err_fini; - } + if (IS_ERR(opp)) + return PTR_ERR(opp); panfrost_devfreq_profile.initial_freq = cur_freq; dev_pm_opp_put(opp); @@ -134,8 +128,7 @@ int panfrost_devfreq_init(struct panfrost_device *pfdev) DEVFREQ_GOV_SIMPLE_ONDEMAND, NULL); if (IS_ERR(devfreq)) { DRM_DEV_ERROR(dev, "Couldn't initialize GPU devfreq\n"); - ret = PTR_ERR(devfreq); - goto err_fini; + return PTR_ERR(devfreq); } pfdevfreq->devfreq = devfreq; @@ -146,10 +139,6 @@ int panfrost_devfreq_init(struct panfrost_device *pfdev) pfdevfreq->cooling = cooling; return 0; - -err_fini: - panfrost_devfreq_fini(pfdev); - return ret; } void panfrost_devfreq_fini(struct panfrost_device *pfdev) @@ -160,14 +149,6 @@ void panfrost_devfreq_fini(struct panfrost_device *pfdev) devfreq_cooling_unregister(pfdevfreq->cooling); pfdevfreq->cooling = NULL; } - - if (pfdevfreq->opp_of_table_added) { - dev_pm_opp_of_remove_table(&pfdev->pdev->dev); - pfdevfreq->opp_of_table_added = false; - } - - dev_pm_opp_put_regulators(pfdevfreq->regulators_opp_table); - pfdevfreq->regulators_opp_table = NULL; } void panfrost_devfreq_resume(struct panfrost_device *pfdev) diff --git a/drivers/gpu/drm/panfrost/panfrost_devfreq.h b/drivers/gpu/drm/panfrost/panfrost_devfreq.h index db6ea48e21f9..210269944687 100644 --- a/drivers/gpu/drm/panfrost/panfrost_devfreq.h +++ b/drivers/gpu/drm/panfrost/panfrost_devfreq.h @@ -8,14 +8,12 @@ #include <linux/ktime.h> struct devfreq; -struct opp_table; struct thermal_cooling_device; struct panfrost_device; struct panfrost_devfreq { struct devfreq *devfreq; - struct opp_table *regulators_opp_table; struct thermal_cooling_device *cooling; bool opp_of_table_added; diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index 3273360f30f7..ec1b9d306ba6 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -744,8 +744,8 @@ static struct cpuidle_state icx_cstates[] __initdata = { .name = "C6", .desc = "MWAIT 0x20", .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED, - .exit_latency = 128, - .target_residency = 384, + .exit_latency = 170, + .target_residency = 600, .enter = &intel_idle, .enter_s2idle = intel_idle_s2idle, }, { @@ -1156,6 +1156,7 @@ static const struct x86_cpu_id intel_idle_ids[] __initconst = { X86_MATCH_INTEL_FAM6_MODEL(KABYLAKE, &idle_cpu_skl), X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_X, &idle_cpu_skx), X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, &idle_cpu_icx), + X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_D, &idle_cpu_icx), X86_MATCH_INTEL_FAM6_MODEL(XEON_PHI_KNL, &idle_cpu_knl), X86_MATCH_INTEL_FAM6_MODEL(XEON_PHI_KNM, &idle_cpu_knl), X86_MATCH_INTEL_FAM6_MODEL(ATOM_GOLDMONT, &idle_cpu_bxt), diff --git a/drivers/memory/samsung/exynos5422-dmc.c b/drivers/memory/samsung/exynos5422-dmc.c index dee503640e12..9c8318923ed0 100644 --- a/drivers/memory/samsung/exynos5422-dmc.c +++ b/drivers/memory/samsung/exynos5422-dmc.c @@ -343,7 +343,7 @@ static int exynos5_init_freq_table(struct exynos5_dmc *dmc, int idx; unsigned long freq; - ret = dev_pm_opp_of_add_table(dmc->dev); + ret = devm_pm_opp_of_add_table(dmc->dev); if (ret < 0) { dev_err(dmc->dev, "Failed to get OPP table\n"); return ret; @@ -354,7 +354,7 @@ static int exynos5_init_freq_table(struct exynos5_dmc *dmc, dmc->opp = devm_kmalloc_array(dmc->dev, dmc->opp_count, sizeof(struct dmc_opp_table), GFP_KERNEL); if (!dmc->opp) - goto err_opp; + return -ENOMEM; idx = dmc->opp_count - 1; for (i = 0, freq = ULONG_MAX; i < dmc->opp_count; i++, freq--) { @@ -362,7 +362,7 @@ static int exynos5_init_freq_table(struct exynos5_dmc *dmc, opp = dev_pm_opp_find_freq_floor(dmc->dev, &freq); if (IS_ERR(opp)) - goto err_opp; + return PTR_ERR(opp); dmc->opp[idx - i].freq_hz = freq; dmc->opp[idx - i].volt_uv = dev_pm_opp_get_voltage(opp); @@ -371,11 +371,6 @@ static int exynos5_init_freq_table(struct exynos5_dmc *dmc, } return 0; - -err_opp: - dev_pm_opp_of_remove_table(dmc->dev); - - return -EINVAL; } /** @@ -1569,8 +1564,6 @@ static int exynos5_dmc_remove(struct platform_device *pdev) clk_disable_unprepare(dmc->mout_bpll); clk_disable_unprepare(dmc->fout_bpll); - dev_pm_opp_remove_table(dmc->dev); - return 0; } diff --git a/drivers/mmc/host/sdhci-msm.c b/drivers/mmc/host/sdhci-msm.c index 5e1da4df096f..d170c919e6e4 100644 --- a/drivers/mmc/host/sdhci-msm.c +++ b/drivers/mmc/host/sdhci-msm.c @@ -264,7 +264,6 @@ struct sdhci_msm_host { struct clk_bulk_data bulk_clks[5]; unsigned long clk_rate; struct mmc_host *mmc; - struct opp_table *opp_table; bool use_14lpp_dll_reset; bool tuning_done; bool calibration_done; @@ -2551,17 +2550,15 @@ static int sdhci_msm_probe(struct platform_device *pdev) if (ret) goto bus_clk_disable; - msm_host->opp_table = dev_pm_opp_set_clkname(&pdev->dev, "core"); - if (IS_ERR(msm_host->opp_table)) { - ret = PTR_ERR(msm_host->opp_table); + ret = devm_pm_opp_set_clkname(&pdev->dev, "core"); + if (ret) goto bus_clk_disable; - } /* OPP table is optional */ - ret = dev_pm_opp_of_add_table(&pdev->dev); + ret = devm_pm_opp_of_add_table(&pdev->dev); if (ret && ret != -ENODEV) { dev_err(&pdev->dev, "Invalid OPP table in Device tree\n"); - goto opp_put_clkname; + goto bus_clk_disable; } /* Vote for maximum clock rate for maximum performance */ @@ -2587,7 +2584,7 @@ static int sdhci_msm_probe(struct platform_device *pdev) ret = clk_bulk_prepare_enable(ARRAY_SIZE(msm_host->bulk_clks), msm_host->bulk_clks); if (ret) - goto opp_cleanup; + goto bus_clk_disable; /* * xo clock is needed for FLL feature of cm_dll. @@ -2732,10 +2729,6 @@ pm_runtime_disable: clk_disable: clk_bulk_disable_unprepare(ARRAY_SIZE(msm_host->bulk_clks), msm_host->bulk_clks); -opp_cleanup: - dev_pm_opp_of_remove_table(&pdev->dev); -opp_put_clkname: - dev_pm_opp_put_clkname(msm_host->opp_table); bus_clk_disable: if (!IS_ERR(msm_host->bus_clk)) clk_disable_unprepare(msm_host->bus_clk); @@ -2754,8 +2747,6 @@ static int sdhci_msm_remove(struct platform_device *pdev) sdhci_remove_host(host, dead); - dev_pm_opp_of_remove_table(&pdev->dev); - dev_pm_opp_put_clkname(msm_host->opp_table); pm_runtime_get_sync(&pdev->dev); pm_runtime_disable(&pdev->dev); pm_runtime_put_noidle(&pdev->dev); diff --git a/drivers/opp/core.c b/drivers/opp/core.c index 1556998425d5..e366218d6736 100644 --- a/drivers/opp/core.c +++ b/drivers/opp/core.c @@ -1857,6 +1857,35 @@ void dev_pm_opp_put_supported_hw(struct opp_table *opp_table) } EXPORT_SYMBOL_GPL(dev_pm_opp_put_supported_hw); +static void devm_pm_opp_supported_hw_release(void *data) +{ + dev_pm_opp_put_supported_hw(data); +} + +/** + * devm_pm_opp_set_supported_hw() - Set supported platforms + * @dev: Device for which supported-hw has to be set. + * @versions: Array of hierarchy of versions to match. + * @count: Number of elements in the array. + * + * This is a resource-managed variant of dev_pm_opp_set_supported_hw(). + * + * Return: 0 on success and errorno otherwise. + */ +int devm_pm_opp_set_supported_hw(struct device *dev, const u32 *versions, + unsigned int count) +{ + struct opp_table *opp_table; + + opp_table = dev_pm_opp_set_supported_hw(dev, versions, count); + if (IS_ERR(opp_table)) + return PTR_ERR(opp_table); + + return devm_add_action_or_reset(dev, devm_pm_opp_supported_hw_release, + opp_table); +} +EXPORT_SYMBOL_GPL(devm_pm_opp_set_supported_hw); + /** * dev_pm_opp_set_prop_name() - Set prop-extn name * @dev: Device for which the prop-name has to be set. @@ -2047,6 +2076,36 @@ put_opp_table: } EXPORT_SYMBOL_GPL(dev_pm_opp_put_regulators); +static void devm_pm_opp_regulators_release(void *data) +{ + dev_pm_opp_put_regulators(data); +} + +/** + * devm_pm_opp_set_regulators() - Set regulator names for the device + * @dev: Device for which regulator name is being set. + * @names: Array of pointers to the names of the regulator. + * @count: Number of regulators. + * + * This is a resource-managed variant of dev_pm_opp_set_regulators(). + * + * Return: 0 on success and errorno otherwise. + */ +int devm_pm_opp_set_regulators(struct device *dev, + const char * const names[], + unsigned int count) +{ + struct opp_table *opp_table; + + opp_table = dev_pm_opp_set_regulators(dev, names, count); + if (IS_ERR(opp_table)) + return PTR_ERR(opp_table); + + return devm_add_action_or_reset(dev, devm_pm_opp_regulators_release, + opp_table); +} +EXPORT_SYMBOL_GPL(devm_pm_opp_set_regulators); + /** * dev_pm_opp_set_clkname() - Set clk name for the device * @dev: Device for which clk name is being set. @@ -2119,6 +2178,33 @@ void dev_pm_opp_put_clkname(struct opp_table *opp_table) } EXPORT_SYMBOL_GPL(dev_pm_opp_put_clkname); +static void devm_pm_opp_clkname_release(void *data) +{ + dev_pm_opp_put_clkname(data); +} + +/** + * devm_pm_opp_set_clkname() - Set clk name for the device + * @dev: Device for which clk name is being set. + * @name: Clk name. + * + * This is a resource-managed variant of dev_pm_opp_set_clkname(). + * + * Return: 0 on success and errorno otherwise. + */ +int devm_pm_opp_set_clkname(struct device *dev, const char *name) +{ + struct opp_table *opp_table; + + opp_table = dev_pm_opp_set_clkname(dev, name); + if (IS_ERR(opp_table)) + return PTR_ERR(opp_table); + + return devm_add_action_or_reset(dev, devm_pm_opp_clkname_release, + opp_table); +} +EXPORT_SYMBOL_GPL(devm_pm_opp_set_clkname); + /** * dev_pm_opp_register_set_opp_helper() - Register custom set OPP helper * @dev: Device for which the helper is getting registered. @@ -2209,25 +2295,19 @@ static void devm_pm_opp_unregister_set_opp_helper(void *data) * * This is a resource-managed version of dev_pm_opp_register_set_opp_helper(). * - * Return: pointer to 'struct opp_table' on success and errorno otherwise. + * Return: 0 on success and errorno otherwise. */ -struct opp_table * -devm_pm_opp_register_set_opp_helper(struct device *dev, - int (*set_opp)(struct dev_pm_set_opp_data *data)) +int devm_pm_opp_register_set_opp_helper(struct device *dev, + int (*set_opp)(struct dev_pm_set_opp_data *data)) { struct opp_table *opp_table; - int err; opp_table = dev_pm_opp_register_set_opp_helper(dev, set_opp); if (IS_ERR(opp_table)) - return opp_table; - - err = devm_add_action_or_reset(dev, devm_pm_opp_unregister_set_opp_helper, - opp_table); - if (err) - return ERR_PTR(err); + return PTR_ERR(opp_table); - return opp_table; + return devm_add_action_or_reset(dev, devm_pm_opp_unregister_set_opp_helper, + opp_table); } EXPORT_SYMBOL_GPL(devm_pm_opp_register_set_opp_helper); @@ -2380,25 +2460,19 @@ static void devm_pm_opp_detach_genpd(void *data) * * This is a resource-managed version of dev_pm_opp_attach_genpd(). * - * Return: pointer to 'struct opp_table' on success and errorno otherwise. + * Return: 0 on success and errorno otherwise. */ -struct opp_table * -devm_pm_opp_attach_genpd(struct device *dev, const char **names, - struct device ***virt_devs) +int devm_pm_opp_attach_genpd(struct device *dev, const char **names, + struct device ***virt_devs) { struct opp_table *opp_table; - int err; opp_table = dev_pm_opp_attach_genpd(dev, names, virt_devs); if (IS_ERR(opp_table)) - return opp_table; - - err = devm_add_action_or_reset(dev, devm_pm_opp_detach_genpd, - opp_table); - if (err) - return ERR_PTR(err); + return PTR_ERR(opp_table); - return opp_table; + return devm_add_action_or_reset(dev, devm_pm_opp_detach_genpd, + opp_table); } EXPORT_SYMBOL_GPL(devm_pm_opp_attach_genpd); diff --git a/drivers/opp/of.c b/drivers/opp/of.c index f480c10e6314..c582a9ca397b 100644 --- a/drivers/opp/of.c +++ b/drivers/opp/of.c @@ -1104,6 +1104,42 @@ static int _of_add_table_indexed(struct device *dev, int index, bool getclk) return ret; } +static void devm_pm_opp_of_table_release(void *data) +{ + dev_pm_opp_of_remove_table(data); +} + +/** + * devm_pm_opp_of_add_table() - Initialize opp table from device tree + * @dev: device pointer used to lookup OPP table. + * + * Register the initial OPP table with the OPP library for given device. + * + * The opp_table structure will be freed after the device is destroyed. + * + * Return: + * 0 On success OR + * Duplicate OPPs (both freq and volt are same) and opp->available + * -EEXIST Freq are same and volt are different OR + * Duplicate OPPs (both freq and volt are same) and !opp->available + * -ENOMEM Memory allocation failure + * -ENODEV when 'operating-points' property is not found or is invalid data + * in device node. + * -ENODATA when empty 'operating-points' property is found + * -EINVAL when invalid entries are found in opp-v2 table + */ +int devm_pm_opp_of_add_table(struct device *dev) +{ + int ret; + + ret = dev_pm_opp_of_add_table(dev); + if (ret) + return ret; + + return devm_add_action_or_reset(dev, devm_pm_opp_of_table_release, dev); +} +EXPORT_SYMBOL_GPL(devm_pm_opp_of_add_table); + /** * dev_pm_opp_of_add_table() - Initialize opp table from device tree * @dev: device pointer used to lookup OPP table. diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 16a17215f633..e4d4e399004b 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1870,20 +1870,10 @@ static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags) int err; int i, bars = 0; - /* - * Power state could be unknown at this point, either due to a fresh - * boot or a device removal call. So get the current power state - * so that things like MSI message writing will behave as expected - * (e.g. if the device really is in D0 at enable time). - */ - if (dev->pm_cap) { - u16 pmcsr; - pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr); - dev->current_state = (pmcsr & PCI_PM_CTRL_STATE_MASK); - } - - if (atomic_inc_return(&dev->enable_cnt) > 1) + if (atomic_inc_return(&dev->enable_cnt) > 1) { + pci_update_current_state(dev, dev->current_state); return 0; /* already enabled */ + } bridge = pci_upstream_bridge(dev); if (bridge) diff --git a/drivers/powercap/intel_rapl_common.c b/drivers/powercap/intel_rapl_common.c index fdda2a737186..73cf68af9770 100644 --- a/drivers/powercap/intel_rapl_common.c +++ b/drivers/powercap/intel_rapl_common.c @@ -1069,6 +1069,7 @@ static const struct x86_cpu_id rapl_ids[] __initconst = { X86_MATCH_VENDOR_FAM(AMD, 0x17, &rapl_defaults_amd), X86_MATCH_VENDOR_FAM(AMD, 0x19, &rapl_defaults_amd), + X86_MATCH_VENDOR_FAM(HYGON, 0x18, &rapl_defaults_amd), {} }; MODULE_DEVICE_TABLE(x86cpu, rapl_ids); diff --git a/drivers/powercap/intel_rapl_msr.c b/drivers/powercap/intel_rapl_msr.c index 78213d4b5b16..cc3b22881bfe 100644 --- a/drivers/powercap/intel_rapl_msr.c +++ b/drivers/powercap/intel_rapl_msr.c @@ -150,6 +150,7 @@ static int rapl_msr_probe(struct platform_device *pdev) case X86_VENDOR_INTEL: rapl_msr_priv = &rapl_msr_priv_intel; break; + case X86_VENDOR_HYGON: case X86_VENDOR_AMD: rapl_msr_priv = &rapl_msr_priv_amd; break; diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c index 881f645661cc..3d0d8ddd5772 100644 --- a/drivers/spi/spi-geni-qcom.c +++ b/drivers/spi/spi-geni-qcom.c @@ -691,14 +691,15 @@ static int spi_geni_probe(struct platform_device *pdev) mas->se.wrapper = dev_get_drvdata(dev->parent); mas->se.base = base; mas->se.clk = clk; - mas->se.opp_table = dev_pm_opp_set_clkname(&pdev->dev, "se"); - if (IS_ERR(mas->se.opp_table)) - return PTR_ERR(mas->se.opp_table); + + ret = devm_pm_opp_set_clkname(&pdev->dev, "se"); + if (ret) + return ret; /* OPP table is optional */ - ret = dev_pm_opp_of_add_table(&pdev->dev); + ret = devm_pm_opp_of_add_table(&pdev->dev); if (ret && ret != -ENODEV) { dev_err(&pdev->dev, "invalid OPP table in device tree\n"); - goto put_clkname; + return ret; } spi->bus_num = -1; @@ -750,9 +751,6 @@ spi_geni_probe_free_irq: free_irq(mas->irq, spi); spi_geni_probe_runtime_disable: pm_runtime_disable(dev); - dev_pm_opp_of_remove_table(&pdev->dev); -put_clkname: - dev_pm_opp_put_clkname(mas->se.opp_table); return ret; } @@ -766,8 +764,6 @@ static int spi_geni_remove(struct platform_device *pdev) free_irq(mas->irq, spi); pm_runtime_disable(&pdev->dev); - dev_pm_opp_of_remove_table(&pdev->dev); - dev_pm_opp_put_clkname(mas->se.opp_table); return 0; } diff --git a/drivers/spi/spi-qcom-qspi.c b/drivers/spi/spi-qcom-qspi.c index 1dbcc410cd35..c334dfec4117 100644 --- a/drivers/spi/spi-qcom-qspi.c +++ b/drivers/spi/spi-qcom-qspi.c @@ -142,7 +142,6 @@ struct qcom_qspi { struct clk_bulk_data *clks; struct qspi_xfer xfer; struct icc_path *icc_path_cpu_to_qspi; - struct opp_table *opp_table; unsigned long last_speed; /* Lock to protect data accessed by IRQs */ spinlock_t lock; @@ -530,14 +529,14 @@ static int qcom_qspi_probe(struct platform_device *pdev) master->handle_err = qcom_qspi_handle_err; master->auto_runtime_pm = true; - ctrl->opp_table = dev_pm_opp_set_clkname(&pdev->dev, "core"); - if (IS_ERR(ctrl->opp_table)) - return PTR_ERR(ctrl->opp_table); + ret = devm_pm_opp_set_clkname(&pdev->dev, "core"); + if (ret) + return ret; /* OPP table is optional */ - ret = dev_pm_opp_of_add_table(&pdev->dev); + ret = devm_pm_opp_of_add_table(&pdev->dev); if (ret && ret != -ENODEV) { dev_err(&pdev->dev, "invalid OPP table in device tree\n"); - goto exit_probe_put_clkname; + return ret; } pm_runtime_use_autosuspend(dev); @@ -549,10 +548,6 @@ static int qcom_qspi_probe(struct platform_device *pdev) return 0; pm_runtime_disable(dev); - dev_pm_opp_of_remove_table(&pdev->dev); - -exit_probe_put_clkname: - dev_pm_opp_put_clkname(ctrl->opp_table); return ret; } @@ -560,14 +555,11 @@ exit_probe_put_clkname: static int qcom_qspi_remove(struct platform_device *pdev) { struct spi_master *master = platform_get_drvdata(pdev); - struct qcom_qspi *ctrl = spi_master_get_devdata(master); /* Unregister _before_ disabling pm_runtime() so we stop transfers */ spi_unregister_master(master); pm_runtime_disable(&pdev->dev); - dev_pm_opp_of_remove_table(&pdev->dev); - dev_pm_opp_put_clkname(ctrl->opp_table); return 0; } diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c index 00bb88a71606..23d729ed3bf6 100644 --- a/drivers/tty/serial/qcom_geni_serial.c +++ b/drivers/tty/serial/qcom_geni_serial.c @@ -1426,14 +1426,14 @@ static int qcom_geni_serial_probe(struct platform_device *pdev) if (of_property_read_bool(pdev->dev.of_node, "cts-rts-swap")) port->cts_rts_swap = true; - port->se.opp_table = dev_pm_opp_set_clkname(&pdev->dev, "se"); - if (IS_ERR(port->se.opp_table)) - return PTR_ERR(port->se.opp_table); + ret = devm_pm_opp_set_clkname(&pdev->dev, "se"); + if (ret) + return ret; /* OPP table is optional */ - ret = dev_pm_opp_of_add_table(&pdev->dev); + ret = devm_pm_opp_of_add_table(&pdev->dev); if (ret && ret != -ENODEV) { dev_err(&pdev->dev, "invalid OPP table in device tree\n"); - goto put_clkname; + return ret; } port->private_data.drv = drv; @@ -1443,7 +1443,7 @@ static int qcom_geni_serial_probe(struct platform_device *pdev) ret = uart_add_one_port(drv, uport); if (ret) - goto err; + return ret; irq_set_status_flags(uport->irq, IRQ_NOAUTOEN); ret = devm_request_irq(uport->dev, uport->irq, qcom_geni_serial_isr, @@ -1451,7 +1451,7 @@ static int qcom_geni_serial_probe(struct platform_device *pdev) if (ret) { dev_err(uport->dev, "Failed to get IRQ ret %d\n", ret); uart_remove_one_port(drv, uport); - goto err; + return ret; } /* @@ -1468,16 +1468,11 @@ static int qcom_geni_serial_probe(struct platform_device *pdev) if (ret) { device_init_wakeup(&pdev->dev, false); uart_remove_one_port(drv, uport); - goto err; + return ret; } } return 0; -err: - dev_pm_opp_of_remove_table(&pdev->dev); -put_clkname: - dev_pm_opp_put_clkname(port->se.opp_table); - return ret; } static int qcom_geni_serial_remove(struct platform_device *pdev) @@ -1485,8 +1480,6 @@ static int qcom_geni_serial_remove(struct platform_device *pdev) struct qcom_geni_serial_port *port = platform_get_drvdata(pdev); struct uart_driver *drv = port->private_data.drv; - dev_pm_opp_of_remove_table(&pdev->dev); - dev_pm_opp_put_clkname(port->se.opp_table); dev_pm_clear_wake_irq(&pdev->dev); device_init_wakeup(&pdev->dev, false); uart_remove_one_port(drv, &port->uport); |