summaryrefslogtreecommitdiff
path: root/drivers/cpufreq/cpufreq_governor.h
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2016-03-17 00:10:53 +0300
committerLinus Torvalds <torvalds@linux-foundation.org>2016-03-17 00:10:53 +0300
commit277edbabf6fece057b14fb6db5e3a34e00f42f42 (patch)
treed33314ae118cf387fa697643d10f1549ba4d6bfe /drivers/cpufreq/cpufreq_governor.h
parent271ecc5253e2b317d729d366560789cd7f93836c (diff)
parent0d571b62dd8eb341788599259c3dbc92c0dc8f22 (diff)
downloadlinux-277edbabf6fece057b14fb6db5e3a34e00f42f42.tar.xz
Merge tag 'pm+acpi-4.6-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI updates from Rafael Wysocki: "This time the majority of changes go into cpufreq and they are significant. First off, the way CPU frequency updates are triggered is different now. Instead of having to set up and manage a deferrable timer for each CPU in the system to evaluate and possibly change its frequency periodically, cpufreq governors set up callbacks to be invoked by the scheduler on a regular basis (basically on utilization updates). The "old" governors, "ondemand" and "conservative", still do all of their work in process context (although that is triggered by the scheduler now), but intel_pstate does it all in the callback invoked by the scheduler with no need for any additional asynchronous processing. Of course, this eliminates the overhead related to the management of all those timers, but also it allows the cpufreq governor code to be simplified quite a bit. On top of that, the common code and data structures used by the "ondemand" and "conservative" governors are cleaned up and made more straightforward and some long-standing and quite annoying problems are addressed. In particular, the handling of governor sysfs attributes is modified and the related locking becomes more fine grained which allows some concurrency problems to be avoided (particularly deadlocks with the core cpufreq code). In principle, the new mechanism for triggering frequency updates allows utilization information to be passed from the scheduler to cpufreq. Although the current code doesn't make use of it, in the works is a new cpufreq governor that will make decisions based on the scheduler's utilization data. That should allow the scheduler and cpufreq to work more closely together in the long run. In addition to the core and governor changes, cpufreq drivers are updated too. Fixes and optimizations go into intel_pstate, the cpufreq-dt driver is updated on top of some modification in the Operating Performance Points (OPP) framework and there are fixes and other updates in the powernv cpufreq driver. Apart from the cpufreq updates there is some new ACPICA material, including a fix for a problem introduced by previous ACPICA updates, and some less significant changes in the ACPI code, like CPPC code optimizations, ACPI processor driver cleanups and support for loading ACPI tables from initrd. Also updated are the generic power domains framework, the Intel RAPL power capping driver and the turbostat utility and we have a bunch of traditional assorted fixes and cleanups. Specifics: - Redesign of cpufreq governors and the intel_pstate driver to make them use callbacks invoked by the scheduler to trigger CPU frequency evaluation instead of using per-CPU deferrable timers for that purpose (Rafael Wysocki). - Reorganization and cleanup of cpufreq governor code to make it more straightforward and fix some concurrency problems in it (Rafael Wysocki, Viresh Kumar). - Cleanup and improvements of locking in the cpufreq core (Viresh Kumar). - Assorted cleanups in the cpufreq core (Rafael Wysocki, Viresh Kumar, Eric Biggers). - intel_pstate driver updates including fixes, optimizations and a modification to make it enable enable hardware-coordinated P-state selection (HWP) by default if supported by the processor (Philippe Longepe, Srinivas Pandruvada, Rafael Wysocki, Viresh Kumar, Felipe Franciosi). - Operating Performance Points (OPP) framework updates to improve its handling of voltage regulators and device clocks and updates of the cpufreq-dt driver on top of that (Viresh Kumar, Jon Hunter). - Updates of the powernv cpufreq driver to fix initialization and cleanup problems in it and correct its worker thread handling with respect to CPU offline, new powernv_throttle tracepoint (Shilpasri Bhat). - ACPI cpufreq driver optimization and cleanup (Rafael Wysocki). - ACPICA updates including one fix for a regression introduced by previos changes in the ACPICA code (Bob Moore, Lv Zheng, David Box, Colin Ian King). - Support for installing ACPI tables from initrd (Lv Zheng). - Optimizations of the ACPI CPPC code (Prashanth Prakash, Ashwin Chaugule). - Support for _HID(ACPI0010) devices (ACPI processor containers) and ACPI processor driver cleanups (Sudeep Holla). - Support for ACPI-based enumeration of the AMBA bus (Graeme Gregory, Aleksey Makarov). - Modification of the ACPI PCI IRQ management code to make it treat 255 in the Interrupt Line register as "not connected" on x86 (as per the specification) and avoid attempts to use that value as a valid interrupt vector (Chen Fan). - ACPI APEI fixes related to resource leaks (Josh Hunt). - Removal of modularity from a few ACPI drivers (BGRT, GHES, intel_pmic_crc) that cannot be built as modules in practice (Paul Gortmaker). - PNP framework update to make it treat ACPI_RESOURCE_TYPE_SERIAL_BUS as a valid resource type (Harb Abdulhamid). - New device ID (future AMD I2C controller) in the ACPI driver for AMD SoCs (APD) and in the designware I2C driver (Xiangliang Yu). - Assorted ACPI cleanups (Colin Ian King, Kaiyen Chang, Oleg Drokin). - cpuidle menu governor optimization to avoid a square root computation in it (Rasmus Villemoes). - Fix for potential use-after-free in the generic device properties framework (Heikki Krogerus). - Updates of the generic power domains (genpd) framework including support for multiple power states of a domain, fixes and debugfs output improvements (Axel Haslam, Jon Hunter, Laurent Pinchart, Geert Uytterhoeven). - Intel RAPL power capping driver updates to reduce IPI overhead in it (Jacob Pan). - System suspend/hibernation code cleanups (Eric Biggers, Saurabh Sengar). - Year 2038 fix for the process freezer (Abhilash Jindal). - turbostat utility updates including new features (decoding of more registers and CPUID fields, sub-second intervals support, GFX MHz and RC6 printout, --out command line option), fixes (syscall jitter detection and workaround, reductioin of the number of syscalls made, fixes related to Xeon x200 processors, compiler warning fixes) and cleanups (Len Brown, Hubert Chrzaniuk, Chen Yu)" * tag 'pm+acpi-4.6-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (182 commits) tools/power turbostat: bugfix: TDP MSRs print bits fixing tools/power turbostat: correct output for MSR_NHM_SNB_PKG_CST_CFG_CTL dump tools/power turbostat: call __cpuid() instead of __get_cpuid() tools/power turbostat: indicate SMX and SGX support tools/power turbostat: detect and work around syscall jitter tools/power turbostat: show GFX%rc6 tools/power turbostat: show GFXMHz tools/power turbostat: show IRQs per CPU tools/power turbostat: make fewer systems calls tools/power turbostat: fix compiler warnings tools/power turbostat: add --out option for saving output in a file tools/power turbostat: re-name "%Busy" field to "Busy%" tools/power turbostat: Intel Xeon x200: fix turbo-ratio decoding tools/power turbostat: Intel Xeon x200: fix erroneous bclk value tools/power turbostat: allow sub-sec intervals ACPI / APEI: ERST: Fixed leaked resources in erst_init ACPI / APEI: Fix leaked resources intel_pstate: Do not skip samples partially intel_pstate: Remove freq calculation from intel_pstate_calc_busy() intel_pstate: Move intel_pstate_calc_busy() into get_target_pstate_use_performance() ...
Diffstat (limited to 'drivers/cpufreq/cpufreq_governor.h')
-rw-r--r--drivers/cpufreq/cpufreq_governor.h261
1 files changed, 84 insertions, 177 deletions
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index 91e767a058a7..61ff82fe0613 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -18,6 +18,7 @@
#define _CPUFREQ_GOVERNOR_H
#include <linux/atomic.h>
+#include <linux/irq_work.h>
#include <linux/cpufreq.h>
#include <linux/kernel_stat.h>
#include <linux/module.h>
@@ -41,96 +42,68 @@
enum {OD_NORMAL_SAMPLE, OD_SUB_SAMPLE};
/*
- * Macro for creating governors sysfs routines
- *
- * - gov_sys: One governor instance per whole system
- * - gov_pol: One governor instance per policy
+ * Abbreviations:
+ * dbs: used as a shortform for demand based switching It helps to keep variable
+ * names smaller, simpler
+ * cdbs: common dbs
+ * od_*: On-demand governor
+ * cs_*: Conservative governor
*/
-/* Create attributes */
-#define gov_sys_attr_ro(_name) \
-static struct global_attr _name##_gov_sys = \
-__ATTR(_name, 0444, show_##_name##_gov_sys, NULL)
-
-#define gov_sys_attr_rw(_name) \
-static struct global_attr _name##_gov_sys = \
-__ATTR(_name, 0644, show_##_name##_gov_sys, store_##_name##_gov_sys)
-
-#define gov_pol_attr_ro(_name) \
-static struct freq_attr _name##_gov_pol = \
-__ATTR(_name, 0444, show_##_name##_gov_pol, NULL)
-
-#define gov_pol_attr_rw(_name) \
-static struct freq_attr _name##_gov_pol = \
-__ATTR(_name, 0644, show_##_name##_gov_pol, store_##_name##_gov_pol)
+/* Governor demand based switching data (per-policy or global). */
+struct dbs_data {
+ int usage_count;
+ void *tuners;
+ unsigned int min_sampling_rate;
+ unsigned int ignore_nice_load;
+ unsigned int sampling_rate;
+ unsigned int sampling_down_factor;
+ unsigned int up_threshold;
+ unsigned int io_is_busy;
-#define gov_sys_pol_attr_rw(_name) \
- gov_sys_attr_rw(_name); \
- gov_pol_attr_rw(_name)
+ struct kobject kobj;
+ struct list_head policy_dbs_list;
+ /*
+ * Protect concurrent updates to governor tunables from sysfs,
+ * policy_dbs_list and usage_count.
+ */
+ struct mutex mutex;
+};
-#define gov_sys_pol_attr_ro(_name) \
- gov_sys_attr_ro(_name); \
- gov_pol_attr_ro(_name)
+/* Governor's specific attributes */
+struct dbs_data;
+struct governor_attr {
+ struct attribute attr;
+ ssize_t (*show)(struct dbs_data *dbs_data, char *buf);
+ ssize_t (*store)(struct dbs_data *dbs_data, const char *buf,
+ size_t count);
+};
-/* Create show/store routines */
-#define show_one(_gov, file_name) \
-static ssize_t show_##file_name##_gov_sys \
-(struct kobject *kobj, struct attribute *attr, char *buf) \
+#define gov_show_one(_gov, file_name) \
+static ssize_t show_##file_name \
+(struct dbs_data *dbs_data, char *buf) \
{ \
- struct _gov##_dbs_tuners *tuners = _gov##_dbs_cdata.gdbs_data->tuners; \
- return sprintf(buf, "%u\n", tuners->file_name); \
-} \
- \
-static ssize_t show_##file_name##_gov_pol \
-(struct cpufreq_policy *policy, char *buf) \
-{ \
- struct dbs_data *dbs_data = policy->governor_data; \
struct _gov##_dbs_tuners *tuners = dbs_data->tuners; \
return sprintf(buf, "%u\n", tuners->file_name); \
}
-#define store_one(_gov, file_name) \
-static ssize_t store_##file_name##_gov_sys \
-(struct kobject *kobj, struct attribute *attr, const char *buf, size_t count) \
-{ \
- struct dbs_data *dbs_data = _gov##_dbs_cdata.gdbs_data; \
- return store_##file_name(dbs_data, buf, count); \
-} \
- \
-static ssize_t store_##file_name##_gov_pol \
-(struct cpufreq_policy *policy, const char *buf, size_t count) \
+#define gov_show_one_common(file_name) \
+static ssize_t show_##file_name \
+(struct dbs_data *dbs_data, char *buf) \
{ \
- struct dbs_data *dbs_data = policy->governor_data; \
- return store_##file_name(dbs_data, buf, count); \
+ return sprintf(buf, "%u\n", dbs_data->file_name); \
}
-#define show_store_one(_gov, file_name) \
-show_one(_gov, file_name); \
-store_one(_gov, file_name)
+#define gov_attr_ro(_name) \
+static struct governor_attr _name = \
+__ATTR(_name, 0444, show_##_name, NULL)
-/* create helper routines */
-#define define_get_cpu_dbs_routines(_dbs_info) \
-static struct cpu_dbs_info *get_cpu_cdbs(int cpu) \
-{ \
- return &per_cpu(_dbs_info, cpu).cdbs; \
-} \
- \
-static void *get_cpu_dbs_info_s(int cpu) \
-{ \
- return &per_cpu(_dbs_info, cpu); \
-}
-
-/*
- * Abbreviations:
- * dbs: used as a shortform for demand based switching It helps to keep variable
- * names smaller, simpler
- * cdbs: common dbs
- * od_*: On-demand governor
- * cs_*: Conservative governor
- */
+#define gov_attr_rw(_name) \
+static struct governor_attr _name = \
+__ATTR(_name, 0644, show_##_name, store_##_name)
/* Common to all CPUs of a policy */
-struct cpu_common_dbs_info {
+struct policy_dbs_info {
struct cpufreq_policy *policy;
/*
* Per policy mutex that serializes load evaluation from limit-change
@@ -138,11 +111,27 @@ struct cpu_common_dbs_info {
*/
struct mutex timer_mutex;
- ktime_t time_stamp;
- atomic_t skip_work;
+ u64 last_sample_time;
+ s64 sample_delay_ns;
+ atomic_t work_count;
+ struct irq_work irq_work;
struct work_struct work;
+ /* dbs_data may be shared between multiple policy objects */
+ struct dbs_data *dbs_data;
+ struct list_head list;
+ /* Multiplier for increasing sample delay temporarily. */
+ unsigned int rate_mult;
+ /* Status indicators */
+ bool is_shared; /* This object is used by multiple CPUs */
+ bool work_in_progress; /* Work is being queued up or in progress */
};
+static inline void gov_update_sample_delay(struct policy_dbs_info *policy_dbs,
+ unsigned int delay_us)
+{
+ policy_dbs->sample_delay_ns = delay_us * NSEC_PER_USEC;
+}
+
/* Per cpu structures */
struct cpu_dbs_info {
u64 prev_cpu_idle;
@@ -155,54 +144,14 @@ struct cpu_dbs_info {
* wake-up from idle.
*/
unsigned int prev_load;
- struct timer_list timer;
- struct cpu_common_dbs_info *shared;
-};
-
-struct od_cpu_dbs_info_s {
- struct cpu_dbs_info cdbs;
- struct cpufreq_frequency_table *freq_table;
- unsigned int freq_lo;
- unsigned int freq_lo_jiffies;
- unsigned int freq_hi_jiffies;
- unsigned int rate_mult;
- unsigned int sample_type:1;
-};
-
-struct cs_cpu_dbs_info_s {
- struct cpu_dbs_info cdbs;
- unsigned int down_skip;
- unsigned int requested_freq;
-};
-
-/* Per policy Governors sysfs tunables */
-struct od_dbs_tuners {
- unsigned int ignore_nice_load;
- unsigned int sampling_rate;
- unsigned int sampling_down_factor;
- unsigned int up_threshold;
- unsigned int powersave_bias;
- unsigned int io_is_busy;
-};
-
-struct cs_dbs_tuners {
- unsigned int ignore_nice_load;
- unsigned int sampling_rate;
- unsigned int sampling_down_factor;
- unsigned int up_threshold;
- unsigned int down_threshold;
- unsigned int freq_step;
+ struct update_util_data update_util;
+ struct policy_dbs_info *policy_dbs;
};
/* Common Governor data across policies */
-struct dbs_data;
-struct common_dbs_data {
- /* Common across governors */
- #define GOV_ONDEMAND 0
- #define GOV_CONSERVATIVE 1
- int governor;
- struct attribute_group *attr_group_gov_sys; /* one governor - system */
- struct attribute_group *attr_group_gov_pol; /* one governor - policy */
+struct dbs_governor {
+ struct cpufreq_governor gov;
+ struct kobj_type kobj_type;
/*
* Common data for platforms that don't set
@@ -210,74 +159,32 @@ struct common_dbs_data {
*/
struct dbs_data *gdbs_data;
- struct cpu_dbs_info *(*get_cpu_cdbs)(int cpu);
- void *(*get_cpu_dbs_info_s)(int cpu);
- unsigned int (*gov_dbs_timer)(struct cpufreq_policy *policy,
- bool modify_all);
- void (*gov_check_cpu)(int cpu, unsigned int load);
+ unsigned int (*gov_dbs_timer)(struct cpufreq_policy *policy);
+ struct policy_dbs_info *(*alloc)(void);
+ void (*free)(struct policy_dbs_info *policy_dbs);
int (*init)(struct dbs_data *dbs_data, bool notify);
void (*exit)(struct dbs_data *dbs_data, bool notify);
-
- /* Governor specific ops, see below */
- void *gov_ops;
-
- /*
- * Protects governor's data (struct dbs_data and struct common_dbs_data)
- */
- struct mutex mutex;
+ void (*start)(struct cpufreq_policy *policy);
};
-/* Governor Per policy data */
-struct dbs_data {
- struct common_dbs_data *cdata;
- unsigned int min_sampling_rate;
- int usage_count;
- void *tuners;
-};
+static inline struct dbs_governor *dbs_governor_of(struct cpufreq_policy *policy)
+{
+ return container_of(policy->governor, struct dbs_governor, gov);
+}
-/* Governor specific ops, will be passed to dbs_data->gov_ops */
+/* Governor specific operations */
struct od_ops {
- void (*powersave_bias_init_cpu)(int cpu);
unsigned int (*powersave_bias_target)(struct cpufreq_policy *policy,
unsigned int freq_next, unsigned int relation);
- void (*freq_increase)(struct cpufreq_policy *policy, unsigned int freq);
};
-static inline int delay_for_sampling_rate(unsigned int sampling_rate)
-{
- int delay = usecs_to_jiffies(sampling_rate);
-
- /* We want all CPUs to do sampling nearly on same jiffy */
- if (num_online_cpus() > 1)
- delay -= jiffies % delay;
-
- return delay;
-}
-
-#define declare_show_sampling_rate_min(_gov) \
-static ssize_t show_sampling_rate_min_gov_sys \
-(struct kobject *kobj, struct attribute *attr, char *buf) \
-{ \
- struct dbs_data *dbs_data = _gov##_dbs_cdata.gdbs_data; \
- return sprintf(buf, "%u\n", dbs_data->min_sampling_rate); \
-} \
- \
-static ssize_t show_sampling_rate_min_gov_pol \
-(struct cpufreq_policy *policy, char *buf) \
-{ \
- struct dbs_data *dbs_data = policy->governor_data; \
- return sprintf(buf, "%u\n", dbs_data->min_sampling_rate); \
-}
-
-extern struct mutex cpufreq_governor_lock;
-
-void gov_add_timers(struct cpufreq_policy *policy, unsigned int delay);
-void gov_cancel_work(struct cpu_common_dbs_info *shared);
-void dbs_check_cpu(struct dbs_data *dbs_data, int cpu);
-int cpufreq_governor_dbs(struct cpufreq_policy *policy,
- struct common_dbs_data *cdata, unsigned int event);
+unsigned int dbs_update(struct cpufreq_policy *policy);
+int cpufreq_governor_dbs(struct cpufreq_policy *policy, unsigned int event);
void od_register_powersave_bias_handler(unsigned int (*f)
(struct cpufreq_policy *, unsigned int, unsigned int),
unsigned int powersave_bias);
void od_unregister_powersave_bias_handler(void);
+ssize_t store_sampling_rate(struct dbs_data *dbs_data, const char *buf,
+ size_t count);
+void gov_update_cpu_data(struct dbs_data *dbs_data);
#endif /* _CPUFREQ_GOVERNOR_H */