summaryrefslogtreecommitdiff
path: root/drivers/cpufreq/cpufreq.c
AgeCommit message (Collapse)AuthorFilesLines
2016-10-04Merge branch 'smp-hotplug-for-linus' of ↵Linus Torvalds1-27/+14
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull CPU hotplug updates from Thomas Gleixner: "Yet another batch of cpu hotplug core updates and conversions: - Provide core infrastructure for multi instance drivers so the drivers do not have to keep custom lists. - Convert custom lists to the new infrastructure. The block-mq custom list conversion comes through the block tree and makes the diffstat tip over to more lines removed than added. - Handle unbalanced hotplug enable/disable calls more gracefully. - Remove the obsolete CPU_STARTING/DYING notifier support. - Convert another batch of notifier users. The relayfs changes which conflicted with the conversion have been shipped to me by Andrew. The remaining lot is targeted for 4.10 so that we finally can remove the rest of the notifiers" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits) cpufreq: Fix up conversion to hotplug state machine blk/mq: Reserve hotplug states for block multiqueue x86/apic/uv: Convert to hotplug state machine s390/mm/pfault: Convert to hotplug state machine mips/loongson/smp: Convert to hotplug state machine mips/octeon/smp: Convert to hotplug state machine fault-injection/cpu: Convert to hotplug state machine padata: Convert to hotplug state machine cpufreq: Convert to hotplug state machine ACPI/processor: Convert to hotplug state machine virtio scsi: Convert to hotplug state machine oprofile/timer: Convert to hotplug state machine block/softirq: Convert to hotplug state machine lib/irq_poll: Convert to hotplug state machine x86/microcode: Convert to hotplug state machine sh/SH-X3 SMP: Convert to hotplug state machine ia64/mca: Convert to hotplug state machine ARM/OMAP/wakeupgen: Convert to hotplug state machine ARM/shmobile: Convert to hotplug state machine arm64/FP/SIMD: Convert to hotplug state machine ...
2016-09-20cpufreq: Fix up conversion to hotplug state machineSebastian Andrzej Siewior1-0/+1
The function cpufreq_register_driver() returns zero on success and since commit 27622b061eb4 ("cpufreq: Convert to hotplug state machine") erroneously a positive number. Due to the "if (x) assume_error" construct all callers assumed an error and as a consequence the cpu freq kworker crashes with a NULL pointer dereference. Reset the return value back to zero in the success case. Fixes: 27622b061eb4 ("cpufreq: Convert to hotplug state machine") Reported-by: Borislav Petkov <bp@alien8.de> Reported-and-tested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: peterz@infradead.org Cc: rjw@rjwysocki.net Link: http://lkml.kernel.org/r/20160920145628.lp2bmq72ip3oiash@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-19cpufreq: Convert to hotplug state machineSebastian Andrzej Siewior1-26/+12
Install the callbacks via the state machine. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: linux-pm@vger.kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Viresh Kumar <viresh.kumar@linaro.or Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160906170457.32393-13-bigeasy@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-13cpufreq: create link to policy only for registered CPUsViresh Kumar1-61/+28
If a cpufreq driver is registered very early in the boot stage (e.g. registered from postcore_initcall()), then cpufreq core may generate kernel warnings for it. In this case, the CPUs are brought online, then the cpufreq driver is registered, and then the CPU topology devices are registered. However, by the time cpufreq_add_dev() gets called, the cpu device isn't stored in the per-cpu variable (cpu_sys_devices,) which is read by get_cpu_device(). So the cpufreq core fails to get device for the CPU, for which cpufreq_add_dev() was called in the first place and we will hit a WARN_ON(!cpu_dev). Even if we reuse the 'dev' parameter passed to cpufreq_add_dev() to avoid that warning, there might be other CPUs online that share the policy with the cpu for which cpufreq_add_dev() is called. Eventually get_cpu_device() will return NULL for them as well, and we will hit the same WARN_ON() again. In order to fix these issues, change cpufreq core to create links to the policy for a cpu only when cpufreq_add_dev() is called for that CPU. Reuse the 'real_cpus' mask to track that as well. Note that cpufreq_remove_dev() already handles removal of the links for individual CPUs and cpufreq_add_dev() has aligned with that now. Reported-by: Russell King <rmk+kernel@arm.linux.org.uk> Tested-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-09-01cpufreq: Drop unnecessary check from cpufreq_policy_alloc()Rafael J. Wysocki1-4/+0
Since cpufreq_policy_alloc() doesn't use its dev variable for anything useful, drop that variable from there along with the NULL check against it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-07-22cpufreq: export cpufreq_driver_resolve_freq()Steve Muckle1-0/+1
Export cpufreq_driver_resolve_freq() since governors may be compiled as modules. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Steve Muckle <smuckle@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-22cpufreq: Disallow ->resolve_freq() for drivers providing ->target_index()Viresh Kumar1-4/+12
The handlers provided by cpufreq core are sufficient for resolving the frequency for drivers providing ->target_index(), as the core already has the frequency table and so ->resolve_freq() isn't required for such platforms. This patch disallows drivers with ->target_index() callback to use the ->resolve_freq() callback. Also, it fixes a potential kernel crash for drivers providing ->target() but no ->resolve_freq(). Fixes: e3c062360870 "cpufreq: add cpufreq_driver_resolve_freq()" Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-21cpufreq: add cpufreq_driver_resolve_freq()Steve Muckle1-0/+25
Cpufreq governors may need to know what a particular target frequency maps to in the driver without necessarily wanting to set the frequency. Support this operation via a new cpufreq API, cpufreq_driver_resolve_freq(). This API returns the lowest driver frequency equal or greater than the target frequency (CPUFREQ_RELATION_L), subject to any policy (min/max) or driver limitations. The mapping is also cached in the policy so that a subsequent fast_switch operation can avoid repeating the same lookup. The API will call a new cpufreq driver callback, resolve_freq(), if it has been registered by the driver. Otherwise the frequency is resolved via cpufreq_frequency_table_target(). Rather than require ->target() style drivers to provide a resolve_freq() callback it is left to the caller to ensure that the driver implements this callback if necessary to use cpufreq_driver_resolve_freq(). Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Steve Muckle <smuckle@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-04cpufreq: Drop redundant check from cpufreq_update_current_freq()Rafael J. Wysocki1-3/+0
Both callers of cpufreq_update_current_freq(), cpufreq_update_policy() and cpufreq_start_governor(), check cpufreq_suspended before calling that function, so drop the redundant cpufreq_suspended check from it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-07-04Merge back earlier cpufreq material for v4.8.Rafael J. Wysocki1-84/+102
2016-06-28cpufreq: Avoid false-positive WARN_ON()s in cpufreq_update_policy()Rafael J. Wysocki1-0/+4
CPU notifications from the firmware coming in when cpufreq is suspended cause cpufreq_update_current_freq() to return 0 which triggers the WARN_ON() in cpufreq_update_policy() for no reason. Avoid that by checking cpufreq_suspended before calling cpufreq_update_current_freq(). Fixes: c9d9c929e674 (cpufreq: Abort cpufreq_update_current_freq() for cpufreq_suspended set) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 4.6+ <stable@vger.kernel.org> # 4.6+
2016-06-14Merge back earlier cpufreq changes for v4.8.Rafael J. Wysocki1-84/+102
2016-06-09cpufreq: Return index from cpufreq_frequency_table_target()Viresh Kumar1-7/+2
This routine can't fail unless the frequency table is invalid and doesn't contain any valid entries. Make it return the index and WARN() in case it is used for an invalid table. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-09cpufreq: Drop 'freq_table' argument of __target_index()Viresh Kumar1-14/+7
It is already present as part of the policy and so no need to pass it from the caller. Also, 'freq_table' is guaranteed to be valid in this function and so no need to check it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-09cpufreq: Drop freq-table param to cpufreq_frequency_table_target()Viresh Kumar1-2/+2
The policy already has this pointer set, use it instead. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-09cpufreq: Remove cpufreq_frequency_get_table()Viresh Kumar1-24/+14
Most of the callers of cpufreq_frequency_get_table() already have the pointer to a valid 'policy' structure and they don't really need to go through the per-cpu variable first and then a check to validate the frequency, in order to find the freq-table for the policy. Directly use the policy->freq_table field instead for them. Only one user of that API is left after above changes, cpu_cooling.c and it accesses the freq_table in a racy way as the policy can get freed in between. Fix it by using cpufreq_cpu_get() properly. Since there are no more users of cpufreq_frequency_get_table() left, get rid of it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Javi Merino <javi.merino@arm.com> (cpu_cooling.c) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-03cpufreq: stats: Make the stats code non-modularRafael J. Wysocki1-0/+4
The modularity of cpufreq_stats is quite problematic. First off, the usage of policy notifiers for the initialization and cleanup in the cpufreq_stats module is inherently racy with respect to CPU offline/online and the initialization and cleanup of the cpufreq driver. Second, fast frequency switching (used by the schedutil governor) cannot be enabled if any transition notifiers are registered, so if the cpufreq_stats module (that registers a transition notifier for updating transition statistics) is loaded, the schedutil governor cannot use fast frequency switching. On the other hand, allowing cpufreq_stats to be built as a module doesn't really add much value. Arguably, there's not much reason for that code to be modular at all. For the above reasons, make the cpufreq stats code non-modular, modify the core to invoke functions provided by that code directly and drop the notifiers from it. Make the stats sysfs attributes appear empty if fast frequency switching is enabled as the statistics will not be updated in that case anyway (and returning -EBUSY from those attributes breaks powertop). While at it, clean up Kconfig help for the CPU_FREQ_STAT and CPU_FREQ_STAT_DETAILS options. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-06-03cpufreq: Use clamp_val() in __cpufreq_driver_target()Viresh Kumar1-4/+1
Use clamp_val() instead of open coding it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-03cpufreq: Send START policy notification after sending CREATEViresh Kumar1-3/+3
The sequence got a bit wrong as we are sending CPUFREQ_START notifications even before we have sent CPUFREQ_CREATE_POLICY. Fix it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-03cpufreq: Drop the 'initialized' field from struct cpufreq_governorRafael J. Wysocki1-3/+0
The 'initialized' field in struct cpufreq_governor is only used by the conservative governor (as a usage counter) and the way that happens is far from straightforward and arguably incorrect. Namely, the value of 'initialized' is checked by cpufreq_dbs_governor_init() and cpufreq_dbs_governor_exit() and the results of those checks are passed (as the second argument) to the ->init() and ->exit() callbacks in struct dbs_governor. Those callbacks are only implemented by the ondemand and conservative governors and ondemand doesn't use their second argument at all. In turn, the conservative governor uses it to decide whether or not to either register or unregister a transition notifier. That whole mechanism is not only unnecessarily convoluted, but also racy, because the 'initialized' field of struct cpufreq_governor is updated in cpufreq_init_governor() and cpufreq_exit_governor() under policy->rwsem which doesn't help if one of these functions is run twice in parallel for different policies (which isn't impossible in principle), for example. Instead of it, add a proper usage counter to the conservative governor and update it from cs_init() and cs_exit() which is guaranteed to be non-racy, as those functions are only called under gov_dbs_data_mutex which is global. With that in place, drop the 'initialized' field from struct cpufreq_governor as it is not used any more. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-06-03cpufreq: governor: Get rid of governor eventsRafael J. Wysocki1-11/+20
The design of the cpufreq governor API is not very straightforward, as struct cpufreq_governor provides only one callback to be invoked from different code paths for different purposes. The purpose it is invoked for is determined by its second "event" argument, causing it to act as a "callback multiplexer" of sorts. Unfortunately, that leads to extra complexity in governors, some of which implement the ->governor() callback as a switch statement that simply checks the event argument and invokes a separate function to handle that specific event. That extra complexity can be eliminated by replacing the all-purpose ->governor() callback with a family of callbacks to carry out specific governor operations: initialization and exit, start and stop and policy limits updates. That also turns out to reduce the code size too, so do it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-06-01cpufreq: Fix clamp_val() usage in cpufreq_driver_fast_switch()Rafael J. Wysocki1-1/+1
The return value of clamp_val() has to be stored actually. Fixes: b7898fda5bc7 (cpufreq: Support for fast frequency switching) Reported-by: Steve Muckle <steve.muckle@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-05-30cpufreq: Split cpufreq_governor() into simpler functionsRafael J. Wysocki1-43/+70
The cpufreq_governor() routine is used by the cpufreq core to invoke the current governor's ->governor() callback with appropriate arguments and do some housekeeping related to that. Unfortunately, the way it mixes different governor events in one code path makes it rather hard to follow the code. For this reason, split cpufreq_governor() into five simpler functions that each will handle just one specific governor event and put all of the code related to the given event into its own function. This change is a prerequisite for a redesign of the cpufreq governor API that will be done subsequently. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-05-30cpufreq: governor: Check transition latecy at init time onlyRafael J. Wysocki1-13/+14
It is not necessary to check the governor's max_transition_latency attribute every time cpufreq_governor() runs, so check it only if the event argument is CPUFREQ_GOV_POLICY_INIT. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-05-30cpufreq: governor: CPUFREQ_GOV_LIMITS never failsRafael J. Wysocki1-2/+7
None of the cpufreq governors currently in the tree will ever fail an invocation of the ->governor() callback with the event argument equal to CPUFREQ_GOV_LIMITS (unless invoked with incorrect arguments which doesn't matter anyway) and had it ever failed, the result of it wouldn't have been very clean. For this reason, rearrange the code in the core to ignore the return value of cpufreq_governor() when called with event equal to CPUFREQ_GOV_LIMITS. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-05-18cpufreq: simplified goto out in cpufreq_register_driver()Pankaj Gupta1-5/+4
simplified goto out in cpufreq_register_driver for increasing code readability Signed-off-by: Pankaj Gupta <pankaj.gupta@spreadtrum.com> Signed-off-by: Sanjeev Yadav <sanjeev.yadav@spreadtrum.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-18cpufreq: governor: CPUFREQ_GOV_STOP never failsRafael J. Wysocki1-29/+11
None of the cpufreq governors currently in the tree will ever fail an invocation of the ->governor() callback with the event argument equal to CPUFREQ_GOV_STOP (unless invoked with incorrect arguments which doesn't matter anyway) and it is rather difficult to imagine a valid reason for such a failure. Accordingly, rearrange the code in the core to make it clear that this call never fails. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-05-18cpufreq: governor: CPUFREQ_GOV_POLICY_EXIT never failsRafael J. Wysocki1-23/+12
None of the cpufreq governors currently in the tree will ever fail an invocation of the ->governor() callback with the event argument equal to CPUFREQ_GOV_POLICY_EXIT (unless invoked with incorrect arguments which doesn't matter anyway) and it wouldn't really make sense to fail it, because the caller won't be able to handle that failure in a meaningful way. Accordingly, rearrange the code in the core to make it clear that this call never fails. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-05-06Merge cpufreq fixes going into v4.6.Rafael J. Wysocki1-11/+15
* pm-cpufreq-fixes: intel_pstate: Fix intel_pstate_get() cpufreq: intel_pstate: Fix HWP on boot CPU after system resume cpufreq: st: enable selective initialization based on the platform cpufreq: intel_pstate: Fix processing for turbo activation ratio
2016-05-02cpufreq: intel_pstate: Fix HWP on boot CPU after system resumeRafael J. Wysocki1-11/+15
Commit 41cfd64cf49fc "Update frequencies of policy->cpus only from ->set_policy()" changed the way the intel_pstate driver's ->set_policy callback updates the HWP (hardware-managed P-states) settings. A side effect of it is that if those settings are modified on the boot CPU during system suspend and wakeup, they will never be restored during subsequent system resume. To address this problem, allow cpufreq drivers that don't provide ->target or ->target_index callbacks to use ->suspend and ->resume callbacks and add a ->resume callback to intel_pstate to restore the HWP settings on the CPUs that belong to the given policy. Fixes: 41cfd64cf49fc "Update frequencies of policy->cpus only from ->set_policy()" Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-04-25Merge back cpufreq changes for v4.7.Rafael J. Wysocki1-23/+141
2016-04-19cpufreq: Abort cpufreq_update_current_freq() for cpufreq_suspended setRafael J. Wysocki1-0/+3
Since governor operations are generally skipped if cpufreq_suspended is set, cpufreq_start_governor() should do nothing in that case. That function is called in the cpufreq_online() path, and may also be called from cpufreq_offline() in some cases, which are invoked by the nonboot CPUs disabing/enabling code during system suspend to RAM and resume. That happens when all devices have been suspended, so if the cpufreq driver relies on things like I2C to get the current frequency, it may not be ready to do that then. To prevent problems from happening for this reason, make cpufreq_update_current_freq(), which is the only function invoked by cpufreq_start_governor() that doesn't check cpufreq_suspended already, return 0 upfront if cpufreq_suspended is set. Fixes: 3bbf8fe3ae08 (cpufreq: Always update current frequency before startig governor) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-04-09cpufreq: Rearrange cpufreq_add_dev()Rafael J. Wysocki1-14/+12
Reorganize the code in cpufreq_add_dev() to avoid using the ret variable and reduce the indentation level in it. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-04-09cpufreq: Simplify switch () in cpufreq_cpu_callback()Rafael J. Wysocki1-4/+1
Merge two switch entries that do the same thing in cpufreq_cpu_callback(). No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-04-08cpufreq: Call cpufreq_disable_fast_switch() in sugov_exit()Rafael J. Wysocki1-8/+11
Due to differences in the cpufreq core's handling of runtime CPU offline and nonboot CPUs disabling during system suspend-to-RAM, fast frequency switching gets disabled after a suspend-to-RAM and resume cycle on all of the nonboot CPUs. To prevent that from happening, move the invocation of cpufreq_disable_fast_switch() from cpufreq_exit_governor() to sugov_exit(), as the schedutil governor is the only user of fast frequency switching today anyway. That simply prevents cpufreq_disable_fast_switch() from being called without invoking the ->governor callback for the CPUFREQ_GOV_POLICY_EXIT event (which happens during system suspend now). Fixes: b7898fda5bc7 (cpufreq: Support for fast frequency switching) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-04-02cpufreq: Support for fast frequency switchingRafael J. Wysocki1-5/+125
Modify the ACPI cpufreq driver to provide a method for switching CPU frequencies from interrupt context and update the cpufreq core to support that method if available. Introduce a new cpufreq driver callback, ->fast_switch, to be invoked for frequency switching from interrupt context by (future) governors supporting that feature via (new) helper function cpufreq_driver_fast_switch(). Add two new policy flags, fast_switch_possible, to be set by the cpufreq driver if fast frequency switching can be used for the given policy and fast_switch_enabled, to be set by the governor if it is going to use fast frequency switching for the given policy. Also add a helper for setting the latter. Since fast frequency switching is inherently incompatible with cpufreq transition notifiers, make it possible to set the fast_switch_enabled only if there are no transition notifiers already registered and make the registration of new transition notifiers fail if fast_switch_enabled is set for at least one policy. Implement the ->fast_switch callback in the ACPI cpufreq driver and make it set fast_switch_possible during policy initialization as appropriate. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-23cpufreq: Always update current frequency before startig governorRafael J. Wysocki1-11/+3
Make policy->cur match the current frequency returned by the driver's ->get() callback before starting the governor in case they went out of sync in the meantime and drop the piece of code attempting to resync policy->cur with the real frequency of the boot CPU from cpufreq_resume() as it serves no purpose any more (and it's racy and super-ugly anyway). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-23cpufreq: Introduce cpufreq_update_current_freq()Rafael J. Wysocki1-9/+19
Move the part of cpufreq_update_policy() that obtains the current frequency from the driver and updates policy->cur if necessary to a separate function, cpufreq_get_current_freq(). That should not introduce functional changes and subsequent change set will need it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-23cpufreq: Introduce cpufreq_start_governor()Rafael J. Wysocki1-22/+22
Starting a governor in cpufreq always follows the same pattern involving two calls to cpufreq_governor(), one with the event argument set to CPUFREQ_GOV_START and one with that argument set to CPUFREQ_GOV_LIMITS. Introduce cpufreq_start_governor() that will carry out those two operations and make all places where governors are started use it. That slightly modifies the behavior of cpufreq_set_policy() which now also will go back to the old governor if the second call to cpufreq_governor() (the one with event equal to CPUFREQ_GOV_LIMITS) fails, but that really is how it should work in the first place. Also cpufreq_resume() will now pring an error message if the CPUFREQ_GOV_LIMITS call to cpufreq_governor() fails, but that makes it follow cpufreq_add_policy_cpu() and cpufreq_offline() in that respect. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2016-03-18cpufreq: Make cpufreq_quick_get() safe to callRichard Cochran1-2/+10
The function, cpufreq_quick_get, accesses the global 'cpufreq_driver' and its fields without taking the associated lock, cpufreq_driver_lock. Without the locking, nothing guarantees that 'cpufreq_driver' remains consistent during the call. This patch fixes the issue by taking the lock before accessing the data structure. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-10Merge branch 'pm-cpufreq-governor' into pm-cpufreqRafael J. Wysocki1-98/+67
2016-03-10cpufreq: Move scheduler-related code to the sched directoryRafael J. Wysocki1-53/+0
Create cpufreq.c under kernel/sched/ and move the cpufreq code related to the scheduler to that file and to sched.h. Redefine cpufreq_update_util() as a static inline function to avoid function calls at its call sites in the scheduler code (as suggested by Peter Zijlstra). Also move the definition of struct update_util_data and declaration of cpufreq_set_update_util_data() from include/linux/cpufreq.h to include/linux/sched.h. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2016-03-09Revert "cpufreq: postfix policy directory with the first CPU in related_cpus"Viresh Kumar1-11/+10
Revert commit 3510fac45492 (cpufreq: postfix policy directory with the first CPU in related_cpus). Earlier, the policy->kobj was added to the kobject core, before ->init() callback was called for the cpufreq drivers. Which allowed those drivers to add or remove, driver dependent, sysfs files/directories to the same kobj from their ->init() and ->exit() callbacks. That isn't possible anymore after commit 3510fac45492. Now, there is no other clean alternative that people can adopt. Its better to revert the earlier commit to allow cpufreq drivers to create/remove sysfs files from ->init() and ->exit() callbacks. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09cpufreq: Reduce cpufreq_update_util() overhead a bitRafael J. Wysocki1-8/+17
Use the observation that cpufreq_update_util() is only called by the scheduler with rq->lock held, so the callers of cpufreq_set_update_util_data() can use synchronize_sched() instead of synchronize_rcu() to wait for cpufreq_update_util() to complete. Moreover, if they are updated to do that, rcu_read_(un)lock() calls in cpufreq_update_util() might be replaced with rcu_read_(un)lock_sched(), respectively, but those aren't really necessary, because the scheduler calls that function from RCU-sched read-side critical sections already. In addition to that, if cpufreq_set_update_util_data() checks the func field in the struct update_util_data before setting the per-CPU pointer to it, the data->func check may be dropped from cpufreq_update_util() as well. Make the above changes to reduce the overhead from cpufreq_update_util() in the scheduler paths invoking it and to make the cleanup after removing its callbacks less heavy-weight somewhat. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2016-03-09cpufreq: Remove 'policy->governor_enabled'Viresh Kumar1-17/+0
The entire sequence of events (like INIT/START or STOP/EXIT) for which cpufreq_governor() is called, is guaranteed to be protected by policy->rwsem now. The additional checks that were added earlier (as we were forced to drop policy->rwsem before calling cpufreq_governor() for EXIT event), aren't required anymore. Over that, they weren't sufficient really. They just take care of START/STOP events, but not INIT/EXIT and the state machine was never maintained properly by them. Kill the unnecessary checks and policy->governor_enabled field. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09cpufreq: Rename __cpufreq_governor() to cpufreq_governor()Viresh Kumar1-23/+21
The __ at the beginning of the routine aren't really necessary at all. Rename it to cpufreq_governor() instead. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09cpufreq: Relocate handle_update() to kill its declarationViresh Kumar1-10/+9
handle_update() is declared at the top of the file as its user appear before its definition. Relocate the routine to get rid of this. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09cpufreq: Remove cpufreq_governor_lockViresh Kumar1-8/+0
We used to drop policy->rwsem just before calling __cpufreq_governor() in some cases earlier and so it was possible that __cpufreq_governor() ran concurrently via separate threads for the same policy. In order to guarantee valid state transitions for governors, 'governor_enabled' was required to be protected using some locking and cpufreq_governor_lock was added for that. But now __cpufreq_governor() is always called under policy->rwsem, and 'governor_enabled' is protected against races even without cpufreq_governor_lock. Get rid of the extra lock now. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Juri Lelli <juri.lelli@arm.com> Tested-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> [ rjw : Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09cpufreq: Call __cpufreq_governor() with policy->rwsem heldViresh Kumar1-16/+33
The cpufreq core code is not consistent with respect to invoking __cpufreq_governor() under policy->rwsem. Changing all code to always hold policy->rwsem around __cpufreq_governor() invocations will allow us to remove cpufreq_governor_lock that is used today because we can't guarantee that __cpufreq_governor() isn't executed twice in parallel for the same policy. We should also ensure that policy->rwsem is held across governor state changes. For example, while adding a CPU to the policy in the CPU online path, we need to stop the governor, change policy->cpus, start the governor and then refresh its limits. The complete sequence must be guaranteed to complete without interruptions by concurrent governor state updates. That can be achieved by holding policy->rwsem around those sequences of operations. Also note that after this patch cpufreq_driver->stop_cpu() and ->exit() will get called under policy->rwsem which wasn't the case earlier. That shouldn't have any side effects, though. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Juri Lelli <juri.lelli@arm.com> Tested-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> [ rjw: Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-09cpufreq: Merge cpufreq_offline_prepare/finish routinesViresh Kumar1-26/+10
Commit 1aee40ac9c86 (cpufreq: Invoke __cpufreq_remove_dev_finish() after releasing cpu_hotplug.lock) split the cpufreq's CPU offline routine in two pieces, one of them to be run with CPU offline/online locked and the other to be called later. The reason for that split was a possible deadlock scenario involving cpufreq sysfs attributes and CPU offline. However, the handling of CPU offline in cpufreq has changed since then. Policy sysfs attributes are never removed during CPU offline, so there's no need to worry about accessing them during CPU offline, because that can't lead to any deadlocks now. Governor sysfs attributes are still removed in __cpufreq_governor(_EXIT), but there is a new kobject type for them now and its show/store callbacks don't lock CPU offline/online (they don't need to do that). This means that the CPU offline code in cpufreq doesn't need to be split any more, so combine cpufreq_offline_prepare() with cpufreq_offline_finish(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Changelog ] Tested-by: Juri Lelli <juri.lelli@arm.com> Tested-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>