BMC/Intel-BMC/linux.git - Intel OpenBMC Linux kernel source tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2013-09-30	cpufreq: pmac: use cpufreq_table_validate_and_show()	Viresh Kumar	2	-5/+2
	Lets use cpufreq_table_validate_and_show() instead of calling cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: pasemi: use cpufreq_table_validate_and_show()	Viresh Kumar	1	-3/+1
	Lets use cpufreq_table_validate_and_show() instead of calling cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: p4-clockmod: use cpufreq_table_validate_and_show()	Viresh Kumar	1	-2/+1
	Lets use cpufreq_table_validate_and_show() instead of calling cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr(). Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: omap: use cpufreq_table_validate_and_show()	Viresh Kumar	1	-3/+1
	Lets use cpufreq_table_validate_and_show() instead of calling cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr(). Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: maple: use cpufreq_table_validate_and_show()	Viresh Kumar	1	-3/+1
	Lets use cpufreq_table_validate_and_show() instead of calling cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr(). Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: loongson2: use cpufreq_table_validate_and_show()	Viresh Kumar	1	-4/+1
	Lets use cpufreq_table_validate_and_show() instead of calling cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr(). Cc: John Crispin <blogic@openwrt.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: longhaul: use cpufreq_table_validate_and_show()	Viresh Kumar	1	-7/+1
	Lets use cpufreq_table_validate_and_show() instead of calling cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: kirkwood: use cpufreq_table_validate_and_show()	Viresh Kumar	1	-9/+1
	Lets use cpufreq_table_validate_and_show() instead of calling cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr(). Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: imx6q: use cpufreq_table_validate_and_show()	Viresh Kumar	1	-2/+1
	Lets use cpufreq_table_validate_and_show() instead of calling cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr(). Cc: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: ia64-acpi: use cpufreq_table_validate_and_show()	Viresh Kumar	1	-3/+1
	Lets use cpufreq_table_validate_and_show() instead of calling cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr(). Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: exynos: use cpufreq_table_validate_and_show()	Viresh Kumar	2	-6/+2
	Lets use cpufreq_table_validate_and_show() instead of calling cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr(). Cc: Kukjin Kim <kgene.kim@samsung.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: elanfreq: use cpufreq_table_validate_and_show()	Viresh Kumar	1	-7/+1
	Lets use cpufreq_table_validate_and_show() instead of calling cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: e_powersaver: use cpufreq_table_validate_and_show()	Viresh Kumar	1	-2/+1
	Lets use cpufreq_table_validate_and_show() instead of calling cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: dbx500: use cpufreq_table_validate_and_show()	Viresh Kumar	1	-4/+2
	Lets use cpufreq_table_validate_and_show() instead of calling cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr(). Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: davinci: use cpufreq_table_validate_and_show()	Viresh Kumar	1	-4/+2
	Lets use cpufreq_table_validate_and_show() instead of calling cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr(). Cc: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: cris: use cpufreq_table_validate_and_show()	Viresh Kumar	2	-18/+2
	Lets use cpufreq_table_validate_and_show() instead of calling cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr(). Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Mikael Starvik <starvik@axis.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: cpufreq-cpu0: use cpufreq_table_validate_and_show()	Viresh Kumar	1	-3/+1
	Lets use cpufreq_table_validate_and_show() instead of calling cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr(). Acked-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: blackfin: use cpufreq_table_validate_and_show()	Viresh Kumar	1	-2/+1
	Lets use cpufreq_table_validate_and_show() instead of calling cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr(). Cc: Steven Miao <realmz6@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: arm_big_little: use cpufreq_table_validate_and_show()	Viresh Kumar	1	-3/+1
	Lets use cpufreq_table_validate_and_show() instead of calling cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: acpi-cpufreq: use cpufreq_table_validate_and_show()	Viresh Kumar	1	-3/+1
	Lets use cpufreq_table_validate_and_show() instead of calling cpufreq_frequency_table_cpuinfo() and cpufreq_frequency_table_get_attr(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: sparc: call cpufreq_frequency_table_get_attr()	Viresh Kumar	2	-4/+17
	This exposes frequency table of driver to cpufreq core and is required for core to guess what the index for a target frequency is, when it calls cpufreq_frequency_table_target(). And so this driver needs to expose it. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: s3cx4xx: call cpufreq_frequency_table_get_attr()	Viresh Kumar	2	-1/+6
	This exposes frequency table of driver to cpufreq core and is required for core to guess what the index for a target frequency is, when it calls cpufreq_frequency_table_target(). And so this driver needs to expose it. Cc: Kukjin Kim <kgene.kim@samsung.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: pxa: call cpufreq_frequency_table_get_attr()	Viresh Kumar	2	-3/+25
	This exposes frequency table of driver to cpufreq core and is required for core to guess what the index for a target frequency is, when it calls cpufreq_frequency_table_target(). And so this driver needs to expose it. Cc: Eric Miao <eric.y.miao@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: Add new helper cpufreq_table_validate_and_show()	Viresh Kumar	1	-0/+12
	Almost every cpufreq driver is required to validate its frequency table with: cpufreq_frequency_table_cpuinfo() and then expose it to cpufreq core with: cpufreq_frequency_table_get_attr(). This patch creates another helper routine cpufreq_table_validate_and_show() that will do both these steps in a single call and will return 0 for success, error otherwise. This also fixes potential bugs in cpufreq drivers where people have called cpufreq_frequency_table_get_attr() before calling cpufreq_frequency_table_cpuinfo(), as the later may fail. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: cpufreq-cpu0: NULL is a valid regulator, part 2	Philipp Zabel	1	-1/+1
	Since the patch "cpufreq: cpufreq-cpu0: NULL is a valid regulator", cpu_reg contains an error value if the regulator is not set, instead of NULL. Accordingly, fix the remaining check for non-NULL cpu_reg. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30	cpufreq: SPEAr: Fix incorrect variable type	Sachin Kamat	1	-1/+1
	'clk_round_rate' returns a negative error code upon failure. This will never get detected by unsigned 'newfreq'. Make it signed. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-25	cpufreq: exynos5440: Fix potential NULL pointer dereference	Sachin Kamat	1	-1/+1
	If 'dvfs_info' is NULL (due to devm_kzalloc failure) the failure error message would try to dereference it. Use 'pdev' instead. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-25	cpufreq: check cpufreq driver is valid and cpufreq isn't disabled in ↵	Viresh Kumar	1	-0/+3
	cpufreq_get() cpufreq_get() can be called from external drivers which might not be aware if cpufreq driver is registered or not. And so we should actually check if cpufreq driver is registered or not and also if cpufreq is active or disabled, at the beginning of cpufreq_get(). Otherwise call to lock_policy_rwsem_read() might hit BUG_ON(!policy). Reported-and-tested-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-25	acpi-cpufreq: skip loading acpi_cpufreq after intel_pstate	Yinghai Lu	1	-0/+4
	If the hw supports intel_pstate and acpi_cpufreq, intel_pstate will get loaded first. acpi_cpufreq_init() will call acpi_cpufreq_early_init() and that will allocate perf data and init those perf data in ACPI core, (that will cover all CPUs). But later it will free them as cpufreq_register_driver(acpi_cpufreq) will fail as intel_pstate is already registered Use cpufreq_get_current_driver() to check if we can skip the acpi_cpufreq loading. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-24	pcc_freq: convert acpi_get_handle() to acpi_has_method()	Zhang Rui	1	-3/+2
	acpi_has_method() is a new ACPI API introduced to check the existence of an ACPI control method. It can be used to replace acpi_get_handle() in the case that 1. the calling function doesn't need the ACPI handle of the control method. and 2. the calling function doesn't care the reason why the method is unavailable. Convert acpi_get_handle() to acpi_has_method() in drivers/cpufreq/pcc_freq.c in this patch. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-20	cpufreq: return EEXIST instead of EBUSY for second registering	Yinghai Lu	1	-1/+1
	On systems that support intel_pstate, acpi_cpufreq fails to load, and udev keeps trying until trace gets filled up and kernel crashes. The root cause is driver return ret from cpufreq_register_driver(), because when some other driver takes over before, it will return EBUSY and then udev will keep trying ... cpufreq_register_driver() should return EEXIST instead so that the system can boot without appending intel_pstate=disable and still use intel_pstate. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-19	cpufreq: imx6q-cpufreq: assign cpu_dev correctly to cpu0 device	Sudeep KarkadaNagesha	1	-1/+6
	Commit cdc58d602d2e657602a90c190cbf745886c95977 "cpufreq: imx6q-cpufreq: remove device tree parsing for cpu nodes" assumed the pdev->dev is set to cpu0 device in the platform code. But it actually points to the virtual cpufreq-cpu0 platform device which is not present in the device tree. Most of the information needed by cpufreq is stored in cpu0 DT node. So cpu_dev must point to cpu0 device. This patch fixes the wrong assignment to cpu_dev. Reported-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Tested-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-19	cpufreq: cpufreq-cpu0: assign cpu_dev correctly to cpu0 device	Sudeep KarkadaNagesha	1	-1/+6
	Commit f837a9b5ab05c52a07108c6f09ca66f2e0aee757 "cpufreq: cpufreq-cpu0: remove device tree parsing for cpu nodes" assumed the pdev->dev is set to cpu0 device in the platform code. But it actually points to the virtual cpufreq-cpu0 platform device which is not present in the device tree. Most of the information needed by cpufreq is stored in cpu0 DT node. So cpu_dev must point to cpu0 device. This patch fixes the wrong assignment to cpu_dev. Reported-and-tested-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Cc: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-18	cpufreq: unlock correct rwsem while updating policy->cpu	Viresh Kumar	1	-2/+11
	Current code looks like this: WARN_ON(lock_policy_rwsem_write(cpu)); update_policy_cpu(policy, new_cpu); unlock_policy_rwsem_write(cpu); {lock\|unlock}_policy_rwsem_write(cpu) takes/releases policy->cpu's rwsem. Because cpu is changing with the call to update_policy_cpu(), the unlock_policy_rwsem_write() will release the incorrect lock. The right solution would be to release the same lock as was taken earlier. Also update_policy_cpu() was also called from cpufreq_add_dev() without any locks and so its better if we move this locking to inside update_policy_cpu(). This patch fixes a regression introduced in 3.12 by commit f9ba680d23 (cpufreq: Extract the handover of policy cpu to a helper function). Reported-and-tested-by: Jon Medhurst<tixy@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-18	cpufreq: Clear policy->cpus bits in __cpufreq_remove_dev_finish()	Viresh Kumar	1	-8/+8
	This broke after a recent change "cedb70a cpufreq: Split __cpufreq_remove_dev() into two parts" from Srivatsa. Consider a scenario where we have two CPUs in a policy (0 & 1) and we are removing CPU 1. On the call to __cpufreq_remove_dev_prepare() we have cleared 1 from policy->cpus and now on a call to __cpufreq_remove_dev_finish() we read cpumask_weight of policy->cpus, which will come as 1 and this code will behave as if we are removing the last CPU from policy :) Fix it by clearing the CPU mask in __cpufreq_remove_dev_finish() instead of __cpufreq_remove_dev_prepare(). Tested-by: Stephen Warren <swarren@wwwdotorg.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-12	Merge branch 'pm-cpufreq'	Rafael J. Wysocki	1	-17/+32
	* pm-cpufreq: cpufreq: Acquire the lock in cpufreq_policy_restore() for reading cpufreq: Prevent problems in update_policy_cpu() if last_cpu == new_cpu cpufreq: Restructure if/else block to avoid unintended behavior cpufreq: Fix crash in cpufreq-stats during suspend/resume
2013-09-12	cpufreq: Acquire the lock in cpufreq_policy_restore() for reading	Lan Tianyu	1	-2/+2
	In cpufreq_policy_restore() before system suspend policy is read from percpu's cpufreq_cpu_data_fallback. It's a read operation rather than a write one, so take the lock for reading in there. Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-12	cpufreq: Prevent problems in update_policy_cpu() if last_cpu == new_cpu	Srivatsa S. Bhat	1	-0/+3
	If update_policy_cpu() is invoked with the existing policy->cpu itself as the new-cpu parameter, then a lot of things can go terribly wrong. In its present form, update_policy_cpu() always assumes that the new-cpu is different from policy->cpu and invokes other functions to perform their respective updates. And those functions implement the actual update like this: per_cpu(..., new_cpu) = per_cpu(..., last_cpu); per_cpu(..., last_cpu) = NULL; Thus, when new_cpu == last_cpu, the final NULL assignment makes the per-cpu references vanish into thin air! (memory leak). From there, it leads to more problems: cpufreq_stats_create_table() now doesn't find the per-cpu reference and hence tries to create a new sysfs-group; but sysfs already had created the group earlier, so it complains that it cannot create a duplicate filename. In short, the repercussions of a rather innocuous invocation of update_policy_cpu() can turn out to be pretty nasty. Ideally update_policy_cpu() should handle this situation (new == last) gracefully, and not lead to such severe problems. So fix it by adding an appropriate check. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Tested-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-12	cpufreq: Restructure if/else block to avoid unintended behavior	Srivatsa S. Bhat	1	-2/+3
	In __cpufreq_remove_dev_prepare(), the code which decides whether to remove the sysfs link or nominate a new policy cpu, is governed by an if/else block with a rather complex set of conditionals. Worse, they harbor a subtlety which leads to certain unintended behavior. The code looks like this: if (cpu != policy->cpu && !frozen) { sysfs_remove_link(&dev->kobj, "cpufreq"); } else if (cpus > 1) { new_cpu = cpufreq_nominate_new_policy_cpu(...); ... update_policy_cpu(..., new_cpu); } The original intention was: If the CPU going offline is not policy->cpu, just remove the link. On the other hand, if the CPU going offline is the policy->cpu itself, handover the policy->cpu job to some other surviving CPU in that policy. But because the 'if' condition also includes the 'frozen' check, now there are two possibilities by which we can enter the 'else' block: 1. cpu == policy->cpu (intended) 2. cpu != policy->cpu && frozen (unintended) Due to the second (unintended) scenario, we end up spuriously nominating a CPU as the policy->cpu, even when the existing policy->cpu is alive and well. This can cause problems further down the line, especially when we end up nominating the same policy->cpu as the new one (ie., old == new), because it totally confuses update_policy_cpu(). To avoid this mess, restructure the if/else block to only do what was originally intended, and thus prevent any unwelcome surprises. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Tested-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-12	cpufreq: Fix crash in cpufreq-stats during suspend/resume	Srivatsa S. Bhat	1	-13/+24
	Stephen Warren reported that the cpufreq-stats code hits a NULL pointer dereference during the second attempt to suspend a system. He also pin-pointed the problem to commit 5302c3f "cpufreq: Perform light-weight init/teardown during suspend/resume". That commit actually ensured that the cpufreq-stats table and the cpufreq-stats sysfs entries are not torn down (ie., not freed) during suspend/resume, which makes it all the more surprising. However, it turns out that the root-cause is not that we access an already freed memory, but that the reference to the allocated memory gets moved around and we lose track of that during resume, leading to the reported crash in a subsequent suspend attempt. In the suspend path, during CPU offline, the value of policy->cpu is updated by choosing one of the surviving CPUs in that policy, as long as there is atleast one CPU in that policy. And cpufreq_stats_update_policy_cpu() is invoked to update the reference to the stats structure by assigning it to the new CPU. However, in the resume path, during CPU online, we end up assigning a fresh CPU as the policy->cpu, without letting cpufreq-stats know about this. Thus the reference to the stats structure remains (incorrectly) associated with the old CPU. So, in a subsequent suspend attempt, during CPU offline, we end up accessing an incorrect location to get the stats structure, which eventually leads to the NULL pointer dereference. Fix this by letting cpufreq-stats know about the update of the policy->cpu during CPU online in the resume path. (Also, move the update_policy_cpu() function higher up in the file, so that __cpufreq_add_dev() can invoke it). Reported-and-tested-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-11	Merge branch 'pm-cpufreq'	Rafael J. Wysocki	3	-34/+76
	* pm-cpufreq: intel_pstate: Add Haswell CPU models Revert "cpufreq: make sure frequency transitions are serialized" cpufreq: Use signed type for 'ret' variable, to store negative error values cpufreq: Remove temporary fix for race between CPU hotplug and sysfs-writes cpufreq: Synchronize the cpufreq store_*() routines with CPU hotplug cpufreq: Invoke __cpufreq_remove_dev_finish() after releasing cpu_hotplug.lock cpufreq: Split __cpufreq_remove_dev() into two parts cpufreq: Fix wrong time unit conversion cpufreq: serialize calls to __cpufreq_governor() cpufreq: don't allow governor limits to be changed when it is disabled
2013-09-11	intel_pstate: Add Haswell CPU models	Nell Hardcastle	1	-0/+5
	Enable the intel_pstate driver for Haswell CPUs. One missing Ivy Bridge model (0x3E) is also included. Models referenced from tools/power/x86/turbostat/turbostat.c:has_nehalem_turbo_ratio_limit Signed-off-by: Nell Hardcastle <nell@spicious.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-10	Revert "cpufreq: make sure frequency transitions are serialized"	Rafael J. Wysocki	1	-15/+0
	Commit 7c30ed5 (cpufreq: make sure frequency transitions are serialized) attempted to serialize frequency transitions by adding checks to the CPUFREQ_PRECHANGE and CPUFREQ_POSTCHANGE notifications. However, it assumed that the notifications will always originate from the driver's .target() callback, but they also can be triggered by cpufreq_out_of_sync() and that leads to warnings like this on some systems: WARNING: CPU: 0 PID: 14543 at drivers/cpufreq/cpufreq.c:317 __cpufreq_notify_transition+0x238/0x260() In middle of another frequency transition accompanied by a call trace similar to this one: [<ffffffff81720daa>] dump_stack+0x46/0x58 [<ffffffff8106534c>] warn_slowpath_common+0x8c/0xc0 [<ffffffff815b8560>] ? acpi_cpufreq_target+0x320/0x320 [<ffffffff81065436>] warn_slowpath_fmt+0x46/0x50 [<ffffffff815b1ec8>] __cpufreq_notify_transition+0x238/0x260 [<ffffffff815b33be>] cpufreq_notify_transition+0x3e/0x70 [<ffffffff815b345d>] cpufreq_out_of_sync+0x6d/0xb0 [<ffffffff815b370c>] cpufreq_update_policy+0x10c/0x160 [<ffffffff815b3760>] ? cpufreq_update_policy+0x160/0x160 [<ffffffff81413813>] cpufreq_set_cur_state+0x8c/0xb5 [<ffffffff814138df>] processor_set_cur_state+0xa3/0xcf [<ffffffff8158e13c>] thermal_cdev_update+0x9c/0xb0 [<ffffffff8159046a>] step_wise_throttle+0x5a/0x90 [<ffffffff8158e21f>] handle_thermal_trip+0x4f/0x140 [<ffffffff8158e377>] thermal_zone_device_update+0x57/0xa0 [<ffffffff81415b36>] acpi_thermal_check+0x2e/0x30 [<ffffffff81415ca0>] acpi_thermal_notify+0x40/0xdc [<ffffffff813e7dbd>] acpi_device_notify+0x19/0x1b [<ffffffff813f8241>] acpi_ev_notify_dispatch+0x41/0x5c [<ffffffff813e3fbe>] acpi_os_execute_deferred+0x25/0x32 [<ffffffff81081060>] process_one_work+0x170/0x4a0 [<ffffffff81082121>] worker_thread+0x121/0x390 [<ffffffff81082000>] ? manage_workers.isra.20+0x170/0x170 [<ffffffff81088fe0>] kthread+0xc0/0xd0 [<ffffffff81088f20>] ? flush_kthread_worker+0xb0/0xb0 [<ffffffff8173582c>] ret_from_fork+0x7c/0xb0 [<ffffffff81088f20>] ? flush_kthread_worker+0xb0/0xb0 For this reason, revert commit 7c30ed5 along with the fix 266c13d (cpufreq: Fix serialization of frequency transitions) on top of it and we will revisit the serialization problem later. Reported-by: Alessandro Bono <alessandro.bono@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-10	cpufreq: Use signed type for 'ret' variable, to store negative error values	Srivatsa S. Bhat	1	-2/+2
	There are places where the variable 'ret' is declared as unsigned int and then used to store negative return values such as -EINVAL. Fix them by declaring the variable as a signed quantity. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-10	cpufreq: Remove temporary fix for race between CPU hotplug and sysfs-writes	Srivatsa S. Bhat	1	-6/+1
	Commit "cpufreq: serialize calls to __cpufreq_governor()" had been a temporary and partial solution to the race condition between writing to a cpufreq sysfs file and taking a CPU offline. Now that we have a proper and complete solution to that problem, remove the temporary fix. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-10	cpufreq: Synchronize the cpufreq store_*() routines with CPU hotplug	Srivatsa S. Bhat	1	-2/+9
	The functions that are used to write to cpufreq sysfs files (such as store_scaling_max_freq()) are not hotplug safe. They can race with CPU hotplug tasks and lead to problems such as trying to acquire an already destroyed timer-mutex etc. Eg: __cpufreq_remove_dev() __cpufreq_governor(policy, CPUFREQ_GOV_STOP); policy->governor->governor(policy, CPUFREQ_GOV_STOP); cpufreq_governor_dbs() case CPUFREQ_GOV_STOP: mutex_destroy(&cpu_cdbs->timer_mutex) cpu_cdbs->cur_policy = NULL; <PREEMPT> store() __cpufreq_set_policy() __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS); policy->governor->governor(policy, CPUFREQ_GOV_LIMITS); case CPUFREQ_GOV_LIMITS: mutex_lock(&cpu_cdbs->timer_mutex); <-- Warning (destroyed mutex) if (policy->max < cpu_cdbs->cur_policy->cur) <- cur_policy == NULL So use get_online_cpus()/put_online_cpus() in the store_() functions, to synchronize with CPU hotplug. However, there is an additional point to note here: some parts of the CPU teardown in the cpufreq subsystem are done in the CPU_POST_DEAD stage, with cpu_hotplug.lock released*. So, using the get/put_online_cpus() functions alone is insufficient; we should also ensure that we don't race with those latter steps in the hotplug sequence. We can easily achieve this by checking if the CPU is online before proceeding with the store, since the CPU would have been marked offline by the time the CPU_POST_DEAD notifiers are executed. Reported-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-10	cpufreq: Invoke __cpufreq_remove_dev_finish() after releasing cpu_hotplug.lock	Srivatsa S. Bhat	1	-0/+3
	__cpufreq_remove_dev_finish() handles the kobject cleanup for a CPU going offline. But because we destroy the kobject towards the end of the CPU offline phase, there are certain race windows where a task can try to write to a cpufreq sysfs file (eg: using store_scaling_max_freq()) while we are taking that CPU offline, and this can bump up the kobject refcount, which in turn might hinder the CPU offline task from running to completion. (It can also cause other more serious problems such as trying to acquire a destroyed timer-mutex etc., depending on the exact stage of the cleanup at which the task managed to take a new refcount). To fix the race window, we will need to synchronize those store_() call-sites with CPU hotplug, using get_online_cpus()/put_online_cpus(). However, that in turn can cause a total deadlock because it can end up waiting for the CPU offline task to complete, with incremented refcount! Write to sysfs CPU offline task -------------- ---------------- kobj_refcnt++ Acquire cpu_hotplug.lock get_online_cpus(); Wait for kobj_refcnt to drop to zero DEADLOCK* A simple way to avoid this problem is to perform the kobject cleanup in the CPU offline path, with the cpu_hotplug.lock released. That is, we can perform the wait-for-kobj-refcnt-to-drop as well as the subsequent cleanup in the CPU_POST_DEAD stage of CPU offline, which is run with cpu_hotplug.lock released. Doing this helps us avoid deadlocks due to holding kobject refcounts and waiting on each other on the cpu_hotplug.lock. (Note: We can't move all of the cpufreq CPU offline steps to the CPU_POST_DEAD stage, because certain things such as stopping the governors have to be done before the outgoing CPU is marked offline. So retain those parts in the CPU_DOWN_PREPARE stage itself). Reported-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-10	cpufreq: Split __cpufreq_remove_dev() into two parts	Srivatsa S. Bhat	1	-12/+53
	During CPU offline, the cpufreq core invokes __cpufreq_remove_dev() to perform work such as stopping the cpufreq governor, clearing the CPU from the policy structure etc, and finally cleaning up the kobject. There are certain subtle issues related to the kobject cleanup, and it would be much easier to deal with them if we separate that part from the rest of the cleanup-work in the CPU offline phase. So split the __cpufreq_remove_dev() function into 2 parts: one that handles the kobject cleanup, and the other that handles the rest of the work. Reported-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-10	cpufreq: Fix wrong time unit conversion	Andreas Schwab	1	-1/+1
	The time spent by a CPU under a given frequency is stored in jiffies unit in the cpu var cpufreq_stats_table->time_in_state[i], i being the index of the frequency. This is what is displayed in the following file on the right column: cat /sys/devices/system/cpu/cpuX/cpufreq/stats/time_in_state 2301000 19835820 2300000 3172 [...] Now cpufreq converts this jiffies unit delta to clock_t before returning it to the user as in the above file. And that conversion is achieved using the API cputime64_to_clock_t(). Although it accidentally works on traditional tick based cputime accounting, where cputime_t maps directly to jiffies, it doesn't work with other types of cputime accounting such as CONFIG_VIRT_CPU_ACCOUNTING_* where cputime_t can map to nsecs or any granularity preffered by the architecture. For example we get a buggy zero delta on full dyntick configurations: cat /sys/devices/system/cpu/cpuX/cpufreq/stats/time_in_state 2301000 0 2300000 0 [...] Fix this with using the proper jiffies_64_t to clock_t conversion. Reported-and-tested-by: Carsten Emde <C.Emde@osadl.org> Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-10	cpufreq: serialize calls to __cpufreq_governor()	Viresh Kumar	1	-1/+6
	We can't take a big lock around __cpufreq_governor() as this causes recursive locking for some cases. But calls to this routine must be serialized for every policy. Otherwise we can see some unpredictable events. For example, consider following scenario: __cpufreq_remove_dev() __cpufreq_governor(policy, CPUFREQ_GOV_STOP); policy->governor->governor(policy, CPUFREQ_GOV_STOP); cpufreq_governor_dbs() case CPUFREQ_GOV_STOP: mutex_destroy(&cpu_cdbs->timer_mutex) cpu_cdbs->cur_policy = NULL; <PREEMPT> store() __cpufreq_set_policy() __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS); policy->governor->governor(policy, CPUFREQ_GOV_LIMITS); case CPUFREQ_GOV_LIMITS: mutex_lock(&cpu_cdbs->timer_mutex); <-- Warning (destroyed mutex) if (policy->max < cpu_cdbs->cur_policy->cur) <- cur_policy == NULL And so store() will eventually result in a crash if cur_policy is NULL at this point. Introduce an additional variable which would guarantee serialization here. Reported-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>