diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2020-01-27 22:23:54 +0300 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2020-01-27 22:23:54 +0300 |
commit | 6d277aca488fdf0a1e67cd14b5a58869f66197c9 (patch) | |
tree | 2ed50bf4bb32092a9a95e7952533cbde98baeb24 | |
parent | aae1464f46a2403565f75717438118691d31ccf1 (diff) | |
parent | c102671af085aacf17219e9bdcfccddc6620a866 (diff) | |
download | linux-6d277aca488fdf0a1e67cd14b5a58869f66197c9.tar.xz |
Merge tag 'pm-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These add ACPI support to the intel_idle driver along with an admin
guide document for it, add support for CPR (Core Power Reduction) to
the AVS (Adaptive Voltage Scaling) subsystem, add new hardware support
in a few places, add some new sysfs attributes, debugfs files and
tracepoints, fix bugs and clean up a bunch of things all over.
Specifics:
- Update the ACPI processor driver in order to export
acpi_processor_evaluate_cst() to the code outside of it, add ACPI
support to the intel_idle driver based on that and clean up that
driver somewhat (Rafael Wysocki).
- Add an admin guide document for the intel_idle driver (Rafael
Wysocki).
- Clean up cpuidle core and drivers, enable compilation testing for
some of them (Benjamin Gaignard, Krzysztof Kozlowski, Rafael
Wysocki, Yangtao Li).
- Fix reference counting of OPP (operating performance points) table
structures (Viresh Kumar).
- Add support for CPR (Core Power Reduction) to the AVS (Adaptive
Voltage Scaling) subsystem (Niklas Cassel, Colin Ian King,
YueHaibing).
- Add support for TigerLake Mobile and JasperLake to the Intel RAPL
power capping driver (Zhang Rui).
- Update cpufreq drivers:
- Add i.MX8MP support to imx-cpufreq-dt (Anson Huang).
- Fix usage of a macro in loongson2_cpufreq (Alexandre Oliva).
- Fix cpufreq policy reference counting issues in s3c and
brcmstb-avs (chenqiwu).
- Fix ACPI table reference counting issue and HiSilicon quirk
handling in the CPPC driver (Hanjun Guo).
- Clean up spelling mistake in intel_pstate (Harry Pan).
- Convert the kirkwood and tegra186 drivers to using
devm_platform_ioremap_resource() (Yangtao Li).
- Update devfreq core:
- Add 'name' sysfs attribute for devfreq devices (Chanwoo Choi).
- Clean up the handing of transition statistics and allow them to
be reset by writing 0 to the 'trans_stat' devfreq device
attribute in sysfs (Kamil Konieczny).
- Add 'devfreq_summary' to debugfs (Chanwoo Choi).
- Clean up kerneldoc comments and Kconfig indentation (Krzysztof
Kozlowski, Randy Dunlap).
- Update devfreq drivers:
- Add dynamic scaling for the imx8m DDR controller and clean up
imx8m-ddrc (Leonard Crestez, YueHaibing).
- Fix DT node reference counting and nitialization error code path
in rk3399_dmc and add COMPILE_TEST and HAVE_ARM_SMCCC dependency
for it (Chanwoo Choi, Yangtao Li).
- Fix DT node reference counting in rockchip-dfi and make it use
devm_platform_ioremap_resource() (Yangtao Li).
- Fix excessive stack usage in exynos-ppmu (Arnd Bergmann).
- Fix initialization error code paths in exynos-bus (Yangtao Li).
- Clean up exynos-bus and exynos somewhat (Artur Świgoń, Krzysztof
Kozlowski).
- Add tracepoints for tracking usage_count updates unrelated to
status changes in PM-runtime (Michał Mirosław).
- Add sysfs attribute to control the "sync on suspend" behavior
during system-wide suspend (Jonas Meurer).
- Switch system-wide suspend tests over to 64-bit time (Alexandre
Belloni).
- Make wakeup sources statistics in debugfs cover deleted ones which
used to be the case some time ago (zhuguangqing).
- Clean up computations carried out during hibernation, update
messages related to hibernation and fix a spelling mistake in one
of them (Wen Yang, Luigi Semenzato, Colin Ian King).
- Add mailmap entry for maintainer e-mail address that has not been
functional for several years (Rafael Wysocki)"
* tag 'pm-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (83 commits)
cpufreq: loongson2_cpufreq: adjust cpufreq uses of LOONGSON_CHIPCFG
intel_idle: Clean up irtl_2_usec()
intel_idle: Move 3 functions closer to their callers
intel_idle: Annotate initialization code and data structures
intel_idle: Move and clean up intel_idle_cpuidle_devices_uninit()
intel_idle: Rearrange intel_idle_cpuidle_driver_init()
intel_idle: Clean up NULL pointer check in intel_idle_init()
intel_idle: Fold intel_idle_probe() into intel_idle_init()
intel_idle: Eliminate __setup_broadcast_timer()
cpuidle: fix cpuidle_find_deepest_state() kerneldoc warnings
cpuidle: sysfs: fix warnings when compiling with W=1
cpuidle: coupled: fix warnings when compiling with W=1
cpufreq: brcmstb-avs: fix imbalance of cpufreq policy refcount
PM: suspend: Add sysfs attribute to control the "sync on suspend" behavior
PM / devfreq: Add debugfs support with devfreq_summary file
Documentation: admin-guide: PM: Add intel_idle document
cpuidle: arm: Enable compile testing for some of drivers
PM-runtime: add tracepoints for usage_count changes
cpufreq: intel_pstate: fix spelling mistake: "Whethet" -> "Whether"
PM: hibernate: fix spelling mistake "shapshot" -> "snapshot"
...
65 files changed, 3829 insertions, 606 deletions
@@ -217,6 +217,7 @@ Praveen BP <praveenbp@ti.com> Punit Agrawal <punitagrawal@gmail.com> <punit.agrawal@arm.com> Qais Yousef <qsyousef@gmail.com> <qais.yousef@imgtec.com> Quentin Perret <qperret@qperret.net> <quentin.perret@arm.com> +Rafael J. Wysocki <rjw@rjwysocki.net> <rjw@sisk.pl> Rajesh Shah <rajesh.shah@intel.com> Ralf Baechle <ralf@linux-mips.org> Ralf Wildenhues <Ralf.Wildenhues@gmx.de> diff --git a/Documentation/ABI/testing/sysfs-class-devfreq b/Documentation/ABI/testing/sysfs-class-devfreq index 01196e19afca..9758eb85ade3 100644 --- a/Documentation/ABI/testing/sysfs-class-devfreq +++ b/Documentation/ABI/testing/sysfs-class-devfreq @@ -7,6 +7,13 @@ Description: The name of devfreq object denoted as ... is same as the name of device using devfreq. +What: /sys/class/devfreq/.../name +Date: November 2019 +Contact: Chanwoo Choi <cw00.choi@samsung.com> +Description: + The /sys/class/devfreq/.../name shows the name of device + of the corresponding devfreq object. + What: /sys/class/devfreq/.../governor Date: September 2011 Contact: MyungJoo Ham <myungjoo.ham@samsung.com> @@ -48,12 +55,15 @@ What: /sys/class/devfreq/.../trans_stat Date: October 2012 Contact: MyungJoo Ham <myungjoo.ham@samsung.com> Description: - This ABI shows the statistics of devfreq behavior on a - specific device. It shows the time spent in each state and - the number of transitions between states. + This ABI shows or clears the statistics of devfreq behavior + on a specific device. It shows the time spent in each state + and the number of transitions between states. In order to activate this ABI, the devfreq target device driver should provide the list of available frequencies - with its profile. + with its profile. If need to reset the statistics of devfreq + behavior on a specific device, enter 0(zero) to 'trans_stat' + as following: + echo 0 > /sys/class/devfreq/.../trans_stat What: /sys/class/devfreq/.../userspace/set_freq Date: September 2011 diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu index fc20cde63d1e..2e0e3b45d02a 100644 --- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -196,6 +196,12 @@ Description: does not reflect it. Likewise, if one enables a deep state but a lighter state still is disabled, then this has no effect. +What: /sys/devices/system/cpu/cpuX/cpuidle/stateN/default_status +Date: December 2019 +KernelVersion: v5.6 +Contact: Linux power management list <linux-pm@vger.kernel.org> +Description: + (RO) The default status of this state, "enabled" or "disabled". What: /sys/devices/system/cpu/cpuX/cpuidle/stateN/residency Date: March 2014 diff --git a/Documentation/ABI/testing/sysfs-power b/Documentation/ABI/testing/sysfs-power index 6f87b9dd384b..5e6ead29124c 100644 --- a/Documentation/ABI/testing/sysfs-power +++ b/Documentation/ABI/testing/sysfs-power @@ -407,3 +407,16 @@ Contact: Kalesh Singh <kaleshsingh96@gmail.com> Description: The /sys/power/suspend_stats/last_failed_step file contains the last failed step in the suspend/resume path. + +What: /sys/power/sync_on_suspend +Date: October 2019 +Contact: Jonas Meurer <jonas@freesources.org> +Description: + This file controls whether or not the kernel will sync() + filesystems during system suspend (after freezing user space + and before suspending devices). + + Writing a "1" to this file enables the sync() and writing a "0" + disables it. Reads from the file return the current value. + The default is "1" if the build-time "SUSPEND_SKIP_SYNC" config + flag is unset, or "0" otherwise. diff --git a/Documentation/admin-guide/pm/cpuidle.rst b/Documentation/admin-guide/pm/cpuidle.rst index e70b365dbc60..311cd7cc2b75 100644 --- a/Documentation/admin-guide/pm/cpuidle.rst +++ b/Documentation/admin-guide/pm/cpuidle.rst @@ -506,6 +506,9 @@ object corresponding to it, as follows: ``disable`` Whether or not this idle state is disabled. +``default_status`` + The default status of this state, "enabled" or "disabled". + ``latency`` Exit latency of the idle state in microseconds. diff --git a/Documentation/admin-guide/pm/intel_idle.rst b/Documentation/admin-guide/pm/intel_idle.rst new file mode 100644 index 000000000000..afbf778035f8 --- /dev/null +++ b/Documentation/admin-guide/pm/intel_idle.rst @@ -0,0 +1,246 @@ +.. SPDX-License-Identifier: GPL-2.0 +.. include:: <isonum.txt> + +============================================== +``intel_idle`` CPU Idle Time Management Driver +============================================== + +:Copyright: |copy| 2020 Intel Corporation + +:Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com> + + +General Information +=================== + +``intel_idle`` is a part of the +:doc:`CPU idle time management subsystem <cpuidle>` in the Linux kernel +(``CPUIdle``). It is the default CPU idle time management driver for the +Nehalem and later generations of Intel processors, but the level of support for +a particular processor model in it depends on whether or not it recognizes that +processor model and may also depend on information coming from the platform +firmware. [To understand ``intel_idle`` it is necessary to know how ``CPUIdle`` +works in general, so this is the time to get familiar with :doc:`cpuidle` if you +have not done that yet.] + +``intel_idle`` uses the ``MWAIT`` instruction to inform the processor that the +logical CPU executing it is idle and so it may be possible to put some of the +processor's functional blocks into low-power states. That instruction takes two +arguments (passed in the ``EAX`` and ``ECX`` registers of the target CPU), the +first of which, referred to as a *hint*, can be used by the processor to +determine what can be done (for details refer to Intel Software Developer’s +Manual [1]_). Accordingly, ``intel_idle`` refuses to work with processors in +which the support for the ``MWAIT`` instruction has been disabled (for example, +via the platform firmware configuration menu) or which do not support that +instruction at all. + +``intel_idle`` is not modular, so it cannot be unloaded, which means that the +only way to pass early-configuration-time parameters to it is via the kernel +command line. + + +.. _intel-idle-enumeration-of-states: + +Enumeration of Idle States +========================== + +Each ``MWAIT`` hint value is interpreted by the processor as a license to +reconfigure itself in a certain way in order to save energy. The processor +configurations (with reduced power draw) resulting from that are referred to +as C-states (in the ACPI terminology) or idle states. The list of meaningful +``MWAIT`` hint values and idle states (i.e. low-power configurations of the +processor) corresponding to them depends on the processor model and it may also +depend on the configuration of the platform. + +In order to create a list of available idle states required by the ``CPUIdle`` +subsystem (see :ref:`idle-states-representation` in :doc:`cpuidle`), +``intel_idle`` can use two sources of information: static tables of idle states +for different processor models included in the driver itself and the ACPI tables +of the system. The former are always used if the processor model at hand is +recognized by ``intel_idle`` and the latter are used if that is required for +the given processor model (which is the case for all server processor models +recognized by ``intel_idle``) or if the processor model is not recognized. + +If the ACPI tables are going to be used for building the list of available idle +states, ``intel_idle`` first looks for a ``_CST`` object under one of the ACPI +objects corresponding to the CPUs in the system (refer to the ACPI specification +[2]_ for the description of ``_CST`` and its output package). Because the +``CPUIdle`` subsystem expects that the list of idle states supplied by the +driver will be suitable for all of the CPUs handled by it and ``intel_idle`` is +registered as the ``CPUIdle`` driver for all of the CPUs in the system, the +driver looks for the first ``_CST`` object returning at least one valid idle +state description and such that all of the idle states included in its return +package are of the FFH (Functional Fixed Hardware) type, which means that the +``MWAIT`` instruction is expected to be used to tell the processor that it can +enter one of them. The return package of that ``_CST`` is then assumed to be +applicable to all of the other CPUs in the system and the idle state +descriptions extracted from it are stored in a preliminary list of idle states +coming from the ACPI tables. [This step is skipped if ``intel_idle`` is +configured to ignore the ACPI tables; see `below <intel-idle-parameters_>`_.] + +Next, the first (index 0) entry in the list of available idle states is +initialized to represent a "polling idle state" (a pseudo-idle state in which +the target CPU continuously fetches and executes instructions), and the +subsequent (real) idle state entries are populated as follows. + +If the processor model at hand is recognized by ``intel_idle``, there is a +(static) table of idle state descriptions for it in the driver. In that case, +the "internal" table is the primary source of information on idle states and the +information from it is copied to the final list of available idle states. If +using the ACPI tables for the enumeration of idle states is not required +(depending on the processor model), all of the listed idle state are enabled by +default (so all of them will be taken into consideration by ``CPUIdle`` +governors during CPU idle state selection). Otherwise, some of the listed idle +states may not be enabled by default if there are no matching entries in the +preliminary list of idle states coming from the ACPI tables. In that case user +space still can enable them later (on a per-CPU basis) with the help of +the ``disable`` idle state attribute in ``sysfs`` (see +:ref:`idle-states-representation` in :doc:`cpuidle`). This basically means that +the idle states "known" to the driver may not be enabled by default if they have +not been exposed by the platform firmware (through the ACPI tables). + +If the given processor model is not recognized by ``intel_idle``, but it +supports ``MWAIT``, the preliminary list of idle states coming from the ACPI +tables is used for building the final list that will be supplied to the +``CPUIdle`` core during driver registration. For each idle state in that list, +the description, ``MWAIT`` hint and exit latency are copied to the corresponding +entry in the final list of idle states. The name of the idle state represented +by it (to be returned by the ``name`` idle state attribute in ``sysfs``) is +"CX_ACPI", where X is the index of that idle state in the final list (note that +the minimum value of X is 1, because 0 is reserved for the "polling" state), and +its target residency is based on the exit latency value. Specifically, for +C1-type idle states the exit latency value is also used as the target residency +(for compatibility with the majority of the "internal" tables of idle states for +various processor models recognized by ``intel_idle``) and for the other idle +state types (C2 and C3) the target residency value is 3 times the exit latency +(again, that is because it reflects the target residency to exit latency ratio +in the majority of cases for the processor models recognized by ``intel_idle``). +All of the idle states in the final list are enabled by default in this case. + + +.. _intel-idle-initialization: + +Initialization +============== + +The initialization of ``intel_idle`` starts with checking if the kernel command +line options forbid the use of the ``MWAIT`` instruction. If that is the case, +an error code is returned right away. + +The next step is to check whether or not the processor model is known to the +driver, which determines the idle states enumeration method (see +`above <intel-idle-enumeration-of-states_>`_), and whether or not the processor +supports ``MWAIT`` (the initialization fails if that is not the case). Then, +the ``MWAIT`` support in the processor is enumerated through ``CPUID`` and the +driver initialization fails if the level of support is not as expected (for +example, if the total number of ``MWAIT`` substates returned is 0). + +Next, if the driver is not configured to ignore the ACPI tables (see +`below <intel-idle-parameters_>`_), the idle states information provided by the +platform firmware is extracted from them. + +Then, ``CPUIdle`` device objects are allocated for all CPUs and the list of +available idle states is created as explained +`above <intel-idle-enumeration-of-states_>`_. + +Finally, ``intel_idle`` is registered with the help of cpuidle_register_driver() +as the ``CPUIdle`` driver for all CPUs in the system and a CPU online callback +for configuring individual CPUs is registered via cpuhp_setup_state(), which +(among other things) causes the callback routine to be invoked for all of the +CPUs present in the system at that time (each CPU executes its own instance of +the callback routine). That routine registers a ``CPUIdle`` device for the CPU +running it (which enables the ``CPUIdle`` subsystem to operate that CPU) and +optionally performs some CPU-specific initialization actions that may be +required for the given processor model. + + +.. _intel-idle-parameters: + +Kernel Command Line Options and Module Parameters +================================================= + +The *x86* architecture support code recognizes three kernel command line +options related to CPU idle time management: ``idle=poll``, ``idle=halt``, +and ``idle=nomwait``. If any of them is present in the kernel command line, the +``MWAIT`` instruction is not allowed to be used, so the initialization of +``intel_idle`` will fail. + +Apart from that there are two module parameters recognized by ``intel_idle`` +itself that can be set via the kernel command line (they cannot be updated via +sysfs, so that is the only way to change their values). + +The ``max_cstate`` parameter value is the maximum idle state index in the list +of idle states supplied to the ``CPUIdle`` core during the registration of the +driver. It is also the maximum number of regular (non-polling) idle states that +can be used by ``intel_idle``, so the enumeration of idle states is terminated +after finding that number of usable idle states (the other idle states that +potentially might have been used if ``max_cstate`` had been greater are not +taken into consideration at all). Setting ``max_cstate`` can prevent +``intel_idle`` from exposing idle states that are regarded as "too deep" for +some reason to the ``CPUIdle`` core, but it does so by making them effectively +invisible until the system is shut down and started again which may not always +be desirable. In practice, it is only really necessary to do that if the idle +states in question cannot be enabled during system startup, because in the +working state of the system the CPU power management quality of service (PM +QoS) feature can be used to prevent ``CPUIdle`` from touching those idle states +even if they have been enumerated (see :ref:`cpu-pm-qos` in :doc:`cpuidle`). +Setting ``max_cstate`` to 0 causes the ``intel_idle`` initialization to fail. + +The ``noacpi`` module parameter (which is recognized by ``intel_idle`` if the +kernel has been configured with ACPI support), can be set to make the driver +ignore the system's ACPI tables entirely (it is unset by default). + + +.. _intel-idle-core-and-package-idle-states: + +Core and Package Levels of Idle States +====================================== + +Typically, in a processor supporting the ``MWAIT`` instruction there are (at +least) two levels of idle states (or C-states). One level, referred to as +"core C-states", covers individual cores in the processor, whereas the other +level, referred to as "package C-states", covers the entire processor package +and it may also involve other components of the system (GPUs, memory +controllers, I/O hubs etc.). + +Some of the ``MWAIT`` hint values allow the processor to use core C-states only +(most importantly, that is the case for the ``MWAIT`` hint value corresponding +to the ``C1`` idle state), but the majority of them give it a license to put +the target core (i.e. the core containing the logical CPU executing ``MWAIT`` +with the given hint value) into a specific core C-state and then (if possible) +to enter a specific package C-state at the deeper level. For example, the +``MWAIT`` hint value representing the ``C3`` idle state allows the processor to +put the target core into the low-power state referred to as "core ``C3``" (or +``CC3``), which happens if all of the logical CPUs (SMT siblings) in that core +have executed ``MWAIT`` with the ``C3`` hint value (or with a hint value +representing a deeper idle state), and in addition to that (in the majority of +cases) it gives the processor a license to put the entire package (possibly +including some non-CPU components such as a GPU or a memory controller) into the +low-power state referred to as "package ``C3``" (or ``PC3``), which happens if +all of the cores have gone into the ``CC3`` state and (possibly) some additional +conditions are satisfied (for instance, if the GPU is covered by ``PC3``, it may +be required to be in a certain GPU-specific low-power state for ``PC3`` to be +reachable). + +As a rule, there is no simple way to make the processor use core C-states only +if the conditions for entering the corresponding package C-states are met, so +the logical CPU executing ``MWAIT`` with a hint value that is not core-level +only (like for ``C1``) must always assume that this may cause the processor to +enter a package C-state. [That is why the exit latency and target residency +values corresponding to the majority of ``MWAIT`` hint values in the "internal" +tables of idle states in ``intel_idle`` reflect the properties of package +C-states.] If using package C-states is not desirable at all, either +:ref:`PM QoS <cpu-pm-qos>` or the ``max_cstate`` module parameter of +``intel_idle`` described `above <intel-idle-parameters_>`_ must be used to +restrict the range of permissible idle states to the ones with core-level only +``MWAIT`` hint values (like ``C1``). + + +References +========== + +.. [1] *Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 2B*, + https://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-vol-2b-manual.html + +.. [2] *Advanced Configuration and Power Interface (ACPI) Specification*, + https://uefi.org/specifications diff --git a/Documentation/admin-guide/pm/working-state.rst b/Documentation/admin-guide/pm/working-state.rst index fc298eb1234b..88f717e59a42 100644 --- a/Documentation/admin-guide/pm/working-state.rst +++ b/Documentation/admin-guide/pm/working-state.rst @@ -8,6 +8,7 @@ Working-State Power Management :maxdepth: 2 cpuidle + intel_idle cpufreq intel_pstate intel_epb diff --git a/Documentation/devicetree/bindings/memory-controllers/fsl/imx8m-ddrc.yaml b/Documentation/devicetree/bindings/memory-controllers/fsl/imx8m-ddrc.yaml new file mode 100644 index 000000000000..c9e6c22cb5be --- /dev/null +++ b/Documentation/devicetree/bindings/memory-controllers/fsl/imx8m-ddrc.yaml @@ -0,0 +1,72 @@ +# SPDX-License-Identifier: GPL-2.0 +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/memory-controllers/fsl/imx8m-ddrc.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: i.MX8M DDR Controller + +maintainers: + - Leonard Crestez <leonard.crestez@nxp.com> + +description: + The DDRC block is integrated in i.MX8M for interfacing with DDR based + memories. + + It supports switching between different frequencies at runtime but during + this process RAM itself becomes briefly inaccessible so actual frequency + switching is implemented by TF-A code which runs from a SRAM area. + + The Linux driver for the DDRC doesn't even map registers (they're included + for the sake of "describing hardware"), it mostly just exposes firmware + capabilities through standard Linux mechanism like devfreq and OPP tables. + +properties: + compatible: + items: + - enum: + - fsl,imx8mn-ddrc + - fsl,imx8mm-ddrc + - fsl,imx8mq-ddrc + - const: fsl,imx8m-ddrc + + reg: + maxItems: 1 + description: + Base address and size of DDRC CTL area. + This is not currently mapped by the imx8m-ddrc driver. + + clocks: + maxItems: 4 + + clock-names: + items: + - const: core + - const: pll + - const: alt + - const: apb + + operating-points-v2: true + opp-table: true + +required: + - reg + - compatible + - clocks + - clock-names + +additionalProperties: false + +examples: + - | + #include <dt-bindings/clock/imx8mm-clock.h> + ddrc: memory-controller@3d400000 { + compatible = "fsl,imx8mm-ddrc", "fsl,imx8m-ddrc"; + reg = <0x3d400000 0x400000>; + clock-names = "core", "pll", "alt", "apb"; + clocks = <&clk IMX8MM_CLK_DRAM_CORE>, + <&clk IMX8MM_DRAM_PLL>, + <&clk IMX8MM_CLK_DRAM_ALT>, + <&clk IMX8MM_CLK_DRAM_APB>; + operating-points-v2 = <&ddrc_opp_table>; + }; diff --git a/Documentation/devicetree/bindings/power/avs/qcom,cpr.txt b/Documentation/devicetree/bindings/power/avs/qcom,cpr.txt new file mode 100644 index 000000000000..ab0d5ebbad4e --- /dev/null +++ b/Documentation/devicetree/bindings/power/avs/qcom,cpr.txt @@ -0,0 +1,130 @@ +QCOM CPR (Core Power Reduction) + +CPR (Core Power Reduction) is a technology to reduce core power on a CPU +or other device. Each OPP of a device corresponds to a "corner" that has +a range of valid voltages for a particular frequency. While the device is +running at a particular frequency, CPR monitors dynamic factors such as +temperature, etc. and suggests adjustments to the voltage to save power +and meet silicon characteristic requirements. + +- compatible: + Usage: required + Value type: <string> + Definition: should be "qcom,qcs404-cpr", "qcom,cpr" for qcs404 + +- reg: + Usage: required + Value type: <prop-encoded-array> + Definition: base address and size of the rbcpr register region + +- interrupts: + Usage: required + Value type: <prop-encoded-array> + Definition: should specify the CPR interrupt + +- clocks: + Usage: required + Value type: <prop-encoded-array> + Definition: phandle to the reference clock + +- clock-names: + Usage: required + Value type: <stringlist> + Definition: must be "ref" + +- vdd-apc-supply: + Usage: required + Value type: <phandle> + Definition: phandle to the vdd-apc-supply regulator + +- #power-domain-cells: + Usage: required + Value type: <u32> + Definition: should be 0 + +- operating-points-v2: + Usage: required + Value type: <phandle> + Definition: A phandle to the OPP table containing the + performance states supported by the CPR + power domain + +- acc-syscon: + Usage: optional + Value type: <phandle> + Definition: phandle to syscon for writing ACC settings + +- nvmem-cells: + Usage: required + Value type: <phandle> + Definition: phandle to nvmem cells containing the data + that makes up a fuse corner, for each fuse corner. + As well as the CPR fuse revision. + +- nvmem-cell-names: + Usage: required + Value type: <stringlist> + Definition: should be "cpr_quotient_offset1", "cpr_quotient_offset2", + "cpr_quotient_offset3", "cpr_init_voltage1", + "cpr_init_voltage2", "cpr_init_voltage3", "cpr_quotient1", + "cpr_quotient2", "cpr_quotient3", "cpr_ring_osc1", + "cpr_ring_osc2", "cpr_ring_osc3", "cpr_fuse_revision" + for qcs404. + +Example: + + cpr_opp_table: cpr-opp-table { + compatible = "operating-points-v2-qcom-level"; + + cpr_opp1: opp1 { + opp-level = <1>; + qcom,opp-fuse-level = <1>; + }; + cpr_opp2: opp2 { + opp-level = <2>; + qcom,opp-fuse-level = <2>; + }; + cpr_opp3: opp3 { + opp-level = <3>; + qcom,opp-fuse-level = <3>; + }; + }; + + power-controller@b018000 { + compatible = "qcom,qcs404-cpr", "qcom,cpr"; + reg = <0x0b018000 0x1000>; + interrupts = <0 15 IRQ_TYPE_EDGE_RISING>; + clocks = <&xo_board>; + clock-names = "ref"; + vdd-apc-supply = <&pms405_s3>; + #power-domain-cells = <0>; + operating-points-v2 = <&cpr_opp_table>; + acc-syscon = <&tcsr>; + + nvmem-cells = <&cpr_efuse_quot_offset1>, + <&cpr_efuse_quot_offset2>, + <&cpr_efuse_quot_offset3>, + <&cpr_efuse_init_voltage1>, + <&cpr_efuse_init_voltage2>, + <&cpr_efuse_init_voltage3>, + <&cpr_efuse_quot1>, + <&cpr_efuse_quot2>, + <&cpr_efuse_quot3>, + <&cpr_efuse_ring1>, + <&cpr_efuse_ring2>, + <&cpr_efuse_ring3>, + <&cpr_efuse_revision>; + nvmem-cell-names = "cpr_quotient_offset1", + "cpr_quotient_offset2", + "cpr_quotient_offset3", + "cpr_init_voltage1", + "cpr_init_voltage2", + "cpr_init_voltage3", + "cpr_quotient1", + "cpr_quotient2", + "cpr_quotient3", + "cpr_ring_osc1", + "cpr_ring_osc2", + "cpr_ring_osc3", + "cpr_fuse_revision"; + }; diff --git a/MAINTAINERS b/MAINTAINERS index 5924365f36b2..589916273a1a 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -13715,6 +13715,14 @@ S: Maintained F: Documentation/devicetree/bindings/opp/qcom-nvmem-cpufreq.txt F: drivers/cpufreq/qcom-cpufreq-nvmem.c +QUALCOMM CORE POWER REDUCTION (CPR) AVS DRIVER +M: Niklas Cassel <nks@flawful.org> +L: linux-pm@vger.kernel.org +L: linux-arm-msm@vger.kernel.org +S: Maintained +F: Documentation/devicetree/bindings/power/avs/qcom,cpr.txt +F: drivers/power/avs/qcom-cpr.c + QUALCOMM EMAC GIGABIT ETHERNET DRIVER M: Timur Tabi <timur@kernel.org> L: netdev@vger.kernel.org diff --git a/arch/x86/include/asm/intel-family.h b/arch/x86/include/asm/intel-family.h index c606c0b70738..4981c293f926 100644 --- a/arch/x86/include/asm/intel-family.h +++ b/arch/x86/include/asm/intel-family.h @@ -111,6 +111,7 @@ #define INTEL_FAM6_ATOM_TREMONT_D 0x86 /* Jacobsville */ #define INTEL_FAM6_ATOM_TREMONT 0x96 /* Elkhart Lake */ +#define INTEL_FAM6_ATOM_TREMONT_L 0x9C /* Jasper Lake */ /* Xeon Phi */ diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig index 002838d23b86..cc57bab146b5 100644 --- a/drivers/acpi/Kconfig +++ b/drivers/acpi/Kconfig @@ -241,6 +241,7 @@ config ACPI_CPU_FREQ_PSS config ACPI_PROCESSOR_CSTATE def_bool y + depends on ACPI_PROCESSOR depends on IA64 || X86 config ACPI_PROCESSOR_IDLE diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c index 2c4dda0787e8..5379bc3f275d 100644 --- a/drivers/acpi/acpi_processor.c +++ b/drivers/acpi/acpi_processor.c @@ -705,3 +705,185 @@ void __init acpi_processor_init(void) acpi_scan_add_handler_with_hotplug(&processor_handler, "processor"); acpi_scan_add_handler(&processor_container_handler); } + +#ifdef CONFIG_ACPI_PROCESSOR_CSTATE +/** + * acpi_processor_claim_cst_control - Request _CST control from the platform. + */ +bool acpi_processor_claim_cst_control(void) +{ + static bool cst_control_claimed; + acpi_status status; + + if (!acpi_gbl_FADT.cst_control || cst_control_claimed) + return true; + + status = acpi_os_write_port(acpi_gbl_FADT.smi_command, + acpi_gbl_FADT.cst_control, 8); + if (ACPI_FAILURE(status)) { + pr_warn("ACPI: Failed to claim processor _CST control\n"); + return false; + } + + cst_control_claimed = true; + return true; +} +EXPORT_SYMBOL_GPL(acpi_processor_claim_cst_control); + +/** + * acpi_processor_evaluate_cst - Evaluate the processor _CST control method. + * @handle: ACPI handle of the processor object containing the _CST. + * @cpu: The numeric ID of the target CPU. + * @info: Object write the C-states information into. + * + * Extract the C-state information for the given CPU from the output of the _CST + * control method under the corresponding ACPI processor object (or processor + * device object) and populate @info with it. + * + * If any ACPI_ADR_SPACE_FIXED_HARDWARE C-states are found, invoke + * acpi_processor_ffh_cstate_probe() to verify them and update the + * cpu_cstate_entry data for @cpu. + */ +int acpi_processor_evaluate_cst(acpi_handle handle, u32 cpu, + struct acpi_processor_power *info) +{ + struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL }; + union acpi_object *cst; + acpi_status status; + u64 count; + int last_index = 0; + int i, ret = 0; + + status = acpi_evaluate_object(handle, "_CST", NULL, &buffer); + if (ACPI_FAILURE(status)) { + acpi_handle_debug(handle, "No _CST\n"); + return -ENODEV; + } + + cst = buffer.pointer; + + /* There must be at least 2 elements. */ + if (!cst || cst->type != ACPI_TYPE_PACKAGE || cst->package.count < 2) { + acpi_handle_warn(handle, "Invalid _CST output\n"); + ret = -EFAULT; + goto end; + } + + count = cst->package.elements[0].integer.value; + + /* Validate the number of C-states. */ + if (count < 1 || count != cst->package.count - 1) { + acpi_handle_warn(handle, "Inconsistent _CST data\n"); + ret = -EFAULT; + goto end; + } + + for (i = 1; i <= count; i++) { + union acpi_object *element; + union acpi_object *obj; + struct acpi_power_register *reg; + struct acpi_processor_cx cx; + + /* + * If there is not enough space for all C-states, skip the + * excess ones and log a warning. + */ + if (last_index >= ACPI_PROCESSOR_MAX_POWER - 1) { + acpi_handle_warn(handle, + "No room for more idle states (limit: %d)\n", + ACPI_PROCESSOR_MAX_POWER - 1); + break; + } + + memset(&cx, 0, sizeof(cx)); + + element = &cst->package.elements[i]; + if (element->type != ACPI_TYPE_PACKAGE) + continue; + + if (element->package.count != 4) + continue; + + obj = &element->package.elements[0]; + + if (obj->type != ACPI_TYPE_BUFFER) + continue; + + reg = (struct acpi_power_register *)obj->buffer.pointer; + + obj = &element->package.elements[1]; + if (obj->type != ACPI_TYPE_INTEGER) + continue; + + cx.type = obj->integer.value; + /* + * There are known cases in which the _CST output does not + * contain C1, so if the type of the first state found is not + * C1, leave an empty slot for C1 to be filled in later. + */ + if (i == 1 && cx.type != ACPI_STATE_C1) + last_index = 1; + + cx.address = reg->address; + cx.index = last_index + 1; + + if (reg->space_id == ACPI_ADR_SPACE_FIXED_HARDWARE) { + if (!acpi_processor_ffh_cstate_probe(cpu, &cx, reg)) { + /* + * In the majority of cases _CST describes C1 as + * a FIXED_HARDWARE C-state, but if the command + * line forbids using MWAIT, use CSTATE_HALT for + * C1 regardless. + */ + if (cx.type == ACPI_STATE_C1 && + boot_option_idle_override == IDLE_NOMWAIT) { + cx.entry_method = ACPI_CSTATE_HALT; + snprintf(cx.desc, ACPI_CX_DESC_LEN, "ACPI HLT"); + } else { + cx.entry_method = ACPI_CSTATE_FFH; + } + } else if (cx.type == ACPI_STATE_C1) { + /* + * In the special case of C1, FIXED_HARDWARE can + * be handled by executing the HLT instruction. + */ + cx.entry_method = ACPI_CSTATE_HALT; + snprintf(cx.desc, ACPI_CX_DESC_LEN, "ACPI HLT"); + } else { + continue; + } + } else if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_IO) { + cx.entry_method = ACPI_CSTATE_SYSTEMIO; + snprintf(cx.desc, ACPI_CX_DESC_LEN, "ACPI IOPORT 0x%x", + cx.address); + } else { + continue; + } + + if (cx.type == ACPI_STATE_C1) + cx.valid = 1; + + obj = &element->package.elements[2]; + if (obj->type != ACPI_TYPE_INTEGER) + continue; + + cx.latency = obj->integer.value; + + obj = &element->package.elements[3]; + if (obj->type != ACPI_TYPE_INTEGER) + continue; + + memcpy(&info->states[++last_index], &cx, sizeof(cx)); + } + + acpi_handle_info(handle, "Found %d idle states\n", last_index); + + info->count = last_index; + + end: + kfree(buffer.pointer); + + return ret; +} +EXPORT_SYMBOL_GPL(acpi_processor_evaluate_cst); +#endif /* CONFIG_ACPI_PROCESSOR_CSTATE */ diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index 2ae95df2e74f..dcc289e30166 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -299,164 +299,24 @@ static int acpi_processor_get_power_info_default(struct acpi_processor *pr) static int acpi_processor_get_power_info_cst(struct acpi_processor *pr) { - acpi_status status; - u64 count; - int current_count; - int i, ret = 0; - struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL }; - union acpi_object *cst; + int ret; if (nocst) return -ENODEV; - current_count = 0; - - status = acpi_evaluate_object(pr->handle, "_CST", NULL, &buffer); - if (ACPI_FAILURE(status)) { - ACPI_DEBUG_PRINT((ACPI_DB_INFO, "No _CST, giving up\n")); - return -ENODEV; - } - - cst = buffer.pointer; - - /* There must be at least 2 elements */ - if (!cst || (cst->type != ACPI_TYPE_PACKAGE) || cst->package.count < 2) { - pr_err("not enough elements in _CST\n"); - ret = -EFAULT; - goto end; - } - - count = cst->package.elements[0].integer.value; + ret = acpi_processor_evaluate_cst(pr->handle, pr->id, &pr->power); + if (ret) + return ret; - /* Validate number of power states. */ - if (count < 1 || count != cst->package.count - 1) { - pr_err("count given by _CST is not valid\n"); - ret = -EFAULT; - goto end; - } + /* + * It is expected that there will be at least 2 states, C1 and + * something else (C2 or C3), so fail if that is not the case. + */ + if (pr->power.count < 2) + return -EFAULT; - /* Tell driver that at least _CST is supported. */ pr->flags.has_cst = 1; - - for (i = 1; i <= count; i++) { - union acpi_object *element; - union acpi_object *obj; - struct acpi_power_register *reg; - struct acpi_processor_cx cx; - - memset(&cx, 0, sizeof(cx)); - - element = &(cst->package.elements[i]); - if (element->type != ACPI_TYPE_PACKAGE) - continue; - - if (element->package.count != 4) - continue; - - obj = &(element->package.elements[0]); - - if (obj->type != ACPI_TYPE_BUFFER) - continue; - - reg = (struct acpi_power_register *)obj->buffer.pointer; - - if (reg->space_id != ACPI_ADR_SPACE_SYSTEM_IO && - (reg->space_id != ACPI_ADR_SPACE_FIXED_HARDWARE)) - continue; - - /* There should be an easy way to extract an integer... */ - obj = &(element->package.elements[1]); - if (obj->type != ACPI_TYPE_INTEGER) - continue; - - cx.type = obj->integer.value; - /* - * Some buggy BIOSes won't list C1 in _CST - - * Let acpi_processor_get_power_info_default() handle them later - */ - if (i == 1 && cx.type != ACPI_STATE_C1) - current_count++; - - cx.address = reg->address; - cx.index = current_count + 1; - - cx.entry_method = ACPI_CSTATE_SYSTEMIO; - if (reg->space_id == ACPI_ADR_SPACE_FIXED_HARDWARE) { - if (acpi_processor_ffh_cstate_probe - (pr->id, &cx, reg) == 0) { - cx.entry_method = ACPI_CSTATE_FFH; - } else if (cx.type == ACPI_STATE_C1) { - /* - * C1 is a special case where FIXED_HARDWARE - * can be handled in non-MWAIT way as well. - * In that case, save this _CST entry info. - * Otherwise, ignore this info and continue. - */ - cx.entry_method = ACPI_CSTATE_HALT; - snprintf(cx.desc, ACPI_CX_DESC_LEN, "ACPI HLT"); - } else { - continue; - } - if (cx.type == ACPI_STATE_C1 && - (boot_option_idle_override == IDLE_NOMWAIT)) { - /* - * In most cases the C1 space_id obtained from - * _CST object is FIXED_HARDWARE access mode. - * But when the option of idle=halt is added, - * the entry_method type should be changed from - * CSTATE_FFH to CSTATE_HALT. - * When the option of idle=nomwait is added, - * the C1 entry_method type should be - * CSTATE_HALT. - */ - cx.entry_method = ACPI_CSTATE_HALT; - snprintf(cx.desc, ACPI_CX_DESC_LEN, "ACPI HLT"); - } - } else { - snprintf(cx.desc, ACPI_CX_DESC_LEN, "ACPI IOPORT 0x%x", - cx.address); - } - - if (cx.type == ACPI_STATE_C1) { - cx.valid = 1; - } - - obj = &(element->package.elements[2]); - if (obj->type != ACPI_TYPE_INTEGER) - continue; - - cx.latency = obj->integer.value; - - obj = &(element->package.elements[3]); - if (obj->type != ACPI_TYPE_INTEGER) - continue; - - current_count++; - memcpy(&(pr->power.states[current_count]), &cx, sizeof(cx)); - - /* - * We support total ACPI_PROCESSOR_MAX_POWER - 1 - * (From 1 through ACPI_PROCESSOR_MAX_POWER - 1) - */ - if (current_count >= (ACPI_PROCESSOR_MAX_POWER - 1)) { - pr_warn("Limiting number of power states to max (%d)\n", - ACPI_PROCESSOR_MAX_POWER); - pr_warn("Please increase ACPI_PROCESSOR_MAX_POWER if needed.\n"); - break; - } - } - - ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Found %d power states\n", - current_count)); - - /* Validate number of power states discovered */ - if (current_count < 2) - ret = -EFAULT; - - end: - kfree(buffer.pointer); - - return ret; + return 0; } static void acpi_processor_power_verify_c3(struct acpi_processor *pr, @@ -909,7 +769,6 @@ static int acpi_processor_setup_cstates(struct acpi_processor *pr) static inline void acpi_processor_cstate_first_run_checks(void) { - acpi_status status; static int first_run; if (first_run) @@ -921,13 +780,10 @@ static inline void acpi_processor_cstate_first_run_checks(void) max_cstate); first_run++; - if (acpi_gbl_FADT.cst_control && !nocst) { - status = acpi_os_write_port(acpi_gbl_FADT.smi_command, - acpi_gbl_FADT.cst_control, 8); - if (ACPI_FAILURE(status)) - ACPI_EXCEPTION((AE_INFO, status, - "Notifying BIOS of _CST ability failed")); - } + if (nocst) + return; + + acpi_processor_claim_cst_control(); } #else diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c index 48616f358854..16134a69bf6f 100644 --- a/drivers/base/power/runtime.c +++ b/drivers/base/power/runtime.c @@ -1006,8 +1006,10 @@ int __pm_runtime_idle(struct device *dev, int rpmflags) int retval; if (rpmflags & RPM_GET_PUT) { - if (!atomic_dec_and_test(&dev->power.usage_count)) + if (!atomic_dec_and_test(&dev->power.usage_count)) { + trace_rpm_usage_rcuidle(dev, rpmflags); return 0; + } } might_sleep_if(!(rpmflags & RPM_ASYNC) && !dev->power.irq_safe); @@ -1038,8 +1040,10 @@ int __pm_runtime_suspend(struct device *dev, int rpmflags) int retval; if (rpmflags & RPM_GET_PUT) { - if (!atomic_dec_and_test(&dev->power.usage_count)) + if (!atomic_dec_and_test(&dev->power.usage_count)) { + trace_rpm_usage_rcuidle(dev, rpmflags); return 0; + } } might_sleep_if(!(rpmflags & RPM_ASYNC) && !dev->power.irq_safe); @@ -1101,6 +1105,7 @@ int pm_runtime_get_if_in_use(struct device *dev) retval = dev->power.disable_depth > 0 ? -EINVAL : dev->power.runtime_status == RPM_ACTIVE && atomic_inc_not_zero(&dev->power.usage_count); + trace_rpm_usage_rcuidle(dev, 0); spin_unlock_irqrestore(&dev->power.lock, flags); return retval; } @@ -1434,6 +1439,8 @@ void pm_runtime_allow(struct device *dev) dev->power.runtime_auto = true; if (atomic_dec_and_test(&dev->power.usage_count)) rpm_idle(dev, RPM_AUTO | RPM_ASYNC); + else + trace_rpm_usage_rcuidle(dev, RPM_AUTO | RPM_ASYNC); out: spin_unlock_irq(&dev->power.lock); @@ -1501,6 +1508,8 @@ static void update_autosuspend(struct device *dev, int old_delay, int old_use) if (!old_use || old_delay >= 0) { atomic_inc(&dev->power.usage_count); rpm_resume(dev, 0); + } else { + trace_rpm_usage_rcuidle(dev, 0); } } diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c index 70a9edb5f525..27f3e60608e5 100644 --- a/drivers/base/power/wakeup.c +++ b/drivers/base/power/wakeup.c @@ -1125,6 +1125,9 @@ static void *wakeup_sources_stats_seq_next(struct seq_file *m, break; } + if (!next_ws) + print_wakeup_source_stats(m, &deleted_ws); + return next_ws; } diff --git a/drivers/cpufreq/brcmstb-avs-cpufreq.c b/drivers/cpufreq/brcmstb-avs-cpufreq.c index 77b0e5d0fb13..4f86ce2db34f 100644 --- a/drivers/cpufreq/brcmstb-avs-cpufreq.c +++ b/drivers/cpufreq/brcmstb-avs-cpufreq.c @@ -455,6 +455,8 @@ static unsigned int brcm_avs_cpufreq_get(unsigned int cpu) struct cpufreq_policy *policy = cpufreq_cpu_get(cpu); struct private_data *priv = policy->driver_data; + cpufreq_cpu_put(policy); + return brcm_avs_get_frequency(priv->base); } diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c index 8d8da763adc5..a06777c35fc0 100644 --- a/drivers/cpufreq/cppc_cpufreq.c +++ b/drivers/cpufreq/cppc_cpufreq.c @@ -39,7 +39,7 @@ static struct cppc_cpudata **all_cpu_data; struct cppc_workaround_oem_info { - char oem_id[ACPI_OEM_ID_SIZE +1]; + char oem_id[ACPI_OEM_ID_SIZE + 1]; char oem_table_id[ACPI_OEM_TABLE_ID_SIZE + 1]; u32 oem_revision; }; @@ -93,9 +93,13 @@ static void cppc_check_hisi_workaround(void) for (i = 0; i < ARRAY_SIZE(wa_info); i++) { if (!memcmp(wa_info[i].oem_id, tbl->oem_id, ACPI_OEM_ID_SIZE) && !memcmp(wa_info[i].oem_table_id, tbl->oem_table_id, ACPI_OEM_TABLE_ID_SIZE) && - wa_info[i].oem_revision == tbl->oem_revision) + wa_info[i].oem_revision == tbl->oem_revision) { apply_hisi_workaround = true; + break; + } } + + acpi_put_table(tbl); } /* Callback function used to retrieve the max frequency from DMI */ diff --git a/drivers/cpufreq/cpufreq-dt-platdev.c b/drivers/cpufreq/cpufreq-dt-platdev.c index aba591d57c67..f2ae9cd455c1 100644 --- a/drivers/cpufreq/cpufreq-dt-platdev.c +++ b/drivers/cpufreq/cpufreq-dt-platdev.c @@ -109,6 +109,7 @@ static const struct of_device_id blacklist[] __initconst = { { .compatible = "fsl,imx8mq", }, { .compatible = "fsl,imx8mm", }, { .compatible = "fsl,imx8mn", }, + { .compatible = "fsl,imx8mp", }, { .compatible = "marvell,armadaxp", }, diff --git a/drivers/cpufreq/imx-cpufreq-dt.c b/drivers/cpufreq/imx-cpufreq-dt.c index 85a6efd6b68f..6cb8193421ea 100644 --- a/drivers/cpufreq/imx-cpufreq-dt.c +++ b/drivers/cpufreq/imx-cpufreq-dt.c @@ -35,7 +35,8 @@ static int imx_cpufreq_dt_probe(struct platform_device *pdev) if (ret) return ret; - if (of_machine_is_compatible("fsl,imx8mn")) + if (of_machine_is_compatible("fsl,imx8mn") || + of_machine_is_compatible("fsl,imx8mp")) speed_grade = (cell_value & IMX8MN_OCOTP_CFG3_SPEED_GRADE_MASK) >> OCOTP_CFG3_SPEED_GRADE_SHIFT; else @@ -54,7 +55,8 @@ static int imx_cpufreq_dt_probe(struct platform_device *pdev) if (of_machine_is_compatible("fsl,imx8mm") || of_machine_is_compatible("fsl,imx8mq")) speed_grade = 1; - if (of_machine_is_compatible("fsl,imx8mn")) + if (of_machine_is_compatible("fsl,imx8mn") || + of_machine_is_compatible("fsl,imx8mp")) speed_grade = 0xb; } diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index d2fa3e9ccd97..ad6a17cf0011 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -172,7 +172,7 @@ struct vid_data { /** * struct global_params - Global parameters, mostly tunable via sysfs. * @no_turbo: Whether or not to use turbo P-states. - * @turbo_disabled: Whethet or not turbo P-states are available at all, + * @turbo_disabled: Whether or not turbo P-states are available at all, * based on the MSR_IA32_MISC_ENABLE value and whether or * not the maximum reported turbo P-state is different from * the maximum reported non-turbo one. diff --git a/drivers/cpufreq/kirkwood-cpufreq.c b/drivers/cpufreq/kirkwood-cpufreq.c index cb74bdc5baaa..70ad8fe1d78b 100644 --- a/drivers/cpufreq/kirkwood-cpufreq.c +++ b/drivers/cpufreq/kirkwood-cpufreq.c @@ -102,13 +102,11 @@ static struct cpufreq_driver kirkwood_cpufreq_driver = { static int kirkwood_cpufreq_probe(struct platform_device *pdev) { struct device_node *np; - struct resource *res; int err; priv.dev = &pdev->dev; - res = platform_get_resource(pdev, IORESOURCE_MEM, 0); - priv.base = devm_ioremap_resource(&pdev->dev, res); + priv.base = devm_platform_ioremap_resource(pdev, 0); if (IS_ERR(priv.base)) return PTR_ERR(priv.base); diff --git a/drivers/cpufreq/loongson2_cpufreq.c b/drivers/cpufreq/loongson2_cpufreq.c index e9caa9586982..909f40fbcde2 100644 --- a/drivers/cpufreq/loongson2_cpufreq.c +++ b/drivers/cpufreq/loongson2_cpufreq.c @@ -144,9 +144,11 @@ static void loongson2_cpu_wait(void) u32 cpu_freq; spin_lock_irqsave(&loongson2_wait_lock, flags); - cpu_freq = LOONGSON_CHIPCFG(0); - LOONGSON_CHIPCFG(0) &= ~0x7; /* Put CPU into wait mode */ - LOONGSON_CHIPCFG(0) = cpu_freq; /* Restore CPU state */ + cpu_freq = readl(LOONGSON_CHIPCFG); + /* Put CPU into wait mode */ + writel(readl(LOONGSON_CHIPCFG) & ~0x7, LOONGSON_CHIPCFG); + /* Restore CPU state */ + writel(cpu_freq, LOONGSON_CHIPCFG); spin_unlock_irqrestore(&loongson2_wait_lock, flags); local_irq_enable(); } diff --git a/drivers/cpufreq/s3c2416-cpufreq.c b/drivers/cpufreq/s3c2416-cpufreq.c index 106910351c41..5c221bc90210 100644 --- a/drivers/cpufreq/s3c2416-cpufreq.c +++ b/drivers/cpufreq/s3c2416-cpufreq.c @@ -304,6 +304,7 @@ static int s3c2416_cpufreq_reboot_notifier_evt(struct notifier_block *this, { struct s3c2416_data *s3c_freq = &s3c2416_cpufreq; int ret; + struct cpufreq_policy *policy; mutex_lock(&cpufreq_lock); @@ -318,7 +319,16 @@ static int s3c2416_cpufreq_reboot_notifier_evt(struct notifier_block *this, */ if (s3c_freq->is_dvs) { pr_debug("cpufreq: leave dvs on reboot\n"); - ret = cpufreq_driver_target(cpufreq_cpu_get(0), FREQ_SLEEP, 0); + + policy = cpufreq_cpu_get(0); + if (!policy) { + pr_debug("cpufreq: get no policy for cpu0\n"); + return NOTIFY_BAD; + } + + ret = cpufreq_driver_target(policy, FREQ_SLEEP, 0); + cpufreq_cpu_put(policy); + if (ret < 0) return NOTIFY_BAD; } diff --git a/drivers/cpufreq/s5pv210-cpufreq.c b/drivers/cpufreq/s5pv210-cpufreq.c index 5d10030f2560..e84281e2561d 100644 --- a/drivers/cpufreq/s5pv210-cpufreq.c +++ b/drivers/cpufreq/s5pv210-cpufreq.c @@ -555,8 +555,17 @@ static int s5pv210_cpufreq_reboot_notifier_event(struct notifier_block *this, unsigned long event, void *ptr) { int ret; + struct cpufreq_policy *policy; + + policy = cpufreq_cpu_get(0); + if (!policy) { + pr_debug("cpufreq: get no policy for cpu0\n"); + return NOTIFY_BAD; + } + + ret = cpufreq_driver_target(policy, SLEEP_FREQ, 0); + cpufreq_cpu_put(policy); - ret = cpufreq_driver_target(cpufreq_cpu_get(0), SLEEP_FREQ, 0); if (ret < 0) return NOTIFY_BAD; diff --git a/drivers/cpufreq/tegra186-cpufreq.c b/drivers/cpufreq/tegra186-cpufreq.c index bcecb068b51b..2e233ad72758 100644 --- a/drivers/cpufreq/tegra186-cpufreq.c +++ b/drivers/cpufreq/tegra186-cpufreq.c @@ -187,7 +187,6 @@ static int tegra186_cpufreq_probe(struct platform_device *pdev) { struct tegra186_cpufreq_data *data; struct tegra_bpmp *bpmp; - struct resource *res; unsigned int i = 0, err; data = devm_kzalloc(&pdev->dev, sizeof(*data), GFP_KERNEL); @@ -205,8 +204,7 @@ static int tegra186_cpufreq_probe(struct platform_device *pdev) if (IS_ERR(bpmp)) return PTR_ERR(bpmp); - res = platform_get_resource(pdev, IORESOURCE_MEM, 0); - data->regs = devm_ioremap_resource(&pdev->dev, res); + data->regs = devm_platform_ioremap_resource(pdev, 0); if (IS_ERR(data->regs)) { err = PTR_ERR(data->regs); goto put_bpmp; diff --git a/drivers/cpuidle/Kconfig.arm b/drivers/cpuidle/Kconfig.arm index a224d33dda7f..62272ecfa771 100644 --- a/drivers/cpuidle/Kconfig.arm +++ b/drivers/cpuidle/Kconfig.arm @@ -25,7 +25,7 @@ config ARM_PSCI_CPUIDLE config ARM_BIG_LITTLE_CPUIDLE bool "Support for ARM big.LITTLE processors" - depends on ARCH_VEXPRESS_TC2_PM || ARCH_EXYNOS + depends on ARCH_VEXPRESS_TC2_PM || ARCH_EXYNOS || COMPILE_TEST depends on MCPM && !ARM64 select ARM_CPU_SUSPEND select CPU_IDLE_MULTIPLE_DRIVERS @@ -51,13 +51,13 @@ config ARM_HIGHBANK_CPUIDLE config ARM_KIRKWOOD_CPUIDLE bool "CPU Idle Driver for Marvell Kirkwood SoCs" - depends on MACH_KIRKWOOD && !ARM64 + depends on (MACH_KIRKWOOD || COMPILE_TEST) && !ARM64 help This adds the CPU Idle driver for Marvell Kirkwood SoCs. config ARM_ZYNQ_CPUIDLE bool "CPU Idle Driver for Xilinx Zynq processors" - depends on ARCH_ZYNQ && !ARM64 + depends on (ARCH_ZYNQ || COMPILE_TEST) && !ARM64 help Select this to enable cpuidle on Xilinx Zynq processors. @@ -70,19 +70,19 @@ config ARM_U8500_CPUIDLE config ARM_AT91_CPUIDLE bool "Cpu Idle Driver for the AT91 processors" default y - depends on ARCH_AT91 && !ARM64 + depends on (ARCH_AT91 || COMPILE_TEST) && !ARM64 help Select this to enable cpuidle for AT91 processors. config ARM_EXYNOS_CPUIDLE bool "Cpu Idle Driver for the Exynos processors" - depends on ARCH_EXYNOS && !ARM64 + depends on (ARCH_EXYNOS || COMPILE_TEST) && !ARM64 select ARCH_NEEDS_CPU_IDLE_COUPLED if SMP help Select this to enable cpuidle for Exynos processors. config ARM_MVEBU_V7_CPUIDLE bool "CPU Idle Driver for mvebu v7 family processors" - depends on ARCH_MVEBU && !ARM64 + depends on (ARCH_MVEBU || COMPILE_TEST) && !ARM64 help Select this to enable cpuidle on Armada 370, 38x and XP processors. diff --git a/drivers/cpuidle/coupled.c b/drivers/cpuidle/coupled.c index b607278df25b..04003b90dc49 100644 --- a/drivers/cpuidle/coupled.c +++ b/drivers/cpuidle/coupled.c @@ -89,6 +89,7 @@ * @coupled_cpus: mask of cpus that are part of the coupled set * @requested_state: array of requested states for cpus in the coupled set * @ready_waiting_counts: combined count of cpus in ready or waiting loops + * @abort_barrier: synchronisation point for abort cases * @online_count: count of cpus that are online * @refcnt: reference count of cpuidle devices that are using this struct * @prevent: flag to prevent coupled idle while a cpu is hotplugging @@ -338,7 +339,7 @@ static void cpuidle_coupled_poke(int cpu) /** * cpuidle_coupled_poke_others - wake up all other cpus that may be waiting - * @dev: struct cpuidle_device for this cpu + * @this_cpu: target cpu * @coupled: the struct coupled that contains the current cpu * * Calls cpuidle_coupled_poke on all other online cpus. @@ -355,7 +356,7 @@ static void cpuidle_coupled_poke_others(int this_cpu, /** * cpuidle_coupled_set_waiting - mark this cpu as in the wait loop - * @dev: struct cpuidle_device for this cpu + * @cpu: target cpu * @coupled: the struct coupled that contains the current cpu * @next_state: the index in drv->states of the requested state for this cpu * @@ -376,7 +377,7 @@ static int cpuidle_coupled_set_waiting(int cpu, /** * cpuidle_coupled_set_not_waiting - mark this cpu as leaving the wait loop - * @dev: struct cpuidle_device for this cpu + * @cpu: target cpu * @coupled: the struct coupled that contains the current cpu * * Removes the requested idle state for the specified cpuidle device. @@ -412,7 +413,7 @@ static void cpuidle_coupled_set_done(int cpu, struct cpuidle_coupled *coupled) /** * cpuidle_coupled_clear_pokes - spin until the poke interrupt is processed - * @cpu - this cpu + * @cpu: this cpu * * Turns on interrupts and spins until any outstanding poke interrupts have * been processed and the poke bit has been cleared. diff --git a/drivers/cpuidle/cpuidle-clps711x.c b/drivers/cpuidle/cpuidle-clps711x.c index 6e36740f5719..fc22c59b6c73 100644 --- a/drivers/cpuidle/cpuidle-clps711x.c +++ b/drivers/cpuidle/cpuidle-clps711x.c @@ -37,10 +37,7 @@ static struct cpuidle_driver clps711x_idle_driver = { static int __init clps711x_cpuidle_probe(struct platform_device *pdev) { - struct resource *res; - - res = platform_get_resource(pdev, IORESOURCE_MEM, 0); - clps711x_halt = devm_ioremap_resource(&pdev->dev, res); + clps711x_halt = devm_platform_ioremap_resource(pdev, 0); if (IS_ERR(clps711x_halt)) return PTR_ERR(clps711x_halt); diff --git a/drivers/cpuidle/cpuidle-kirkwood.c b/drivers/cpuidle/cpuidle-kirkwood.c index d23d8f468c12..511c4f46027a 100644 --- a/drivers/cpuidle/cpuidle-kirkwood.c +++ b/drivers/cpuidle/cpuidle-kirkwood.c @@ -55,10 +55,7 @@ static struct cpuidle_driver kirkwood_idle_driver = { /* Initialize CPU idle by registering the idle states */ static int kirkwood_cpuidle_probe(struct platform_device *pdev) { - struct resource *res; - - res = platform_get_resource(pdev, IORESOURCE_MEM, 0); - ddr_operation_base = devm_ioremap_resource(&pdev->dev, res); + ddr_operation_base = devm_platform_ioremap_resource(pdev, 0); if (IS_ERR(ddr_operation_base)) return PTR_ERR(ddr_operation_base); diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index 33d19c8eb027..de81298051b3 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -121,6 +121,9 @@ void cpuidle_use_deepest_state(u64 latency_limit_ns) * cpuidle_find_deepest_state - Find the deepest available idle state. * @drv: cpuidle driver for the given CPU. * @dev: cpuidle device for the given CPU. + * @latency_limit_ns: Idle state exit latency limit + * + * Return: the index of the deepest available idle state. */ int cpuidle_find_deepest_state(struct cpuidle_driver *drv, struct cpuidle_device *dev, @@ -572,10 +575,14 @@ static int __cpuidle_register_device(struct cpuidle_device *dev) if (!try_module_get(drv->owner)) return -EINVAL; - for (i = 0; i < drv->state_count; i++) + for (i = 0; i < drv->state_count; i++) { if (drv->states[i].flags & CPUIDLE_FLAG_UNUSABLE) dev->states_usage[i].disable |= CPUIDLE_STATE_DISABLED_BY_DRIVER; + if (drv->states[i].flags & CPUIDLE_FLAG_OFF) + dev->states_usage[i].disable |= CPUIDLE_STATE_DISABLED_BY_USER; + } + per_cpu(cpuidle_devices, dev->cpu) = dev; list_add(&dev->device_list, &cpuidle_detected_devices); diff --git a/drivers/cpuidle/driver.c b/drivers/cpuidle/driver.c index ce6a5f80fb83..4070e573bf43 100644 --- a/drivers/cpuidle/driver.c +++ b/drivers/cpuidle/driver.c @@ -155,8 +155,6 @@ static void __cpuidle_driver_init(struct cpuidle_driver *drv) { int i; - drv->refcnt = 0; - /* * Use all possible CPUs as the default, because if the kernel boots * with some CPUs offline and then we online one of them, the CPU @@ -240,9 +238,6 @@ static int __cpuidle_register_driver(struct cpuidle_driver *drv) */ static void __cpuidle_unregister_driver(struct cpuidle_driver *drv) { - if (WARN_ON(drv->refcnt > 0)) - return; - if (drv->bctimer) { drv->bctimer = 0; on_each_cpu_mask(drv->cpumask, cpuidle_setup_broadcast_timer, @@ -350,47 +345,6 @@ struct cpuidle_driver *cpuidle_get_cpu_driver(struct cpuidle_device *dev) EXPORT_SYMBOL_GPL(cpuidle_get_cpu_driver); /** - * cpuidle_driver_ref - get a reference to the driver. - * - * Increment the reference counter of the cpuidle driver associated with - * the current CPU. - * - * Returns a pointer to the driver, or NULL if the current CPU has no driver. - */ -struct cpuidle_driver *cpuidle_driver_ref(void) -{ - struct cpuidle_driver *drv; - - spin_lock(&cpuidle_driver_lock); - - drv = cpuidle_get_driver(); - if (drv) - drv->refcnt++; - - spin_unlock(&cpuidle_driver_lock); - return drv; -} - -/** - * cpuidle_driver_unref - puts down the refcount for the driver - * - * Decrement the reference counter of the cpuidle driver associated with - * the current CPU. - */ -void cpuidle_driver_unref(void) -{ - struct cpuidle_driver *drv; - - spin_lock(&cpuidle_driver_lock); - - drv = cpuidle_get_driver(); - if (drv && !WARN_ON(drv->refcnt <= 0)) - drv->refcnt--; - - spin_unlock(&cpuidle_driver_lock); -} - -/** * cpuidle_driver_state_disabled - Disable or enable an idle state * @drv: cpuidle driver owning the state * @idx: State index diff --git a/drivers/cpuidle/sysfs.c b/drivers/cpuidle/sysfs.c index 38ef770be90d..cdeedbf02646 100644 --- a/drivers/cpuidle/sysfs.c +++ b/drivers/cpuidle/sysfs.c @@ -142,6 +142,7 @@ static struct attribute_group cpuidle_attr_group = { /** * cpuidle_add_interface - add CPU global sysfs attributes + * @dev: the target device */ int cpuidle_add_interface(struct device *dev) { @@ -153,6 +154,7 @@ int cpuidle_add_interface(struct device *dev) /** * cpuidle_remove_interface - remove CPU global sysfs attributes + * @dev: the target device */ void cpuidle_remove_interface(struct device *dev) { @@ -327,6 +329,14 @@ static ssize_t store_state_disable(struct cpuidle_state *state, return size; } +static ssize_t show_state_default_status(struct cpuidle_state *state, + struct cpuidle_state_usage *state_usage, + char *buf) +{ + return sprintf(buf, "%s\n", + state->flags & CPUIDLE_FLAG_OFF ? "disabled" : "enabled"); +} + define_one_state_ro(name, show_state_name); define_one_state_ro(desc, show_state_desc); define_one_state_ro(latency, show_state_exit_latency); @@ -337,6 +347,7 @@ define_one_state_ro(time, show_state_time); define_one_state_rw(disable, show_state_disable, store_state_disable); define_one_state_ro(above, show_state_above); define_one_state_ro(below, show_state_below); +define_one_state_ro(default_status, show_state_default_status); static struct attribute *cpuidle_state_default_attrs[] = { &attr_name.attr, @@ -349,6 +360,7 @@ static struct attribute *cpuidle_state_default_attrs[] = { &attr_disable.attr, &attr_above.attr, &attr_below.attr, + &attr_default_status.attr, NULL }; @@ -615,7 +627,7 @@ static struct kobj_type ktype_driver_cpuidle = { /** * cpuidle_add_driver_sysfs - adds the driver name sysfs attribute - * @device: the target device + * @dev: the target device */ static int cpuidle_add_driver_sysfs(struct cpuidle_device *dev) { @@ -646,7 +658,7 @@ static int cpuidle_add_driver_sysfs(struct cpuidle_device *dev) /** * cpuidle_remove_driver_sysfs - removes the driver name sysfs attribute - * @device: the target device + * @dev: the target device */ static void cpuidle_remove_driver_sysfs(struct cpuidle_device *dev) { diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig index 35535833b6f7..0b1df12e0f21 100644 --- a/drivers/devfreq/Kconfig +++ b/drivers/devfreq/Kconfig @@ -77,7 +77,7 @@ config DEVFREQ_GOV_PASSIVE comment "DEVFREQ Drivers" config ARM_EXYNOS_BUS_DEVFREQ - tristate "ARM EXYNOS Generic Memory Bus DEVFREQ Driver" + tristate "ARM Exynos Generic Memory Bus DEVFREQ Driver" depends on ARCH_EXYNOS || COMPILE_TEST select DEVFREQ_GOV_SIMPLE_ONDEMAND select DEVFREQ_GOV_PASSIVE @@ -91,6 +91,16 @@ config ARM_EXYNOS_BUS_DEVFREQ and adjusts the operating frequencies and voltages with OPP support. This does not yet operate with optimal voltages. +config ARM_IMX8M_DDRC_DEVFREQ + tristate "i.MX8M DDRC DEVFREQ Driver" + depends on (ARCH_MXC && HAVE_ARM_SMCCC) || \ + (COMPILE_TEST && HAVE_ARM_SMCCC) + select DEVFREQ_GOV_SIMPLE_ONDEMAND + select DEVFREQ_GOV_USERSPACE + help + This adds the DEVFREQ driver for the i.MX8M DDR Controller. It allows + adjusting DRAM frequency. + config ARM_TEGRA_DEVFREQ tristate "NVIDIA Tegra30/114/124/210 DEVFREQ Driver" depends on ARCH_TEGRA_3x_SOC || ARCH_TEGRA_114_SOC || \ @@ -115,14 +125,15 @@ config ARM_TEGRA20_DEVFREQ config ARM_RK3399_DMC_DEVFREQ tristate "ARM RK3399 DMC DEVFREQ Driver" - depends on ARCH_ROCKCHIP + depends on (ARCH_ROCKCHIP && HAVE_ARM_SMCCC) || \ + (COMPILE_TEST && HAVE_ARM_SMCCC) select DEVFREQ_EVENT_ROCKCHIP_DFI select DEVFREQ_GOV_SIMPLE_ONDEMAND select PM_DEVFREQ_EVENT help - This adds the DEVFREQ driver for the RK3399 DMC(Dynamic Memory Controller). - It sets the frequency for the memory controller and reads the usage counts - from hardware. + This adds the DEVFREQ driver for the RK3399 DMC(Dynamic Memory Controller). + It sets the frequency for the memory controller and reads the usage counts + from hardware. source "drivers/devfreq/event/Kconfig" diff --git a/drivers/devfreq/Makefile b/drivers/devfreq/Makefile index 338ae8440db6..3eb4d5e6635c 100644 --- a/drivers/devfreq/Makefile +++ b/drivers/devfreq/Makefile @@ -9,6 +9,7 @@ obj-$(CONFIG_DEVFREQ_GOV_PASSIVE) += governor_passive.o # DEVFREQ Drivers obj-$(CONFIG_ARM_EXYNOS_BUS_DEVFREQ) += exynos-bus.o +obj-$(CONFIG_ARM_IMX8M_DDRC_DEVFREQ) += imx8m-ddrc.o obj-$(CONFIG_ARM_RK3399_DMC_DEVFREQ) += rk3399_dmc.o obj-$(CONFIG_ARM_TEGRA_DEVFREQ) += tegra30-devfreq.o obj-$(CONFIG_ARM_TEGRA20_DEVFREQ) += tegra20-devfreq.o diff --git a/drivers/devfreq/devfreq-event.c b/drivers/devfreq/devfreq-event.c index 3dc5fd6065a3..8c31b0f2e28f 100644 --- a/drivers/devfreq/devfreq-event.c +++ b/drivers/devfreq/devfreq-event.c @@ -346,9 +346,9 @@ EXPORT_SYMBOL_GPL(devfreq_event_add_edev); /** * devfreq_event_remove_edev() - Remove the devfreq-event device registered. - * @dev : the devfreq-event device + * @edev : the devfreq-event device * - * Note that this function remove the registered devfreq-event device. + * Note that this function removes the registered devfreq-event device. */ int devfreq_event_remove_edev(struct devfreq_event_dev *edev) { diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c index 57f6944d65a6..cceee8bc3c2f 100644 --- a/drivers/devfreq/devfreq.c +++ b/drivers/devfreq/devfreq.c @@ -10,6 +10,7 @@ #include <linux/kernel.h> #include <linux/kmod.h> #include <linux/sched.h> +#include <linux/debugfs.h> #include <linux/errno.h> #include <linux/err.h> #include <linux/init.h> @@ -33,6 +34,7 @@ #define HZ_PER_KHZ 1000 static struct class *devfreq_class; +static struct dentry *devfreq_debugfs; /* * devfreq core provides delayed work based load monitoring helper @@ -209,10 +211,10 @@ static int set_freq_table(struct devfreq *devfreq) int devfreq_update_status(struct devfreq *devfreq, unsigned long freq) { int lev, prev_lev, ret = 0; - unsigned long cur_time; + u64 cur_time; lockdep_assert_held(&devfreq->lock); - cur_time = jiffies; + cur_time = get_jiffies_64(); /* Immediately exit if previous_freq is not initialized yet. */ if (!devfreq->previous_freq) @@ -224,8 +226,8 @@ int devfreq_update_status(struct devfreq *devfreq, unsigned long freq) goto out; } - devfreq->time_in_state[prev_lev] += - cur_time - devfreq->last_stat_updated; + devfreq->stats.time_in_state[prev_lev] += + cur_time - devfreq->stats.last_update; lev = devfreq_get_freq_level(devfreq, freq); if (lev < 0) { @@ -234,13 +236,13 @@ int devfreq_update_status(struct devfreq *devfreq, unsigned long freq) } if (lev != prev_lev) { - devfreq->trans_table[(prev_lev * - devfreq->profile->max_state) + lev]++; - devfreq->total_trans++; + devfreq->stats.trans_table[ + (prev_lev * devfreq->profile->max_state) + lev]++; + devfreq->stats.total_trans++; } out: - devfreq->last_stat_updated = cur_time; + devfreq->stats.last_update = cur_time; return ret; } EXPORT_SYMBOL(devfreq_update_status); @@ -535,7 +537,7 @@ void devfreq_monitor_resume(struct devfreq *devfreq) msecs_to_jiffies(devfreq->profile->polling_ms)); out_update: - devfreq->last_stat_updated = jiffies; + devfreq->stats.last_update = get_jiffies_64(); devfreq->stop_polling = false; if (devfreq->profile->get_cur_freq && @@ -807,28 +809,29 @@ struct devfreq *devfreq_add_device(struct device *dev, goto err_out; } - devfreq->trans_table = devm_kzalloc(&devfreq->dev, + devfreq->stats.trans_table = devm_kzalloc(&devfreq->dev, array3_size(sizeof(unsigned int), devfreq->profile->max_state, devfreq->profile->max_state), GFP_KERNEL); - if (!devfreq->trans_table) { + if (!devfreq->stats.trans_table) { mutex_unlock(&devfreq->lock); err = -ENOMEM; goto err_devfreq; } - devfreq->time_in_state = devm_kcalloc(&devfreq->dev, + devfreq->stats.time_in_state = devm_kcalloc(&devfreq->dev, devfreq->profile->max_state, - sizeof(unsigned long), + sizeof(*devfreq->stats.time_in_state), GFP_KERNEL); - if (!devfreq->time_in_state) { + if (!devfreq->stats.time_in_state) { mutex_unlock(&devfreq->lock); err = -ENOMEM; goto err_devfreq; } - devfreq->last_stat_updated = jiffies; + devfreq->stats.total_trans = 0; + devfreq->stats.last_update = get_jiffies_64(); srcu_init_notifier_head(&devfreq->transition_notifier_list); @@ -1259,6 +1262,14 @@ err_out: } EXPORT_SYMBOL(devfreq_remove_governor); +static ssize_t name_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct devfreq *devfreq = to_devfreq(dev); + return sprintf(buf, "%s\n", dev_name(devfreq->dev.parent)); +} +static DEVICE_ATTR_RO(name); + static ssize_t governor_show(struct device *dev, struct device_attribute *attr, char *buf) { @@ -1461,6 +1472,7 @@ static ssize_t min_freq_show(struct device *dev, struct device_attribute *attr, return sprintf(buf, "%lu\n", min_freq); } +static DEVICE_ATTR_RW(min_freq); static ssize_t max_freq_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) @@ -1501,7 +1513,6 @@ static ssize_t max_freq_store(struct device *dev, struct device_attribute *attr, return count; } -static DEVICE_ATTR_RW(min_freq); static ssize_t max_freq_show(struct device *dev, struct device_attribute *attr, char *buf) @@ -1580,18 +1591,47 @@ static ssize_t trans_stat_show(struct device *dev, devfreq->profile->freq_table[i]); for (j = 0; j < max_state; j++) len += sprintf(buf + len, "%10u", - devfreq->trans_table[(i * max_state) + j]); - len += sprintf(buf + len, "%10u\n", - jiffies_to_msecs(devfreq->time_in_state[i])); + devfreq->stats.trans_table[(i * max_state) + j]); + + len += sprintf(buf + len, "%10llu\n", (u64) + jiffies64_to_msecs(devfreq->stats.time_in_state[i])); } len += sprintf(buf + len, "Total transition : %u\n", - devfreq->total_trans); + devfreq->stats.total_trans); return len; } -static DEVICE_ATTR_RO(trans_stat); + +static ssize_t trans_stat_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct devfreq *df = to_devfreq(dev); + int err, value; + + if (df->profile->max_state == 0) + return count; + + err = kstrtoint(buf, 10, &value); + if (err || value != 0) + return -EINVAL; + + mutex_lock(&df->lock); + memset(df->stats.time_in_state, 0, (df->profile->max_state * + sizeof(*df->stats.time_in_state))); + memset(df->stats.trans_table, 0, array3_size(sizeof(unsigned int), + df->profile->max_state, + df->profile->max_state)); + df->stats.total_trans = 0; + df->stats.last_update = get_jiffies_64(); + mutex_unlock(&df->lock); + + return count; +} +static DEVICE_ATTR_RW(trans_stat); static struct attribute *devfreq_attrs[] = { + &dev_attr_name.attr, &dev_attr_governor.attr, &dev_attr_available_governors.attr, &dev_attr_cur_freq.attr, @@ -1605,6 +1645,81 @@ static struct attribute *devfreq_attrs[] = { }; ATTRIBUTE_GROUPS(devfreq); +/** + * devfreq_summary_show() - Show the summary of the devfreq devices + * @s: seq_file instance to show the summary of devfreq devices + * @data: not used + * + * Show the summary of the devfreq devices via 'devfreq_summary' debugfs file. + * It helps that user can know the detailed information of the devfreq devices. + * + * Return 0 always because it shows the information without any data change. + */ +static int devfreq_summary_show(struct seq_file *s, void *data) +{ + struct devfreq *devfreq; + struct devfreq *p_devfreq = NULL; + unsigned long cur_freq, min_freq, max_freq; + unsigned int polling_ms; + + seq_printf(s, "%-30s %-10s %-10s %-15s %10s %12s %12s %12s\n", + "dev_name", + "dev", + "parent_dev", + "governor", + "polling_ms", + "cur_freq_Hz", + "min_freq_Hz", + "max_freq_Hz"); + seq_printf(s, "%30s %10s %10s %15s %10s %12s %12s %12s\n", + "------------------------------", + "----------", + "----------", + "---------------", + "----------", + "------------", + "------------", + "------------"); + + mutex_lock(&devfreq_list_lock); + + list_for_each_entry_reverse(devfreq, &devfreq_list, node) { +#if IS_ENABLED(CONFIG_DEVFREQ_GOV_PASSIVE) + if (!strncmp(devfreq->governor_name, DEVFREQ_GOV_PASSIVE, + DEVFREQ_NAME_LEN)) { + struct devfreq_passive_data *data = devfreq->data; + + if (data) + p_devfreq = data->parent; + } else { + p_devfreq = NULL; + } +#endif + + mutex_lock(&devfreq->lock); + cur_freq = devfreq->previous_freq, + get_freq_range(devfreq, &min_freq, &max_freq); + polling_ms = devfreq->profile->polling_ms, + mutex_unlock(&devfreq->lock); + + seq_printf(s, + "%-30s %-10s %-10s %-15s %10d %12ld %12ld %12ld\n", + dev_name(devfreq->dev.parent), + dev_name(&devfreq->dev), + p_devfreq ? dev_name(&p_devfreq->dev) : "null", + devfreq->governor_name, + polling_ms, + cur_freq, + min_freq, + max_freq); + } + + mutex_unlock(&devfreq_list_lock); + + return 0; +} +DEFINE_SHOW_ATTRIBUTE(devfreq_summary); + static int __init devfreq_init(void) { devfreq_class = class_create(THIS_MODULE, "devfreq"); @@ -1621,6 +1736,11 @@ static int __init devfreq_init(void) } devfreq_class->dev_groups = devfreq_groups; + devfreq_debugfs = debugfs_create_dir("devfreq", NULL); + debugfs_create_file("devfreq_summary", 0444, + devfreq_debugfs, NULL, + &devfreq_summary_fops); + return 0; } subsys_initcall(devfreq_init); @@ -1814,7 +1934,7 @@ static void devm_devfreq_notifier_release(struct device *dev, void *res) /** * devm_devfreq_register_notifier() - - Resource-managed devfreq_register_notifier() + * - Resource-managed devfreq_register_notifier() * @dev: The devfreq user device. (parent of devfreq) * @devfreq: The devfreq object. * @nb: The notifier block to be unregistered. @@ -1850,7 +1970,7 @@ EXPORT_SYMBOL(devm_devfreq_register_notifier); /** * devm_devfreq_unregister_notifier() - - Resource-managed devfreq_unregister_notifier() + * - Resource-managed devfreq_unregister_notifier() * @dev: The devfreq user device. (parent of devfreq) * @devfreq: The devfreq object. * @nb: The notifier block to be unregistered. diff --git a/drivers/devfreq/event/Kconfig b/drivers/devfreq/event/Kconfig index cef2cf5347ca..878825372f6f 100644 --- a/drivers/devfreq/event/Kconfig +++ b/drivers/devfreq/event/Kconfig @@ -15,7 +15,7 @@ menuconfig PM_DEVFREQ_EVENT if PM_DEVFREQ_EVENT config DEVFREQ_EVENT_EXYNOS_NOCP - tristate "EXYNOS NoC (Network On Chip) Probe DEVFREQ event Driver" + tristate "Exynos NoC (Network On Chip) Probe DEVFREQ event Driver" depends on ARCH_EXYNOS || COMPILE_TEST select PM_OPP select REGMAP_MMIO @@ -24,7 +24,7 @@ config DEVFREQ_EVENT_EXYNOS_NOCP (Network on Chip) Probe counters to measure the bandwidth of AXI bus. config DEVFREQ_EVENT_EXYNOS_PPMU - tristate "EXYNOS PPMU (Platform Performance Monitoring Unit) DEVFREQ event Driver" + tristate "Exynos PPMU (Platform Performance Monitoring Unit) DEVFREQ event Driver" depends on ARCH_EXYNOS || COMPILE_TEST select PM_OPP help @@ -34,7 +34,7 @@ config DEVFREQ_EVENT_EXYNOS_PPMU config DEVFREQ_EVENT_ROCKCHIP_DFI tristate "ROCKCHIP DFI DEVFREQ event Driver" - depends on ARCH_ROCKCHIP + depends on ARCH_ROCKCHIP || COMPILE_TEST help This add the devfreq-event driver for Rockchip SoC. It provides DFI (DDR Monitor Module) driver to count ddr load. diff --git a/drivers/devfreq/event/exynos-nocp.c b/drivers/devfreq/event/exynos-nocp.c index 1c565926db9f..ccc531ee6938 100644 --- a/drivers/devfreq/event/exynos-nocp.c +++ b/drivers/devfreq/event/exynos-nocp.c @@ -1,6 +1,6 @@ // SPDX-License-Identifier: GPL-2.0-only /* - * exynos-nocp.c - EXYNOS NoC (Network On Chip) Probe support + * exynos-nocp.c - Exynos NoC (Network On Chip) Probe support * * Copyright (c) 2016 Samsung Electronics Co., Ltd. * Author : Chanwoo Choi <cw00.choi@samsung.com> diff --git a/drivers/devfreq/event/exynos-nocp.h b/drivers/devfreq/event/exynos-nocp.h index 55cc96284a36..2d6f08cfd0c5 100644 --- a/drivers/devfreq/event/exynos-nocp.h +++ b/drivers/devfreq/event/exynos-nocp.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0-only */ /* - * exynos-nocp.h - EXYNOS NoC (Network on Chip) Probe header file + * exynos-nocp.h - Exynos NoC (Network on Chip) Probe header file * * Copyright (c) 2016 Samsung Electronics Co., Ltd. * Author : Chanwoo Choi <cw00.choi@samsung.com> diff --git a/drivers/devfreq/event/exynos-ppmu.c b/drivers/devfreq/event/exynos-ppmu.c index 85c7a77bf3f0..17ed980d9099 100644 --- a/drivers/devfreq/event/exynos-ppmu.c +++ b/drivers/devfreq/event/exynos-ppmu.c @@ -1,6 +1,6 @@ // SPDX-License-Identifier: GPL-2.0-only /* - * exynos_ppmu.c - EXYNOS PPMU (Platform Performance Monitoring Unit) support + * exynos_ppmu.c - Exynos PPMU (Platform Performance Monitoring Unit) support * * Copyright (c) 2014-2015 Samsung Electronics Co., Ltd. * Author : Chanwoo Choi <cw00.choi@samsung.com> @@ -101,17 +101,22 @@ static struct __exynos_ppmu_events { PPMU_EVENT(dmc1_1), }; -static int exynos_ppmu_find_ppmu_id(struct devfreq_event_dev *edev) +static int __exynos_ppmu_find_ppmu_id(const char *edev_name) { int i; for (i = 0; i < ARRAY_SIZE(ppmu_events); i++) - if (!strcmp(edev->desc->name, ppmu_events[i].name)) + if (!strcmp(edev_name, ppmu_events[i].name)) return ppmu_events[i].id; return -EINVAL; } +static int exynos_ppmu_find_ppmu_id(struct devfreq_event_dev *edev) +{ + return __exynos_ppmu_find_ppmu_id(edev->desc->name); +} + /* * The devfreq-event ops structure for PPMU v1.1 */ @@ -556,13 +561,11 @@ static int of_get_devfreq_events(struct device_node *np, * use default if not. */ if (info->ppmu_type == EXYNOS_TYPE_PPMU_V2) { - struct devfreq_event_dev edev; int id; /* Not all registers take the same value for * read+write data count. */ - edev.desc = &desc[j]; - id = exynos_ppmu_find_ppmu_id(&edev); + id = __exynos_ppmu_find_ppmu_id(desc[j].name); switch (id) { case PPMU_PMNCNT0: diff --git a/drivers/devfreq/event/exynos-ppmu.h b/drivers/devfreq/event/exynos-ppmu.h index 284420047455..97f667d0cbdd 100644 --- a/drivers/devfreq/event/exynos-ppmu.h +++ b/drivers/devfreq/event/exynos-ppmu.h @@ -1,6 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0-only */ /* - * exynos_ppmu.h - EXYNOS PPMU header file + * exynos_ppmu.h - Exynos PPMU header file * * Copyright (c) 2015 Samsung Electronics Co., Ltd. * Author : Chanwoo Choi <cw00.choi@samsung.com> diff --git a/drivers/devfreq/event/rockchip-dfi.c b/drivers/devfreq/event/rockchip-dfi.c index 5d1042188727..9a88faaf8b27 100644 --- a/drivers/devfreq/event/rockchip-dfi.c +++ b/drivers/devfreq/event/rockchip-dfi.c @@ -177,7 +177,6 @@ static int rockchip_dfi_probe(struct platform_device *pdev) { struct device *dev = &pdev->dev; struct rockchip_dfi *data; - struct resource *res; struct devfreq_event_desc *desc; struct device_node *np = pdev->dev.of_node, *node; @@ -185,8 +184,7 @@ static int rockchip_dfi_probe(struct platform_device *pdev) if (!data) return -ENOMEM; - res = platform_get_resource(pdev, IORESOURCE_MEM, 0); - data->regs = devm_ioremap_resource(&pdev->dev, res); + data->regs = devm_platform_ioremap_resource(pdev, 0); if (IS_ERR(data->regs)) return PTR_ERR(data->regs); @@ -200,6 +198,7 @@ static int rockchip_dfi_probe(struct platform_device *pdev) node = of_parse_phandle(np, "rockchip,pmu", 0); if (node) { data->regmap_pmu = syscon_node_to_regmap(node); + of_node_put(node); if (IS_ERR(data->regmap_pmu)) return PTR_ERR(data->regmap_pmu); } diff --git a/drivers/devfreq/exynos-bus.c b/drivers/devfreq/exynos-bus.c index c832673273a2..8fa8eb541373 100644 --- a/drivers/devfreq/exynos-bus.c +++ b/drivers/devfreq/exynos-bus.c @@ -15,11 +15,10 @@ #include <linux/device.h> #include <linux/export.h> #include <linux/module.h> -#include <linux/of_device.h> +#include <linux/of.h> #include <linux/pm_opp.h> #include <linux/platform_device.h> #include <linux/regulator/consumer.h> -#include <linux/slab.h> #define DEFAULT_SATURATION_RATIO 40 @@ -127,6 +126,7 @@ static int exynos_bus_get_dev_status(struct device *dev, ret = exynos_bus_get_event(bus, &edata); if (ret < 0) { + dev_err(dev, "failed to get event from devfreq-event devices\n"); stat->total_time = stat->busy_time = 0; goto err; } @@ -287,52 +287,12 @@ err_clk: return ret; } -static int exynos_bus_probe(struct platform_device *pdev) +static int exynos_bus_profile_init(struct exynos_bus *bus, + struct devfreq_dev_profile *profile) { - struct device *dev = &pdev->dev; - struct device_node *np = dev->of_node, *node; - struct devfreq_dev_profile *profile; + struct device *dev = bus->dev; struct devfreq_simple_ondemand_data *ondemand_data; - struct devfreq_passive_data *passive_data; - struct devfreq *parent_devfreq; - struct exynos_bus *bus; - int ret, max_state; - unsigned long min_freq, max_freq; - bool passive = false; - - if (!np) { - dev_err(dev, "failed to find devicetree node\n"); - return -EINVAL; - } - - bus = devm_kzalloc(&pdev->dev, sizeof(*bus), GFP_KERNEL); - if (!bus) - return -ENOMEM; - mutex_init(&bus->lock); - bus->dev = &pdev->dev; - platform_set_drvdata(pdev, bus); - - profile = devm_kzalloc(dev, sizeof(*profile), GFP_KERNEL); - if (!profile) - return -ENOMEM; - - node = of_parse_phandle(dev->of_node, "devfreq", 0); - if (node) { - of_node_put(node); - passive = true; - } else { - ret = exynos_bus_parent_parse_of(np, bus); - if (ret < 0) - return ret; - } - - /* Parse the device-tree to get the resource information */ - ret = exynos_bus_parse_of(np, bus); - if (ret < 0) - goto err_reg; - - if (passive) - goto passive; + int ret; /* Initialize the struct profile and governor data for parent device */ profile->polling_ms = 50; @@ -341,10 +301,9 @@ static int exynos_bus_probe(struct platform_device *pdev) profile->exit = exynos_bus_exit; ondemand_data = devm_kzalloc(dev, sizeof(*ondemand_data), GFP_KERNEL); - if (!ondemand_data) { - ret = -ENOMEM; - goto err; - } + if (!ondemand_data) + return -ENOMEM; + ondemand_data->upthreshold = 40; ondemand_data->downdifferential = 5; @@ -354,15 +313,14 @@ static int exynos_bus_probe(struct platform_device *pdev) ondemand_data); if (IS_ERR(bus->devfreq)) { dev_err(dev, "failed to add devfreq device\n"); - ret = PTR_ERR(bus->devfreq); - goto err; + return PTR_ERR(bus->devfreq); } /* Register opp_notifier to catch the change of OPP */ ret = devm_devfreq_register_opp_notifier(dev, bus->devfreq); if (ret < 0) { dev_err(dev, "failed to register opp notifier\n"); - goto err; + return ret; } /* @@ -372,33 +330,44 @@ static int exynos_bus_probe(struct platform_device *pdev) ret = exynos_bus_enable_edev(bus); if (ret < 0) { dev_err(dev, "failed to enable devfreq-event devices\n"); - goto err; + return ret; } ret = exynos_bus_set_event(bus); if (ret < 0) { dev_err(dev, "failed to set event to devfreq-event devices\n"); - goto err; + goto err_edev; } - goto out; -passive: + return 0; + +err_edev: + if (exynos_bus_disable_edev(bus)) + dev_warn(dev, "failed to disable the devfreq-event devices\n"); + + return ret; +} + +static int exynos_bus_profile_init_passive(struct exynos_bus *bus, + struct devfreq_dev_profile *profile) +{ + struct device *dev = bus->dev; + struct devfreq_passive_data *passive_data; + struct devfreq *parent_devfreq; + /* Initialize the struct profile and governor data for passive device */ profile->target = exynos_bus_target; profile->exit = exynos_bus_passive_exit; /* Get the instance of parent devfreq device */ parent_devfreq = devfreq_get_devfreq_by_phandle(dev, 0); - if (IS_ERR(parent_devfreq)) { - ret = -EPROBE_DEFER; - goto err; - } + if (IS_ERR(parent_devfreq)) + return -EPROBE_DEFER; passive_data = devm_kzalloc(dev, sizeof(*passive_data), GFP_KERNEL); - if (!passive_data) { - ret = -ENOMEM; - goto err; - } + if (!passive_data) + return -ENOMEM; + passive_data->parent = parent_devfreq; /* Add devfreq device for exynos bus with passive governor */ @@ -407,11 +376,61 @@ passive: if (IS_ERR(bus->devfreq)) { dev_err(dev, "failed to add devfreq dev with passive governor\n"); - ret = PTR_ERR(bus->devfreq); - goto err; + return PTR_ERR(bus->devfreq); + } + + return 0; +} + +static int exynos_bus_probe(struct platform_device *pdev) +{ + struct device *dev = &pdev->dev; + struct device_node *np = dev->of_node, *node; + struct devfreq_dev_profile *profile; + struct exynos_bus *bus; + int ret, max_state; + unsigned long min_freq, max_freq; + bool passive = false; + + if (!np) { + dev_err(dev, "failed to find devicetree node\n"); + return -EINVAL; } -out: + bus = devm_kzalloc(&pdev->dev, sizeof(*bus), GFP_KERNEL); + if (!bus) + return -ENOMEM; + mutex_init(&bus->lock); + bus->dev = &pdev->dev; + platform_set_drvdata(pdev, bus); + + profile = devm_kzalloc(dev, sizeof(*profile), GFP_KERNEL); + if (!profile) + return -ENOMEM; + + node = of_parse_phandle(dev->of_node, "devfreq", 0); + if (node) { + of_node_put(node); + passive = true; + } else { + ret = exynos_bus_parent_parse_of(np, bus); + if (ret < 0) + return ret; + } + + /* Parse the device-tree to get the resource information */ + ret = exynos_bus_parse_of(np, bus); + if (ret < 0) + goto err_reg; + + if (passive) + ret = exynos_bus_profile_init_passive(bus, profile); + else + ret = exynos_bus_profile_init(bus, profile); + + if (ret < 0) + goto err; + max_state = bus->devfreq->profile->max_state; min_freq = (bus->devfreq->profile->freq_table[0] / 1000); max_freq = (bus->devfreq->profile->freq_table[max_state - 1] / 1000); diff --git a/drivers/devfreq/imx8m-ddrc.c b/drivers/devfreq/imx8m-ddrc.c new file mode 100644 index 000000000000..bc82d3653bff --- /dev/null +++ b/drivers/devfreq/imx8m-ddrc.c @@ -0,0 +1,471 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright 2019 NXP + */ + +#include <linux/module.h> +#include <linux/device.h> +#include <linux/of_device.h> +#include <linux/platform_device.h> +#include <linux/devfreq.h> +#include <linux/pm_opp.h> +#include <linux/clk.h> +#include <linux/clk-provider.h> +#include <linux/arm-smccc.h> + +#define IMX_SIP_DDR_DVFS 0xc2000004 + +/* Query available frequencies. */ +#define IMX_SIP_DDR_DVFS_GET_FREQ_COUNT 0x10 +#define IMX_SIP_DDR_DVFS_GET_FREQ_INFO 0x11 + +/* + * This should be in a 1:1 mapping with devicetree OPPs but + * firmware provides additional info. + */ +struct imx8m_ddrc_freq { + unsigned long rate; + unsigned long smcarg; + int dram_core_parent_index; + int dram_alt_parent_index; + int dram_apb_parent_index; +}; + +/* Hardware limitation */ +#define IMX8M_DDRC_MAX_FREQ_COUNT 4 + +/* + * i.MX8M DRAM Controller clocks have the following structure (abridged): + * + * +----------+ |\ +------+ + * | dram_pll |-------|M| dram_core | | + * +----------+ |U|---------->| D | + * /--|X| | D | + * dram_alt_root | |/ | R | + * | | C | + * +---------+ | | + * |FIX DIV/4| | | + * +---------+ | | + * composite: | | | + * +----------+ | | | + * | dram_alt |----/ | | + * +----------+ | | + * | dram_apb |-------------------->| | + * +----------+ +------+ + * + * The dram_pll is used for higher rates and dram_alt is used for lower rates. + * + * Frequency switching is implemented in TF-A (via SMC call) and can change the + * configuration of the clocks, including mux parents. The dram_alt and + * dram_apb clocks are "imx composite" and their parent can change too. + * + * We need to prepare/enable the new mux parents head of switching and update + * their information afterwards. + */ +struct imx8m_ddrc { + struct devfreq_dev_profile profile; + struct devfreq *devfreq; + + /* For frequency switching: */ + struct clk *dram_core; + struct clk *dram_pll; + struct clk *dram_alt; + struct clk *dram_apb; + + int freq_count; + struct imx8m_ddrc_freq freq_table[IMX8M_DDRC_MAX_FREQ_COUNT]; +}; + +static struct imx8m_ddrc_freq *imx8m_ddrc_find_freq(struct imx8m_ddrc *priv, + unsigned long rate) +{ + struct imx8m_ddrc_freq *freq; + int i; + + /* + * Firmware reports values in MT/s, so we round-down from Hz + * Rounding is extra generous to ensure a match. + */ + rate = DIV_ROUND_CLOSEST(rate, 250000); + for (i = 0; i < priv->freq_count; ++i) { + freq = &priv->freq_table[i]; + if (freq->rate == rate || + freq->rate + 1 == rate || + freq->rate - 1 == rate) + return freq; + } + + return NULL; +} + +static void imx8m_ddrc_smc_set_freq(int target_freq) +{ + struct arm_smccc_res res; + u32 online_cpus = 0; + int cpu; + + local_irq_disable(); + + for_each_online_cpu(cpu) + online_cpus |= (1 << (cpu * 8)); + + /* change the ddr freqency */ + arm_smccc_smc(IMX_SIP_DDR_DVFS, target_freq, online_cpus, + 0, 0, 0, 0, 0, &res); + + local_irq_enable(); +} + +static struct clk *clk_get_parent_by_index(struct clk *clk, int index) +{ + struct clk_hw *hw; + + hw = clk_hw_get_parent_by_index(__clk_get_hw(clk), index); + + return hw ? hw->clk : NULL; +} + +static int imx8m_ddrc_set_freq(struct device *dev, struct imx8m_ddrc_freq *freq) +{ + struct imx8m_ddrc *priv = dev_get_drvdata(dev); + struct clk *new_dram_core_parent; + struct clk *new_dram_alt_parent; + struct clk *new_dram_apb_parent; + int ret; + + /* + * Fetch new parents + * + * new_dram_alt_parent and new_dram_apb_parent are optional but + * new_dram_core_parent is not. + */ + new_dram_core_parent = clk_get_parent_by_index( + priv->dram_core, freq->dram_core_parent_index - 1); + if (!new_dram_core_parent) { + dev_err(dev, "failed to fetch new dram_core parent\n"); + return -EINVAL; + } + if (freq->dram_alt_parent_index) { + new_dram_alt_parent = clk_get_parent_by_index( + priv->dram_alt, + freq->dram_alt_parent_index - 1); + if (!new_dram_alt_parent) { + dev_err(dev, "failed to fetch new dram_alt parent\n"); + return -EINVAL; + } + } else + new_dram_alt_parent = NULL; + + if (freq->dram_apb_parent_index) { + new_dram_apb_parent = clk_get_parent_by_index( + priv->dram_apb, + freq->dram_apb_parent_index - 1); + if (!new_dram_apb_parent) { + dev_err(dev, "failed to fetch new dram_apb parent\n"); + return -EINVAL; + } + } else + new_dram_apb_parent = NULL; + + /* increase reference counts and ensure clks are ON before switch */ + ret = clk_prepare_enable(new_dram_core_parent); + if (ret) { + dev_err(dev, "failed to enable new dram_core parent: %d\n", + ret); + goto out; + } + ret = clk_prepare_enable(new_dram_alt_parent); + if (ret) { + dev_err(dev, "failed to enable new dram_alt parent: %d\n", + ret); + goto out_disable_core_parent; + } + ret = clk_prepare_enable(new_dram_apb_parent); + if (ret) { + dev_err(dev, "failed to enable new dram_apb parent: %d\n", + ret); + goto out_disable_alt_parent; + } + + imx8m_ddrc_smc_set_freq(freq->smcarg); + + /* update parents in clk tree after switch. */ + ret = clk_set_parent(priv->dram_core, new_dram_core_parent); + if (ret) + dev_warn(dev, "failed to set dram_core parent: %d\n", ret); + if (new_dram_alt_parent) { + ret = clk_set_parent(priv->dram_alt, new_dram_alt_parent); + if (ret) + dev_warn(dev, "failed to set dram_alt parent: %d\n", + ret); + } + if (new_dram_apb_parent) { + ret = clk_set_parent(priv->dram_apb, new_dram_apb_parent); + if (ret) + dev_warn(dev, "failed to set dram_apb parent: %d\n", + ret); + } + + /* + * Explicitly refresh dram PLL rate. + * + * Even if it's marked with CLK_GET_RATE_NOCACHE the rate will not be + * automatically refreshed when clk_get_rate is called on children. + */ + clk_get_rate(priv->dram_pll); + + /* + * clk_set_parent transfer the reference count from old parent. + * now we drop extra reference counts used during the switch + */ + clk_disable_unprepare(new_dram_apb_parent); +out_disable_alt_parent: + clk_disable_unprepare(new_dram_alt_parent); +out_disable_core_parent: + clk_disable_unprepare(new_dram_core_parent); +out: + return ret; +} + +static int imx8m_ddrc_target(struct device *dev, unsigned long *freq, u32 flags) +{ + struct imx8m_ddrc *priv = dev_get_drvdata(dev); + struct imx8m_ddrc_freq *freq_info; + struct dev_pm_opp *new_opp; + unsigned long old_freq, new_freq; + int ret; + + new_opp = devfreq_recommended_opp(dev, freq, flags); + if (IS_ERR(new_opp)) { + ret = PTR_ERR(new_opp); + dev_err(dev, "failed to get recommended opp: %d\n", ret); + return ret; + } + dev_pm_opp_put(new_opp); + + old_freq = clk_get_rate(priv->dram_core); + if (*freq == old_freq) + return 0; + + freq_info = imx8m_ddrc_find_freq(priv, *freq); + if (!freq_info) + return -EINVAL; + + /* + * Read back the clk rate to verify switch was correct and so that + * we can report it on all error paths. + */ + ret = imx8m_ddrc_set_freq(dev, freq_info); + + new_freq = clk_get_rate(priv->dram_core); + if (ret) + dev_err(dev, "ddrc failed freq switch to %lu from %lu: error %d. now at %lu\n", + *freq, old_freq, ret, new_freq); + else if (*freq != new_freq) + dev_err(dev, "ddrc failed freq update to %lu from %lu, now at %lu\n", + *freq, old_freq, new_freq); + else + dev_dbg(dev, "ddrc freq set to %lu (was %lu)\n", + *freq, old_freq); + + return ret; +} + +static int imx8m_ddrc_get_cur_freq(struct device *dev, unsigned long *freq) +{ + struct imx8m_ddrc *priv = dev_get_drvdata(dev); + + *freq = clk_get_rate(priv->dram_core); + + return 0; +} + +static int imx8m_ddrc_get_dev_status(struct device *dev, + struct devfreq_dev_status *stat) +{ + struct imx8m_ddrc *priv = dev_get_drvdata(dev); + + stat->busy_time = 0; + stat->total_time = 0; + stat->current_frequency = clk_get_rate(priv->dram_core); + + return 0; +} + +static int imx8m_ddrc_init_freq_info(struct device *dev) +{ + struct imx8m_ddrc *priv = dev_get_drvdata(dev); + struct arm_smccc_res res; + int index; + + /* An error here means DDR DVFS API not supported by firmware */ + arm_smccc_smc(IMX_SIP_DDR_DVFS, IMX_SIP_DDR_DVFS_GET_FREQ_COUNT, + 0, 0, 0, 0, 0, 0, &res); + priv->freq_count = res.a0; + if (priv->freq_count <= 0 || + priv->freq_count > IMX8M_DDRC_MAX_FREQ_COUNT) + return -ENODEV; + + for (index = 0; index < priv->freq_count; ++index) { + struct imx8m_ddrc_freq *freq = &priv->freq_table[index]; + + arm_smccc_smc(IMX_SIP_DDR_DVFS, IMX_SIP_DDR_DVFS_GET_FREQ_INFO, + index, 0, 0, 0, 0, 0, &res); + /* Result should be strictly positive */ + if ((long)res.a0 <= 0) + return -ENODEV; + + freq->rate = res.a0; + freq->smcarg = index; + freq->dram_core_parent_index = res.a1; + freq->dram_alt_parent_index = res.a2; + freq->dram_apb_parent_index = res.a3; + + /* dram_core has 2 options: dram_pll or dram_alt_root */ + if (freq->dram_core_parent_index != 1 && + freq->dram_core_parent_index != 2) + return -ENODEV; + /* dram_apb and dram_alt have exactly 8 possible parents */ + if (freq->dram_alt_parent_index > 8 || + freq->dram_apb_parent_index > 8) + return -ENODEV; + /* dram_core from alt requires explicit dram_alt parent */ + if (freq->dram_core_parent_index == 2 && + freq->dram_alt_parent_index == 0) + return -ENODEV; + } + + return 0; +} + +static int imx8m_ddrc_check_opps(struct device *dev) +{ + struct imx8m_ddrc *priv = dev_get_drvdata(dev); + struct imx8m_ddrc_freq *freq_info; + struct dev_pm_opp *opp; + unsigned long freq; + int i, opp_count; + + /* Enumerate DT OPPs and disable those not supported by firmware */ + opp_count = dev_pm_opp_get_opp_count(dev); + if (opp_count < 0) + return opp_count; + for (i = 0, freq = 0; i < opp_count; ++i, ++freq) { + opp = dev_pm_opp_find_freq_ceil(dev, &freq); + if (IS_ERR(opp)) { + dev_err(dev, "Failed enumerating OPPs: %ld\n", + PTR_ERR(opp)); + return PTR_ERR(opp); + } + dev_pm_opp_put(opp); + + freq_info = imx8m_ddrc_find_freq(priv, freq); + if (!freq_info) { + dev_info(dev, "Disable unsupported OPP %luHz %luMT/s\n", + freq, DIV_ROUND_CLOSEST(freq, 250000)); + dev_pm_opp_disable(dev, freq); + } + } + + return 0; +} + +static void imx8m_ddrc_exit(struct device *dev) +{ + dev_pm_opp_of_remove_table(dev); +} + +static int imx8m_ddrc_probe(struct platform_device *pdev) +{ + struct device *dev = &pdev->dev; + struct imx8m_ddrc *priv; + const char *gov = DEVFREQ_GOV_USERSPACE; + int ret; + + priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL); + if (!priv) + return -ENOMEM; + + platform_set_drvdata(pdev, priv); + + ret = imx8m_ddrc_init_freq_info(dev); + if (ret) { + dev_err(dev, "failed to init firmware freq info: %d\n", ret); + return ret; + } + + priv->dram_core = devm_clk_get(dev, "core"); + if (IS_ERR(priv->dram_core)) { + ret = PTR_ERR(priv->dram_core); + dev_err(dev, "failed to fetch core clock: %d\n", ret); + return ret; + } + priv->dram_pll = devm_clk_get(dev, "pll"); + if (IS_ERR(priv->dram_pll)) { + ret = PTR_ERR(priv->dram_pll); + dev_err(dev, "failed to fetch pll clock: %d\n", ret); + return ret; + } + priv->dram_alt = devm_clk_get(dev, "alt"); + if (IS_ERR(priv->dram_alt)) { + ret = PTR_ERR(priv->dram_alt); + dev_err(dev, "failed to fetch alt clock: %d\n", ret); + return ret; + } + priv->dram_apb = devm_clk_get(dev, "apb"); + if (IS_ERR(priv->dram_apb)) { + ret = PTR_ERR(priv->dram_apb); + dev_err(dev, "failed to fetch apb clock: %d\n", ret); + return ret; + } + + ret = dev_pm_opp_of_add_table(dev); + if (ret < 0) { + dev_err(dev, "failed to get OPP table\n"); + return ret; + } + + ret = imx8m_ddrc_check_opps(dev); + if (ret < 0) + goto err; + + priv->profile.polling_ms = 1000; + priv->profile.target = imx8m_ddrc_target; + priv->profile.get_dev_status = imx8m_ddrc_get_dev_status; + priv->profile.exit = imx8m_ddrc_exit; + priv->profile.get_cur_freq = imx8m_ddrc_get_cur_freq; + priv->profile.initial_freq = clk_get_rate(priv->dram_core); + + priv->devfreq = devm_devfreq_add_device(dev, &priv->profile, + gov, NULL); + if (IS_ERR(priv->devfreq)) { + ret = PTR_ERR(priv->devfreq); + dev_err(dev, "failed to add devfreq device: %d\n", ret); + goto err; + } + + return 0; + +err: + dev_pm_opp_of_remove_table(dev); + return ret; +} + +static const struct of_device_id imx8m_ddrc_of_match[] = { + { .compatible = "fsl,imx8m-ddrc", }, + { /* sentinel */ }, +}; +MODULE_DEVICE_TABLE(of, imx8m_ddrc_of_match); + +static struct platform_driver imx8m_ddrc_platdrv = { + .probe = imx8m_ddrc_probe, + .driver = { + .name = "imx8m-ddrc-devfreq", + .of_match_table = of_match_ptr(imx8m_ddrc_of_match), + }, +}; +module_platform_driver(imx8m_ddrc_platdrv); + +MODULE_DESCRIPTION("i.MX8M DDR Controller frequency driver"); +MODULE_AUTHOR("Leonard Crestez <leonard.crestez@nxp.com>"); +MODULE_LICENSE("GPL v2"); diff --git a/drivers/devfreq/rk3399_dmc.c b/drivers/devfreq/rk3399_dmc.c index 2e65d7279d79..24f04f78285b 100644 --- a/drivers/devfreq/rk3399_dmc.c +++ b/drivers/devfreq/rk3399_dmc.c @@ -364,7 +364,8 @@ static int rk3399_dmcfreq_probe(struct platform_device *pdev) if (res.a0) { dev_err(dev, "Failed to set dram param: %ld\n", res.a0); - return -EINVAL; + ret = -EINVAL; + goto err_edev; } } } @@ -372,8 +373,11 @@ static int rk3399_dmcfreq_probe(struct platform_device *pdev) node = of_parse_phandle(np, "rockchip,pmu", 0); if (node) { data->regmap_pmu = syscon_node_to_regmap(node); - if (IS_ERR(data->regmap_pmu)) - return PTR_ERR(data->regmap_pmu); + of_node_put(node); + if (IS_ERR(data->regmap_pmu)) { + ret = PTR_ERR(data->regmap_pmu); + goto err_edev; + } } regmap_read(data->regmap_pmu, RK3399_PMUGRF_OS_REG2, &val); @@ -391,7 +395,8 @@ static int rk3399_dmcfreq_probe(struct platform_device *pdev) data->odt_dis_freq = data->timing.lpddr4_odt_dis_freq; break; default: - return -EINVAL; + ret = -EINVAL; + goto err_edev; }; arm_smccc_smc(ROCKCHIP_SIP_DRAM_FREQ, 0, 0, @@ -425,7 +430,8 @@ static int rk3399_dmcfreq_probe(struct platform_device *pdev) */ if (dev_pm_opp_of_add_table(dev)) { dev_err(dev, "Invalid operating-points in device tree.\n"); - return -EINVAL; + ret = -EINVAL; + goto err_edev; } of_property_read_u32(np, "upthreshold", @@ -465,6 +471,9 @@ static int rk3399_dmcfreq_probe(struct platform_device *pdev) err_free_opp: dev_pm_opp_of_remove_table(&pdev->dev); +err_edev: + devfreq_event_disable_edev(data->edev); + return ret; } diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index 75fd2a7b0842..9f38ff02a7b8 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -41,6 +41,7 @@ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt +#include <linux/acpi.h> #include <linux/kernel.h> #include <linux/cpuidle.h> #include <linux/tick.h> @@ -79,6 +80,7 @@ struct idle_cpu { unsigned long auto_demotion_disable_flags; bool byt_auto_demotion_disable_flag; bool disable_promotion_to_c1e; + bool use_acpi; }; static const struct idle_cpu *icpu; @@ -90,6 +92,11 @@ static void intel_idle_s2idle(struct cpuidle_device *dev, static struct cpuidle_state *cpuidle_state_table; /* + * Enable this state by default even if the ACPI _CST does not list it. + */ +#define CPUIDLE_FLAG_ALWAYS_ENABLE BIT(15) + +/* * Set this flag for states where the HW flushes the TLB for us * and so we don't need cross-calls to keep it consistent. * If this flag is set, SW flushes the TLB, so even if the @@ -124,7 +131,7 @@ static struct cpuidle_state nehalem_cstates[] = { { .name = "C1E", .desc = "MWAIT 0x01", - .flags = MWAIT2flg(0x01), + .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE, .exit_latency = 10, .target_residency = 20, .enter = &intel_idle, @@ -161,7 +168,7 @@ static struct cpuidle_state snb_cstates[] = { { .name = "C1E", .desc = "MWAIT 0x01", - .flags = MWAIT2flg(0x01), + .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE, .exit_latency = 10, .target_residency = 20, .enter = &intel_idle, @@ -296,7 +303,7 @@ static struct cpuidle_state ivb_cstates[] = { { .name = "C1E", .desc = "MWAIT 0x01", - .flags = MWAIT2flg(0x01), + .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE, .exit_latency = 10, .target_residency = 20, .enter = &intel_idle, @@ -341,7 +348,7 @@ static struct cpuidle_state ivt_cstates[] = { { .name = "C1E", .desc = "MWAIT 0x01", - .flags = MWAIT2flg(0x01), + .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE, .exit_latency = 10, .target_residency = 80, .enter = &intel_idle, @@ -378,7 +385,7 @@ static struct cpuidle_state ivt_cstates_4s[] = { { .name = "C1E", .desc = "MWAIT 0x01", - .flags = MWAIT2flg(0x01), + .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE, .exit_latency = 10, .target_residency = 250, .enter = &intel_idle, @@ -415,7 +422,7 @@ static struct cpuidle_state ivt_cstates_8s[] = { { .name = "C1E", .desc = "MWAIT 0x01", - .flags = MWAIT2flg(0x01), + .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE, .exit_latency = 10, .target_residency = 500, .enter = &intel_idle, @@ -452,7 +459,7 @@ static struct cpuidle_state hsw_cstates[] = { { .name = "C1E", .desc = "MWAIT 0x01", - .flags = MWAIT2flg(0x01), + .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE, .exit_latency = 10, .target_residency = 20, .enter = &intel_idle, @@ -520,7 +527,7 @@ static struct cpuidle_state bdw_cstates[] = { { .name = "C1E", .desc = "MWAIT 0x01", - .flags = MWAIT2flg(0x01), + .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE, .exit_latency = 10, .target_residency = 20, .enter = &intel_idle, @@ -589,7 +596,7 @@ static struct cpuidle_state skl_cstates[] = { { .name = "C1E", .desc = "MWAIT 0x01", - .flags = MWAIT2flg(0x01), + .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE, .exit_latency = 10, .target_residency = 20, .enter = &intel_idle, @@ -658,7 +665,7 @@ static struct cpuidle_state skx_cstates[] = { { .name = "C1E", .desc = "MWAIT 0x01", - .flags = MWAIT2flg(0x01), + .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE, .exit_latency = 10, .target_residency = 20, .enter = &intel_idle, @@ -808,7 +815,7 @@ static struct cpuidle_state bxt_cstates[] = { { .name = "C1E", .desc = "MWAIT 0x01", - .flags = MWAIT2flg(0x01), + .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE, .exit_latency = 10, .target_residency = 20, .enter = &intel_idle, @@ -869,7 +876,7 @@ static struct cpuidle_state dnv_cstates[] = { { .name = "C1E", .desc = "MWAIT 0x01", - .flags = MWAIT2flg(0x01), + .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE, .exit_latency = 10, .target_residency = 20, .enter = &intel_idle, @@ -944,37 +951,19 @@ static void intel_idle_s2idle(struct cpuidle_device *dev, mwait_idle_with_hints(eax, ecx); } -static void __setup_broadcast_timer(bool on) -{ - if (on) - tick_broadcast_enable(); - else - tick_broadcast_disable(); -} - -static void auto_demotion_disable(void) -{ - unsigned long long msr_bits; - - rdmsrl(MSR_PKG_CST_CONFIG_CONTROL, msr_bits); - msr_bits &= ~(icpu->auto_demotion_disable_flags); - wrmsrl(MSR_PKG_CST_CONFIG_CONTROL, msr_bits); -} -static void c1e_promotion_disable(void) -{ - unsigned long long msr_bits; - - rdmsrl(MSR_IA32_POWER_CTL, msr_bits); - msr_bits &= ~0x2; - wrmsrl(MSR_IA32_POWER_CTL, msr_bits); -} - static const struct idle_cpu idle_cpu_nehalem = { .state_table = nehalem_cstates, .auto_demotion_disable_flags = NHM_C1_AUTO_DEMOTE | NHM_C3_AUTO_DEMOTE, .disable_promotion_to_c1e = true, }; +static const struct idle_cpu idle_cpu_nhx = { + .state_table = nehalem_cstates, + .auto_demotion_disable_flags = NHM_C1_AUTO_DEMOTE | NHM_C3_AUTO_DEMOTE, + .disable_promotion_to_c1e = true, + .use_acpi = true, +}; + static const struct idle_cpu idle_cpu_atom = { .state_table = atom_cstates, }; @@ -993,6 +982,12 @@ static const struct idle_cpu idle_cpu_snb = { .disable_promotion_to_c1e = true, }; +static const struct idle_cpu idle_cpu_snx = { + .state_table = snb_cstates, + .disable_promotion_to_c1e = true, + .use_acpi = true, +}; + static const struct idle_cpu idle_cpu_byt = { .state_table = byt_cstates, .disable_promotion_to_c1e = true, @@ -1013,6 +1008,7 @@ static const struct idle_cpu idle_cpu_ivb = { static const struct idle_cpu idle_cpu_ivt = { .state_table = ivt_cstates, .disable_promotion_to_c1e = true, + .use_acpi = true, }; static const struct idle_cpu idle_cpu_hsw = { @@ -1020,11 +1016,23 @@ static const struct idle_cpu idle_cpu_hsw = { .disable_promotion_to_c1e = true, }; +static const struct idle_cpu idle_cpu_hsx = { + .state_table = hsw_cstates, + .disable_promotion_to_c1e = true, + .use_acpi = true, +}; + static const struct idle_cpu idle_cpu_bdw = { .state_table = bdw_cstates, .disable_promotion_to_c1e = true, }; +static const struct idle_cpu idle_cpu_bdx = { + .state_table = bdw_cstates, + .disable_promotion_to_c1e = true, + .use_acpi = true, +}; + static const struct idle_cpu idle_cpu_skl = { .state_table = skl_cstates, .disable_promotion_to_c1e = true, @@ -1033,15 +1041,18 @@ static const struct idle_cpu idle_cpu_skl = { static const struct idle_cpu idle_cpu_skx = { .state_table = skx_cstates, .disable_promotion_to_c1e = true, + .use_acpi = true, }; static const struct idle_cpu idle_cpu_avn = { .state_table = avn_cstates, .disable_promotion_to_c1e = true, + .use_acpi = true, }; static const struct idle_cpu idle_cpu_knl = { .state_table = knl_cstates, + .use_acpi = true, }; static const struct idle_cpu idle_cpu_bxt = { @@ -1052,20 +1063,21 @@ static const struct idle_cpu idle_cpu_bxt = { static const struct idle_cpu idle_cpu_dnv = { .state_table = dnv_cstates, .disable_promotion_to_c1e = true, + .use_acpi = true, }; static const struct x86_cpu_id intel_idle_ids[] __initconst = { - INTEL_CPU_FAM6(NEHALEM_EP, idle_cpu_nehalem), + INTEL_CPU_FAM6(NEHALEM_EP, idle_cpu_nhx), INTEL_CPU_FAM6(NEHALEM, idle_cpu_nehalem), INTEL_CPU_FAM6(NEHALEM_G, idle_cpu_nehalem), INTEL_CPU_FAM6(WESTMERE, idle_cpu_nehalem), - INTEL_CPU_FAM6(WESTMERE_EP, idle_cpu_nehalem), - INTEL_CPU_FAM6(NEHALEM_EX, idle_cpu_nehalem), + INTEL_CPU_FAM6(WESTMERE_EP, idle_cpu_nhx), + INTEL_CPU_FAM6(NEHALEM_EX, idle_cpu_nhx), INTEL_CPU_FAM6(ATOM_BONNELL, idle_cpu_atom), INTEL_CPU_FAM6(ATOM_BONNELL_MID, idle_cpu_lincroft), - INTEL_CPU_FAM6(WESTMERE_EX, idle_cpu_nehalem), + INTEL_CPU_FAM6(WESTMERE_EX, idle_cpu_nhx), INTEL_CPU_FAM6(SANDYBRIDGE, idle_cpu_snb), - INTEL_CPU_FAM6(SANDYBRIDGE_X, idle_cpu_snb), + INTEL_CPU_FAM6(SANDYBRIDGE_X, idle_cpu_snx), INTEL_CPU_FAM6(ATOM_SALTWELL, idle_cpu_atom), INTEL_CPU_FAM6(ATOM_SILVERMONT, idle_cpu_byt), INTEL_CPU_FAM6(ATOM_SILVERMONT_MID, idle_cpu_tangier), @@ -1073,14 +1085,14 @@ static const struct x86_cpu_id intel_idle_ids[] __initconst = { INTEL_CPU_FAM6(IVYBRIDGE, idle_cpu_ivb), INTEL_CPU_FAM6(IVYBRIDGE_X, idle_cpu_ivt), INTEL_CPU_FAM6(HASWELL, idle_cpu_hsw), - INTEL_CPU_FAM6(HASWELL_X, idle_cpu_hsw), + INTEL_CPU_FAM6(HASWELL_X, idle_cpu_hsx), INTEL_CPU_FAM6(HASWELL_L, idle_cpu_hsw), INTEL_CPU_FAM6(HASWELL_G, idle_cpu_hsw), INTEL_CPU_FAM6(ATOM_SILVERMONT_D, idle_cpu_avn), INTEL_CPU_FAM6(BROADWELL, idle_cpu_bdw), INTEL_CPU_FAM6(BROADWELL_G, idle_cpu_bdw), - INTEL_CPU_FAM6(BROADWELL_X, idle_cpu_bdw), - INTEL_CPU_FAM6(BROADWELL_D, idle_cpu_bdw), + INTEL_CPU_FAM6(BROADWELL_X, idle_cpu_bdx), + INTEL_CPU_FAM6(BROADWELL_D, idle_cpu_bdx), INTEL_CPU_FAM6(SKYLAKE_L, idle_cpu_skl), INTEL_CPU_FAM6(SKYLAKE, idle_cpu_skl), INTEL_CPU_FAM6(KABYLAKE_L, idle_cpu_skl), @@ -1095,76 +1107,169 @@ static const struct x86_cpu_id intel_idle_ids[] __initconst = { {} }; -/* - * intel_idle_probe() +#define INTEL_CPU_FAM6_MWAIT \ + { X86_VENDOR_INTEL, 6, X86_MODEL_ANY, X86_FEATURE_MWAIT, 0 } + +static const struct x86_cpu_id intel_mwait_ids[] __initconst = { + INTEL_CPU_FAM6_MWAIT, + {} +}; + +static bool __init intel_idle_max_cstate_reached(int cstate) +{ + if (cstate + 1 > max_cstate) { + pr_info("max_cstate %d reached\n", max_cstate); + return true; + } + return false; +} + +#ifdef CONFIG_ACPI_PROCESSOR_CSTATE +#include <acpi/processor.h> + +static bool no_acpi __read_mostly; +module_param(no_acpi, bool, 0444); +MODULE_PARM_DESC(no_acpi, "Do not use ACPI _CST for building the idle states list"); + +static struct acpi_processor_power acpi_state_table __initdata; + +/** + * intel_idle_cst_usable - Check if the _CST information can be used. + * + * Check if all of the C-states listed by _CST in the max_cstate range are + * ACPI_CSTATE_FFH, which means that they should be entered via MWAIT. */ -static int __init intel_idle_probe(void) +static bool __init intel_idle_cst_usable(void) { - unsigned int eax, ebx, ecx; - const struct x86_cpu_id *id; + int cstate, limit; - if (max_cstate == 0) { - pr_debug("disabled\n"); - return -EPERM; - } + limit = min_t(int, min_t(int, CPUIDLE_STATE_MAX, max_cstate + 1), + acpi_state_table.count); - id = x86_match_cpu(intel_idle_ids); - if (!id) { - if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL && - boot_cpu_data.x86 == 6) - pr_debug("does not run on family %d model %d\n", - boot_cpu_data.x86, boot_cpu_data.x86_model); - return -ENODEV; + for (cstate = 1; cstate < limit; cstate++) { + struct acpi_processor_cx *cx = &acpi_state_table.states[cstate]; + + if (cx->entry_method != ACPI_CSTATE_FFH) + return false; } - if (!boot_cpu_has(X86_FEATURE_MWAIT)) { - pr_debug("Please enable MWAIT in BIOS SETUP\n"); - return -ENODEV; + return true; +} + +static bool __init intel_idle_acpi_cst_extract(void) +{ + unsigned int cpu; + + if (no_acpi) { + pr_debug("Not allowed to use ACPI _CST\n"); + return false; } - if (boot_cpu_data.cpuid_level < CPUID_MWAIT_LEAF) - return -ENODEV; + for_each_possible_cpu(cpu) { + struct acpi_processor *pr = per_cpu(processors, cpu); - cpuid(CPUID_MWAIT_LEAF, &eax, &ebx, &ecx, &mwait_substates); + if (!pr) + continue; - if (!(ecx & CPUID5_ECX_EXTENSIONS_SUPPORTED) || - !(ecx & CPUID5_ECX_INTERRUPT_BREAK) || - !mwait_substates) - return -ENODEV; + if (acpi_processor_evaluate_cst(pr->handle, cpu, &acpi_state_table)) + continue; - pr_debug("MWAIT substates: 0x%x\n", mwait_substates); + acpi_state_table.count++; - icpu = (const struct idle_cpu *)id->driver_data; - cpuidle_state_table = icpu->state_table; + if (!intel_idle_cst_usable()) + continue; - pr_debug("v" INTEL_IDLE_VERSION " model 0x%X\n", - boot_cpu_data.x86_model); + if (!acpi_processor_claim_cst_control()) { + acpi_state_table.count = 0; + return false; + } - return 0; + return true; + } + + pr_debug("ACPI _CST not found or not usable\n"); + return false; } -/* - * intel_idle_cpuidle_devices_uninit() - * Unregisters the cpuidle devices. - */ -static void intel_idle_cpuidle_devices_uninit(void) +static void __init intel_idle_init_cstates_acpi(struct cpuidle_driver *drv) { - int i; - struct cpuidle_device *dev; + int cstate, limit = min_t(int, CPUIDLE_STATE_MAX, acpi_state_table.count); + + /* + * If limit > 0, intel_idle_cst_usable() has returned 'true', so all of + * the interesting states are ACPI_CSTATE_FFH. + */ + for (cstate = 1; cstate < limit; cstate++) { + struct acpi_processor_cx *cx; + struct cpuidle_state *state; - for_each_online_cpu(i) { - dev = per_cpu_ptr(intel_idle_cpuidle_devices, i); - cpuidle_unregister_device(dev); + if (intel_idle_max_cstate_reached(cstate)) + break; + + cx = &acpi_state_table.states[cstate]; + + state = &drv->states[drv->state_count++]; + + snprintf(state->name, CPUIDLE_NAME_LEN, "C%d_ACPI", cstate); + strlcpy(state->desc, cx->desc, CPUIDLE_DESC_LEN); + state->exit_latency = cx->latency; + /* + * For C1-type C-states use the same number for both the exit + * latency and target residency, because that is the case for + * C1 in the majority of the static C-states tables above. + * For the other types of C-states, however, set the target + * residency to 3 times the exit latency which should lead to + * a reasonable balance between energy-efficiency and + * performance in the majority of interesting cases. + */ + state->target_residency = cx->latency; + if (cx->type > ACPI_STATE_C1) + state->target_residency *= 3; + + state->flags = MWAIT2flg(cx->address); + if (cx->type > ACPI_STATE_C2) + state->flags |= CPUIDLE_FLAG_TLB_FLUSHED; + + state->enter = intel_idle; + state->enter_s2idle = intel_idle_s2idle; } } +static bool __init intel_idle_off_by_default(u32 mwait_hint) +{ + int cstate, limit; + + /* + * If there are no _CST C-states, do not disable any C-states by + * default. + */ + if (!acpi_state_table.count) + return false; + + limit = min_t(int, CPUIDLE_STATE_MAX, acpi_state_table.count); + /* + * If limit > 0, intel_idle_cst_usable() has returned 'true', so all of + * the interesting states are ACPI_CSTATE_FFH. + */ + for (cstate = 1; cstate < limit; cstate++) { + if (acpi_state_table.states[cstate].address == mwait_hint) + return false; + } + return true; +} +#else /* !CONFIG_ACPI_PROCESSOR_CSTATE */ +static inline bool intel_idle_acpi_cst_extract(void) { return false; } +static inline void intel_idle_init_cstates_acpi(struct cpuidle_driver *drv) { } +static inline bool intel_idle_off_by_default(u32 mwait_hint) { return false; } +#endif /* !CONFIG_ACPI_PROCESSOR_CSTATE */ + /* * ivt_idle_state_table_update(void) * * Tune IVT multi-socket targets * Assumption: num_sockets == (max_package_num + 1) */ -static void ivt_idle_state_table_update(void) +static void __init ivt_idle_state_table_update(void) { /* IVT uses a different table for 1-2, 3-4, and > 4 sockets */ int cpu, package_num, num_sockets = 1; @@ -1187,15 +1292,17 @@ static void ivt_idle_state_table_update(void) /* else, 1 and 2 socket systems use default ivt_cstates */ } -/* - * Translate IRTL (Interrupt Response Time Limit) MSR to usec +/** + * irtl_2_usec - IRTL to microseconds conversion. + * @irtl: IRTL MSR value. + * + * Translate the IRTL (Interrupt Response Time Limit) MSR value to microseconds. */ - -static unsigned int irtl_ns_units[] = { - 1, 32, 1024, 32768, 1048576, 33554432, 0, 0 }; - -static unsigned long long irtl_2_usec(unsigned long long irtl) +static unsigned long long __init irtl_2_usec(unsigned long long irtl) { + static const unsigned int irtl_ns_units[] __initconst = { + 1, 32, 1024, 32768, 1048576, 33554432, 0, 0 + }; unsigned long long ns; if (!irtl) @@ -1203,15 +1310,16 @@ static unsigned long long irtl_2_usec(unsigned long long irtl) ns = irtl_ns_units[(irtl >> 10) & 0x7]; - return div64_u64((irtl & 0x3FF) * ns, 1000); + return div_u64((irtl & 0x3FF) * ns, NSEC_PER_USEC); } + /* * bxt_idle_state_table_update(void) * * On BXT, we trust the IRTL to show the definitive maximum latency * We use the same value for target_residency. */ -static void bxt_idle_state_table_update(void) +static void __init bxt_idle_state_table_update(void) { unsigned long long msr; unsigned int usec; @@ -1258,7 +1366,7 @@ static void bxt_idle_state_table_update(void) * On SKL-H (model 0x5e) disable C8 and C9 if: * C10 is enabled and SGX disabled */ -static void sklh_idle_state_table_update(void) +static void __init sklh_idle_state_table_update(void) { unsigned long long msr; unsigned int eax, ebx, ecx, edx; @@ -1294,16 +1402,28 @@ static void sklh_idle_state_table_update(void) skl_cstates[5].flags |= CPUIDLE_FLAG_UNUSABLE; /* C8-SKL */ skl_cstates[6].flags |= CPUIDLE_FLAG_UNUSABLE; /* C9-SKL */ } -/* - * intel_idle_state_table_update() - * - * Update the default state_table for this CPU-id - */ -static void intel_idle_state_table_update(void) +static bool __init intel_idle_verify_cstate(unsigned int mwait_hint) { - switch (boot_cpu_data.x86_model) { + unsigned int mwait_cstate = MWAIT_HINT2CSTATE(mwait_hint) + 1; + unsigned int num_substates = (mwait_substates >> mwait_cstate * 4) & + MWAIT_SUBSTATE_MASK; + + /* Ignore the C-state if there are NO sub-states in CPUID for it. */ + if (num_substates == 0) + return false; + + if (mwait_cstate > 2 && !boot_cpu_has(X86_FEATURE_NONSTOP_TSC)) + mark_tsc_unstable("TSC halts in idle states deeper than C2"); + return true; +} + +static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv) +{ + int cstate; + + switch (boot_cpu_data.x86_model) { case INTEL_FAM6_IVYBRIDGE_X: ivt_idle_state_table_update(); break; @@ -1315,62 +1435,36 @@ static void intel_idle_state_table_update(void) sklh_idle_state_table_update(); break; } -} - -/* - * intel_idle_cpuidle_driver_init() - * allocate, initialize cpuidle_states - */ -static void __init intel_idle_cpuidle_driver_init(void) -{ - int cstate; - struct cpuidle_driver *drv = &intel_idle_driver; - - intel_idle_state_table_update(); - - cpuidle_poll_state_init(drv); - drv->state_count = 1; for (cstate = 0; cstate < CPUIDLE_STATE_MAX; ++cstate) { - int num_substates, mwait_hint, mwait_cstate; + unsigned int mwait_hint; - if ((cpuidle_state_table[cstate].enter == NULL) && - (cpuidle_state_table[cstate].enter_s2idle == NULL)) + if (intel_idle_max_cstate_reached(cstate)) break; - if (cstate + 1 > max_cstate) { - pr_info("max_cstate %d reached\n", max_cstate); + if (!cpuidle_state_table[cstate].enter && + !cpuidle_state_table[cstate].enter_s2idle) break; - } - - mwait_hint = flg2MWAIT(cpuidle_state_table[cstate].flags); - mwait_cstate = MWAIT_HINT2CSTATE(mwait_hint); - - /* number of sub-states for this state in CPUID.MWAIT */ - num_substates = (mwait_substates >> ((mwait_cstate + 1) * 4)) - & MWAIT_SUBSTATE_MASK; - - /* if NO sub-states for this state in CPUID, skip it */ - if (num_substates == 0) - continue; - /* if state marked as disabled, skip it */ + /* If marked as unusable, skip this state. */ if (cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_UNUSABLE) { pr_debug("state %s is disabled\n", cpuidle_state_table[cstate].name); continue; } + mwait_hint = flg2MWAIT(cpuidle_state_table[cstate].flags); + if (!intel_idle_verify_cstate(mwait_hint)) + continue; - if (((mwait_cstate + 1) > 2) && - !boot_cpu_has(X86_FEATURE_NONSTOP_TSC)) - mark_tsc_unstable("TSC halts in idle" - " states deeper than C2"); + /* Structure copy. */ + drv->states[drv->state_count] = cpuidle_state_table[cstate]; - drv->states[drv->state_count] = /* structure copy */ - cpuidle_state_table[cstate]; + if (icpu->use_acpi && intel_idle_off_by_default(mwait_hint) && + !(cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_ALWAYS_ENABLE)) + drv->states[drv->state_count].flags |= CPUIDLE_FLAG_OFF; - drv->state_count += 1; + drv->state_count++; } if (icpu->byt_auto_demotion_disable_flag) { @@ -1379,6 +1473,38 @@ static void __init intel_idle_cpuidle_driver_init(void) } } +/* + * intel_idle_cpuidle_driver_init() + * allocate, initialize cpuidle_states + */ +static void __init intel_idle_cpuidle_driver_init(struct cpuidle_driver *drv) +{ + cpuidle_poll_state_init(drv); + drv->state_count = 1; + + if (icpu) + intel_idle_init_cstates_icpu(drv); + else + intel_idle_init_cstates_acpi(drv); +} + +static void auto_demotion_disable(void) +{ + unsigned long long msr_bits; + + rdmsrl(MSR_PKG_CST_CONFIG_CONTROL, msr_bits); + msr_bits &= ~(icpu->auto_demotion_disable_flags); + wrmsrl(MSR_PKG_CST_CONFIG_CONTROL, msr_bits); +} + +static void c1e_promotion_disable(void) +{ + unsigned long long msr_bits; + + rdmsrl(MSR_IA32_POWER_CTL, msr_bits); + msr_bits &= ~0x2; + wrmsrl(MSR_IA32_POWER_CTL, msr_bits); +} /* * intel_idle_cpu_init() @@ -1397,6 +1523,9 @@ static int intel_idle_cpu_init(unsigned int cpu) return -EIO; } + if (!icpu) + return 0; + if (icpu->auto_demotion_disable_flags) auto_demotion_disable(); @@ -1411,7 +1540,7 @@ static int intel_idle_cpu_online(unsigned int cpu) struct cpuidle_device *dev; if (lapic_timer_reliable_states != LAPIC_TIMER_ALWAYS_RELIABLE) - __setup_broadcast_timer(true); + tick_broadcast_enable(); /* * Some systems can hotplug a cpu at runtime after @@ -1425,23 +1554,74 @@ static int intel_idle_cpu_online(unsigned int cpu) return 0; } +/** + * intel_idle_cpuidle_devices_uninit - Unregister all cpuidle devices. + */ +static void __init intel_idle_cpuidle_devices_uninit(void) +{ + int i; + + for_each_online_cpu(i) + cpuidle_unregister_device(per_cpu_ptr(intel_idle_cpuidle_devices, i)); +} + static int __init intel_idle_init(void) { + const struct x86_cpu_id *id; + unsigned int eax, ebx, ecx; int retval; /* Do not load intel_idle at all for now if idle= is passed */ if (boot_option_idle_override != IDLE_NO_OVERRIDE) return -ENODEV; - retval = intel_idle_probe(); - if (retval) - return retval; + if (max_cstate == 0) { + pr_debug("disabled\n"); + return -EPERM; + } + + id = x86_match_cpu(intel_idle_ids); + if (id) { + if (!boot_cpu_has(X86_FEATURE_MWAIT)) { + pr_debug("Please enable MWAIT in BIOS SETUP\n"); + return -ENODEV; + } + } else { + id = x86_match_cpu(intel_mwait_ids); + if (!id) + return -ENODEV; + } + + if (boot_cpu_data.cpuid_level < CPUID_MWAIT_LEAF) + return -ENODEV; + + cpuid(CPUID_MWAIT_LEAF, &eax, &ebx, &ecx, &mwait_substates); + + if (!(ecx & CPUID5_ECX_EXTENSIONS_SUPPORTED) || + !(ecx & CPUID5_ECX_INTERRUPT_BREAK) || + !mwait_substates) + return -ENODEV; + + pr_debug("MWAIT substates: 0x%x\n", mwait_substates); + + icpu = (const struct idle_cpu *)id->driver_data; + if (icpu) { + cpuidle_state_table = icpu->state_table; + if (icpu->use_acpi) + intel_idle_acpi_cst_extract(); + } else if (!intel_idle_acpi_cst_extract()) { + return -ENODEV; + } + + pr_debug("v" INTEL_IDLE_VERSION " model 0x%X\n", + boot_cpu_data.x86_model); intel_idle_cpuidle_devices = alloc_percpu(struct cpuidle_device); - if (intel_idle_cpuidle_devices == NULL) + if (!intel_idle_cpuidle_devices) return -ENOMEM; - intel_idle_cpuidle_driver_init(); + intel_idle_cpuidle_driver_init(&intel_idle_driver); + retval = cpuidle_register_driver(&intel_idle_driver); if (retval) { struct cpuidle_driver *drv = cpuidle_get_driver(); diff --git a/drivers/opp/core.c b/drivers/opp/core.c index be7a7d332332..ba43e6a3dc0a 100644 --- a/drivers/opp/core.c +++ b/drivers/opp/core.c @@ -988,7 +988,6 @@ static struct opp_table *_allocate_opp_table(struct device *dev, int index) BLOCKING_INIT_NOTIFIER_HEAD(&opp_table->head); INIT_LIST_HEAD(&opp_table->opp_list); kref_init(&opp_table->kref); - kref_init(&opp_table->list_kref); /* Secure the device table modification */ list_add(&opp_table->node, &opp_tables); @@ -1072,33 +1071,6 @@ static void _opp_table_kref_release(struct kref *kref) mutex_unlock(&opp_table_lock); } -void _opp_remove_all_static(struct opp_table *opp_table) -{ - struct dev_pm_opp *opp, *tmp; - - list_for_each_entry_safe(opp, tmp, &opp_table->opp_list, node) { - if (!opp->dynamic) - dev_pm_opp_put(opp); - } - - opp_table->parsed_static_opps = false; -} - -static void _opp_table_list_kref_release(struct kref *kref) -{ - struct opp_table *opp_table = container_of(kref, struct opp_table, - list_kref); - - _opp_remove_all_static(opp_table); - mutex_unlock(&opp_table_lock); -} - -void _put_opp_list_kref(struct opp_table *opp_table) -{ - kref_put_mutex(&opp_table->list_kref, _opp_table_list_kref_release, - &opp_table_lock); -} - void dev_pm_opp_put_opp_table(struct opp_table *opp_table) { kref_put_mutex(&opp_table->kref, _opp_table_kref_release, @@ -1202,6 +1174,24 @@ void dev_pm_opp_remove(struct device *dev, unsigned long freq) } EXPORT_SYMBOL_GPL(dev_pm_opp_remove); +void _opp_remove_all_static(struct opp_table *opp_table) +{ + struct dev_pm_opp *opp, *tmp; + + mutex_lock(&opp_table->lock); + + if (!opp_table->parsed_static_opps || --opp_table->parsed_static_opps) + goto unlock; + + list_for_each_entry_safe(opp, tmp, &opp_table->opp_list, node) { + if (!opp->dynamic) + dev_pm_opp_put_unlocked(opp); + } + +unlock: + mutex_unlock(&opp_table->lock); +} + /** * dev_pm_opp_remove_all_dynamic() - Remove all dynamically created OPPs * @dev: device for which we do this operation @@ -2276,7 +2266,7 @@ void _dev_pm_opp_find_and_remove_table(struct device *dev) return; } - _put_opp_list_kref(opp_table); + _opp_remove_all_static(opp_table); /* Drop reference taken by _find_opp_table() */ dev_pm_opp_put_opp_table(opp_table); diff --git a/drivers/opp/of.c b/drivers/opp/of.c index 1cbb58240b80..9cd8f0adacae 100644 --- a/drivers/opp/of.c +++ b/drivers/opp/of.c @@ -658,17 +658,15 @@ static int _of_add_opp_table_v2(struct device *dev, struct opp_table *opp_table) struct dev_pm_opp *opp; /* OPP table is already initialized for the device */ + mutex_lock(&opp_table->lock); if (opp_table->parsed_static_opps) { - kref_get(&opp_table->list_kref); + opp_table->parsed_static_opps++; + mutex_unlock(&opp_table->lock); return 0; } - /* - * Re-initialize list_kref every time we add static OPPs to the OPP - * table as the reference count may be 0 after the last tie static OPPs - * were removed. - */ - kref_init(&opp_table->list_kref); + opp_table->parsed_static_opps = 1; + mutex_unlock(&opp_table->lock); /* We have opp-table node now, iterate over it and add OPPs */ for_each_available_child_of_node(opp_table->np, np) { @@ -678,15 +676,17 @@ static int _of_add_opp_table_v2(struct device *dev, struct opp_table *opp_table) dev_err(dev, "%s: Failed to add OPP, %d\n", __func__, ret); of_node_put(np); - return ret; + goto remove_static_opp; } else if (opp) { count++; } } /* There should be one of more OPP defined */ - if (WARN_ON(!count)) - return -ENOENT; + if (WARN_ON(!count)) { + ret = -ENOENT; + goto remove_static_opp; + } list_for_each_entry(opp, &opp_table->opp_list, node) pstate_count += !!opp->pstate; @@ -695,15 +695,19 @@ static int _of_add_opp_table_v2(struct device *dev, struct opp_table *opp_table) if (pstate_count && pstate_count != count) { dev_err(dev, "Not all nodes have performance state set (%d: %d)\n", count, pstate_count); - return -ENOENT; + ret = -ENOENT; + goto remove_static_opp; } if (pstate_count) opp_table->genpd_performance_state = true; - opp_table->parsed_static_opps = true; - return 0; + +remove_static_opp: + _opp_remove_all_static(opp_table); + + return ret; } /* Initializes OPP tables based on old-deprecated bindings */ @@ -738,6 +742,7 @@ static int _of_add_opp_table_v1(struct device *dev, struct opp_table *opp_table) if (ret) { dev_err(dev, "%s: Failed to add OPP %ld (%d)\n", __func__, freq, ret); + _opp_remove_all_static(opp_table); return ret; } nr -= 2; diff --git a/drivers/opp/opp.h b/drivers/opp/opp.h index 01a500e2c40a..d14e27102730 100644 --- a/drivers/opp/opp.h +++ b/drivers/opp/opp.h @@ -127,11 +127,10 @@ enum opp_table_access { * @dev_list: list of devices that share these OPPs * @opp_list: table of opps * @kref: for reference count of the table. - * @list_kref: for reference count of the OPP list. * @lock: mutex protecting the opp_list and dev_list. * @np: struct device_node pointer for opp's DT node. * @clock_latency_ns_max: Max clock latency in nanoseconds. - * @parsed_static_opps: True if OPPs are initialized from DT. + * @parsed_static_opps: Count of devices for which OPPs are initialized from DT. * @shared_opp: OPP is shared between multiple devices. * @suspend_opp: Pointer to OPP to be used during device suspend. * @genpd_virt_dev_lock: Mutex protecting the genpd virtual device pointers. @@ -167,7 +166,6 @@ struct opp_table { struct list_head dev_list; struct list_head opp_list; struct kref kref; - struct kref list_kref; struct mutex lock; struct device_node *np; @@ -176,7 +174,7 @@ struct opp_table { /* For backward compatibility with v1 bindings */ unsigned int voltage_tolerance_v1; - bool parsed_static_opps; + unsigned int parsed_static_opps; enum opp_table_access shared_opp; struct dev_pm_opp *suspend_opp; diff --git a/drivers/power/avs/Kconfig b/drivers/power/avs/Kconfig index 089b6244b716..b8fe166cd0d9 100644 --- a/drivers/power/avs/Kconfig +++ b/drivers/power/avs/Kconfig @@ -12,6 +12,22 @@ menuconfig POWER_AVS Say Y here to enable Adaptive Voltage Scaling class support. +config QCOM_CPR + tristate "QCOM Core Power Reduction (CPR) support" + depends on POWER_AVS + select PM_OPP + select REGMAP + help + Say Y here to enable support for the CPR hardware found on Qualcomm + SoCs like QCS404. + + This driver populates CPU OPPs tables and makes adjustments to the + tables based on feedback from the CPR hardware. If you want to do + CPUfrequency scaling say Y here. + + To compile this driver as a module, choose M here: the module will + be called qcom-cpr + config ROCKCHIP_IODOMAIN tristate "Rockchip IO domain support" depends on POWER_AVS && ARCH_ROCKCHIP && OF diff --git a/drivers/power/avs/Makefile b/drivers/power/avs/Makefile index a1b8cd453f19..9007d05853e2 100644 --- a/drivers/power/avs/Makefile +++ b/drivers/power/avs/Makefile @@ -1,3 +1,4 @@ # SPDX-License-Identifier: GPL-2.0-only obj-$(CONFIG_POWER_AVS_OMAP) += smartreflex.o +obj-$(CONFIG_QCOM_CPR) += qcom-cpr.o obj-$(CONFIG_ROCKCHIP_IODOMAIN) += rockchip-io-domain.o diff --git a/drivers/power/avs/qcom-cpr.c b/drivers/power/avs/qcom-cpr.c new file mode 100644 index 000000000000..9192fb747653 --- /dev/null +++ b/drivers/power/avs/qcom-cpr.c @@ -0,0 +1,1793 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2013-2015, The Linux Foundation. All rights reserved. + * Copyright (c) 2019, Linaro Limited + */ + +#include <linux/module.h> +#include <linux/err.h> +#include <linux/debugfs.h> +#include <linux/string.h> +#include <linux/kernel.h> +#include <linux/list.h> +#include <linux/init.h> +#include <linux/io.h> +#include <linux/bitops.h> +#include <linux/slab.h> +#include <linux/of.h> +#include <linux/of_device.h> +#include <linux/platform_device.h> +#include <linux/pm_domain.h> +#include <linux/pm_opp.h> +#include <linux/interrupt.h> +#include <linux/regmap.h> +#include <linux/mfd/syscon.h> +#include <linux/regulator/consumer.h> +#include <linux/clk.h> +#include <linux/nvmem-consumer.h> + +/* Register Offsets for RB-CPR and Bit Definitions */ + +/* RBCPR Version Register */ +#define REG_RBCPR_VERSION 0 +#define RBCPR_VER_2 0x02 +#define FLAGS_IGNORE_1ST_IRQ_STATUS BIT(0) + +/* RBCPR Gate Count and Target Registers */ +#define REG_RBCPR_GCNT_TARGET(n) (0x60 + 4 * (n)) + +#define RBCPR_GCNT_TARGET_TARGET_SHIFT 0 +#define RBCPR_GCNT_TARGET_TARGET_MASK GENMASK(11, 0) +#define RBCPR_GCNT_TARGET_GCNT_SHIFT 12 +#define RBCPR_GCNT_TARGET_GCNT_MASK GENMASK(9, 0) + +/* RBCPR Timer Control */ +#define REG_RBCPR_TIMER_INTERVAL 0x44 +#define REG_RBIF_TIMER_ADJUST 0x4c + +#define RBIF_TIMER_ADJ_CONS_UP_MASK GENMASK(3, 0) +#define RBIF_TIMER_ADJ_CONS_UP_SHIFT 0 +#define RBIF_TIMER_ADJ_CONS_DOWN_MASK GENMASK(3, 0) +#define RBIF_TIMER_ADJ_CONS_DOWN_SHIFT 4 +#define RBIF_TIMER_ADJ_CLAMP_INT_MASK GENMASK(7, 0) +#define RBIF_TIMER_ADJ_CLAMP_INT_SHIFT 8 + +/* RBCPR Config Register */ +#define REG_RBIF_LIMIT 0x48 +#define RBIF_LIMIT_CEILING_MASK GENMASK(5, 0) +#define RBIF_LIMIT_CEILING_SHIFT 6 +#define RBIF_LIMIT_FLOOR_BITS 6 +#define RBIF_LIMIT_FLOOR_MASK GENMASK(5, 0) + +#define RBIF_LIMIT_CEILING_DEFAULT RBIF_LIMIT_CEILING_MASK +#define RBIF_LIMIT_FLOOR_DEFAULT 0 + +#define REG_RBIF_SW_VLEVEL 0x94 +#define RBIF_SW_VLEVEL_DEFAULT 0x20 + +#define REG_RBCPR_STEP_QUOT 0x80 +#define RBCPR_STEP_QUOT_STEPQUOT_MASK GENMASK(7, 0) +#define RBCPR_STEP_QUOT_IDLE_CLK_MASK GENMASK(3, 0) +#define RBCPR_STEP_QUOT_IDLE_CLK_SHIFT 8 + +/* RBCPR Control Register */ +#define REG_RBCPR_CTL 0x90 + +#define RBCPR_CTL_LOOP_EN BIT(0) +#define RBCPR_CTL_TIMER_EN BIT(3) +#define RBCPR_CTL_SW_AUTO_CONT_ACK_EN BIT(5) +#define RBCPR_CTL_SW_AUTO_CONT_NACK_DN_EN BIT(6) +#define RBCPR_CTL_COUNT_MODE BIT(10) +#define RBCPR_CTL_UP_THRESHOLD_MASK GENMASK(3, 0) +#define RBCPR_CTL_UP_THRESHOLD_SHIFT 24 +#define RBCPR_CTL_DN_THRESHOLD_MASK GENMASK(3, 0) +#define RBCPR_CTL_DN_THRESHOLD_SHIFT 28 + +/* RBCPR Ack/Nack Response */ +#define REG_RBIF_CONT_ACK_CMD 0x98 +#define REG_RBIF_CONT_NACK_CMD 0x9c + +/* RBCPR Result status Register */ +#define REG_RBCPR_RESULT_0 0xa0 + +#define RBCPR_RESULT0_BUSY_SHIFT 19 +#define RBCPR_RESULT0_BUSY_MASK BIT(RBCPR_RESULT0_BUSY_SHIFT) +#define RBCPR_RESULT0_ERROR_LT0_SHIFT 18 +#define RBCPR_RESULT0_ERROR_SHIFT 6 +#define RBCPR_RESULT0_ERROR_MASK GENMASK(11, 0) +#define RBCPR_RESULT0_ERROR_STEPS_SHIFT 2 +#define RBCPR_RESULT0_ERROR_STEPS_MASK GENMASK(3, 0) +#define RBCPR_RESULT0_STEP_UP_SHIFT 1 + +/* RBCPR Interrupt Control Register */ +#define REG_RBIF_IRQ_EN(n) (0x100 + 4 * (n)) +#define REG_RBIF_IRQ_CLEAR 0x110 +#define REG_RBIF_IRQ_STATUS 0x114 + +#define CPR_INT_DONE BIT(0) +#define CPR_INT_MIN BIT(1) +#define CPR_INT_DOWN BIT(2) +#define CPR_INT_MID BIT(3) +#define CPR_INT_UP BIT(4) +#define CPR_INT_MAX BIT(5) +#define CPR_INT_CLAMP BIT(6) +#define CPR_INT_ALL (CPR_INT_DONE | CPR_INT_MIN | CPR_INT_DOWN | \ + CPR_INT_MID | CPR_INT_UP | CPR_INT_MAX | CPR_INT_CLAMP) +#define CPR_INT_DEFAULT (CPR_INT_UP | CPR_INT_DOWN) + +#define CPR_NUM_RING_OSC 8 + +/* CPR eFuse parameters */ +#define CPR_FUSE_TARGET_QUOT_BITS_MASK GENMASK(11, 0) + +#define CPR_FUSE_MIN_QUOT_DIFF 50 + +#define FUSE_REVISION_UNKNOWN (-1) + +enum voltage_change_dir { + NO_CHANGE, + DOWN, + UP, +}; + +struct cpr_fuse { + char *ring_osc; + char *init_voltage; + char *quotient; + char *quotient_offset; +}; + +struct fuse_corner_data { + int ref_uV; + int max_uV; + int min_uV; + int max_volt_scale; + int max_quot_scale; + /* fuse quot */ + int quot_offset; + int quot_scale; + int quot_adjust; + /* fuse quot_offset */ + int quot_offset_scale; + int quot_offset_adjust; +}; + +struct cpr_fuses { + int init_voltage_step; + int init_voltage_width; + struct fuse_corner_data *fuse_corner_data; +}; + +struct corner_data { + unsigned int fuse_corner; + unsigned long freq; +}; + +struct cpr_desc { + unsigned int num_fuse_corners; + int min_diff_quot; + int *step_quot; + + unsigned int timer_delay_us; + unsigned int timer_cons_up; + unsigned int timer_cons_down; + unsigned int up_threshold; + unsigned int down_threshold; + unsigned int idle_clocks; + unsigned int gcnt_us; + unsigned int vdd_apc_step_up_limit; + unsigned int vdd_apc_step_down_limit; + unsigned int clamp_timer_interval; + + struct cpr_fuses cpr_fuses; + bool reduce_to_fuse_uV; + bool reduce_to_corner_uV; +}; + +struct acc_desc { + unsigned int enable_reg; + u32 enable_mask; + + struct reg_sequence *config; + struct reg_sequence *settings; + int num_regs_per_fuse; +}; + +struct cpr_acc_desc { + const struct cpr_desc *cpr_desc; + const struct acc_desc *acc_desc; +}; + +struct fuse_corner { + int min_uV; + int max_uV; + int uV; + int quot; + int step_quot; + const struct reg_sequence *accs; + int num_accs; + unsigned long max_freq; + u8 ring_osc_idx; +}; + +struct corner { + int min_uV; + int max_uV; + int uV; + int last_uV; + int quot_adjust; + u32 save_ctl; + u32 save_irq; + unsigned long freq; + struct fuse_corner *fuse_corner; +}; + +struct cpr_drv { + unsigned int num_corners; + unsigned int ref_clk_khz; + + struct generic_pm_domain pd; + struct device *dev; + struct device *attached_cpu_dev; + struct mutex lock; + void __iomem *base; + struct corner *corner; + struct regulator *vdd_apc; + struct clk *cpu_clk; + struct regmap *tcsr; + bool loop_disabled; + u32 gcnt; + unsigned long flags; + + struct fuse_corner *fuse_corners; + struct corner *corners; + + const struct cpr_desc *desc; + const struct acc_desc *acc_desc; + const struct cpr_fuse *cpr_fuses; + + struct dentry *debugfs; +}; + +static bool cpr_is_allowed(struct cpr_drv *drv) +{ + return !drv->loop_disabled; +} + +static void cpr_write(struct cpr_drv *drv, u32 offset, u32 value) +{ + writel_relaxed(value, drv->base + offset); +} + +static u32 cpr_read(struct cpr_drv *drv, u32 offset) +{ + return readl_relaxed(drv->base + offset); +} + +static void +cpr_masked_write(struct cpr_drv *drv, u32 offset, u32 mask, u32 value) +{ + u32 val; + + val = readl_relaxed(drv->base + offset); + val &= ~mask; + val |= value & mask; + writel_relaxed(val, drv->base + offset); +} + +static void cpr_irq_clr(struct cpr_drv *drv) +{ + cpr_write(drv, REG_RBIF_IRQ_CLEAR, CPR_INT_ALL); +} + +static void cpr_irq_clr_nack(struct cpr_drv *drv) +{ + cpr_irq_clr(drv); + cpr_write(drv, REG_RBIF_CONT_NACK_CMD, 1); +} + +static void cpr_irq_clr_ack(struct cpr_drv *drv) +{ + cpr_irq_clr(drv); + cpr_write(drv, REG_RBIF_CONT_ACK_CMD, 1); +} + +static void cpr_irq_set(struct cpr_drv *drv, u32 int_bits) +{ + cpr_write(drv, REG_RBIF_IRQ_EN(0), int_bits); +} + +static void cpr_ctl_modify(struct cpr_drv *drv, u32 mask, u32 value) +{ + cpr_masked_write(drv, REG_RBCPR_CTL, mask, value); +} + +static void cpr_ctl_enable(struct cpr_drv *drv, struct corner *corner) +{ + u32 val, mask; + const struct cpr_desc *desc = drv->desc; + + /* Program Consecutive Up & Down */ + val = desc->timer_cons_down << RBIF_TIMER_ADJ_CONS_DOWN_SHIFT; + val |= desc->timer_cons_up << RBIF_TIMER_ADJ_CONS_UP_SHIFT; + mask = RBIF_TIMER_ADJ_CONS_UP_MASK | RBIF_TIMER_ADJ_CONS_DOWN_MASK; + cpr_masked_write(drv, REG_RBIF_TIMER_ADJUST, mask, val); + cpr_masked_write(drv, REG_RBCPR_CTL, + RBCPR_CTL_SW_AUTO_CONT_NACK_DN_EN | + RBCPR_CTL_SW_AUTO_CONT_ACK_EN, + corner->save_ctl); + cpr_irq_set(drv, corner->save_irq); + + if (cpr_is_allowed(drv) && corner->max_uV > corner->min_uV) + val = RBCPR_CTL_LOOP_EN; + else + val = 0; + cpr_ctl_modify(drv, RBCPR_CTL_LOOP_EN, val); +} + +static void cpr_ctl_disable(struct cpr_drv *drv) +{ + cpr_irq_set(drv, 0); + cpr_ctl_modify(drv, RBCPR_CTL_SW_AUTO_CONT_NACK_DN_EN | + RBCPR_CTL_SW_AUTO_CONT_ACK_EN, 0); + cpr_masked_write(drv, REG_RBIF_TIMER_ADJUST, + RBIF_TIMER_ADJ_CONS_UP_MASK | + RBIF_TIMER_ADJ_CONS_DOWN_MASK, 0); + cpr_irq_clr(drv); + cpr_write(drv, REG_RBIF_CONT_ACK_CMD, 1); + cpr_write(drv, REG_RBIF_CONT_NACK_CMD, 1); + cpr_ctl_modify(drv, RBCPR_CTL_LOOP_EN, 0); +} + +static bool cpr_ctl_is_enabled(struct cpr_drv *drv) +{ + u32 reg_val; + + reg_val = cpr_read(drv, REG_RBCPR_CTL); + return reg_val & RBCPR_CTL_LOOP_EN; +} + +static bool cpr_ctl_is_busy(struct cpr_drv *drv) +{ + u32 reg_val; + + reg_val = cpr_read(drv, REG_RBCPR_RESULT_0); + return reg_val & RBCPR_RESULT0_BUSY_MASK; +} + +static void cpr_corner_save(struct cpr_drv *drv, struct corner *corner) +{ + corner->save_ctl = cpr_read(drv, REG_RBCPR_CTL); + corner->save_irq = cpr_read(drv, REG_RBIF_IRQ_EN(0)); +} + +static void cpr_corner_restore(struct cpr_drv *drv, struct corner *corner) +{ + u32 gcnt, ctl, irq, ro_sel, step_quot; + struct fuse_corner *fuse = corner->fuse_corner; + const struct cpr_desc *desc = drv->desc; + int i; + + ro_sel = fuse->ring_osc_idx; + gcnt = drv->gcnt; + gcnt |= fuse->quot - corner->quot_adjust; + + /* Program the step quotient and idle clocks */ + step_quot = desc->idle_clocks << RBCPR_STEP_QUOT_IDLE_CLK_SHIFT; + step_quot |= fuse->step_quot & RBCPR_STEP_QUOT_STEPQUOT_MASK; + cpr_write(drv, REG_RBCPR_STEP_QUOT, step_quot); + + /* Clear the target quotient value and gate count of all ROs */ + for (i = 0; i < CPR_NUM_RING_OSC; i++) + cpr_write(drv, REG_RBCPR_GCNT_TARGET(i), 0); + + cpr_write(drv, REG_RBCPR_GCNT_TARGET(ro_sel), gcnt); + ctl = corner->save_ctl; + cpr_write(drv, REG_RBCPR_CTL, ctl); + irq = corner->save_irq; + cpr_irq_set(drv, irq); + dev_dbg(drv->dev, "gcnt = %#08x, ctl = %#08x, irq = %#08x\n", gcnt, + ctl, irq); +} + +static void cpr_set_acc(struct regmap *tcsr, struct fuse_corner *f, + struct fuse_corner *end) +{ + if (f == end) + return; + + if (f < end) { + for (f += 1; f <= end; f++) + regmap_multi_reg_write(tcsr, f->accs, f->num_accs); + } else { + for (f -= 1; f >= end; f--) + regmap_multi_reg_write(tcsr, f->accs, f->num_accs); + } +} + +static int cpr_pre_voltage(struct cpr_drv *drv, + struct fuse_corner *fuse_corner, + enum voltage_change_dir dir) +{ + struct fuse_corner *prev_fuse_corner = drv->corner->fuse_corner; + + if (drv->tcsr && dir == DOWN) + cpr_set_acc(drv->tcsr, prev_fuse_corner, fuse_corner); + + return 0; +} + +static int cpr_post_voltage(struct cpr_drv *drv, + struct fuse_corner *fuse_corner, + enum voltage_change_dir dir) +{ + struct fuse_corner *prev_fuse_corner = drv->corner->fuse_corner; + + if (drv->tcsr && dir == UP) + cpr_set_acc(drv->tcsr, prev_fuse_corner, fuse_corner); + + return 0; +} + +static int cpr_scale_voltage(struct cpr_drv *drv, struct corner *corner, + int new_uV, enum voltage_change_dir dir) +{ + int ret; + struct fuse_corner *fuse_corner = corner->fuse_corner; + + ret = cpr_pre_voltage(drv, fuse_corner, dir); + if (ret) + return ret; + + ret = regulator_set_voltage(drv->vdd_apc, new_uV, new_uV); + if (ret) { + dev_err_ratelimited(drv->dev, "failed to set apc voltage %d\n", + new_uV); + return ret; + } + + ret = cpr_post_voltage(drv, fuse_corner, dir); + if (ret) + return ret; + + return 0; +} + +static unsigned int cpr_get_cur_perf_state(struct cpr_drv *drv) +{ + return drv->corner ? drv->corner - drv->corners + 1 : 0; +} + +static int cpr_scale(struct cpr_drv *drv, enum voltage_change_dir dir) +{ + u32 val, error_steps, reg_mask; + int last_uV, new_uV, step_uV, ret; + struct corner *corner; + const struct cpr_desc *desc = drv->desc; + + if (dir != UP && dir != DOWN) + return 0; + + step_uV = regulator_get_linear_step(drv->vdd_apc); + if (!step_uV) + return -EINVAL; + + corner = drv->corner; + + val = cpr_read(drv, REG_RBCPR_RESULT_0); + + error_steps = val >> RBCPR_RESULT0_ERROR_STEPS_SHIFT; + error_steps &= RBCPR_RESULT0_ERROR_STEPS_MASK; + last_uV = corner->last_uV; + + if (dir == UP) { + if (desc->clamp_timer_interval && + error_steps < desc->up_threshold) { + /* + * Handle the case where another measurement started + * after the interrupt was triggered due to a core + * exiting from power collapse. + */ + error_steps = max(desc->up_threshold, + desc->vdd_apc_step_up_limit); + } + + if (last_uV >= corner->max_uV) { + cpr_irq_clr_nack(drv); + + /* Maximize the UP threshold */ + reg_mask = RBCPR_CTL_UP_THRESHOLD_MASK; + reg_mask <<= RBCPR_CTL_UP_THRESHOLD_SHIFT; + val = reg_mask; + cpr_ctl_modify(drv, reg_mask, val); + + /* Disable UP interrupt */ + cpr_irq_set(drv, CPR_INT_DEFAULT & ~CPR_INT_UP); + + return 0; + } + + if (error_steps > desc->vdd_apc_step_up_limit) + error_steps = desc->vdd_apc_step_up_limit; + + /* Calculate new voltage */ + new_uV = last_uV + error_steps * step_uV; + new_uV = min(new_uV, corner->max_uV); + + dev_dbg(drv->dev, + "UP: -> new_uV: %d last_uV: %d perf state: %u\n", + new_uV, last_uV, cpr_get_cur_perf_state(drv)); + } else if (dir == DOWN) { + if (desc->clamp_timer_interval && + error_steps < desc->down_threshold) { + /* + * Handle the case where another measurement started + * after the interrupt was triggered due to a core + * exiting from power collapse. + */ + error_steps = max(desc->down_threshold, + desc->vdd_apc_step_down_limit); + } + + if (last_uV <= corner->min_uV) { + cpr_irq_clr_nack(drv); + + /* Enable auto nack down */ + reg_mask = RBCPR_CTL_SW_AUTO_CONT_NACK_DN_EN; + val = RBCPR_CTL_SW_AUTO_CONT_NACK_DN_EN; + + cpr_ctl_modify(drv, reg_mask, val); + + /* Disable DOWN interrupt */ + cpr_irq_set(drv, CPR_INT_DEFAULT & ~CPR_INT_DOWN); + + return 0; + } + + if (error_steps > desc->vdd_apc_step_down_limit) + error_steps = desc->vdd_apc_step_down_limit; + + /* Calculate new voltage */ + new_uV = last_uV - error_steps * step_uV; + new_uV = max(new_uV, corner->min_uV); + + dev_dbg(drv->dev, + "DOWN: -> new_uV: %d last_uV: %d perf state: %u\n", + new_uV, last_uV, cpr_get_cur_perf_state(drv)); + } + + ret = cpr_scale_voltage(drv, corner, new_uV, dir); + if (ret) { + cpr_irq_clr_nack(drv); + return ret; + } + drv->corner->last_uV = new_uV; + + if (dir == UP) { + /* Disable auto nack down */ + reg_mask = RBCPR_CTL_SW_AUTO_CONT_NACK_DN_EN; + val = 0; + } else if (dir == DOWN) { + /* Restore default threshold for UP */ + reg_mask = RBCPR_CTL_UP_THRESHOLD_MASK; + reg_mask <<= RBCPR_CTL_UP_THRESHOLD_SHIFT; + val = desc->up_threshold; + val <<= RBCPR_CTL_UP_THRESHOLD_SHIFT; + } + + cpr_ctl_modify(drv, reg_mask, val); + + /* Re-enable default interrupts */ + cpr_irq_set(drv, CPR_INT_DEFAULT); + + /* Ack */ + cpr_irq_clr_ack(drv); + + return 0; +} + +static irqreturn_t cpr_irq_handler(int irq, void *dev) +{ + struct cpr_drv *drv = dev; + const struct cpr_desc *desc = drv->desc; + irqreturn_t ret = IRQ_HANDLED; + u32 val; + + mutex_lock(&drv->lock); + + val = cpr_read(drv, REG_RBIF_IRQ_STATUS); + if (drv->flags & FLAGS_IGNORE_1ST_IRQ_STATUS) + val = cpr_read(drv, REG_RBIF_IRQ_STATUS); + + dev_dbg(drv->dev, "IRQ_STATUS = %#02x\n", val); + + if (!cpr_ctl_is_enabled(drv)) { + dev_dbg(drv->dev, "CPR is disabled\n"); + ret = IRQ_NONE; + } else if (cpr_ctl_is_busy(drv) && !desc->clamp_timer_interval) { + dev_dbg(drv->dev, "CPR measurement is not ready\n"); + } else if (!cpr_is_allowed(drv)) { + val = cpr_read(drv, REG_RBCPR_CTL); + dev_err_ratelimited(drv->dev, + "Interrupt broken? RBCPR_CTL = %#02x\n", + val); + ret = IRQ_NONE; + } else { + /* + * Following sequence of handling is as per each IRQ's + * priority + */ + if (val & CPR_INT_UP) { + cpr_scale(drv, UP); + } else if (val & CPR_INT_DOWN) { + cpr_scale(drv, DOWN); + } else if (val & CPR_INT_MIN) { + cpr_irq_clr_nack(drv); + } else if (val & CPR_INT_MAX) { + cpr_irq_clr_nack(drv); + } else if (val & CPR_INT_MID) { + /* RBCPR_CTL_SW_AUTO_CONT_ACK_EN is enabled */ + dev_dbg(drv->dev, "IRQ occurred for Mid Flag\n"); + } else { + dev_dbg(drv->dev, + "IRQ occurred for unknown flag (%#08x)\n", val); + } + + /* Save register values for the corner */ + cpr_corner_save(drv, drv->corner); + } + + mutex_unlock(&drv->lock); + + return ret; +} + +static int cpr_enable(struct cpr_drv *drv) +{ + int ret; + + ret = regulator_enable(drv->vdd_apc); + if (ret) + return ret; + + mutex_lock(&drv->lock); + + if (cpr_is_allowed(drv) && drv->corner) { + cpr_irq_clr(drv); + cpr_corner_restore(drv, drv->corner); + cpr_ctl_enable(drv, drv->corner); + } + + mutex_unlock(&drv->lock); + + return 0; +} + +static int cpr_disable(struct cpr_drv *drv) +{ + int ret; + + mutex_lock(&drv->lock); + + if (cpr_is_allowed(drv)) { + cpr_ctl_disable(drv); + cpr_irq_clr(drv); + } + + mutex_unlock(&drv->lock); + + ret = regulator_disable(drv->vdd_apc); + if (ret) + return ret; + + return 0; +} + +static int cpr_config(struct cpr_drv *drv) +{ + int i; + u32 val, gcnt; + struct corner *corner; + const struct cpr_desc *desc = drv->desc; + + /* Disable interrupt and CPR */ + cpr_write(drv, REG_RBIF_IRQ_EN(0), 0); + cpr_write(drv, REG_RBCPR_CTL, 0); + + /* Program the default HW ceiling, floor and vlevel */ + val = (RBIF_LIMIT_CEILING_DEFAULT & RBIF_LIMIT_CEILING_MASK) + << RBIF_LIMIT_CEILING_SHIFT; + val |= RBIF_LIMIT_FLOOR_DEFAULT & RBIF_LIMIT_FLOOR_MASK; + cpr_write(drv, REG_RBIF_LIMIT, val); + cpr_write(drv, REG_RBIF_SW_VLEVEL, RBIF_SW_VLEVEL_DEFAULT); + + /* + * Clear the target quotient value and gate count of all + * ring oscillators + */ + for (i = 0; i < CPR_NUM_RING_OSC; i++) + cpr_write(drv, REG_RBCPR_GCNT_TARGET(i), 0); + + /* Init and save gcnt */ + gcnt = (drv->ref_clk_khz * desc->gcnt_us) / 1000; + gcnt = gcnt & RBCPR_GCNT_TARGET_GCNT_MASK; + gcnt <<= RBCPR_GCNT_TARGET_GCNT_SHIFT; + drv->gcnt = gcnt; + + /* Program the delay count for the timer */ + val = (drv->ref_clk_khz * desc->timer_delay_us) / 1000; + cpr_write(drv, REG_RBCPR_TIMER_INTERVAL, val); + dev_dbg(drv->dev, "Timer count: %#0x (for %d us)\n", val, + desc->timer_delay_us); + + /* Program Consecutive Up & Down */ + val = desc->timer_cons_down << RBIF_TIMER_ADJ_CONS_DOWN_SHIFT; + val |= desc->timer_cons_up << RBIF_TIMER_ADJ_CONS_UP_SHIFT; + val |= desc->clamp_timer_interval << RBIF_TIMER_ADJ_CLAMP_INT_SHIFT; + cpr_write(drv, REG_RBIF_TIMER_ADJUST, val); + + /* Program the control register */ + val = desc->up_threshold << RBCPR_CTL_UP_THRESHOLD_SHIFT; + val |= desc->down_threshold << RBCPR_CTL_DN_THRESHOLD_SHIFT; + val |= RBCPR_CTL_TIMER_EN | RBCPR_CTL_COUNT_MODE; + val |= RBCPR_CTL_SW_AUTO_CONT_ACK_EN; + cpr_write(drv, REG_RBCPR_CTL, val); + + for (i = 0; i < drv->num_corners; i++) { + corner = &drv->corners[i]; + corner->save_ctl = val; + corner->save_irq = CPR_INT_DEFAULT; + } + + cpr_irq_set(drv, CPR_INT_DEFAULT); + + val = cpr_read(drv, REG_RBCPR_VERSION); + if (val <= RBCPR_VER_2) + drv->flags |= FLAGS_IGNORE_1ST_IRQ_STATUS; + + return 0; +} + +static int cpr_set_performance_state(struct generic_pm_domain *domain, + unsigned int state) +{ + struct cpr_drv *drv = container_of(domain, struct cpr_drv, pd); + struct corner *corner, *end; + enum voltage_change_dir dir; + int ret = 0, new_uV; + + mutex_lock(&drv->lock); + + dev_dbg(drv->dev, "%s: setting perf state: %u (prev state: %u)\n", + __func__, state, cpr_get_cur_perf_state(drv)); + + /* + * Determine new corner we're going to. + * Remove one since lowest performance state is 1. + */ + corner = drv->corners + state - 1; + end = &drv->corners[drv->num_corners - 1]; + if (corner > end || corner < drv->corners) { + ret = -EINVAL; + goto unlock; + } + + /* Determine direction */ + if (drv->corner > corner) + dir = DOWN; + else if (drv->corner < corner) + dir = UP; + else + dir = NO_CHANGE; + + if (cpr_is_allowed(drv)) + new_uV = corner->last_uV; + else + new_uV = corner->uV; + + if (cpr_is_allowed(drv)) + cpr_ctl_disable(drv); + + ret = cpr_scale_voltage(drv, corner, new_uV, dir); + if (ret) + goto unlock; + + if (cpr_is_allowed(drv)) { + cpr_irq_clr(drv); + if (drv->corner != corner) + cpr_corner_restore(drv, corner); + cpr_ctl_enable(drv, corner); + } + + drv->corner = corner; + +unlock: + mutex_unlock(&drv->lock); + + return ret; +} + +static int cpr_read_efuse(struct device *dev, const char *cname, u32 *data) +{ + struct nvmem_cell *cell; + ssize_t len; + char *ret; + int i; + + *data = 0; + + cell = nvmem_cell_get(dev, cname); + if (IS_ERR(cell)) { + if (PTR_ERR(cell) != -EPROBE_DEFER) + dev_err(dev, "undefined cell %s\n", cname); + return PTR_ERR(cell); + } + + ret = nvmem_cell_read(cell, &len); + nvmem_cell_put(cell); + if (IS_ERR(ret)) { + dev_err(dev, "can't read cell %s\n", cname); + return PTR_ERR(ret); + } + + for (i = 0; i < len; i++) + *data |= ret[i] << (8 * i); + + kfree(ret); + dev_dbg(dev, "efuse read(%s) = %x, bytes %zd\n", cname, *data, len); + + return 0; +} + +static int +cpr_populate_ring_osc_idx(struct cpr_drv *drv) +{ + struct fuse_corner *fuse = drv->fuse_corners; + struct fuse_corner *end = fuse + drv->desc->num_fuse_corners; + const struct cpr_fuse *fuses = drv->cpr_fuses; + u32 data; + int ret; + + for (; fuse < end; fuse++, fuses++) { + ret = cpr_read_efuse(drv->dev, fuses->ring_osc, + &data); + if (ret) + return ret; + fuse->ring_osc_idx = data; + } + + return 0; +} + +static int cpr_read_fuse_uV(const struct cpr_desc *desc, + const struct fuse_corner_data *fdata, + const char *init_v_efuse, + int step_volt, + struct cpr_drv *drv) +{ + int step_size_uV, steps, uV; + u32 bits = 0; + int ret; + + ret = cpr_read_efuse(drv->dev, init_v_efuse, &bits); + if (ret) + return ret; + + steps = bits & ~BIT(desc->cpr_fuses.init_voltage_width - 1); + /* Not two's complement.. instead highest bit is sign bit */ + if (bits & BIT(desc->cpr_fuses.init_voltage_width - 1)) + steps = -steps; + + step_size_uV = desc->cpr_fuses.init_voltage_step; + + uV = fdata->ref_uV + steps * step_size_uV; + return DIV_ROUND_UP(uV, step_volt) * step_volt; +} + +static int cpr_fuse_corner_init(struct cpr_drv *drv) +{ + const struct cpr_desc *desc = drv->desc; + const struct cpr_fuse *fuses = drv->cpr_fuses; + const struct acc_desc *acc_desc = drv->acc_desc; + int i; + unsigned int step_volt; + struct fuse_corner_data *fdata; + struct fuse_corner *fuse, *end; + int uV; + const struct reg_sequence *accs; + int ret; + + accs = acc_desc->settings; + + step_volt = regulator_get_linear_step(drv->vdd_apc); + if (!step_volt) + return -EINVAL; + + /* Populate fuse_corner members */ + fuse = drv->fuse_corners; + end = &fuse[desc->num_fuse_corners - 1]; + fdata = desc->cpr_fuses.fuse_corner_data; + + for (i = 0; fuse <= end; fuse++, fuses++, i++, fdata++) { + /* + * Update SoC voltages: platforms might choose a different + * regulators than the one used to characterize the algorithms + * (ie, init_voltage_step). + */ + fdata->min_uV = roundup(fdata->min_uV, step_volt); + fdata->max_uV = roundup(fdata->max_uV, step_volt); + + /* Populate uV */ + uV = cpr_read_fuse_uV(desc, fdata, fuses->init_voltage, + step_volt, drv); + if (uV < 0) + return uV; + + fuse->min_uV = fdata->min_uV; + fuse->max_uV = fdata->max_uV; + fuse->uV = clamp(uV, fuse->min_uV, fuse->max_uV); + + if (fuse == end) { + /* + * Allow the highest fuse corner's PVS voltage to + * define the ceiling voltage for that corner in order + * to support SoC's in which variable ceiling values + * are required. + */ + end->max_uV = max(end->max_uV, end->uV); + } + + /* Populate target quotient by scaling */ + ret = cpr_read_efuse(drv->dev, fuses->quotient, &fuse->quot); + if (ret) + return ret; + + fuse->quot *= fdata->quot_scale; + fuse->quot += fdata->quot_offset; + fuse->quot += fdata->quot_adjust; + fuse->step_quot = desc->step_quot[fuse->ring_osc_idx]; + + /* Populate acc settings */ + fuse->accs = accs; + fuse->num_accs = acc_desc->num_regs_per_fuse; + accs += acc_desc->num_regs_per_fuse; + } + + /* + * Restrict all fuse corner PVS voltages based upon per corner + * ceiling and floor voltages. + */ + for (fuse = drv->fuse_corners, i = 0; fuse <= end; fuse++, i++) { + if (fuse->uV > fuse->max_uV) + fuse->uV = fuse->max_uV; + else if (fuse->uV < fuse->min_uV) + fuse->uV = fuse->min_uV; + + ret = regulator_is_supported_voltage(drv->vdd_apc, + fuse->min_uV, + fuse->min_uV); + if (!ret) { + dev_err(drv->dev, + "min uV: %d (fuse corner: %d) not supported by regulator\n", + fuse->min_uV, i); + return -EINVAL; + } + + ret = regulator_is_supported_voltage(drv->vdd_apc, + fuse->max_uV, + fuse->max_uV); + if (!ret) { + dev_err(drv->dev, + "max uV: %d (fuse corner: %d) not supported by regulator\n", + fuse->max_uV, i); + return -EINVAL; + } + + dev_dbg(drv->dev, + "fuse corner %d: [%d %d %d] RO%hhu quot %d squot %d\n", + i, fuse->min_uV, fuse->uV, fuse->max_uV, + fuse->ring_osc_idx, fuse->quot, fuse->step_quot); + } + + return 0; +} + +static int cpr_calculate_scaling(const char *quot_offset, + struct cpr_drv *drv, + const struct fuse_corner_data *fdata, + const struct corner *corner) +{ + u32 quot_diff = 0; + unsigned long freq_diff; + int scaling; + const struct fuse_corner *fuse, *prev_fuse; + int ret; + + fuse = corner->fuse_corner; + prev_fuse = fuse - 1; + + if (quot_offset) { + ret = cpr_read_efuse(drv->dev, quot_offset, "_diff); + if (ret) + return ret; + + quot_diff *= fdata->quot_offset_scale; + quot_diff += fdata->quot_offset_adjust; + } else { + quot_diff = fuse->quot - prev_fuse->quot; + } + + freq_diff = fuse->max_freq - prev_fuse->max_freq; + freq_diff /= 1000000; /* Convert to MHz */ + scaling = 1000 * quot_diff / freq_diff; + return min(scaling, fdata->max_quot_scale); +} + +static int cpr_interpolate(const struct corner *corner, int step_volt, + const struct fuse_corner_data *fdata) +{ + unsigned long f_high, f_low, f_diff; + int uV_high, uV_low, uV; + u64 temp, temp_limit; + const struct fuse_corner *fuse, *prev_fuse; + + fuse = corner->fuse_corner; + prev_fuse = fuse - 1; + + f_high = fuse->max_freq; + f_low = prev_fuse->max_freq; + uV_high = fuse->uV; + uV_low = prev_fuse->uV; + f_diff = fuse->max_freq - corner->freq; + + /* + * Don't interpolate in the wrong direction. This could happen + * if the adjusted fuse voltage overlaps with the previous fuse's + * adjusted voltage. + */ + if (f_high <= f_low || uV_high <= uV_low || f_high <= corner->freq) + return corner->uV; + + temp = f_diff * (uV_high - uV_low); + do_div(temp, f_high - f_low); + + /* + * max_volt_scale has units of uV/MHz while freq values + * have units of Hz. Divide by 1000000 to convert to. + */ + temp_limit = f_diff * fdata->max_volt_scale; + do_div(temp_limit, 1000000); + + uV = uV_high - min(temp, temp_limit); + return roundup(uV, step_volt); +} + +static unsigned int cpr_get_fuse_corner(struct dev_pm_opp *opp) +{ + struct device_node *np; + unsigned int fuse_corner = 0; + + np = dev_pm_opp_get_of_node(opp); + if (of_property_read_u32(np, "qcom,opp-fuse-level", &fuse_corner)) + pr_err("%s: missing 'qcom,opp-fuse-level' property\n", + __func__); + + of_node_put(np); + + return fuse_corner; +} + +static unsigned long cpr_get_opp_hz_for_req(struct dev_pm_opp *ref, + struct device *cpu_dev) +{ + u64 rate = 0; + struct device_node *ref_np; + struct device_node *desc_np; + struct device_node *child_np = NULL; + struct device_node *child_req_np = NULL; + + desc_np = dev_pm_opp_of_get_opp_desc_node(cpu_dev); + if (!desc_np) + return 0; + + ref_np = dev_pm_opp_get_of_node(ref); + if (!ref_np) + goto out_ref; + + do { + of_node_put(child_req_np); + child_np = of_get_next_available_child(desc_np, child_np); + child_req_np = of_parse_phandle(child_np, "required-opps", 0); + } while (child_np && child_req_np != ref_np); + + if (child_np && child_req_np == ref_np) + of_property_read_u64(child_np, "opp-hz", &rate); + + of_node_put(child_req_np); + of_node_put(child_np); + of_node_put(ref_np); +out_ref: + of_node_put(desc_np); + + return (unsigned long) rate; +} + +static int cpr_corner_init(struct cpr_drv *drv) +{ + const struct cpr_desc *desc = drv->desc; + const struct cpr_fuse *fuses = drv->cpr_fuses; + int i, level, scaling = 0; + unsigned int fnum, fc; + const char *quot_offset; + struct fuse_corner *fuse, *prev_fuse; + struct corner *corner, *end; + struct corner_data *cdata; + const struct fuse_corner_data *fdata; + bool apply_scaling; + unsigned long freq_diff, freq_diff_mhz; + unsigned long freq; + int step_volt = regulator_get_linear_step(drv->vdd_apc); + struct dev_pm_opp *opp; + + if (!step_volt) + return -EINVAL; + + corner = drv->corners; + end = &corner[drv->num_corners - 1]; + + cdata = devm_kcalloc(drv->dev, drv->num_corners, + sizeof(struct corner_data), + GFP_KERNEL); + if (!cdata) + return -ENOMEM; + + /* + * Store maximum frequency for each fuse corner based on the frequency + * plan + */ + for (level = 1; level <= drv->num_corners; level++) { + opp = dev_pm_opp_find_level_exact(&drv->pd.dev, level); + if (IS_ERR(opp)) + return -EINVAL; + fc = cpr_get_fuse_corner(opp); + if (!fc) { + dev_pm_opp_put(opp); + return -EINVAL; + } + fnum = fc - 1; + freq = cpr_get_opp_hz_for_req(opp, drv->attached_cpu_dev); + if (!freq) { + dev_pm_opp_put(opp); + return -EINVAL; + } + cdata[level - 1].fuse_corner = fnum; + cdata[level - 1].freq = freq; + + fuse = &drv->fuse_corners[fnum]; + dev_dbg(drv->dev, "freq: %lu level: %u fuse level: %u\n", + freq, dev_pm_opp_get_level(opp) - 1, fnum); + if (freq > fuse->max_freq) + fuse->max_freq = freq; + dev_pm_opp_put(opp); + } + + /* + * Get the quotient adjustment scaling factor, according to: + * + * scaling = min(1000 * (QUOT(corner_N) - QUOT(corner_N-1)) + * / (freq(corner_N) - freq(corner_N-1)), max_factor) + * + * QUOT(corner_N): quotient read from fuse for fuse corner N + * QUOT(corner_N-1): quotient read from fuse for fuse corner (N - 1) + * freq(corner_N): max frequency in MHz supported by fuse corner N + * freq(corner_N-1): max frequency in MHz supported by fuse corner + * (N - 1) + * + * Then walk through the corners mapped to each fuse corner + * and calculate the quotient adjustment for each one using the + * following formula: + * + * quot_adjust = (freq_max - freq_corner) * scaling / 1000 + * + * freq_max: max frequency in MHz supported by the fuse corner + * freq_corner: frequency in MHz corresponding to the corner + * scaling: calculated from above equation + * + * + * + + + * | v | + * q | f c o | f c + * u | c l | c + * o | f t | f + * t | c a | c + * | c f g | c f + * | e | + * +--------------- +---------------- + * 0 1 2 3 4 5 6 0 1 2 3 4 5 6 + * corner corner + * + * c = corner + * f = fuse corner + * + */ + for (apply_scaling = false, i = 0; corner <= end; corner++, i++) { + fnum = cdata[i].fuse_corner; + fdata = &desc->cpr_fuses.fuse_corner_data[fnum]; + quot_offset = fuses[fnum].quotient_offset; + fuse = &drv->fuse_corners[fnum]; + if (fnum) + prev_fuse = &drv->fuse_corners[fnum - 1]; + else + prev_fuse = NULL; + + corner->fuse_corner = fuse; + corner->freq = cdata[i].freq; + corner->uV = fuse->uV; + + if (prev_fuse && cdata[i - 1].freq == prev_fuse->max_freq) { + scaling = cpr_calculate_scaling(quot_offset, drv, + fdata, corner); + if (scaling < 0) + return scaling; + + apply_scaling = true; + } else if (corner->freq == fuse->max_freq) { + /* This is a fuse corner; don't scale anything */ + apply_scaling = false; + } + + if (apply_scaling) { + freq_diff = fuse->max_freq - corner->freq; + freq_diff_mhz = freq_diff / 1000000; + corner->quot_adjust = scaling * freq_diff_mhz / 1000; + + corner->uV = cpr_interpolate(corner, step_volt, fdata); + } + + corner->max_uV = fuse->max_uV; + corner->min_uV = fuse->min_uV; + corner->uV = clamp(corner->uV, corner->min_uV, corner->max_uV); + corner->last_uV = corner->uV; + + /* Reduce the ceiling voltage if needed */ + if (desc->reduce_to_corner_uV && corner->uV < corner->max_uV) + corner->max_uV = corner->uV; + else if (desc->reduce_to_fuse_uV && fuse->uV < corner->max_uV) + corner->max_uV = max(corner->min_uV, fuse->uV); + + dev_dbg(drv->dev, "corner %d: [%d %d %d] quot %d\n", i, + corner->min_uV, corner->uV, corner->max_uV, + fuse->quot - corner->quot_adjust); + } + + return 0; +} + +static const struct cpr_fuse *cpr_get_fuses(struct cpr_drv *drv) +{ + const struct cpr_desc *desc = drv->desc; + struct cpr_fuse *fuses; + int i; + + fuses = devm_kcalloc(drv->dev, desc->num_fuse_corners, + sizeof(struct cpr_fuse), + GFP_KERNEL); + if (!fuses) + return ERR_PTR(-ENOMEM); + + for (i = 0; i < desc->num_fuse_corners; i++) { + char tbuf[32]; + + snprintf(tbuf, 32, "cpr_ring_osc%d", i + 1); + fuses[i].ring_osc = devm_kstrdup(drv->dev, tbuf, GFP_KERNEL); + if (!fuses[i].ring_osc) + return ERR_PTR(-ENOMEM); + + snprintf(tbuf, 32, "cpr_init_voltage%d", i + 1); + fuses[i].init_voltage = devm_kstrdup(drv->dev, tbuf, + GFP_KERNEL); + if (!fuses[i].init_voltage) + return ERR_PTR(-ENOMEM); + + snprintf(tbuf, 32, "cpr_quotient%d", i + 1); + fuses[i].quotient = devm_kstrdup(drv->dev, tbuf, GFP_KERNEL); + if (!fuses[i].quotient) + return ERR_PTR(-ENOMEM); + + snprintf(tbuf, 32, "cpr_quotient_offset%d", i + 1); + fuses[i].quotient_offset = devm_kstrdup(drv->dev, tbuf, + GFP_KERNEL); + if (!fuses[i].quotient_offset) + return ERR_PTR(-ENOMEM); + } + + return fuses; +} + +static void cpr_set_loop_allowed(struct cpr_drv *drv) +{ + drv->loop_disabled = false; +} + +static int cpr_init_parameters(struct cpr_drv *drv) +{ + const struct cpr_desc *desc = drv->desc; + struct clk *clk; + + clk = clk_get(drv->dev, "ref"); + if (IS_ERR(clk)) + return PTR_ERR(clk); + + drv->ref_clk_khz = clk_get_rate(clk) / 1000; + clk_put(clk); + + if (desc->timer_cons_up > RBIF_TIMER_ADJ_CONS_UP_MASK || + desc->timer_cons_down > RBIF_TIMER_ADJ_CONS_DOWN_MASK || + desc->up_threshold > RBCPR_CTL_UP_THRESHOLD_MASK || + desc->down_threshold > RBCPR_CTL_DN_THRESHOLD_MASK || + desc->idle_clocks > RBCPR_STEP_QUOT_IDLE_CLK_MASK || + desc->clamp_timer_interval > RBIF_TIMER_ADJ_CLAMP_INT_MASK) + return -EINVAL; + + dev_dbg(drv->dev, "up threshold = %u, down threshold = %u\n", + desc->up_threshold, desc->down_threshold); + + return 0; +} + +static int cpr_find_initial_corner(struct cpr_drv *drv) +{ + unsigned long rate; + const struct corner *end; + struct corner *iter; + unsigned int i = 0; + + if (!drv->cpu_clk) { + dev_err(drv->dev, "cannot get rate from NULL clk\n"); + return -EINVAL; + } + + end = &drv->corners[drv->num_corners - 1]; + rate = clk_get_rate(drv->cpu_clk); + + /* + * Some bootloaders set a CPU clock frequency that is not defined + * in the OPP table. When running at an unlisted frequency, + * cpufreq_online() will change to the OPP which has the lowest + * frequency, at or above the unlisted frequency. + * Since cpufreq_online() always "rounds up" in the case of an + * unlisted frequency, this function always "rounds down" in case + * of an unlisted frequency. That way, when cpufreq_online() + * triggers the first ever call to cpr_set_performance_state(), + * it will correctly determine the direction as UP. + */ + for (iter = drv->corners; iter <= end; iter++) { + if (iter->freq > rate) + break; + i++; + if (iter->freq == rate) { + drv->corner = iter; + break; + } + if (iter->freq < rate) + drv->corner = iter; + } + + if (!drv->corner) { + dev_err(drv->dev, "boot up corner not found\n"); + return -EINVAL; + } + + dev_dbg(drv->dev, "boot up perf state: %u\n", i); + + return 0; +} + +static const struct cpr_desc qcs404_cpr_desc = { + .num_fuse_corners = 3, + .min_diff_quot = CPR_FUSE_MIN_QUOT_DIFF, + .step_quot = (int []){ 25, 25, 25, }, + .timer_delay_us = 5000, + .timer_cons_up = 0, + .timer_cons_down = 2, + .up_threshold = 1, + .down_threshold = 3, + .idle_clocks = 15, + .gcnt_us = 1, + .vdd_apc_step_up_limit = 1, + .vdd_apc_step_down_limit = 1, + .cpr_fuses = { + .init_voltage_step = 8000, + .init_voltage_width = 6, + .fuse_corner_data = (struct fuse_corner_data[]){ + /* fuse corner 0 */ + { + .ref_uV = 1224000, + .max_uV = 1224000, + .min_uV = 1048000, + .max_volt_scale = 0, + .max_quot_scale = 0, + .quot_offset = 0, + .quot_scale = 1, + .quot_adjust = 0, + .quot_offset_scale = 5, + .quot_offset_adjust = 0, + }, + /* fuse corner 1 */ + { + .ref_uV = 1288000, + .max_uV = 1288000, + .min_uV = 1048000, + .max_volt_scale = 2000, + .max_quot_scale = 1400, + .quot_offset = 0, + .quot_scale = 1, + .quot_adjust = -20, + .quot_offset_scale = 5, + .quot_offset_adjust = 0, + }, + /* fuse corner 2 */ + { + .ref_uV = 1352000, + .max_uV = 1384000, + .min_uV = 1088000, + .max_volt_scale = 2000, + .max_quot_scale = 1400, + .quot_offset = 0, + .quot_scale = 1, + .quot_adjust = 0, + .quot_offset_scale = 5, + .quot_offset_adjust = 0, + }, + }, + }, +}; + +static const struct acc_desc qcs404_acc_desc = { + .settings = (struct reg_sequence[]){ + { 0xb120, 0x1041040 }, + { 0xb124, 0x41 }, + { 0xb120, 0x0 }, + { 0xb124, 0x0 }, + { 0xb120, 0x0 }, + { 0xb124, 0x0 }, + }, + .config = (struct reg_sequence[]){ + { 0xb138, 0xff }, + { 0xb130, 0x5555 }, + }, + .num_regs_per_fuse = 2, +}; + +static const struct cpr_acc_desc qcs404_cpr_acc_desc = { + .cpr_desc = &qcs404_cpr_desc, + .acc_desc = &qcs404_acc_desc, +}; + +static unsigned int cpr_get_performance_state(struct generic_pm_domain *genpd, + struct dev_pm_opp *opp) +{ + return dev_pm_opp_get_level(opp); +} + +static int cpr_power_off(struct generic_pm_domain *domain) +{ + struct cpr_drv *drv = container_of(domain, struct cpr_drv, pd); + + return cpr_disable(drv); +} + +static int cpr_power_on(struct generic_pm_domain *domain) +{ + struct cpr_drv *drv = container_of(domain, struct cpr_drv, pd); + + return cpr_enable(drv); +} + +static int cpr_pd_attach_dev(struct generic_pm_domain *domain, + struct device *dev) +{ + struct cpr_drv *drv = container_of(domain, struct cpr_drv, pd); + const struct acc_desc *acc_desc = drv->acc_desc; + int ret = 0; + + mutex_lock(&drv->lock); + + dev_dbg(drv->dev, "attach callback for: %s\n", dev_name(dev)); + + /* + * This driver only supports scaling voltage for a CPU cluster + * where all CPUs in the cluster share a single regulator. + * Therefore, save the struct device pointer only for the first + * CPU device that gets attached. There is no need to do any + * additional initialization when further CPUs get attached. + */ + if (drv->attached_cpu_dev) + goto unlock; + + /* + * cpr_scale_voltage() requires the direction (if we are changing + * to a higher or lower OPP). The first time + * cpr_set_performance_state() is called, there is no previous + * performance state defined. Therefore, we call + * cpr_find_initial_corner() that gets the CPU clock frequency + * set by the bootloader, so that we can determine the direction + * the first time cpr_set_performance_state() is called. + */ + drv->cpu_clk = devm_clk_get(dev, NULL); + if (IS_ERR(drv->cpu_clk)) { + ret = PTR_ERR(drv->cpu_clk); + if (ret != -EPROBE_DEFER) + dev_err(drv->dev, "could not get cpu clk: %d\n", ret); + goto unlock; + } + drv->attached_cpu_dev = dev; + + dev_dbg(drv->dev, "using cpu clk from: %s\n", + dev_name(drv->attached_cpu_dev)); + + /* + * Everything related to (virtual) corners has to be initialized + * here, when attaching to the power domain, since we need to know + * the maximum frequency for each fuse corner, and this is only + * available after the cpufreq driver has attached to us. + * The reason for this is that we need to know the highest + * frequency associated with each fuse corner. + */ + ret = dev_pm_opp_get_opp_count(&drv->pd.dev); + if (ret < 0) { + dev_err(drv->dev, "could not get OPP count\n"); + goto unlock; + } + drv->num_corners = ret; + + if (drv->num_corners < 2) { + dev_err(drv->dev, "need at least 2 OPPs to use CPR\n"); + ret = -EINVAL; + goto unlock; + } + + dev_dbg(drv->dev, "number of OPPs: %d\n", drv->num_corners); + + drv->corners = devm_kcalloc(drv->dev, drv->num_corners, + sizeof(*drv->corners), + GFP_KERNEL); + if (!drv->corners) { + ret = -ENOMEM; + goto unlock; + } + + ret = cpr_corner_init(drv); + if (ret) + goto unlock; + + cpr_set_loop_allowed(drv); + + ret = cpr_init_parameters(drv); + if (ret) + goto unlock; + + /* Configure CPR HW but keep it disabled */ + ret = cpr_config(drv); + if (ret) + goto unlock; + + ret = cpr_find_initial_corner(drv); + if (ret) + goto unlock; + + if (acc_desc->config) + regmap_multi_reg_write(drv->tcsr, acc_desc->config, + acc_desc->num_regs_per_fuse); + + /* Enable ACC if required */ + if (acc_desc->enable_mask) + regmap_update_bits(drv->tcsr, acc_desc->enable_reg, + acc_desc->enable_mask, + acc_desc->enable_mask); + +unlock: + mutex_unlock(&drv->lock); + + return ret; +} + +static int cpr_debug_info_show(struct seq_file *s, void *unused) +{ + u32 gcnt, ro_sel, ctl, irq_status, reg, error_steps; + u32 step_dn, step_up, error, error_lt0, busy; + struct cpr_drv *drv = s->private; + struct fuse_corner *fuse_corner; + struct corner *corner; + + corner = drv->corner; + fuse_corner = corner->fuse_corner; + + seq_printf(s, "corner, current_volt = %d uV\n", + corner->last_uV); + + ro_sel = fuse_corner->ring_osc_idx; + gcnt = cpr_read(drv, REG_RBCPR_GCNT_TARGET(ro_sel)); + seq_printf(s, "rbcpr_gcnt_target (%u) = %#02X\n", ro_sel, gcnt); + + ctl = cpr_read(drv, REG_RBCPR_CTL); + seq_printf(s, "rbcpr_ctl = %#02X\n", ctl); + + irq_status = cpr_read(drv, REG_RBIF_IRQ_STATUS); + seq_printf(s, "rbcpr_irq_status = %#02X\n", irq_status); + + reg = cpr_read(drv, REG_RBCPR_RESULT_0); + seq_printf(s, "rbcpr_result_0 = %#02X\n", reg); + + step_dn = reg & 0x01; + step_up = (reg >> RBCPR_RESULT0_STEP_UP_SHIFT) & 0x01; + seq_printf(s, " [step_dn = %u", step_dn); + + seq_printf(s, ", step_up = %u", step_up); + + error_steps = (reg >> RBCPR_RESULT0_ERROR_STEPS_SHIFT) + & RBCPR_RESULT0_ERROR_STEPS_MASK; + seq_printf(s, ", error_steps = %u", error_steps); + + error = (reg >> RBCPR_RESULT0_ERROR_SHIFT) & RBCPR_RESULT0_ERROR_MASK; + seq_printf(s, ", error = %u", error); + + error_lt0 = (reg >> RBCPR_RESULT0_ERROR_LT0_SHIFT) & 0x01; + seq_printf(s, ", error_lt_0 = %u", error_lt0); + + busy = (reg >> RBCPR_RESULT0_BUSY_SHIFT) & 0x01; + seq_printf(s, ", busy = %u]\n", busy); + + return 0; +} +DEFINE_SHOW_ATTRIBUTE(cpr_debug_info); + +static void cpr_debugfs_init(struct cpr_drv *drv) +{ + drv->debugfs = debugfs_create_dir("qcom_cpr", NULL); + + debugfs_create_file("debug_info", 0444, drv->debugfs, + drv, &cpr_debug_info_fops); +} + +static int cpr_probe(struct platform_device *pdev) +{ + struct resource *res; + struct device *dev = &pdev->dev; + struct cpr_drv *drv; + int irq, ret; + const struct cpr_acc_desc *data; + struct device_node *np; + u32 cpr_rev = FUSE_REVISION_UNKNOWN; + + data = of_device_get_match_data(dev); + if (!data || !data->cpr_desc || !data->acc_desc) + return -EINVAL; + + drv = devm_kzalloc(dev, sizeof(*drv), GFP_KERNEL); + if (!drv) + return -ENOMEM; + drv->dev = dev; + drv->desc = data->cpr_desc; + drv->acc_desc = data->acc_desc; + + drv->fuse_corners = devm_kcalloc(dev, drv->desc->num_fuse_corners, + sizeof(*drv->fuse_corners), + GFP_KERNEL); + if (!drv->fuse_corners) + return -ENOMEM; + + np = of_parse_phandle(dev->of_node, "acc-syscon", 0); + if (!np) + return -ENODEV; + + drv->tcsr = syscon_node_to_regmap(np); + of_node_put(np); + if (IS_ERR(drv->tcsr)) + return PTR_ERR(drv->tcsr); + + res = platform_get_resource(pdev, IORESOURCE_MEM, 0); + drv->base = devm_ioremap_resource(dev, res); + if (IS_ERR(drv->base)) + return PTR_ERR(drv->base); + + irq = platform_get_irq(pdev, 0); + if (irq < 0) + return -EINVAL; + + drv->vdd_apc = devm_regulator_get(dev, "vdd-apc"); + if (IS_ERR(drv->vdd_apc)) + return PTR_ERR(drv->vdd_apc); + + /* + * Initialize fuse corners, since it simply depends + * on data in efuses. + * Everything related to (virtual) corners has to be + * initialized after attaching to the power domain, + * since it depends on the CPU's OPP table. + */ + ret = cpr_read_efuse(dev, "cpr_fuse_revision", &cpr_rev); + if (ret) + return ret; + + drv->cpr_fuses = cpr_get_fuses(drv); + if (IS_ERR(drv->cpr_fuses)) + return PTR_ERR(drv->cpr_fuses); + + ret = cpr_populate_ring_osc_idx(drv); + if (ret) + return ret; + + ret = cpr_fuse_corner_init(drv); + if (ret) + return ret; + + mutex_init(&drv->lock); + + ret = devm_request_threaded_irq(dev, irq, NULL, + cpr_irq_handler, + IRQF_ONESHOT | IRQF_TRIGGER_RISING, + "cpr", drv); + if (ret) + return ret; + + drv->pd.name = devm_kstrdup_const(dev, dev->of_node->full_name, + GFP_KERNEL); + if (!drv->pd.name) + return -EINVAL; + + drv->pd.power_off = cpr_power_off; + drv->pd.power_on = cpr_power_on; + drv->pd.set_performance_state = cpr_set_performance_state; + drv->pd.opp_to_performance_state = cpr_get_performance_state; + drv->pd.attach_dev = cpr_pd_attach_dev; + + ret = pm_genpd_init(&drv->pd, NULL, true); + if (ret) + return ret; + + ret = of_genpd_add_provider_simple(dev->of_node, &drv->pd); + if (ret) + return ret; + + platform_set_drvdata(pdev, drv); + cpr_debugfs_init(drv); + + return 0; +} + +static int cpr_remove(struct platform_device *pdev) +{ + struct cpr_drv *drv = platform_get_drvdata(pdev); + + if (cpr_is_allowed(drv)) { + cpr_ctl_disable(drv); + cpr_irq_set(drv, 0); + } + + of_genpd_del_provider(pdev->dev.of_node); + pm_genpd_remove(&drv->pd); + + debugfs_remove_recursive(drv->debugfs); + + return 0; +} + +static const struct of_device_id cpr_match_table[] = { + { .compatible = "qcom,qcs404-cpr", .data = &qcs404_cpr_acc_desc }, + { } +}; +MODULE_DEVICE_TABLE(of, cpr_match_table); + +static struct platform_driver cpr_driver = { + .probe = cpr_probe, + .remove = cpr_remove, + .driver = { + .name = "qcom-cpr", + .of_match_table = cpr_match_table, + }, +}; +module_platform_driver(cpr_driver); + +MODULE_DESCRIPTION("Core Power Reduction (CPR) driver"); +MODULE_LICENSE("GPL v2"); diff --git a/drivers/powercap/intel_rapl_common.c b/drivers/powercap/intel_rapl_common.c index 2e5b6a6834da..73257cf107d9 100644 --- a/drivers/powercap/intel_rapl_common.c +++ b/drivers/powercap/intel_rapl_common.c @@ -980,6 +980,7 @@ static const struct x86_cpu_id rapl_ids[] __initconst = { INTEL_CPU_FAM6(ICELAKE_D, rapl_defaults_hsw_server), INTEL_CPU_FAM6(COMETLAKE_L, rapl_defaults_core), INTEL_CPU_FAM6(COMETLAKE, rapl_defaults_core), + INTEL_CPU_FAM6(TIGERLAKE_L, rapl_defaults_core), INTEL_CPU_FAM6(ATOM_SILVERMONT, rapl_defaults_byt), INTEL_CPU_FAM6(ATOM_AIRMONT, rapl_defaults_cht), @@ -989,6 +990,7 @@ static const struct x86_cpu_id rapl_ids[] __initconst = { INTEL_CPU_FAM6(ATOM_GOLDMONT_PLUS, rapl_defaults_core), INTEL_CPU_FAM6(ATOM_GOLDMONT_D, rapl_defaults_core), INTEL_CPU_FAM6(ATOM_TREMONT_D, rapl_defaults_core), + INTEL_CPU_FAM6(ATOM_TREMONT_L, rapl_defaults_core), INTEL_CPU_FAM6(XEON_PHI_KNL, rapl_defaults_hsw_server), INTEL_CPU_FAM6(XEON_PHI_KNM, rapl_defaults_hsw_server), diff --git a/include/linux/acpi.h b/include/linux/acpi.h index 0f37a7d5fa77..0f24d701fbdc 100644 --- a/include/linux/acpi.h +++ b/include/linux/acpi.h @@ -279,6 +279,21 @@ static inline bool invalid_phys_cpuid(phys_cpuid_t phys_id) /* Validate the processor object's proc_id */ bool acpi_duplicate_processor_id(int proc_id); +/* Processor _CTS control */ +struct acpi_processor_power; + +#ifdef CONFIG_ACPI_PROCESSOR_CSTATE +bool acpi_processor_claim_cst_control(void); +int acpi_processor_evaluate_cst(acpi_handle handle, u32 cpu, + struct acpi_processor_power *info); +#else +static inline bool acpi_processor_claim_cst_control(void) { return false; } +static inline int acpi_processor_evaluate_cst(acpi_handle handle, u32 cpu, + struct acpi_processor_power *info) +{ + return -ENODEV; +} +#endif #ifdef CONFIG_ACPI_HOTPLUG_CPU /* Arch dependent functions for cpu hotplug support */ diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h index 1dabe36bd011..ec2ef63771f0 100644 --- a/include/linux/cpuidle.h +++ b/include/linux/cpuidle.h @@ -77,6 +77,7 @@ struct cpuidle_state { #define CPUIDLE_FLAG_COUPLED BIT(1) /* state applies to multiple cpus */ #define CPUIDLE_FLAG_TIMER_STOP BIT(2) /* timer is stopped on this state */ #define CPUIDLE_FLAG_UNUSABLE BIT(3) /* avoid using this state */ +#define CPUIDLE_FLAG_OFF BIT(4) /* disable this state by default */ struct cpuidle_device_kobj; struct cpuidle_state_kobj; @@ -115,7 +116,6 @@ DECLARE_PER_CPU(struct cpuidle_device, cpuidle_dev); struct cpuidle_driver { const char *name; struct module *owner; - int refcnt; /* used by the cpuidle framework to setup the broadcast timer */ unsigned int bctimer:1; @@ -147,8 +147,6 @@ extern u64 cpuidle_poll_time(struct cpuidle_driver *drv, extern int cpuidle_register_driver(struct cpuidle_driver *drv); extern struct cpuidle_driver *cpuidle_get_driver(void); -extern struct cpuidle_driver *cpuidle_driver_ref(void); -extern void cpuidle_driver_unref(void); extern void cpuidle_driver_state_disabled(struct cpuidle_driver *drv, int idx, bool disable); extern void cpuidle_unregister_driver(struct cpuidle_driver *drv); @@ -186,8 +184,6 @@ static inline u64 cpuidle_poll_time(struct cpuidle_driver *drv, static inline int cpuidle_register_driver(struct cpuidle_driver *drv) {return -ENODEV; } static inline struct cpuidle_driver *cpuidle_get_driver(void) {return NULL; } -static inline struct cpuidle_driver *cpuidle_driver_ref(void) {return NULL; } -static inline void cpuidle_driver_unref(void) {} static inline void cpuidle_driver_state_disabled(struct cpuidle_driver *drv, int idx, bool disable) { } static inline void cpuidle_unregister_driver(struct cpuidle_driver *drv) { } diff --git a/include/linux/devfreq.h b/include/linux/devfreq.h index fb376b5b7281..c6f82d4bec9f 100644 --- a/include/linux/devfreq.h +++ b/include/linux/devfreq.h @@ -108,6 +108,20 @@ struct devfreq_dev_profile { }; /** + * struct devfreq_stats - Statistics of devfreq device behavior + * @total_trans: Number of devfreq transitions. + * @trans_table: Statistics of devfreq transitions. + * @time_in_state: Statistics of devfreq states. + * @last_update: The last time stats were updated. + */ +struct devfreq_stats { + unsigned int total_trans; + unsigned int *trans_table; + u64 *time_in_state; + u64 last_update; +}; + +/** * struct devfreq - Device devfreq structure * @node: list node - contains the devices with devfreq that have been * registered. @@ -122,6 +136,7 @@ struct devfreq_dev_profile { * devfreq.nb to the corresponding register notifier call chain. * @work: delayed work for load monitoring. * @previous_freq: previously configured frequency value. + * @last_status: devfreq user device info, performance statistics * @data: Private data of the governor. The devfreq framework does not * touch this. * @user_min_freq_req: PM QoS minimum frequency request from user (via sysfs) @@ -132,15 +147,12 @@ struct devfreq_dev_profile { * @suspend_freq: frequency of a device set during suspend phase. * @resume_freq: frequency of a device set in resume phase. * @suspend_count: suspend requests counter for a device. - * @total_trans: Number of devfreq transitions - * @trans_table: Statistics of devfreq transitions - * @time_in_state: Statistics of devfreq states - * @last_stat_updated: The last time stat updated + * @stats: Statistics of devfreq device behavior * @transition_notifier_list: list head of DEVFREQ_TRANSITION_NOTIFIER notifier * @nb_min: Notifier block for DEV_PM_QOS_MIN_FREQUENCY * @nb_max: Notifier block for DEV_PM_QOS_MAX_FREQUENCY * - * This structure stores the devfreq information for a give device. + * This structure stores the devfreq information for a given device. * * Note that when a governor accesses entries in struct devfreq in its * functions except for the context of callbacks defined in struct @@ -174,11 +186,8 @@ struct devfreq { unsigned long resume_freq; atomic_t suspend_count; - /* information for device frequency transition */ - unsigned int total_trans; - unsigned int *trans_table; - unsigned long *time_in_state; - unsigned long last_stat_updated; + /* information for device frequency transitions */ + struct devfreq_stats stats; struct srcu_notifier_head transition_notifier_list; diff --git a/include/linux/suspend.h b/include/linux/suspend.h index 6fc8843f1c9e..4a230c2f1c31 100644 --- a/include/linux/suspend.h +++ b/include/linux/suspend.h @@ -329,6 +329,7 @@ extern void arch_suspend_disable_irqs(void); extern void arch_suspend_enable_irqs(void); extern int pm_suspend(suspend_state_t state); +extern bool sync_on_suspend_enabled; #else /* !CONFIG_SUSPEND */ #define suspend_valid_only_mem NULL @@ -342,6 +343,7 @@ static inline bool pm_suspend_default_s2idle(void) { return false; } static inline void suspend_set_ops(const struct platform_suspend_ops *ops) {} static inline int pm_suspend(suspend_state_t state) { return -ENOSYS; } +static inline bool sync_on_suspend_enabled(void) { return true; } static inline bool idle_should_enter_s2idle(void) { return false; } static inline void __init pm_states_init(void) {} static inline void s2idle_set_ops(const struct platform_s2idle_ops *ops) {} diff --git a/include/trace/events/rpm.h b/include/trace/events/rpm.h index 26927a560eab..3c716214dab1 100644 --- a/include/trace/events/rpm.h +++ b/include/trace/events/rpm.h @@ -74,6 +74,12 @@ DEFINE_EVENT(rpm_internal, rpm_idle, TP_ARGS(dev, flags) ); +DEFINE_EVENT(rpm_internal, rpm_usage, + + TP_PROTO(struct device *dev, int flags), + + TP_ARGS(dev, flags) +); TRACE_EVENT(rpm_return_int, TP_PROTO(struct device *dev, unsigned long ip, int ret), diff --git a/kernel/power/Kconfig b/kernel/power/Kconfig index d3667b4075c1..7cbfbeacd68a 100644 --- a/kernel/power/Kconfig +++ b/kernel/power/Kconfig @@ -27,7 +27,10 @@ config SUSPEND_SKIP_SYNC Skip the kernel sys_sync() before freezing user processes. Some systems prefer not to pay this cost on every invocation of suspend, or they are content with invoking sync() from - user-space before invoking suspend. Say Y if that's your case. + user-space before invoking suspend. There's a run-time switch + at '/sys/power/sync_on_suspend' to configure this behaviour. + This setting changes the default for the run-tim switch. Say Y + to change the default to disable the kernel sys_sync(). config HIBERNATE_CALLBACKS bool diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c index 3c0a5a8170b0..6dbeedb7354c 100644 --- a/kernel/power/hibernate.c +++ b/kernel/power/hibernate.c @@ -9,7 +9,7 @@ * Copyright (C) 2012 Bojan Smojver <bojan@rexursive.com> */ -#define pr_fmt(fmt) "PM: " fmt +#define pr_fmt(fmt) "PM: hibernation: " fmt #include <linux/export.h> #include <linux/suspend.h> @@ -106,7 +106,7 @@ EXPORT_SYMBOL(system_entering_hibernation); #ifdef CONFIG_PM_DEBUG static void hibernation_debug_sleep(void) { - pr_info("hibernation debug: Waiting for 5 seconds.\n"); + pr_info("debug: Waiting for 5 seconds.\n"); mdelay(5000); } @@ -277,7 +277,7 @@ static int create_image(int platform_mode) error = dpm_suspend_end(PMSG_FREEZE); if (error) { - pr_err("Some devices failed to power down, aborting hibernation\n"); + pr_err("Some devices failed to power down, aborting\n"); return error; } @@ -295,7 +295,7 @@ static int create_image(int platform_mode) error = syscore_suspend(); if (error) { - pr_err("Some system devices failed to power down, aborting hibernation\n"); + pr_err("Some system devices failed to power down, aborting\n"); goto Enable_irqs; } @@ -310,7 +310,7 @@ static int create_image(int platform_mode) restore_processor_state(); trace_suspend_resume(TPS("machine_suspend"), PM_EVENT_HIBERNATE, false); if (error) - pr_err("Error %d creating hibernation image\n", error); + pr_err("Error %d creating image\n", error); if (!in_suspend) { events_check_enabled = false; @@ -680,7 +680,7 @@ static int load_image_and_restore(void) if (!error) hibernation_restore(flags & SF_PLATFORM_MODE); - pr_err("Failed to load hibernation image, recovering.\n"); + pr_err("Failed to load image, recovering.\n"); swsusp_free(); free_basic_memory_bitmaps(); Unlock: @@ -743,7 +743,7 @@ int hibernate(void) else flags |= SF_CRC32_MODE; - pm_pr_dbg("Writing image.\n"); + pm_pr_dbg("Writing hibernation image.\n"); error = swsusp_write(flags); swsusp_free(); if (!error) { @@ -755,7 +755,7 @@ int hibernate(void) in_suspend = 0; pm_restore_gfp_mask(); } else { - pm_pr_dbg("Image restored successfully.\n"); + pm_pr_dbg("Hibernation image restored successfully.\n"); } Free_bitmaps: @@ -894,7 +894,7 @@ static int software_resume(void) goto Close_Finish; } - pm_pr_dbg("Preparing processes for restore.\n"); + pm_pr_dbg("Preparing processes for hibernation restore.\n"); error = freeze_processes(); if (error) goto Close_Finish; @@ -903,7 +903,7 @@ static int software_resume(void) Finish: __pm_notifier_call_chain(PM_POST_RESTORE, nr_calls, NULL); pm_restore_console(); - pr_info("resume from hibernation failed (%d)\n", error); + pr_info("resume failed (%d)\n", error); atomic_inc(&snapshot_device_available); /* For success case, the suspend path will release the lock */ Unlock: @@ -1068,7 +1068,8 @@ static ssize_t resume_store(struct kobject *kobj, struct kobj_attribute *attr, lock_system_sleep(); swsusp_resume_device = res; unlock_system_sleep(); - pm_pr_dbg("Configured resume from disk to %u\n", swsusp_resume_device); + pm_pr_dbg("Configured hibernation resume from disk to %u\n", + swsusp_resume_device); noresume = 0; software_resume(); return n; diff --git a/kernel/power/main.c b/kernel/power/main.c index e26de7af520b..69b7a8aeca3b 100644 --- a/kernel/power/main.c +++ b/kernel/power/main.c @@ -190,6 +190,38 @@ static ssize_t mem_sleep_store(struct kobject *kobj, struct kobj_attribute *attr } power_attr(mem_sleep); + +/* + * sync_on_suspend: invoke ksys_sync_helper() before suspend. + * + * show() returns whether ksys_sync_helper() is invoked before suspend. + * store() accepts 0 or 1. 0 disables ksys_sync_helper() and 1 enables it. + */ +bool sync_on_suspend_enabled = !IS_ENABLED(CONFIG_SUSPEND_SKIP_SYNC); + +static ssize_t sync_on_suspend_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + return sprintf(buf, "%d\n", sync_on_suspend_enabled); +} + +static ssize_t sync_on_suspend_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t n) +{ + unsigned long val; + + if (kstrtoul(buf, 10, &val)) + return -EINVAL; + + if (val > 1) + return -EINVAL; + + sync_on_suspend_enabled = !!val; + return n; +} + +power_attr(sync_on_suspend); #endif /* CONFIG_SUSPEND */ #ifdef CONFIG_PM_SLEEP_DEBUG @@ -855,6 +887,7 @@ static struct attribute * g[] = { &wakeup_count_attr.attr, #ifdef CONFIG_SUSPEND &mem_sleep_attr.attr, + &sync_on_suspend_attr.attr, #endif #ifdef CONFIG_PM_AUTOSLEEP &autosleep_attr.attr, diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c index d65f2d5ab694..ddade80ad276 100644 --- a/kernel/power/snapshot.c +++ b/kernel/power/snapshot.c @@ -8,7 +8,7 @@ * Copyright (C) 2006 Rafael J. Wysocki <rjw@sisk.pl> */ -#define pr_fmt(fmt) "PM: " fmt +#define pr_fmt(fmt) "PM: hibernation: " fmt #include <linux/version.h> #include <linux/module.h> @@ -1566,9 +1566,7 @@ static unsigned long preallocate_image_highmem(unsigned long nr_pages) */ static unsigned long __fraction(u64 x, u64 multiplier, u64 base) { - x *= multiplier; - do_div(x, base); - return (unsigned long)x; + return div64_u64(x * multiplier, base); } static unsigned long preallocate_highmem_fraction(unsigned long nr_pages, @@ -1705,16 +1703,20 @@ int hibernate_preallocate_memory(void) ktime_t start, stop; int error; - pr_info("Preallocating image memory... "); + pr_info("Preallocating image memory\n"); start = ktime_get(); error = memory_bm_create(&orig_bm, GFP_IMAGE, PG_ANY); - if (error) + if (error) { + pr_err("Cannot allocate original bitmap\n"); goto err_out; + } error = memory_bm_create(©_bm, GFP_IMAGE, PG_ANY); - if (error) + if (error) { + pr_err("Cannot allocate copy bitmap\n"); goto err_out; + } alloc_normal = 0; alloc_highmem = 0; @@ -1804,8 +1806,11 @@ int hibernate_preallocate_memory(void) alloc -= pages; pages += pages_highmem; pages_highmem = preallocate_image_highmem(alloc); - if (pages_highmem < alloc) + if (pages_highmem < alloc) { + pr_err("Image allocation is %lu pages short\n", + alloc - pages_highmem); goto err_out; + } pages += pages_highmem; /* * size is the desired number of saveable pages to leave in @@ -1836,13 +1841,12 @@ int hibernate_preallocate_memory(void) out: stop = ktime_get(); - pr_cont("done (allocated %lu pages)\n", pages); + pr_info("Allocated %lu pages for snapshot\n", pages); swsusp_show_speed(start, stop, pages, "Allocated"); return 0; err_out: - pr_cont("\n"); swsusp_free(); return -ENOMEM; } @@ -1976,7 +1980,7 @@ asmlinkage __visible int swsusp_save(void) { unsigned int nr_pages, nr_highmem; - pr_info("Creating hibernation image:\n"); + pr_info("Creating image:\n"); drain_local_pages(NULL); nr_pages = count_data_pages(); @@ -2010,7 +2014,7 @@ asmlinkage __visible int swsusp_save(void) nr_copy_pages = nr_pages; nr_meta_pages = DIV_ROUND_UP(nr_pages * sizeof(long), PAGE_SIZE); - pr_info("Hibernation image created (%d pages copied)\n", nr_pages); + pr_info("Image created (%d pages copied)\n", nr_pages); return 0; } diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c index f3b7239f1892..2c47280fbfc7 100644 --- a/kernel/power/suspend.c +++ b/kernel/power/suspend.c @@ -564,7 +564,7 @@ static int enter_state(suspend_state_t state) if (state == PM_SUSPEND_TO_IDLE) s2idle_begin(); - if (!IS_ENABLED(CONFIG_SUSPEND_SKIP_SYNC)) { + if (sync_on_suspend_enabled) { trace_suspend_resume(TPS("sync_filesystems"), 0, true); ksys_sync_helper(); trace_suspend_resume(TPS("sync_filesystems"), 0, false); diff --git a/kernel/power/suspend_test.c b/kernel/power/suspend_test.c index 60564b58de07..e1ed58adb69e 100644 --- a/kernel/power/suspend_test.c +++ b/kernel/power/suspend_test.c @@ -70,7 +70,7 @@ static void __init test_wakealarm(struct rtc_device *rtc, suspend_state_t state) static char info_test[] __initdata = KERN_INFO "PM: test RTC wakeup from '%s' suspend\n"; - unsigned long now; + time64_t now; struct rtc_wkalrm alm; int status; @@ -81,10 +81,10 @@ repeat: printk(err_readtime, dev_name(&rtc->dev), status); return; } - rtc_tm_to_time(&alm.time, &now); + now = rtc_tm_to_time64(&alm.time); memset(&alm, 0, sizeof alm); - rtc_time_to_tm(now + TEST_SUSPEND_SECONDS, &alm.time); + rtc_time64_to_tm(now + TEST_SUSPEND_SECONDS, &alm.time); alm.enabled = true; status = rtc_set_alarm(rtc, &alm); |