summaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ABI/stable/sysfs-block3
-rw-r--r--Documentation/ABI/testing/sysfs-bus-css15
-rw-r--r--Documentation/ABI/testing/sysfs-fs-erofs3
-rw-r--r--Documentation/RCU/NMI-RCU.rst4
-rw-r--r--Documentation/RCU/UP.rst13
-rw-r--r--Documentation/RCU/lockdep.rst13
-rw-r--r--Documentation/RCU/rcu.rst6
-rw-r--r--Documentation/RCU/rcu_dereference.rst21
-rw-r--r--Documentation/RCU/rcubarrier.rst357
-rw-r--r--Documentation/RCU/rculist_nulls.rst109
-rw-r--r--Documentation/RCU/stallwarn.rst135
-rw-r--r--Documentation/RCU/torture.rst89
-rw-r--r--Documentation/RCU/whatisRCU.rst193
-rw-r--r--Documentation/admin-guide/cgroup-v1/cgroups.rst2
-rw-r--r--Documentation/admin-guide/cgroup-v1/memory.rst284
-rw-r--r--Documentation/admin-guide/cgroup-v2.rst18
-rw-r--r--Documentation/admin-guide/hw-vuln/cross-thread-rsb.rst91
-rw-r--r--Documentation/admin-guide/hw-vuln/index.rst1
-rw-r--r--Documentation/admin-guide/hw-vuln/spectre.rst6
-rw-r--r--Documentation/admin-guide/index.rst1
-rw-r--r--Documentation/admin-guide/kernel-parameters.txt36
-rw-r--r--Documentation/admin-guide/media/davinci-vpbe.rst65
-rw-r--r--Documentation/admin-guide/media/platform-cardlist.rst1
-rw-r--r--Documentation/admin-guide/media/v4l-drivers.rst1
-rw-r--r--Documentation/admin-guide/mm/zswap.rst4
-rw-r--r--Documentation/admin-guide/pm/amd-pstate.rst78
-rw-r--r--Documentation/admin-guide/thermal/index.rst8
-rw-r--r--Documentation/admin-guide/thermal/intel_powerclamp.rst (renamed from Documentation/driver-api/thermal/intel_powerclamp.rst)35
-rw-r--r--Documentation/arm/index.rst1
-rw-r--r--Documentation/arm/samsung-s3c24xx/cpufreq.rst77
-rw-r--r--Documentation/arm/samsung-s3c24xx/eb2410itx.rst59
-rw-r--r--Documentation/arm/samsung-s3c24xx/gpio.rst172
-rw-r--r--Documentation/arm/samsung-s3c24xx/h1940.rst41
-rw-r--r--Documentation/arm/samsung-s3c24xx/index.rst20
-rw-r--r--Documentation/arm/samsung-s3c24xx/nand.rst30
-rw-r--r--Documentation/arm/samsung-s3c24xx/overview.rst311
-rw-r--r--Documentation/arm/samsung-s3c24xx/s3c2412.rst121
-rw-r--r--Documentation/arm/samsung-s3c24xx/s3c2413.rst22
-rw-r--r--Documentation/arm/samsung-s3c24xx/smdk2440.rst57
-rw-r--r--Documentation/arm/samsung-s3c24xx/suspend.rst137
-rw-r--r--Documentation/arm/samsung-s3c24xx/usb-host.rst91
-rw-r--r--Documentation/arm/samsung/gpio.rst8
-rw-r--r--Documentation/arm/samsung/overview.rst13
-rw-r--r--Documentation/arm64/silicon-errata.rst2
-rw-r--r--Documentation/block/capability.rst10
-rw-r--r--Documentation/block/index.rst1
-rw-r--r--Documentation/block/ublk.rst55
-rw-r--r--Documentation/conf.py12
-rw-r--r--Documentation/devicetree/bindings/.gitignore5
-rw-r--r--Documentation/devicetree/bindings/arm/altera.yaml1
-rw-r--r--Documentation/devicetree/bindings/arm/amlogic.yaml3
-rw-r--r--Documentation/devicetree/bindings/arm/aspeed/aspeed.yaml2
-rw-r--r--Documentation/devicetree/bindings/arm/atmel-at91.yaml6
-rw-r--r--Documentation/devicetree/bindings/arm/fsl.yaml96
-rw-r--r--Documentation/devicetree/bindings/arm/mediatek.yaml4
-rw-r--r--Documentation/devicetree/bindings/arm/qcom.yaml79
-rw-r--r--Documentation/devicetree/bindings/arm/rockchip.yaml61
-rw-r--r--Documentation/devicetree/bindings/arm/rockchip/pmu.yaml2
-rw-r--r--Documentation/devicetree/bindings/arm/samsung/samsung-boards.yaml1
-rw-r--r--Documentation/devicetree/bindings/arm/ti/k3.yaml15
-rw-r--r--Documentation/devicetree/bindings/bus/nvidia,tegra210-aconnect.yaml4
-rw-r--r--Documentation/devicetree/bindings/clock/qcom,qdu1000-gcc.yaml51
-rw-r--r--Documentation/devicetree/bindings/clock/qcom,sm6350-camcc.yaml49
-rw-r--r--Documentation/devicetree/bindings/clock/qcom,sm8550-dispcc.yaml105
-rw-r--r--Documentation/devicetree/bindings/clock/qcom,sm8550-tcsr.yaml55
-rw-r--r--Documentation/devicetree/bindings/clock/samsung,s3c2410-clock.txt49
-rw-r--r--Documentation/devicetree/bindings/clock/samsung,s3c2412-clock.txt49
-rw-r--r--Documentation/devicetree/bindings/clock/samsung,s3c2443-clock.txt55
-rw-r--r--Documentation/devicetree/bindings/cpufreq/cpufreq-qcom-hw.yaml16
-rw-r--r--Documentation/devicetree/bindings/cpufreq/qcom-cpufreq-nvmem.yaml81
-rw-r--r--Documentation/devicetree/bindings/crypto/atmel,at91sam9g46-aes.yaml2
-rw-r--r--Documentation/devicetree/bindings/crypto/atmel,at91sam9g46-sha.yaml2
-rw-r--r--Documentation/devicetree/bindings/crypto/atmel,at91sam9g46-tdes.yaml2
-rw-r--r--Documentation/devicetree/bindings/display/msm/dsi-controller-main.yaml4
-rw-r--r--Documentation/devicetree/bindings/display/msm/dsi-phy-10nm.yaml1
-rw-r--r--Documentation/devicetree/bindings/display/msm/dsi-phy-14nm.yaml1
-rw-r--r--Documentation/devicetree/bindings/display/msm/dsi-phy-28nm.yaml4
-rw-r--r--Documentation/devicetree/bindings/display/msm/qcom,qcm2290-mdss.yaml2
-rw-r--r--Documentation/devicetree/bindings/display/msm/qcom,sm6115-mdss.yaml2
-rw-r--r--Documentation/devicetree/bindings/display/tegra/nvidia,tegra186-display.yaml4
-rw-r--r--Documentation/devicetree/bindings/i2c/renesas,rzv2m.yaml4
-rw-r--r--Documentation/devicetree/bindings/interconnect/qcom,qdu1000-rpmh.yaml70
-rw-r--r--Documentation/devicetree/bindings/interconnect/qcom,rpm.yaml24
-rw-r--r--Documentation/devicetree/bindings/interconnect/qcom,sm8550-rpmh.yaml139
-rw-r--r--Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.yaml2
-rw-r--r--Documentation/devicetree/bindings/interrupt-controller/samsung,s3c24xx-irq.txt53
-rw-r--r--Documentation/devicetree/bindings/memory-controllers/xlnx,zynqmp-ocmc-1.0.yaml45
-rw-r--r--Documentation/devicetree/bindings/mfd/syscon.yaml6
-rw-r--r--Documentation/devicetree/bindings/mmc/samsung,s3cmci.txt42
-rw-r--r--Documentation/devicetree/bindings/opp/opp-v2-kryo-cpu.yaml18
-rw-r--r--Documentation/devicetree/bindings/opp/opp-v2-qcom-level.yaml4
-rw-r--r--Documentation/devicetree/bindings/phy/amlogic,g12a-usb2-phy.yaml (renamed from Documentation/devicetree/bindings/phy/amlogic,meson-g12a-usb2-phy.yaml)8
-rw-r--r--Documentation/devicetree/bindings/phy/amlogic,g12a-usb3-pcie-phy.yaml (renamed from Documentation/devicetree/bindings/phy/amlogic,meson-g12a-usb3-pcie-phy.yaml)6
-rw-r--r--Documentation/devicetree/bindings/phy/qcom,usb-hs-28nm.yaml1
-rw-r--r--Documentation/devicetree/bindings/phy/rockchip,inno-usb2phy.yaml (renamed from Documentation/devicetree/bindings/phy/phy-rockchip-inno-usb2.yaml)2
-rw-r--r--Documentation/devicetree/bindings/phy/rockchip,rk3288-dp-phy.yaml41
-rw-r--r--Documentation/devicetree/bindings/phy/rockchip-dp-phy.txt26
-rw-r--r--Documentation/devicetree/bindings/power/fsl,imx-gpc.yaml19
-rw-r--r--Documentation/devicetree/bindings/regulator/samsung,s2mps14.yaml21
-rw-r--r--Documentation/devicetree/bindings/riscv/cpus.yaml2
-rw-r--r--Documentation/devicetree/bindings/riscv/microchip.yaml1
-rw-r--r--Documentation/devicetree/bindings/riscv/starfive.yaml6
-rw-r--r--Documentation/devicetree/bindings/riscv/sunxi.yaml69
-rw-r--r--Documentation/devicetree/bindings/rtc/qcom-pm8xxx-rtc.yaml2
-rw-r--r--Documentation/devicetree/bindings/soc/imx/fsl,imx-iomuxc-gpr.yaml57
-rw-r--r--Documentation/devicetree/bindings/soc/imx/fsl,imx8mp-hsio-blk-ctrl.yaml4
-rw-r--r--Documentation/devicetree/bindings/soc/qcom/qcom,apr-services.yaml5
-rw-r--r--Documentation/devicetree/bindings/soc/renesas/renesas,rzv2m-pwc.yaml56
-rw-r--r--Documentation/devicetree/bindings/soc/rockchip/grf.yaml12
-rw-r--r--Documentation/devicetree/bindings/soc/samsung/exynos-pmu.yaml71
-rw-r--r--Documentation/devicetree/bindings/soc/samsung/samsung,exynos-sysreg.yaml87
-rw-r--r--[-rwxr-xr-x]Documentation/devicetree/bindings/sound/everest,es8326.yaml0
-rw-r--r--Documentation/devicetree/bindings/sound/mt8186-mt6366-rt1019-rt5682s.yaml1
-rw-r--r--Documentation/devicetree/bindings/sound/nvidia,tegra210-ahub.yaml4
-rw-r--r--Documentation/devicetree/bindings/sound/nvidia,tegra210-ope.yaml4
-rw-r--r--Documentation/devicetree/bindings/sound/qcom,lpass-tx-macro.yaml4
-rw-r--r--Documentation/devicetree/bindings/sound/qcom,lpass-wsa-macro.yaml58
-rw-r--r--Documentation/devicetree/bindings/spi/atmel,at91rm9200-spi.yaml2
-rw-r--r--Documentation/devicetree/bindings/spi/atmel,quadspi.yaml2
-rw-r--r--Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml4
-rw-r--r--Documentation/devicetree/bindings/thermal/mediatek,lvts-thermal.yaml142
-rw-r--r--Documentation/devicetree/bindings/thermal/qcom-spmi-adc-tm5.yaml1
-rw-r--r--Documentation/devicetree/bindings/thermal/qcom-tsens.yaml153
-rw-r--r--Documentation/devicetree/bindings/thermal/rcar-gen3-thermal.yaml3
-rw-r--r--Documentation/devicetree/bindings/timer/mediatek,mtk-timer.txt1
-rw-r--r--Documentation/devicetree/bindings/timer/riscv,timer.yaml52
-rw-r--r--Documentation/devicetree/bindings/timer/rockchip,rk-timer.yaml1
-rw-r--r--Documentation/devicetree/bindings/timer/sifive,clint.yaml8
-rw-r--r--Documentation/devicetree/bindings/vendor-prefixes.yaml20
-rw-r--r--Documentation/driver-api/media/drivers/davinci-vpbe-devel.rst39
-rw-r--r--Documentation/driver-api/media/drivers/index.rst1
-rw-r--r--Documentation/driver-api/thermal/index.rst1
-rw-r--r--Documentation/driver-api/thermal/intel_dptf.rst3
-rw-r--r--Documentation/features/seccomp/seccomp-filter/arch-support.txt2
-rw-r--r--Documentation/filesystems/erofs.rst2
-rw-r--r--Documentation/filesystems/fscrypt.rst4
-rw-r--r--Documentation/filesystems/fsverity.rst96
-rw-r--r--Documentation/filesystems/locking.rst24
-rw-r--r--Documentation/filesystems/vfs.rst24
-rw-r--r--Documentation/firmware-guide/acpi/acpi-lid.rst2
-rw-r--r--Documentation/firmware-guide/acpi/gpio-properties.rst35
-rw-r--r--Documentation/firmware-guide/acpi/namespace.rst2
-rw-r--r--Documentation/kbuild/makefiles.rst2
-rw-r--r--Documentation/kernel-hacking/locking.rst4
-rw-r--r--Documentation/networking/bridge.rst2
-rw-r--r--Documentation/networking/device_drivers/ethernet/intel/ice.rst2
-rw-r--r--Documentation/networking/device_drivers/ethernet/wangxun/txgbe.rst2
-rw-r--r--Documentation/networking/nf_conntrack-sysctl.rst10
-rw-r--r--Documentation/networking/rxrpc.rst4
-rw-r--r--Documentation/power/power_supply_class.rst4
-rw-r--r--Documentation/power/suspend-and-interrupts.rst2
-rw-r--r--Documentation/s390/pci.rst4
-rw-r--r--Documentation/s390/vfio-ccw.rst6
-rw-r--r--Documentation/scheduler/index.rst1
-rw-r--r--Documentation/scheduler/sched-util-clamp.rst741
-rw-r--r--Documentation/sphinx/load_config.py6
-rw-r--r--Documentation/translations/it_IT/kernel-hacking/locking.rst4
-rw-r--r--Documentation/userspace-api/ioctl/ioctl-number.rst1
-rw-r--r--Documentation/virt/kvm/api.rst32
-rw-r--r--Documentation/virt/kvm/locking.rst25
-rw-r--r--Documentation/x86/amd-memory-encryption.rst36
-rw-r--r--Documentation/x86/resctrl.rst147
162 files changed, 3953 insertions, 2325 deletions
diff --git a/Documentation/ABI/stable/sysfs-block b/Documentation/ABI/stable/sysfs-block
index cd14ecb3c9a5..ac1e519272aa 100644
--- a/Documentation/ABI/stable/sysfs-block
+++ b/Documentation/ABI/stable/sysfs-block
@@ -432,7 +432,8 @@ Contact: linux-block@vger.kernel.org
Description:
[RW] This is the maximum number of kilobytes that the block
layer will allow for a filesystem request. Must be smaller than
- or equal to the maximum size allowed by the hardware.
+ or equal to the maximum size allowed by the hardware. Write 0
+ to use default kernel settings.
What: /sys/block/<disk>/queue/max_segment_size
diff --git a/Documentation/ABI/testing/sysfs-bus-css b/Documentation/ABI/testing/sysfs-bus-css
index 12a733fe357f..d4d5cfb63b90 100644
--- a/Documentation/ABI/testing/sysfs-bus-css
+++ b/Documentation/ABI/testing/sysfs-bus-css
@@ -1,22 +1,19 @@
What: /sys/bus/css/devices/.../type
Date: March 2008
-Contact: Cornelia Huck <cornelia.huck@de.ibm.com>
- linux-s390@vger.kernel.org
+Contact: linux-s390@vger.kernel.org
Description: Contains the subchannel type, as reported by the hardware.
This attribute is present for all subchannel types.
What: /sys/bus/css/devices/.../modalias
Date: March 2008
-Contact: Cornelia Huck <cornelia.huck@de.ibm.com>
- linux-s390@vger.kernel.org
+Contact: linux-s390@vger.kernel.org
Description: Contains the module alias as reported with uevents.
It is of the format css:t<type> and present for all
subchannel types.
What: /sys/bus/css/drivers/io_subchannel/.../chpids
Date: December 2002
-Contact: Cornelia Huck <cornelia.huck@de.ibm.com>
- linux-s390@vger.kernel.org
+Contact: linux-s390@vger.kernel.org
Description: Contains the ids of the channel paths used by this
subchannel, as reported by the channel subsystem
during subchannel recognition.
@@ -26,8 +23,7 @@ Users: s390-tools, HAL
What: /sys/bus/css/drivers/io_subchannel/.../pimpampom
Date: December 2002
-Contact: Cornelia Huck <cornelia.huck@de.ibm.com>
- linux-s390@vger.kernel.org
+Contact: linux-s390@vger.kernel.org
Description: Contains the PIM/PAM/POM values, as reported by the
channel subsystem when last queried by the common I/O
layer (this implies that this attribute is not necessarily
@@ -38,8 +34,7 @@ Users: s390-tools, HAL
What: /sys/bus/css/devices/.../driver_override
Date: June 2019
-Contact: Cornelia Huck <cohuck@redhat.com>
- linux-s390@vger.kernel.org
+Contact: linux-s390@vger.kernel.org
Description: This file allows the driver for a device to be specified. When
specified, only a driver with a name matching the value written
to driver_override will have an opportunity to bind to the
diff --git a/Documentation/ABI/testing/sysfs-fs-erofs b/Documentation/ABI/testing/sysfs-fs-erofs
index bb4681a01811..284224d1b56f 100644
--- a/Documentation/ABI/testing/sysfs-fs-erofs
+++ b/Documentation/ABI/testing/sysfs-fs-erofs
@@ -4,7 +4,8 @@ Contact: "Huang Jianan" <huangjianan@oppo.com>
Description: Shows all enabled kernel features.
Supported features:
zero_padding, compr_cfgs, big_pcluster, chunked_file,
- device_table, compr_head2, sb_chksum.
+ device_table, compr_head2, sb_chksum, ztailpacking,
+ dedupe, fragments.
What: /sys/fs/erofs/<disk>/sync_decompress
Date: November 2021
diff --git a/Documentation/RCU/NMI-RCU.rst b/Documentation/RCU/NMI-RCU.rst
index 2a92bc685ef1..dff60a80b386 100644
--- a/Documentation/RCU/NMI-RCU.rst
+++ b/Documentation/RCU/NMI-RCU.rst
@@ -8,7 +8,7 @@ Although RCU is usually used to protect read-mostly data structures,
it is possible to use RCU to provide dynamic non-maskable interrupt
handlers, as well as dynamic irq handlers. This document describes
how to do this, drawing loosely from Zwane Mwaikambo's NMI-timer
-work in "arch/x86/kernel/traps.c".
+work in an old version of "arch/x86/kernel/traps.c".
The relevant pieces of code are listed below, each followed by a
brief explanation::
@@ -116,7 +116,7 @@ Answer to Quick Quiz:
This same sad story can happen on other CPUs when using
a compiler with aggressive pointer-value speculation
- optimizations.
+ optimizations. (But please don't!)
More important, the rcu_dereference_sched() makes it
clear to someone reading the code that the pointer is
diff --git a/Documentation/RCU/UP.rst b/Documentation/RCU/UP.rst
index e26dda27430c..8b20fd45f255 100644
--- a/Documentation/RCU/UP.rst
+++ b/Documentation/RCU/UP.rst
@@ -38,7 +38,7 @@ by having call_rcu() directly invoke its arguments only if it was called
from process context. However, this can fail in a similar manner.
Suppose that an RCU-based algorithm again scans a linked list containing
-elements A, B, and C in process contexts, but that it invokes a function
+elements A, B, and C in process context, but that it invokes a function
on each element as it is scanned. Suppose further that this function
deletes element B from the list, then passes it to call_rcu() for deferred
freeing. This may be a bit unconventional, but it is perfectly legal
@@ -59,7 +59,8 @@ Example 3: Death by Deadlock
Suppose that call_rcu() is invoked while holding a lock, and that the
callback function must acquire this same lock. In this case, if
call_rcu() were to directly invoke the callback, the result would
-be self-deadlock.
+be self-deadlock *even if* this invocation occurred from a later
+call_rcu() invocation a full grace period later.
In some cases, it would possible to restructure to code so that
the call_rcu() is delayed until after the lock is released. However,
@@ -85,6 +86,14 @@ Quick Quiz #2:
:ref:`Answers to Quick Quiz <answer_quick_quiz_up>`
+It is important to note that userspace RCU implementations *do*
+permit call_rcu() to directly invoke callbacks, but only if a full
+grace period has elapsed since those callbacks were queued. This is
+the case because some userspace environments are extremely constrained.
+Nevertheless, people writing userspace RCU implementations are strongly
+encouraged to avoid invoking callbacks from call_rcu(), thus obtaining
+the deadlock-avoidance benefits called out above.
+
Summary
-------
diff --git a/Documentation/RCU/lockdep.rst b/Documentation/RCU/lockdep.rst
index 9308f1bdba05..2749f43ec1b0 100644
--- a/Documentation/RCU/lockdep.rst
+++ b/Documentation/RCU/lockdep.rst
@@ -69,9 +69,8 @@ checking of rcu_dereference() primitives:
value of the pointer itself, for example, against NULL.
The rcu_dereference_check() check expression can be any boolean
-expression, but would normally include a lockdep expression. However,
-any boolean expression can be used. For a moderately ornate example,
-consider the following::
+expression, but would normally include a lockdep expression. For a
+moderately ornate example, consider the following::
file = rcu_dereference_check(fdt->fd[fd],
lockdep_is_held(&files->file_lock) ||
@@ -97,10 +96,10 @@ code, it could instead be written as follows::
atomic_read(&files->count) == 1);
This would verify cases #2 and #3 above, and furthermore lockdep would
-complain if this was used in an RCU read-side critical section unless one
-of these two cases held. Because rcu_dereference_protected() omits all
-barriers and compiler constraints, it generates better code than do the
-other flavors of rcu_dereference(). On the other hand, it is illegal
+complain even if this was used in an RCU read-side critical section unless
+one of these two cases held. Because rcu_dereference_protected() omits
+all barriers and compiler constraints, it generates better code than do
+the other flavors of rcu_dereference(). On the other hand, it is illegal
to use rcu_dereference_protected() if either the RCU-protected pointer
or the RCU-protected data that it points to can change concurrently.
diff --git a/Documentation/RCU/rcu.rst b/Documentation/RCU/rcu.rst
index 3cfe01ba9a49..bf6617b330a7 100644
--- a/Documentation/RCU/rcu.rst
+++ b/Documentation/RCU/rcu.rst
@@ -77,15 +77,17 @@ Frequently Asked Questions
search for the string "Patent" in Documentation/RCU/RTFP.txt to find them.
Of these, one was allowed to lapse by the assignee, and the
others have been contributed to the Linux kernel under GPL.
+ Many (but not all) have long since expired.
There are now also LGPL implementations of user-level RCU
available (https://liburcu.org/).
- I hear that RCU needs work in order to support realtime kernels?
- Realtime-friendly RCU can be enabled via the CONFIG_PREEMPT_RCU
+ Realtime-friendly RCU are enabled via the CONFIG_PREEMPTION
kernel configuration parameter.
- Where can I find more information on RCU?
See the Documentation/RCU/RTFP.txt file.
- Or point your browser at (http://www.rdrop.com/users/paulmck/RCU/).
+ Or point your browser at (https://docs.google.com/document/d/1X0lThx8OK0ZgLMqVoXiR4ZrGURHrXK6NyLRbeXe3Xac/edit)
+ or (https://docs.google.com/document/d/1GCdQC8SDbb54W1shjEXqGZ0Rq8a6kIeYutdSIajfpLA/edit?usp=sharing).
diff --git a/Documentation/RCU/rcu_dereference.rst b/Documentation/RCU/rcu_dereference.rst
index 81e828c8313b..3b739f6243c8 100644
--- a/Documentation/RCU/rcu_dereference.rst
+++ b/Documentation/RCU/rcu_dereference.rst
@@ -19,8 +19,9 @@ Follow these rules to keep your RCU code working properly:
can reload the value, and won't your code have fun with two
different values for a single pointer! Without rcu_dereference(),
DEC Alpha can load a pointer, dereference that pointer, and
- return data preceding initialization that preceded the store of
- the pointer.
+ return data preceding initialization that preceded the store
+ of the pointer. (As noted later, in recent kernels READ_ONCE()
+ also prevents DEC Alpha from playing these tricks.)
In addition, the volatile cast in rcu_dereference() prevents the
compiler from deducing the resulting pointer value. Please see
@@ -34,7 +35,7 @@ Follow these rules to keep your RCU code working properly:
takes on the role of the lockless_dereference() primitive that
was removed in v4.15.
-- You are only permitted to use rcu_dereference on pointer values.
+- You are only permitted to use rcu_dereference() on pointer values.
The compiler simply knows too much about integral values to
trust it to carry dependencies through integer operations.
There are a very few exceptions, namely that you can temporarily
@@ -240,6 +241,7 @@ precautions. To see this, consider the following code fragment::
struct foo *q;
int r1, r2;
+ rcu_read_lock();
p = rcu_dereference(gp2);
if (p == NULL)
return;
@@ -248,7 +250,10 @@ precautions. To see this, consider the following code fragment::
if (p == q) {
/* The compiler decides that q->c is same as p->c. */
r2 = p->c; /* Could get 44 on weakly order system. */
+ } else {
+ r2 = p->c - r1; /* Unconditional access to p->c. */
}
+ rcu_read_unlock();
do_something_with(r1, r2);
}
@@ -297,6 +302,7 @@ Then one approach is to use locking, for example, as follows::
struct foo *q;
int r1, r2;
+ rcu_read_lock();
p = rcu_dereference(gp2);
if (p == NULL)
return;
@@ -306,7 +312,12 @@ Then one approach is to use locking, for example, as follows::
if (p == q) {
/* The compiler decides that q->c is same as p->c. */
r2 = p->c; /* Locking guarantees r2 == 144. */
+ } else {
+ spin_lock(&q->lock);
+ r2 = q->c - r1;
+ spin_unlock(&q->lock);
}
+ rcu_read_unlock();
spin_unlock(&p->lock);
do_something_with(r1, r2);
}
@@ -364,7 +375,7 @@ the exact value of "p" even in the not-equals case. This allows the
compiler to make the return values independent of the load from "gp",
in turn destroying the ordering between this load and the loads of the
return values. This can result in "p->b" returning pre-initialization
-garbage values.
+garbage values on weakly ordered systems.
In short, rcu_dereference() is *not* optional when you are going to
dereference the resulting pointer.
@@ -430,7 +441,7 @@ member of the rcu_dereference() to use in various situations:
SPARSE CHECKING OF RCU-PROTECTED POINTERS
-----------------------------------------
-The sparse static-analysis tool checks for direct access to RCU-protected
+The sparse static-analysis tool checks for non-RCU access to RCU-protected
pointers, which can result in "interesting" bugs due to compiler
optimizations involving invented loads and perhaps also load tearing.
For example, suppose someone mistakenly does something like this::
diff --git a/Documentation/RCU/rcubarrier.rst b/Documentation/RCU/rcubarrier.rst
index 3b4a24877496..6da7f66da2a8 100644
--- a/Documentation/RCU/rcubarrier.rst
+++ b/Documentation/RCU/rcubarrier.rst
@@ -5,37 +5,12 @@ RCU and Unloadable Modules
[Originally published in LWN Jan. 14, 2007: http://lwn.net/Articles/217484/]
-RCU (read-copy update) is a synchronization mechanism that can be thought
-of as a replacement for read-writer locking (among other things), but with
-very low-overhead readers that are immune to deadlock, priority inversion,
-and unbounded latency. RCU read-side critical sections are delimited
-by rcu_read_lock() and rcu_read_unlock(), which, in non-CONFIG_PREEMPTION
-kernels, generate no code whatsoever.
-
-This means that RCU writers are unaware of the presence of concurrent
-readers, so that RCU updates to shared data must be undertaken quite
-carefully, leaving an old version of the data structure in place until all
-pre-existing readers have finished. These old versions are needed because
-such readers might hold a reference to them. RCU updates can therefore be
-rather expensive, and RCU is thus best suited for read-mostly situations.
-
-How can an RCU writer possibly determine when all readers are finished,
-given that readers might well leave absolutely no trace of their
-presence? There is a synchronize_rcu() primitive that blocks until all
-pre-existing readers have completed. An updater wishing to delete an
-element p from a linked list might do the following, while holding an
-appropriate lock, of course::
-
- list_del_rcu(p);
- synchronize_rcu();
- kfree(p);
-
-But the above code cannot be used in IRQ context -- the call_rcu()
-primitive must be used instead. This primitive takes a pointer to an
-rcu_head struct placed within the RCU-protected data structure and
-another pointer to a function that may be invoked later to free that
-structure. Code to delete an element p from the linked list from IRQ
-context might then be as follows::
+RCU updaters sometimes use call_rcu() to initiate an asynchronous wait for
+a grace period to elapse. This primitive takes a pointer to an rcu_head
+struct placed within the RCU-protected data structure and another pointer
+to a function that may be invoked later to free that structure. Code to
+delete an element p from the linked list from IRQ context might then be
+as follows::
list_del_rcu(p);
call_rcu(&p->rcu, p_callback);
@@ -54,7 +29,7 @@ IRQ context. The function p_callback() might be defined as follows::
Unloading Modules That Use call_rcu()
-------------------------------------
-But what if p_callback is defined in an unloadable module?
+But what if the p_callback() function is defined in an unloadable module?
If we unload the module while some RCU callbacks are pending,
the CPUs executing these callbacks are going to be severely
@@ -67,20 +42,21 @@ grace period to elapse, it does not wait for the callbacks to complete.
One might be tempted to try several back-to-back synchronize_rcu()
calls, but this is still not guaranteed to work. If there is a very
-heavy RCU-callback load, then some of the callbacks might be deferred
-in order to allow other processing to proceed. Such deferral is required
-in realtime kernels in order to avoid excessive scheduling latencies.
+heavy RCU-callback load, then some of the callbacks might be deferred in
+order to allow other processing to proceed. For but one example, such
+deferral is required in realtime kernels in order to avoid excessive
+scheduling latencies.
rcu_barrier()
-------------
-We instead need the rcu_barrier() primitive. Rather than waiting for
-a grace period to elapse, rcu_barrier() waits for all outstanding RCU
-callbacks to complete. Please note that rcu_barrier() does **not** imply
-synchronize_rcu(), in particular, if there are no RCU callbacks queued
-anywhere, rcu_barrier() is within its rights to return immediately,
-without waiting for a grace period to elapse.
+This situation can be handled by the rcu_barrier() primitive. Rather
+than waiting for a grace period to elapse, rcu_barrier() waits for all
+outstanding RCU callbacks to complete. Please note that rcu_barrier()
+does **not** imply synchronize_rcu(), in particular, if there are no RCU
+callbacks queued anywhere, rcu_barrier() is within its rights to return
+immediately, without waiting for anything, let alone a grace period.
Pseudo-code using rcu_barrier() is as follows:
@@ -89,83 +65,86 @@ Pseudo-code using rcu_barrier() is as follows:
3. Allow the module to be unloaded.
There is also an srcu_barrier() function for SRCU, and you of course
-must match the flavor of rcu_barrier() with that of call_rcu(). If your
-module uses multiple flavors of call_rcu(), then it must also use multiple
-flavors of rcu_barrier() when unloading that module. For example, if
-it uses call_rcu(), call_srcu() on srcu_struct_1, and call_srcu() on
-srcu_struct_2, then the following three lines of code will be required
-when unloading::
-
- 1 rcu_barrier();
- 2 srcu_barrier(&srcu_struct_1);
- 3 srcu_barrier(&srcu_struct_2);
-
-The rcutorture module makes use of rcu_barrier() in its exit function
-as follows::
-
- 1 static void
- 2 rcu_torture_cleanup(void)
- 3 {
- 4 int i;
- 5
- 6 fullstop = 1;
- 7 if (shuffler_task != NULL) {
- 8 VERBOSE_PRINTK_STRING("Stopping rcu_torture_shuffle task");
- 9 kthread_stop(shuffler_task);
- 10 }
- 11 shuffler_task = NULL;
+must match the flavor of srcu_barrier() with that of call_srcu().
+If your module uses multiple srcu_struct structures, then it must also
+use multiple invocations of srcu_barrier() when unloading that module.
+For example, if it uses call_rcu(), call_srcu() on srcu_struct_1, and
+call_srcu() on srcu_struct_2, then the following three lines of code
+will be required when unloading::
+
+ 1 rcu_barrier();
+ 2 srcu_barrier(&srcu_struct_1);
+ 3 srcu_barrier(&srcu_struct_2);
+
+If latency is of the essence, workqueues could be used to run these
+three functions concurrently.
+
+An ancient version of the rcutorture module makes use of rcu_barrier()
+in its exit function as follows::
+
+ 1 static void
+ 2 rcu_torture_cleanup(void)
+ 3 {
+ 4 int i;
+ 5
+ 6 fullstop = 1;
+ 7 if (shuffler_task != NULL) {
+ 8 VERBOSE_PRINTK_STRING("Stopping rcu_torture_shuffle task");
+ 9 kthread_stop(shuffler_task);
+ 10 }
+ 11 shuffler_task = NULL;
12
- 13 if (writer_task != NULL) {
- 14 VERBOSE_PRINTK_STRING("Stopping rcu_torture_writer task");
- 15 kthread_stop(writer_task);
- 16 }
- 17 writer_task = NULL;
+ 13 if (writer_task != NULL) {
+ 14 VERBOSE_PRINTK_STRING("Stopping rcu_torture_writer task");
+ 15 kthread_stop(writer_task);
+ 16 }
+ 17 writer_task = NULL;
18
- 19 if (reader_tasks != NULL) {
- 20 for (i = 0; i < nrealreaders; i++) {
- 21 if (reader_tasks[i] != NULL) {
- 22 VERBOSE_PRINTK_STRING(
- 23 "Stopping rcu_torture_reader task");
- 24 kthread_stop(reader_tasks[i]);
- 25 }
- 26 reader_tasks[i] = NULL;
- 27 }
- 28 kfree(reader_tasks);
- 29 reader_tasks = NULL;
- 30 }
- 31 rcu_torture_current = NULL;
+ 19 if (reader_tasks != NULL) {
+ 20 for (i = 0; i < nrealreaders; i++) {
+ 21 if (reader_tasks[i] != NULL) {
+ 22 VERBOSE_PRINTK_STRING(
+ 23 "Stopping rcu_torture_reader task");
+ 24 kthread_stop(reader_tasks[i]);
+ 25 }
+ 26 reader_tasks[i] = NULL;
+ 27 }
+ 28 kfree(reader_tasks);
+ 29 reader_tasks = NULL;
+ 30 }
+ 31 rcu_torture_current = NULL;
32
- 33 if (fakewriter_tasks != NULL) {
- 34 for (i = 0; i < nfakewriters; i++) {
- 35 if (fakewriter_tasks[i] != NULL) {
- 36 VERBOSE_PRINTK_STRING(
- 37 "Stopping rcu_torture_fakewriter task");
- 38 kthread_stop(fakewriter_tasks[i]);
- 39 }
- 40 fakewriter_tasks[i] = NULL;
- 41 }
- 42 kfree(fakewriter_tasks);
- 43 fakewriter_tasks = NULL;
- 44 }
+ 33 if (fakewriter_tasks != NULL) {
+ 34 for (i = 0; i < nfakewriters; i++) {
+ 35 if (fakewriter_tasks[i] != NULL) {
+ 36 VERBOSE_PRINTK_STRING(
+ 37 "Stopping rcu_torture_fakewriter task");
+ 38 kthread_stop(fakewriter_tasks[i]);
+ 39 }
+ 40 fakewriter_tasks[i] = NULL;
+ 41 }
+ 42 kfree(fakewriter_tasks);
+ 43 fakewriter_tasks = NULL;
+ 44 }
45
- 46 if (stats_task != NULL) {
- 47 VERBOSE_PRINTK_STRING("Stopping rcu_torture_stats task");
- 48 kthread_stop(stats_task);
- 49 }
- 50 stats_task = NULL;
+ 46 if (stats_task != NULL) {
+ 47 VERBOSE_PRINTK_STRING("Stopping rcu_torture_stats task");
+ 48 kthread_stop(stats_task);
+ 49 }
+ 50 stats_task = NULL;
51
- 52 /* Wait for all RCU callbacks to fire. */
- 53 rcu_barrier();
+ 52 /* Wait for all RCU callbacks to fire. */
+ 53 rcu_barrier();
54
- 55 rcu_torture_stats_print(); /* -After- the stats thread is stopped! */
+ 55 rcu_torture_stats_print(); /* -After- the stats thread is stopped! */
56
- 57 if (cur_ops->cleanup != NULL)
- 58 cur_ops->cleanup();
- 59 if (atomic_read(&n_rcu_torture_error))
- 60 rcu_torture_print_module_parms("End of test: FAILURE");
- 61 else
- 62 rcu_torture_print_module_parms("End of test: SUCCESS");
- 63 }
+ 57 if (cur_ops->cleanup != NULL)
+ 58 cur_ops->cleanup();
+ 59 if (atomic_read(&n_rcu_torture_error))
+ 60 rcu_torture_print_module_parms("End of test: FAILURE");
+ 61 else
+ 62 rcu_torture_print_module_parms("End of test: SUCCESS");
+ 63 }
Line 6 sets a global variable that prevents any RCU callbacks from
re-posting themselves. This will not be necessary in most cases, since
@@ -190,16 +169,17 @@ Quick Quiz #1:
:ref:`Answer to Quick Quiz #1 <answer_rcubarrier_quiz_1>`
Your module might have additional complications. For example, if your
-module invokes call_rcu() from timers, you will need to first cancel all
-the timers, and only then invoke rcu_barrier() to wait for any remaining
+module invokes call_rcu() from timers, you will need to first refrain
+from posting new timers, cancel (or wait for) all the already-posted
+timers, and only then invoke rcu_barrier() to wait for any remaining
RCU callbacks to complete.
-Of course, if you module uses call_rcu(), you will need to invoke
+Of course, if your module uses call_rcu(), you will need to invoke
rcu_barrier() before unloading. Similarly, if your module uses
call_srcu(), you will need to invoke srcu_barrier() before unloading,
and on the same srcu_struct structure. If your module uses call_rcu()
-**and** call_srcu(), then you will need to invoke rcu_barrier() **and**
-srcu_barrier().
+**and** call_srcu(), then (as noted above) you will need to invoke
+rcu_barrier() **and** srcu_barrier().
Implementing rcu_barrier()
@@ -211,27 +191,40 @@ queues. His implementation queues an RCU callback on each of the per-CPU
callback queues, and then waits until they have all started executing, at
which point, all earlier RCU callbacks are guaranteed to have completed.
-The original code for rcu_barrier() was as follows::
-
- 1 void rcu_barrier(void)
- 2 {
- 3 BUG_ON(in_interrupt());
- 4 /* Take cpucontrol mutex to protect against CPU hotplug */
- 5 mutex_lock(&rcu_barrier_mutex);
- 6 init_completion(&rcu_barrier_completion);
- 7 atomic_set(&rcu_barrier_cpu_count, 0);
- 8 on_each_cpu(rcu_barrier_func, NULL, 0, 1);
- 9 wait_for_completion(&rcu_barrier_completion);
- 10 mutex_unlock(&rcu_barrier_mutex);
- 11 }
-
-Line 3 verifies that the caller is in process context, and lines 5 and 10
+The original code for rcu_barrier() was roughly as follows::
+
+ 1 void rcu_barrier(void)
+ 2 {
+ 3 BUG_ON(in_interrupt());
+ 4 /* Take cpucontrol mutex to protect against CPU hotplug */
+ 5 mutex_lock(&rcu_barrier_mutex);
+ 6 init_completion(&rcu_barrier_completion);
+ 7 atomic_set(&rcu_barrier_cpu_count, 1);
+ 8 on_each_cpu(rcu_barrier_func, NULL, 0, 1);
+ 9 if (atomic_dec_and_test(&rcu_barrier_cpu_count))
+ 10 complete(&rcu_barrier_completion);
+ 11 wait_for_completion(&rcu_barrier_completion);
+ 12 mutex_unlock(&rcu_barrier_mutex);
+ 13 }
+
+Line 3 verifies that the caller is in process context, and lines 5 and 12
use rcu_barrier_mutex to ensure that only one rcu_barrier() is using the
global completion and counters at a time, which are initialized on lines
6 and 7. Line 8 causes each CPU to invoke rcu_barrier_func(), which is
shown below. Note that the final "1" in on_each_cpu()'s argument list
ensures that all the calls to rcu_barrier_func() will have completed
-before on_each_cpu() returns. Line 9 then waits for the completion.
+before on_each_cpu() returns. Line 9 removes the initial count from
+rcu_barrier_cpu_count, and if this count is now zero, line 10 finalizes
+the completion, which prevents line 11 from blocking. Either way,
+line 11 then waits (if needed) for the completion.
+
+.. _rcubarrier_quiz_2:
+
+Quick Quiz #2:
+ Why doesn't line 8 initialize rcu_barrier_cpu_count to zero,
+ thereby avoiding the need for lines 9 and 10?
+
+:ref:`Answer to Quick Quiz #2 <answer_rcubarrier_quiz_2>`
This code was rewritten in 2008 and several times thereafter, but this
still gives the general idea.
@@ -239,21 +232,21 @@ still gives the general idea.
The rcu_barrier_func() runs on each CPU, where it invokes call_rcu()
to post an RCU callback, as follows::
- 1 static void rcu_barrier_func(void *notused)
- 2 {
- 3 int cpu = smp_processor_id();
- 4 struct rcu_data *rdp = &per_cpu(rcu_data, cpu);
- 5 struct rcu_head *head;
- 6
- 7 head = &rdp->barrier;
- 8 atomic_inc(&rcu_barrier_cpu_count);
- 9 call_rcu(head, rcu_barrier_callback);
- 10 }
+ 1 static void rcu_barrier_func(void *notused)
+ 2 {
+ 3 int cpu = smp_processor_id();
+ 4 struct rcu_data *rdp = &per_cpu(rcu_data, cpu);
+ 5 struct rcu_head *head;
+ 6
+ 7 head = &rdp->barrier;
+ 8 atomic_inc(&rcu_barrier_cpu_count);
+ 9 call_rcu(head, rcu_barrier_callback);
+ 10 }
Lines 3 and 4 locate RCU's internal per-CPU rcu_data structure,
which contains the struct rcu_head that needed for the later call to
call_rcu(). Line 7 picks up a pointer to this struct rcu_head, and line
-8 increments a global counter. This counter will later be decremented
+8 increments the global counter. This counter will later be decremented
by the callback. Line 9 then registers the rcu_barrier_callback() on
the current CPU's queue.
@@ -261,33 +254,34 @@ The rcu_barrier_callback() function simply atomically decrements the
rcu_barrier_cpu_count variable and finalizes the completion when it
reaches zero, as follows::
- 1 static void rcu_barrier_callback(struct rcu_head *notused)
- 2 {
- 3 if (atomic_dec_and_test(&rcu_barrier_cpu_count))
- 4 complete(&rcu_barrier_completion);
- 5 }
+ 1 static void rcu_barrier_callback(struct rcu_head *notused)
+ 2 {
+ 3 if (atomic_dec_and_test(&rcu_barrier_cpu_count))
+ 4 complete(&rcu_barrier_completion);
+ 5 }
-.. _rcubarrier_quiz_2:
+.. _rcubarrier_quiz_3:
-Quick Quiz #2:
+Quick Quiz #3:
What happens if CPU 0's rcu_barrier_func() executes
immediately (thus incrementing rcu_barrier_cpu_count to the
value one), but the other CPU's rcu_barrier_func() invocations
are delayed for a full grace period? Couldn't this result in
rcu_barrier() returning prematurely?
-:ref:`Answer to Quick Quiz #2 <answer_rcubarrier_quiz_2>`
+:ref:`Answer to Quick Quiz #3 <answer_rcubarrier_quiz_3>`
The current rcu_barrier() implementation is more complex, due to the need
to avoid disturbing idle CPUs (especially on battery-powered systems)
and the need to minimally disturb non-idle CPUs in real-time systems.
-However, the code above illustrates the concepts.
+In addition, a great many optimizations have been applied. However,
+the code above illustrates the concepts.
rcu_barrier() Summary
---------------------
-The rcu_barrier() primitive has seen relatively little use, since most
+The rcu_barrier() primitive is used relatively infrequently, since most
code using RCU is in the core kernel rather than in modules. However, if
you are using RCU from an unloadable module, you need to use rcu_barrier()
so that your module may be safely unloaded.
@@ -302,7 +296,8 @@ Quick Quiz #1:
Is there any other situation where rcu_barrier() might
be required?
-Answer: Interestingly enough, rcu_barrier() was not originally
+Answer:
+ Interestingly enough, rcu_barrier() was not originally
implemented for module unloading. Nikita Danilov was using
RCU in a filesystem, which resulted in a similar situation at
filesystem-unmount time. Dipankar Sarma coded up rcu_barrier()
@@ -318,13 +313,48 @@ Answer: Interestingly enough, rcu_barrier() was not originally
.. _answer_rcubarrier_quiz_2:
Quick Quiz #2:
+ Why doesn't line 8 initialize rcu_barrier_cpu_count to zero,
+ thereby avoiding the need for lines 9 and 10?
+
+Answer:
+ Suppose that the on_each_cpu() function shown on line 8 was
+ delayed, so that CPU 0's rcu_barrier_func() executed and
+ the corresponding grace period elapsed, all before CPU 1's
+ rcu_barrier_func() started executing. This would result in
+ rcu_barrier_cpu_count being decremented to zero, so that line
+ 11's wait_for_completion() would return immediately, failing to
+ wait for CPU 1's callbacks to be invoked.
+
+ Note that this was not a problem when the rcu_barrier() code
+ was first added back in 2005. This is because on_each_cpu()
+ disables preemption, which acted as an RCU read-side critical
+ section, thus preventing CPU 0's grace period from completing
+ until on_each_cpu() had dealt with all of the CPUs. However,
+ with the advent of preemptible RCU, rcu_barrier() no longer
+ waited on nonpreemptible regions of code in preemptible kernels,
+ that being the job of the new rcu_barrier_sched() function.
+
+ However, with the RCU flavor consolidation around v4.20, this
+ possibility was once again ruled out, because the consolidated
+ RCU once again waits on nonpreemptible regions of code.
+
+ Nevertheless, that extra count might still be a good idea.
+ Relying on these sort of accidents of implementation can result
+ in later surprise bugs when the implementation changes.
+
+:ref:`Back to Quick Quiz #2 <rcubarrier_quiz_2>`
+
+.. _answer_rcubarrier_quiz_3:
+
+Quick Quiz #3:
What happens if CPU 0's rcu_barrier_func() executes
immediately (thus incrementing rcu_barrier_cpu_count to the
value one), but the other CPU's rcu_barrier_func() invocations
are delayed for a full grace period? Couldn't this result in
rcu_barrier() returning prematurely?
-Answer: This cannot happen. The reason is that on_each_cpu() has its last
+Answer:
+ This cannot happen. The reason is that on_each_cpu() has its last
argument, the wait flag, set to "1". This flag is passed through
to smp_call_function() and further to smp_call_function_on_cpu(),
causing this latter to spin until the cross-CPU invocation of
@@ -336,18 +366,15 @@ Answer: This cannot happen. The reason is that on_each_cpu() has its last
Therefore, on_each_cpu() disables preemption across its call
to smp_call_function() and also across the local call to
- rcu_barrier_func(). This prevents the local CPU from context
- switching, again preventing grace periods from completing. This
+ rcu_barrier_func(). Because recent RCU implementations treat
+ preemption-disabled regions of code as RCU read-side critical
+ sections, this prevents grace periods from completing. This
means that all CPUs have executed rcu_barrier_func() before
the first rcu_barrier_callback() can possibly execute, in turn
preventing rcu_barrier_cpu_count from prematurely reaching zero.
- Currently, -rt implementations of RCU keep but a single global
- queue for RCU callbacks, and thus do not suffer from this
- problem. However, when the -rt RCU eventually does have per-CPU
- callback queues, things will have to change. One simple change
- is to add an rcu_read_lock() before line 8 of rcu_barrier()
- and an rcu_read_unlock() after line 8 of this same function. If
- you can think of a better change, please let me know!
+ But if on_each_cpu() ever decides to forgo disabling preemption,
+ as might well happen due to real-time latency considerations,
+ initializing rcu_barrier_cpu_count to one will save the day.
-:ref:`Back to Quick Quiz #2 <rcubarrier_quiz_2>`
+:ref:`Back to Quick Quiz #3 <rcubarrier_quiz_3>`
diff --git a/Documentation/RCU/rculist_nulls.rst b/Documentation/RCU/rculist_nulls.rst
index ca4692775ad4..9a734bf54b76 100644
--- a/Documentation/RCU/rculist_nulls.rst
+++ b/Documentation/RCU/rculist_nulls.rst
@@ -14,19 +14,19 @@ Using 'nulls'
=============
Using special makers (called 'nulls') is a convenient way
-to solve following problem :
+to solve following problem.
-A typical RCU linked list managing objects which are
-allocated with SLAB_TYPESAFE_BY_RCU kmem_cache can
-use following algos :
+Without 'nulls', a typical RCU linked list managing objects which are
+allocated with SLAB_TYPESAFE_BY_RCU kmem_cache can use the following
+algorithms:
-1) Lookup algo
---------------
+1) Lookup algorithm
+-------------------
::
- rcu_read_lock()
begin:
+ rcu_read_lock()
obj = lockless_lookup(key);
if (obj) {
if (!try_get_ref(obj)) // might fail for free objects
@@ -38,6 +38,7 @@ use following algos :
*/
if (obj->key != key) { // not the object we expected
put_ref(obj);
+ rcu_read_unlock();
goto begin;
}
}
@@ -52,9 +53,9 @@ but a version with an additional memory barrier (smp_rmb())
{
struct hlist_node *node, *next;
for (pos = rcu_dereference((head)->first);
- pos && ({ next = pos->next; smp_rmb(); prefetch(next); 1; }) &&
- ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; });
- pos = rcu_dereference(next))
+ pos && ({ next = pos->next; smp_rmb(); prefetch(next); 1; }) &&
+ ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; });
+ pos = rcu_dereference(next))
if (obj->key == key)
return obj;
return NULL;
@@ -64,9 +65,9 @@ And note the traditional hlist_for_each_entry_rcu() misses this smp_rmb()::
struct hlist_node *node;
for (pos = rcu_dereference((head)->first);
- pos && ({ prefetch(pos->next); 1; }) &&
- ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; });
- pos = rcu_dereference(pos->next))
+ pos && ({ prefetch(pos->next); 1; }) &&
+ ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; });
+ pos = rcu_dereference(pos->next))
if (obj->key == key)
return obj;
return NULL;
@@ -82,36 +83,32 @@ Quoting Corey Minyard::
solved by pre-fetching the "next" field (with proper barriers) before
checking the key."
-2) Insert algo
---------------
+2) Insertion algorithm
+----------------------
We need to make sure a reader cannot read the new 'obj->obj_next' value
-and previous value of 'obj->key'. Or else, an item could be deleted
+and previous value of 'obj->key'. Otherwise, an item could be deleted
from a chain, and inserted into another chain. If new chain was empty
-before the move, 'next' pointer is NULL, and lockless reader can
-not detect it missed following items in original chain.
+before the move, 'next' pointer is NULL, and lockless reader can not
+detect the fact that it missed following items in original chain.
::
/*
- * Please note that new inserts are done at the head of list,
- * not in the middle or end.
- */
+ * Please note that new inserts are done at the head of list,
+ * not in the middle or end.
+ */
obj = kmem_cache_alloc(...);
lock_chain(); // typically a spin_lock()
obj->key = key;
- /*
- * we need to make sure obj->key is updated before obj->next
- * or obj->refcnt
- */
- smp_wmb();
- atomic_set(&obj->refcnt, 1);
+ atomic_set_release(&obj->refcnt, 1); // key before refcnt
hlist_add_head_rcu(&obj->obj_node, list);
unlock_chain(); // typically a spin_unlock()
-3) Remove algo
---------------
+3) Removal algorithm
+--------------------
+
Nothing special here, we can use a standard RCU hlist deletion.
But thanks to SLAB_TYPESAFE_BY_RCU, beware a deleted object can be reused
very very fast (before the end of RCU grace period)
@@ -133,7 +130,7 @@ Avoiding extra smp_rmb()
========================
With hlist_nulls we can avoid extra smp_rmb() in lockless_lookup()
-and extra smp_wmb() in insert function.
+and extra _release() in insert function.
For example, if we choose to store the slot number as the 'nulls'
end-of-list marker for each slot of the hash table, we can detect
@@ -142,59 +139,61 @@ to another chain) checking the final 'nulls' value if
the lookup met the end of chain. If final 'nulls' value
is not the slot number, then we must restart the lookup at
the beginning. If the object was moved to the same chain,
-then the reader doesn't care : It might eventually
+then the reader doesn't care: It might occasionally
scan the list again without harm.
-1) lookup algo
---------------
+1) lookup algorithm
+-------------------
::
head = &table[slot];
- rcu_read_lock();
begin:
+ rcu_read_lock();
hlist_nulls_for_each_entry_rcu(obj, node, head, member) {
if (obj->key == key) {
- if (!try_get_ref(obj)) // might fail for free objects
+ if (!try_get_ref(obj)) { // might fail for free objects
+ rcu_read_unlock();
goto begin;
+ }
if (obj->key != key) { // not the object we expected
put_ref(obj);
+ rcu_read_unlock();
goto begin;
}
- goto out;
+ goto out;
+ }
+ }
+
+ // If the nulls value we got at the end of this lookup is
+ // not the expected one, we must restart lookup.
+ // We probably met an item that was moved to another chain.
+ if (get_nulls_value(node) != slot) {
+ put_ref(obj);
+ rcu_read_unlock();
+ goto begin;
}
- /*
- * if the nulls value we got at the end of this lookup is
- * not the expected one, we must restart lookup.
- * We probably met an item that was moved to another chain.
- */
- if (get_nulls_value(node) != slot)
- goto begin;
obj = NULL;
out:
rcu_read_unlock();
-2) Insert function
-------------------
+2) Insert algorithm
+-------------------
::
/*
- * Please note that new inserts are done at the head of list,
- * not in the middle or end.
- */
+ * Please note that new inserts are done at the head of list,
+ * not in the middle or end.
+ */
obj = kmem_cache_alloc(cachep);
lock_chain(); // typically a spin_lock()
obj->key = key;
+ atomic_set_release(&obj->refcnt, 1); // key before refcnt
/*
- * changes to obj->key must be visible before refcnt one
- */
- smp_wmb();
- atomic_set(&obj->refcnt, 1);
- /*
- * insert obj in RCU way (readers might be traversing chain)
- */
+ * insert obj in RCU way (readers might be traversing chain)
+ */
hlist_nulls_add_head_rcu(&obj->obj_node, list);
unlock_chain(); // typically a spin_unlock()
diff --git a/Documentation/RCU/stallwarn.rst b/Documentation/RCU/stallwarn.rst
index e38c587067fc..ca7b7cd806a1 100644
--- a/Documentation/RCU/stallwarn.rst
+++ b/Documentation/RCU/stallwarn.rst
@@ -25,10 +25,10 @@ warnings:
- A CPU looping with bottom halves disabled.
-- For !CONFIG_PREEMPTION kernels, a CPU looping anywhere in the kernel
- without invoking schedule(). If the looping in the kernel is
- really expected and desirable behavior, you might need to add
- some calls to cond_resched().
+- For !CONFIG_PREEMPTION kernels, a CPU looping anywhere in the
+ kernel without potentially invoking schedule(). If the looping
+ in the kernel is really expected and desirable behavior, you
+ might need to add some calls to cond_resched().
- Booting Linux using a console connection that is too slow to
keep up with the boot-time console-message rate. For example,
@@ -108,16 +108,17 @@ warnings:
- A bug in the RCU implementation.
-- A hardware failure. This is quite unlikely, but has occurred
- at least once in real life. A CPU failed in a running system,
- becoming unresponsive, but not causing an immediate crash.
- This resulted in a series of RCU CPU stall warnings, eventually
- leading the realization that the CPU had failed.
+- A hardware failure. This is quite unlikely, but is not at all
+ uncommon in large datacenter. In one memorable case some decades
+ back, a CPU failed in a running system, becoming unresponsive,
+ but not causing an immediate crash. This resulted in a series
+ of RCU CPU stall warnings, eventually leading the realization
+ that the CPU had failed.
-The RCU, RCU-sched, and RCU-tasks implementations have CPU stall warning.
-Note that SRCU does *not* have CPU stall warnings. Please note that
-RCU only detects CPU stalls when there is a grace period in progress.
-No grace period, no CPU stall warnings.
+The RCU, RCU-sched, RCU-tasks, and RCU-tasks-trace implementations have
+CPU stall warning. Note that SRCU does *not* have CPU stall warnings.
+Please note that RCU only detects CPU stalls when there is a grace period
+in progress. No grace period, no CPU stall warnings.
To diagnose the cause of the stall, inspect the stack traces.
The offending function will usually be near the top of the stack.
@@ -205,16 +206,21 @@ RCU_STALL_RAT_DELAY
rcupdate.rcu_task_stall_timeout
-------------------------------
- This boot/sysfs parameter controls the RCU-tasks stall warning
- interval. A value of zero or less suppresses RCU-tasks stall
- warnings. A positive value sets the stall-warning interval
- in seconds. An RCU-tasks stall warning starts with the line:
+ This boot/sysfs parameter controls the RCU-tasks and
+ RCU-tasks-trace stall warning intervals. A value of zero or less
+ suppresses RCU-tasks stall warnings. A positive value sets the
+ stall-warning interval in seconds. An RCU-tasks stall warning
+ starts with the line:
INFO: rcu_tasks detected stalls on tasks:
And continues with the output of sched_show_task() for each
task stalling the current RCU-tasks grace period.
+ An RCU-tasks-trace stall warning starts (and continues) similarly:
+
+ INFO: rcu_tasks_trace detected stalls on tasks
+
Interpreting RCU's CPU Stall-Detector "Splats"
==============================================
@@ -248,7 +254,8 @@ dynticks counter, which will have an even-numbered value if the CPU
is in dyntick-idle mode and an odd-numbered value otherwise. The hex
number between the two "/"s is the value of the nesting, which will be
a small non-negative number if in the idle loop (as shown above) and a
-very large positive number otherwise.
+very large positive number otherwise. The number following the final
+"/" is the NMI nesting, which will be a small non-negative number.
The "softirq=" portion of the message tracks the number of RCU softirq
handlers that the stalled CPU has executed. The number before the "/"
@@ -383,3 +390,95 @@ for example, "P3421".
It is entirely possible to see stall warnings from normal and from
expedited grace periods at about the same time during the same run.
+
+RCU_CPU_STALL_CPUTIME
+=====================
+
+In kernels built with CONFIG_RCU_CPU_STALL_CPUTIME=y or booted with
+rcupdate.rcu_cpu_stall_cputime=1, the following additional information
+is supplied with each RCU CPU stall warning::
+
+ rcu: hardirqs softirqs csw/system
+ rcu: number: 624 45 0
+ rcu: cputime: 69 1 2425 ==> 2500(ms)
+
+These statistics are collected during the sampling period. The values
+in row "number:" are the number of hard interrupts, number of soft
+interrupts, and number of context switches on the stalled CPU. The
+first three values in row "cputime:" indicate the CPU time in
+milliseconds consumed by hard interrupts, soft interrupts, and tasks
+on the stalled CPU. The last number is the measurement interval, again
+in milliseconds. Because user-mode tasks normally do not cause RCU CPU
+stalls, these tasks are typically kernel tasks, which is why only the
+system CPU time are considered.
+
+The sampling period is shown as follows::
+
+ |<------------first timeout---------->|<-----second timeout----->|
+ |<--half timeout-->|<--half timeout-->| |
+ | |<--first period-->| |
+ | |<-----------second sampling period---------->|
+ | | | |
+ snapshot time point 1st-stall 2nd-stall
+
+The following describes four typical scenarios:
+
+1. A CPU looping with interrupts disabled.
+
+ ::
+
+ rcu: hardirqs softirqs csw/system
+ rcu: number: 0 0 0
+ rcu: cputime: 0 0 0 ==> 2500(ms)
+
+ Because interrupts have been disabled throughout the measurement
+ interval, there are no interrupts and no context switches.
+ Furthermore, because CPU time consumption was measured using interrupt
+ handlers, the system CPU consumption is misleadingly measured as zero.
+ This scenario will normally also have "(0 ticks this GP)" printed on
+ this CPU's summary line.
+
+2. A CPU looping with bottom halves disabled.
+
+ This is similar to the previous example, but with non-zero number of
+ and CPU time consumed by hard interrupts, along with non-zero CPU
+ time consumed by in-kernel execution::
+
+ rcu: hardirqs softirqs csw/system
+ rcu: number: 624 0 0
+ rcu: cputime: 49 0 2446 ==> 2500(ms)
+
+ The fact that there are zero softirqs gives a hint that these were
+ disabled, perhaps via local_bh_disable(). It is of course possible
+ that there were no softirqs, perhaps because all events that would
+ result in softirq execution are confined to other CPUs. In this case,
+ the diagnosis should continue as shown in the next example.
+
+3. A CPU looping with preemption disabled.
+
+ Here, only the number of context switches is zero::
+
+ rcu: hardirqs softirqs csw/system
+ rcu: number: 624 45 0
+ rcu: cputime: 69 1 2425 ==> 2500(ms)
+
+ This situation hints that the stalled CPU was looping with preemption
+ disabled.
+
+4. No looping, but massive hard and soft interrupts.
+
+ ::
+
+ rcu: hardirqs softirqs csw/system
+ rcu: number: xx xx 0
+ rcu: cputime: xx xx 0 ==> 2500(ms)
+
+ Here, the number and CPU time of hard interrupts are all non-zero,
+ but the number of context switches and the in-kernel CPU time consumed
+ are zero. The number and cputime of soft interrupts will usually be
+ non-zero, but could be zero, for example, if the CPU was spinning
+ within a single hard interrupt handler.
+
+ If this type of RCU CPU stall warning can be reproduced, you can
+ narrow it down by looking at /proc/interrupts or by writing code to
+ trace each interrupt, for example, by referring to show_interrupts().
diff --git a/Documentation/RCU/torture.rst b/Documentation/RCU/torture.rst
index a90147713062..0316ba0c6922 100644
--- a/Documentation/RCU/torture.rst
+++ b/Documentation/RCU/torture.rst
@@ -206,7 +206,11 @@ values for memory may require disabling the callback-flooding tests
using the --bootargs parameter discussed below.
Sometimes additional debugging is useful, and in such cases the --kconfig
-parameter to kvm.sh may be used, for example, ``--kconfig 'CONFIG_KASAN=y'``.
+parameter to kvm.sh may be used, for example, ``--kconfig 'CONFIG_RCU_EQS_DEBUG=y'``.
+In addition, there are the --gdb, --kasan, and --kcsan parameters.
+Note that --gdb limits you to one scenario per kvm.sh run and requires
+that you have another window open from which to run ``gdb`` as instructed
+by the script.
Kernel boot arguments can also be supplied, for example, to control
rcutorture's module parameters. For example, to test a change to RCU's
@@ -219,10 +223,17 @@ require disabling rcutorture's callback-flooding tests::
--bootargs 'rcutorture.fwd_progress=0'
Sometimes all that is needed is a full set of kernel builds. This is
-what the --buildonly argument does.
+what the --buildonly parameter does.
-Finally, the --trust-make argument allows each kernel build to reuse what
-it can from the previous kernel build.
+The --duration parameter can override the default run time of 30 minutes.
+For example, ``--duration 2d`` would run for two days, ``--duration 3h``
+would run for three hours, ``--duration 5m`` would run for five minutes,
+and ``--duration 45s`` would run for 45 seconds. This last can be useful
+for tracking down rare boot-time failures.
+
+Finally, the --trust-make parameter allows each kernel build to reuse what
+it can from the previous kernel build. Please note that without the
+--trust-make parameter, your tags files may be demolished.
There are additional more arcane arguments that are documented in the
source code of the kvm.sh script.
@@ -291,3 +302,73 @@ the following summary at the end of the run on a 12-CPU system::
TREE07 ------- 167347 GPs (30.9902/s) [rcu: g1079021 f0x0 ] n_max_cbs: 478732
CPU count limited from 16 to 12
TREE09 ------- 752238 GPs (139.303/s) [rcu: g13075057 f0x0 ] n_max_cbs: 99011
+
+
+Repeated Runs
+=============
+
+Suppose that you are chasing down a rare boot-time failure. Although you
+could use kvm.sh, doing so will rebuild the kernel on each run. If you
+need (say) 1,000 runs to have confidence that you have fixed the bug,
+these pointless rebuilds can become extremely annoying.
+
+This is why kvm-again.sh exists.
+
+Suppose that a previous kvm.sh run left its output in this directory::
+
+ tools/testing/selftests/rcutorture/res/2022.11.03-11.26.28
+
+Then this run can be re-run without rebuilding as follow:
+
+ kvm-again.sh tools/testing/selftests/rcutorture/res/2022.11.03-11.26.28
+
+A few of the original run's kvm.sh parameters may be overridden, perhaps
+most notably --duration and --bootargs. For example::
+
+ kvm-again.sh tools/testing/selftests/rcutorture/res/2022.11.03-11.26.28 \
+ --duration 45s
+
+would re-run the previous test, but for only 45 seconds, thus facilitating
+tracking down the aforementioned rare boot-time failure.
+
+
+Distributed Runs
+================
+
+Although kvm.sh is quite useful, its testing is confined to a single
+system. It is not all that hard to use your favorite framework to cause
+(say) 5 instances of kvm.sh to run on your 5 systems, but this will very
+likely unnecessarily rebuild kernels. In addition, manually distributing
+the desired rcutorture scenarios across the available systems can be
+painstaking and error-prone.
+
+And this is why the kvm-remote.sh script exists.
+
+If you the following command works::
+
+ ssh system0 date
+
+and if it also works for system1, system2, system3, system4, and system5,
+and all of these systems have 64 CPUs, you can type::
+
+ kvm-remote.sh "system0 system1 system2 system3 system4 system5" \
+ --cpus 64 --duration 8h --configs "5*CFLIST"
+
+This will build each default scenario's kernel on the local system, then
+spread each of five instances of each scenario over the systems listed,
+running each scenario for eight hours. At the end of the runs, the
+results will be gathered, recorded, and printed. Most of the parameters
+that kvm.sh will accept can be passed to kvm-remote.sh, but the list of
+systems must come first.
+
+The kvm.sh ``--dryrun scenarios`` argument is useful for working out
+how many scenarios may be run in one batch across a group of systems.
+
+You can also re-run a previous remote run in a manner similar to kvm.sh:
+
+ kvm-remote.sh "system0 system1 system2 system3 system4 system5" \
+ tools/testing/selftests/rcutorture/res/2022.11.03-11.26.28-remote \
+ --duration 24h
+
+In this case, most of the kvm-again.sh parmeters may be supplied following
+the pathname of the old run-results directory.
diff --git a/Documentation/RCU/whatisRCU.rst b/Documentation/RCU/whatisRCU.rst
index 1c747ac3f2c8..2c5563a91998 100644
--- a/Documentation/RCU/whatisRCU.rst
+++ b/Documentation/RCU/whatisRCU.rst
@@ -16,18 +16,23 @@ to start learning about RCU:
| 6. The RCU API, 2019 Edition https://lwn.net/Articles/777036/
| 2019 Big API Table https://lwn.net/Articles/777165/
+For those preferring video:
+
+| 1. Unraveling RCU Mysteries: Fundamentals https://www.linuxfoundation.org/webinars/unraveling-rcu-usage-mysteries
+| 2. Unraveling RCU Mysteries: Additional Use Cases https://www.linuxfoundation.org/webinars/unraveling-rcu-usage-mysteries-additional-use-cases
+
What is RCU?
RCU is a synchronization mechanism that was added to the Linux kernel
during the 2.5 development effort that is optimized for read-mostly
-situations. Although RCU is actually quite simple once you understand it,
-getting there can sometimes be a challenge. Part of the problem is that
-most of the past descriptions of RCU have been written with the mistaken
-assumption that there is "one true way" to describe RCU. Instead,
-the experience has been that different people must take different paths
-to arrive at an understanding of RCU. This document provides several
-different paths, as follows:
+situations. Although RCU is actually quite simple, making effective use
+of it requires you to think differently about your code. Another part
+of the problem is the mistaken assumption that there is "one true way" to
+describe and to use RCU. Instead, the experience has been that different
+people must take different paths to arrive at an understanding of RCU,
+depending on their experiences and use cases. This document provides
+several different paths, as follows:
:ref:`1. RCU OVERVIEW <1_whatisRCU>`
@@ -157,34 +162,36 @@ rcu_read_lock()
^^^^^^^^^^^^^^^
void rcu_read_lock(void);
- Used by a reader to inform the reclaimer that the reader is
- entering an RCU read-side critical section. It is illegal
- to block while in an RCU read-side critical section, though
- kernels built with CONFIG_PREEMPT_RCU can preempt RCU
- read-side critical sections. Any RCU-protected data structure
- accessed during an RCU read-side critical section is guaranteed to
- remain unreclaimed for the full duration of that critical section.
- Reference counts may be used in conjunction with RCU to maintain
- longer-term references to data structures.
+ This temporal primitive is used by a reader to inform the
+ reclaimer that the reader is entering an RCU read-side critical
+ section. It is illegal to block while in an RCU read-side
+ critical section, though kernels built with CONFIG_PREEMPT_RCU
+ can preempt RCU read-side critical sections. Any RCU-protected
+ data structure accessed during an RCU read-side critical section
+ is guaranteed to remain unreclaimed for the full duration of that
+ critical section. Reference counts may be used in conjunction
+ with RCU to maintain longer-term references to data structures.
rcu_read_unlock()
^^^^^^^^^^^^^^^^^
void rcu_read_unlock(void);
- Used by a reader to inform the reclaimer that the reader is
- exiting an RCU read-side critical section. Note that RCU
- read-side critical sections may be nested and/or overlapping.
+ This temporal primitives is used by a reader to inform the
+ reclaimer that the reader is exiting an RCU read-side critical
+ section. Note that RCU read-side critical sections may be nested
+ and/or overlapping.
synchronize_rcu()
^^^^^^^^^^^^^^^^^
void synchronize_rcu(void);
- Marks the end of updater code and the beginning of reclaimer
- code. It does this by blocking until all pre-existing RCU
- read-side critical sections on all CPUs have completed.
- Note that synchronize_rcu() will **not** necessarily wait for
- any subsequent RCU read-side critical sections to complete.
- For example, consider the following sequence of events::
+ This temporal primitive marks the end of updater code and the
+ beginning of reclaimer code. It does this by blocking until
+ all pre-existing RCU read-side critical sections on all CPUs
+ have completed. Note that synchronize_rcu() will **not**
+ necessarily wait for any subsequent RCU read-side critical
+ sections to complete. For example, consider the following
+ sequence of events::
CPU 0 CPU 1 CPU 2
----------------- ------------------------- ---------------
@@ -211,13 +218,13 @@ synchronize_rcu()
to be useful in all but the most read-intensive situations,
synchronize_rcu()'s overhead must also be quite small.
- The call_rcu() API is a callback form of synchronize_rcu(),
- and is described in more detail in a later section. Instead of
- blocking, it registers a function and argument which are invoked
- after all ongoing RCU read-side critical sections have completed.
- This callback variant is particularly useful in situations where
- it is illegal to block or where update-side performance is
- critically important.
+ The call_rcu() API is an asynchronous callback form of
+ synchronize_rcu(), and is described in more detail in a later
+ section. Instead of blocking, it registers a function and
+ argument which are invoked after all ongoing RCU read-side
+ critical sections have completed. This callback variant is
+ particularly useful in situations where it is illegal to block
+ or where update-side performance is critically important.
However, the call_rcu() API should not be used lightly, as use
of the synchronize_rcu() API generally results in simpler code.
@@ -236,11 +243,13 @@ rcu_assign_pointer()
would be cool to be able to declare a function in this manner.
(Compiler experts will no doubt disagree.)
- The updater uses this function to assign a new value to an
+ The updater uses this spatial macro to assign a new value to an
RCU-protected pointer, in order to safely communicate the change
- in value from the updater to the reader. This macro does not
- evaluate to an rvalue, but it does execute any memory-barrier
- instructions required for a given CPU architecture.
+ in value from the updater to the reader. This is a spatial (as
+ opposed to temporal) macro. It does not evaluate to an rvalue,
+ but it does execute any memory-barrier instructions required
+ for a given CPU architecture. Its ordering properties are that
+ of a store-release operation.
Perhaps just as important, it serves to document (1) which
pointers are protected by RCU and (2) the point at which a
@@ -255,14 +264,15 @@ rcu_dereference()
Like rcu_assign_pointer(), rcu_dereference() must be implemented
as a macro.
- The reader uses rcu_dereference() to fetch an RCU-protected
- pointer, which returns a value that may then be safely
- dereferenced. Note that rcu_dereference() does not actually
- dereference the pointer, instead, it protects the pointer for
- later dereferencing. It also executes any needed memory-barrier
- instructions for a given CPU architecture. Currently, only Alpha
- needs memory barriers within rcu_dereference() -- on other CPUs,
- it compiles to nothing, not even a compiler directive.
+ The reader uses the spatial rcu_dereference() macro to fetch
+ an RCU-protected pointer, which returns a value that may
+ then be safely dereferenced. Note that rcu_dereference()
+ does not actually dereference the pointer, instead, it
+ protects the pointer for later dereferencing. It also
+ executes any needed memory-barrier instructions for a given
+ CPU architecture. Currently, only Alpha needs memory barriers
+ within rcu_dereference() -- on other CPUs, it compiles to a
+ volatile load.
Common coding practice uses rcu_dereference() to copy an
RCU-protected pointer to a local variable, then dereferences
@@ -355,12 +365,15 @@ reader, updater, and reclaimer.
synchronize_rcu() & call_rcu()
-The RCU infrastructure observes the time sequence of rcu_read_lock(),
+The RCU infrastructure observes the temporal sequence of rcu_read_lock(),
rcu_read_unlock(), synchronize_rcu(), and call_rcu() invocations in
order to determine when (1) synchronize_rcu() invocations may return
to their callers and (2) call_rcu() callbacks may be invoked. Efficient
implementations of the RCU infrastructure make heavy use of batching in
order to amortize their overhead over many uses of the corresponding APIs.
+The rcu_assign_pointer() and rcu_dereference() invocations communicate
+spatial changes via stores to and loads from the RCU-protected pointer in
+question.
There are at least three flavors of RCU usage in the Linux kernel. The diagram
above shows the most common one. On the updater side, the rcu_assign_pointer(),
@@ -392,7 +405,9 @@ b. RCU applied to networking data structures that may be subjected
c. RCU applied to scheduler and interrupt/NMI-handler tasks.
Again, most uses will be of (a). The (b) and (c) cases are important
-for specialized uses, but are relatively uncommon.
+for specialized uses, but are relatively uncommon. The SRCU, RCU-Tasks,
+RCU-Tasks-Rude, and RCU-Tasks-Trace have similar relationships among
+their assorted primitives.
.. _3_whatisRCU:
@@ -468,7 +483,7 @@ So, to sum up:
- Within an RCU read-side critical section, use rcu_dereference()
to dereference RCU-protected pointers.
-- Use some solid scheme (such as locks or semaphores) to
+- Use some solid design (such as locks or semaphores) to
keep concurrent updates from interfering with each other.
- Use rcu_assign_pointer() to update an RCU-protected pointer.
@@ -579,6 +594,14 @@ to avoid having to write your own callback::
kfree_rcu(old_fp, rcu);
+If the occasional sleep is permitted, the single-argument form may
+be used, omitting the rcu_head structure from struct foo.
+
+ kfree_rcu(old_fp);
+
+This variant of kfree_rcu() almost never blocks, but might do so by
+invoking synchronize_rcu() in response to memory-allocation failure.
+
Again, see checklist.rst for additional rules governing the use of RCU.
.. _5_whatisRCU:
@@ -596,7 +619,7 @@ lacking both functionality and performance. However, they are useful
in getting a feel for how RCU works. See kernel/rcu/update.c for a
production-quality implementation, and see:
- http://www.rdrop.com/users/paulmck/RCU
+ https://docs.google.com/document/d/1X0lThx8OK0ZgLMqVoXiR4ZrGURHrXK6NyLRbeXe3Xac/edit
for papers describing the Linux kernel RCU implementation. The OLS'01
and OLS'02 papers are a good introduction, and the dissertation provides
@@ -929,6 +952,8 @@ unfortunately any spinlock in a ``SLAB_TYPESAFE_BY_RCU`` object must be
initialized after each and every call to kmem_cache_alloc(), which renders
reference-free spinlock acquisition completely unsafe. Therefore, when
using ``SLAB_TYPESAFE_BY_RCU``, make proper use of a reference counter.
+(Those willing to use a kmem_cache constructor may also use locking,
+including cache-friendly sequence locking.)
With traditional reference counting -- such as that implemented by the
kref library in Linux -- there is typically code that runs when the last
@@ -1047,6 +1072,30 @@ sched::
rcu_read_lock_sched_held
+RCU-Tasks::
+
+ Critical sections Grace period Barrier
+
+ N/A call_rcu_tasks rcu_barrier_tasks
+ synchronize_rcu_tasks
+
+
+RCU-Tasks-Rude::
+
+ Critical sections Grace period Barrier
+
+ N/A call_rcu_tasks_rude rcu_barrier_tasks_rude
+ synchronize_rcu_tasks_rude
+
+
+RCU-Tasks-Trace::
+
+ Critical sections Grace period Barrier
+
+ rcu_read_lock_trace call_rcu_tasks_trace rcu_barrier_tasks_trace
+ rcu_read_unlock_trace synchronize_rcu_tasks_trace
+
+
SRCU::
Critical sections Grace period Barrier
@@ -1087,35 +1136,43 @@ list can be helpful:
a. Will readers need to block? If so, you need SRCU.
-b. What about the -rt patchset? If readers would need to block
- in an non-rt kernel, you need SRCU. If readers would block
- in a -rt kernel, but not in a non-rt kernel, SRCU is not
- necessary. (The -rt patchset turns spinlocks into sleeplocks,
- hence this distinction.)
+b. Will readers need to block and are you doing tracing, for
+ example, ftrace or BPF? If so, you need RCU-tasks,
+ RCU-tasks-rude, and/or RCU-tasks-trace.
+
+c. What about the -rt patchset? If readers would need to block in
+ an non-rt kernel, you need SRCU. If readers would block when
+ acquiring spinlocks in a -rt kernel, but not in a non-rt kernel,
+ SRCU is not necessary. (The -rt patchset turns spinlocks into
+ sleeplocks, hence this distinction.)
-c. Do you need to treat NMI handlers, hardirq handlers,
+d. Do you need to treat NMI handlers, hardirq handlers,
and code segments with preemption disabled (whether
via preempt_disable(), local_irq_save(), local_bh_disable(),
or some other mechanism) as if they were explicit RCU readers?
- If so, RCU-sched is the only choice that will work for you.
-
-d. Do you need RCU grace periods to complete even in the face
- of softirq monopolization of one or more of the CPUs? For
- example, is your code subject to network-based denial-of-service
- attacks? If so, you should disable softirq across your readers,
- for example, by using rcu_read_lock_bh().
-
-e. Is your workload too update-intensive for normal use of
+ If so, RCU-sched readers are the only choice that will work
+ for you, but since about v4.20 you use can use the vanilla RCU
+ update primitives.
+
+e. Do you need RCU grace periods to complete even in the face of
+ softirq monopolization of one or more of the CPUs? For example,
+ is your code subject to network-based denial-of-service attacks?
+ If so, you should disable softirq across your readers, for
+ example, by using rcu_read_lock_bh(). Since about v4.20 you
+ use can use the vanilla RCU update primitives.
+
+f. Is your workload too update-intensive for normal use of
RCU, but inappropriate for other synchronization mechanisms?
If so, consider SLAB_TYPESAFE_BY_RCU (which was originally
named SLAB_DESTROY_BY_RCU). But please be careful!
-f. Do you need read-side critical sections that are respected
- even though they are in the middle of the idle loop, during
- user-mode execution, or on an offlined CPU? If so, SRCU is the
- only choice that will work for you.
+g. Do you need read-side critical sections that are respected even
+ on CPUs that are deep in the idle loop, during entry to or exit
+ from user-mode execution, or on an offlined CPU? If so, SRCU
+ and RCU Tasks Trace are the only choices that will work for you,
+ with SRCU being strongly preferred in almost all cases.
-g. Otherwise, use RCU.
+h. Otherwise, use RCU.
Of course, this all assumes that you have determined that RCU is in fact
the right tool for your job.
diff --git a/Documentation/admin-guide/cgroup-v1/cgroups.rst b/Documentation/admin-guide/cgroup-v1/cgroups.rst
index b0688011ed06..9343148ee993 100644
--- a/Documentation/admin-guide/cgroup-v1/cgroups.rst
+++ b/Documentation/admin-guide/cgroup-v1/cgroups.rst
@@ -80,6 +80,8 @@ access. For example, cpusets (see Documentation/admin-guide/cgroup-v1/cpusets.rs
you to associate a set of CPUs and a set of memory nodes with the
tasks in each cgroup.
+.. _cgroups-why-needed:
+
1.2 Why are cgroups needed ?
----------------------------
diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst
index 60370f2c67b9..27d89495ac88 100644
--- a/Documentation/admin-guide/cgroup-v1/memory.rst
+++ b/Documentation/admin-guide/cgroup-v1/memory.rst
@@ -2,18 +2,18 @@
Memory Resource Controller
==========================
-NOTE:
+.. caution::
This document is hopelessly outdated and it asks for a complete
rewrite. It still contains a useful information so we are keeping it
here but make sure to check the current code if you need a deeper
understanding.
-NOTE:
+.. note::
The Memory Resource Controller has generically been referred to as the
memory controller in this document. Do not confuse memory controller
used here with the memory controller that is used in hardware.
-(For editors) In this document:
+.. hint::
When we mention a cgroup (cgroupfs's directory) with memory controller,
we call it "memory cgroup". When you see git-log and source code, you'll
see patch's title and function names tend to use "memcg".
@@ -23,7 +23,7 @@ Benefits and Purpose of the memory controller
=============================================
The memory controller isolates the memory behaviour of a group of tasks
-from the rest of the system. The article on LWN [12] mentions some probable
+from the rest of the system. The article on LWN [12]_ mentions some probable
uses of the memory controller. The memory controller can be used to
a. Isolate an application or a group of applications
@@ -55,7 +55,8 @@ Features:
- Root cgroup has no limit controls.
Kernel memory support is a work in progress, and the current version provides
- basically functionality. (See Section 2.7)
+ basically functionality. (See :ref:`section 2.7
+ <cgroup-v1-memory-kernel-extension>`)
Brief summary of control files.
@@ -107,16 +108,16 @@ Brief summary of control files.
==========
The memory controller has a long history. A request for comments for the memory
-controller was posted by Balbir Singh [1]. At the time the RFC was posted
+controller was posted by Balbir Singh [1]_. At the time the RFC was posted
there were several implementations for memory control. The goal of the
RFC was to build consensus and agreement for the minimal features required
-for memory control. The first RSS controller was posted by Balbir Singh[2]
-in Feb 2007. Pavel Emelianov [3][4][5] has since posted three versions of the
-RSS controller. At OLS, at the resource management BoF, everyone suggested
-that we handle both page cache and RSS together. Another request was raised
-to allow user space handling of OOM. The current memory controller is
+for memory control. The first RSS controller was posted by Balbir Singh [2]_
+in Feb 2007. Pavel Emelianov [3]_ [4]_ [5]_ has since posted three versions
+of the RSS controller. At OLS, at the resource management BoF, everyone
+suggested that we handle both page cache and RSS together. Another request was
+raised to allow user space handling of OOM. The current memory controller is
at version 6; it combines both mapped (RSS) and unmapped Page
-Cache Control [11].
+Cache Control [11]_.
2. Memory Control
=================
@@ -147,7 +148,8 @@ specific data structure (mem_cgroup) associated with it.
2.2. Accounting
---------------
-::
+.. code-block::
+ :caption: Figure 1: Hierarchy of Accounting
+--------------------+
| mem_cgroup |
@@ -167,7 +169,6 @@ specific data structure (mem_cgroup) associated with it.
| | | |
+---------------+ +---------------+
- (Figure 1: Hierarchy of Accounting)
Figure 1 shows the important aspects of the controller
@@ -221,8 +222,9 @@ behind this approach is that a cgroup that aggressively uses a shared
page will eventually get charged for it (once it is uncharged from
the cgroup that brought it in -- this will happen on memory pressure).
-But see section 8.2: when moving a task to another cgroup, its pages may
-be recharged to the new cgroup, if move_charge_at_immigrate has been chosen.
+But see :ref:`section 8.2 <cgroup-v1-memory-movable-charges>` when moving a
+task to another cgroup, its pages may be recharged to the new cgroup, if
+move_charge_at_immigrate has been chosen.
2.4 Swap Extension
--------------------------------------
@@ -244,7 +246,8 @@ In this case, setting memsw.limit_in_bytes=3G will prevent bad use of swap.
By using the memsw limit, you can avoid system OOM which can be caused by swap
shortage.
-**why 'memory+swap' rather than swap**
+2.4.1 why 'memory+swap' rather than swap
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The global LRU(kswapd) can swap out arbitrary pages. Swap-out means
to move account from memory to swap...there is no change in usage of
@@ -252,7 +255,8 @@ memory+swap. In other words, when we want to limit the usage of swap without
affecting global LRU, memory+swap limit is better than just limiting swap from
an OS point of view.
-**What happens when a cgroup hits memory.memsw.limit_in_bytes**
+2.4.2. What happens when a cgroup hits memory.memsw.limit_in_bytes
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When a cgroup hits memory.memsw.limit_in_bytes, it's useless to do swap-out
in this cgroup. Then, swap-out will not be done by cgroup routine and file
@@ -268,26 +272,26 @@ global VM. When a cgroup goes over its limit, we first try
to reclaim memory from the cgroup so as to make space for the new
pages that the cgroup has touched. If the reclaim is unsuccessful,
an OOM routine is invoked to select and kill the bulkiest task in the
-cgroup. (See 10. OOM Control below.)
+cgroup. (See :ref:`10. OOM Control <cgroup-v1-memory-oom-control>` below.)
The reclaim algorithm has not been modified for cgroups, except that
pages that are selected for reclaiming come from the per-cgroup LRU
list.
-NOTE:
- Reclaim does not work for the root cgroup, since we cannot set any
- limits on the root cgroup.
+.. note::
+ Reclaim does not work for the root cgroup, since we cannot set any
+ limits on the root cgroup.
-Note2:
- When panic_on_oom is set to "2", the whole system will panic.
+.. note::
+ When panic_on_oom is set to "2", the whole system will panic.
When oom event notifier is registered, event will be delivered.
-(See oom_control section)
+(See :ref:`oom_control <cgroup-v1-memory-oom-control>` section)
2.6 Locking
-----------
-Lock order is as follows:
+Lock order is as follows::
Page lock (PG_locked bit of page->flags)
mm->page_table_lock or split pte_lock
@@ -299,6 +303,8 @@ Per-node-per-memcgroup LRU (cgroup's private LRU) is guarded by
lruvec->lru_lock; PG_lru bit of page->flags is cleared before
isolating a page from its LRU under lruvec->lru_lock.
+.. _cgroup-v1-memory-kernel-extension:
+
2.7 Kernel Memory Extension
-----------------------------------------------
@@ -367,10 +373,10 @@ U != 0, K < U:
never greater than the total memory, and freely set U at the cost of his
QoS.
-WARNING:
- In the current implementation, memory reclaim will NOT be
- triggered for a cgroup when it hits K while staying below U, which makes
- this setup impractical.
+ .. warning::
+ In the current implementation, memory reclaim will NOT be triggered for
+ a cgroup when it hits K while staying below U, which makes this setup
+ impractical.
U != 0, K >= U:
Since kmem charges will also be fed to the user counter and reclaim will be
@@ -381,45 +387,41 @@ U != 0, K >= U:
3. User Interface
=================
-3.0. Configuration
-------------------
-
-a. Enable CONFIG_CGROUPS
-b. Enable CONFIG_MEMCG
-
-3.1. Prepare the cgroups (see cgroups.txt, Why are cgroups needed?)
--------------------------------------------------------------------
+To use the user interface:
-::
+1. Enable CONFIG_CGROUPS and CONFIG_MEMCG options
+2. Prepare the cgroups (see :ref:`Why are cgroups needed?
+ <cgroups-why-needed>` for the background information)::
# mount -t tmpfs none /sys/fs/cgroup
# mkdir /sys/fs/cgroup/memory
# mount -t cgroup none /sys/fs/cgroup/memory -o memory
-3.2. Make the new group and move bash into it::
+3. Make the new group and move bash into it::
# mkdir /sys/fs/cgroup/memory/0
# echo $$ > /sys/fs/cgroup/memory/0/tasks
-Since now we're in the 0 cgroup, we can alter the memory limit::
+4. Since now we're in the 0 cgroup, we can alter the memory limit::
# echo 4M > /sys/fs/cgroup/memory/0/memory.limit_in_bytes
-NOTE:
- We can use a suffix (k, K, m, M, g or G) to indicate values in kilo,
- mega or gigabytes. (Here, Kilo, Mega, Giga are Kibibytes, Mebibytes,
- Gibibytes.)
+ The limit can now be queried::
-NOTE:
- We can write "-1" to reset the ``*.limit_in_bytes(unlimited)``.
+ # cat /sys/fs/cgroup/memory/0/memory.limit_in_bytes
+ 4194304
-NOTE:
- We cannot set limits on the root cgroup any more.
+.. note::
+ We can use a suffix (k, K, m, M, g or G) to indicate values in kilo,
+ mega or gigabytes. (Here, Kilo, Mega, Giga are Kibibytes, Mebibytes,
+ Gibibytes.)
-::
+.. note::
+ We can write "-1" to reset the ``*.limit_in_bytes(unlimited)``.
+
+.. note::
+ We cannot set limits on the root cgroup any more.
- # cat /sys/fs/cgroup/memory/0/memory.limit_in_bytes
- 4194304
We can check the usage::
@@ -458,6 +460,8 @@ test because it has noise of shared objects/status.
But the above two are testing extreme situations.
Trying usual test under memory controller is always helpful.
+.. _cgroup-v1-memory-test-troubleshoot:
+
4.1 Troubleshooting
-------------------
@@ -470,8 +474,11 @@ terminated by the OOM killer. There are several causes for this:
A sync followed by echo 1 > /proc/sys/vm/drop_caches will help get rid of
some of the pages cached in the cgroup (page cache pages).
-To know what happens, disabling OOM_Kill as per "10. OOM Control" (below) and
-seeing what happens will be helpful.
+To know what happens, disabling OOM_Kill as per :ref:`"10. OOM Control"
+<cgroup-v1-memory-oom-control>` (below) and seeing what happens will be
+helpful.
+
+.. _cgroup-v1-memory-test-task-migration:
4.2 Task migration
------------------
@@ -482,15 +489,16 @@ remain charged to it, the charge is dropped when the page is freed or
reclaimed.
You can move charges of a task along with task migration.
-See 8. "Move charges at task migration"
+See :ref:`8. "Move charges at task migration" <cgroup-v1-memory-move-charges>`
4.3 Removing a cgroup
---------------------
-A cgroup can be removed by rmdir, but as discussed in sections 4.1 and 4.2, a
-cgroup might have some charge associated with it, even though all
-tasks have migrated away from it. (because we charge against pages, not
-against tasks.)
+A cgroup can be removed by rmdir, but as discussed in :ref:`sections 4.1
+<cgroup-v1-memory-test-troubleshoot>` and :ref:`4.2
+<cgroup-v1-memory-test-task-migration>`, a cgroup might have some charge
+associated with it, even though all tasks have migrated away from it. (because
+we charge against pages, not against tasks.)
We move the stats to parent, and no change on the charge except uncharging
from the child.
@@ -519,67 +527,66 @@ will be charged as a new owner of it.
5.2 stat file
-------------
-memory.stat file includes following statistics
-
-per-memory cgroup local status
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-=============== ===============================================================
-cache # of bytes of page cache memory.
-rss # of bytes of anonymous and swap cache memory (includes
- transparent hugepages).
-rss_huge # of bytes of anonymous transparent hugepages.
-mapped_file # of bytes of mapped file (includes tmpfs/shmem)
-pgpgin # of charging events to the memory cgroup. The charging
- event happens each time a page is accounted as either mapped
- anon page(RSS) or cache page(Page Cache) to the cgroup.
-pgpgout # of uncharging events to the memory cgroup. The uncharging
- event happens each time a page is unaccounted from the cgroup.
-swap # of bytes of swap usage
-dirty # of bytes that are waiting to get written back to the disk.
-writeback # of bytes of file/anon cache that are queued for syncing to
- disk.
-inactive_anon # of bytes of anonymous and swap cache memory on inactive
- LRU list.
-active_anon # of bytes of anonymous and swap cache memory on active
- LRU list.
-inactive_file # of bytes of file-backed memory and MADV_FREE anonymous memory(
- LazyFree pages) on inactive LRU list.
-active_file # of bytes of file-backed memory on active LRU list.
-unevictable # of bytes of memory that cannot be reclaimed (mlocked etc).
-=============== ===============================================================
-
-status considering hierarchy (see memory.use_hierarchy settings)
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-========================= ===================================================
-hierarchical_memory_limit # of bytes of memory limit with regard to hierarchy
- under which the memory cgroup is
-hierarchical_memsw_limit # of bytes of memory+swap limit with regard to
- hierarchy under which memory cgroup is.
-
-total_<counter> # hierarchical version of <counter>, which in
- addition to the cgroup's own value includes the
- sum of all hierarchical children's values of
- <counter>, i.e. total_cache
-========================= ===================================================
-
-The following additional stats are dependent on CONFIG_DEBUG_VM
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-========================= ========================================
-recent_rotated_anon VM internal parameter. (see mm/vmscan.c)
-recent_rotated_file VM internal parameter. (see mm/vmscan.c)
-recent_scanned_anon VM internal parameter. (see mm/vmscan.c)
-recent_scanned_file VM internal parameter. (see mm/vmscan.c)
-========================= ========================================
-
-Memo:
+memory.stat file includes following statistics:
+
+ * per-memory cgroup local status
+
+ =============== ===============================================================
+ cache # of bytes of page cache memory.
+ rss # of bytes of anonymous and swap cache memory (includes
+ transparent hugepages).
+ rss_huge # of bytes of anonymous transparent hugepages.
+ mapped_file # of bytes of mapped file (includes tmpfs/shmem)
+ pgpgin # of charging events to the memory cgroup. The charging
+ event happens each time a page is accounted as either mapped
+ anon page(RSS) or cache page(Page Cache) to the cgroup.
+ pgpgout # of uncharging events to the memory cgroup. The uncharging
+ event happens each time a page is unaccounted from the
+ cgroup.
+ swap # of bytes of swap usage
+ dirty # of bytes that are waiting to get written back to the disk.
+ writeback # of bytes of file/anon cache that are queued for syncing to
+ disk.
+ inactive_anon # of bytes of anonymous and swap cache memory on inactive
+ LRU list.
+ active_anon # of bytes of anonymous and swap cache memory on active
+ LRU list.
+ inactive_file # of bytes of file-backed memory and MADV_FREE anonymous
+ memory (LazyFree pages) on inactive LRU list.
+ active_file # of bytes of file-backed memory on active LRU list.
+ unevictable # of bytes of memory that cannot be reclaimed (mlocked etc).
+ =============== ===============================================================
+
+ * status considering hierarchy (see memory.use_hierarchy settings):
+
+ ========================= ===================================================
+ hierarchical_memory_limit # of bytes of memory limit with regard to
+ hierarchy
+ under which the memory cgroup is
+ hierarchical_memsw_limit # of bytes of memory+swap limit with regard to
+ hierarchy under which memory cgroup is.
+
+ total_<counter> # hierarchical version of <counter>, which in
+ addition to the cgroup's own value includes the
+ sum of all hierarchical children's values of
+ <counter>, i.e. total_cache
+ ========================= ===================================================
+
+ * additional vm parameters (depends on CONFIG_DEBUG_VM):
+
+ ========================= ========================================
+ recent_rotated_anon VM internal parameter. (see mm/vmscan.c)
+ recent_rotated_file VM internal parameter. (see mm/vmscan.c)
+ recent_scanned_anon VM internal parameter. (see mm/vmscan.c)
+ recent_scanned_file VM internal parameter. (see mm/vmscan.c)
+ ========================= ========================================
+
+.. hint::
recent_rotated means recent frequency of LRU rotation.
recent_scanned means recent # of scans to LRU.
showing for better debug please see the code for meanings.
-Note:
+.. note::
Only anonymous and swap cache memory is listed as part of 'rss' stat.
This should not be confused with the true 'resident set size' or the
amount of physical memory used by the cgroup.
@@ -710,13 +717,16 @@ If we want to change this to 1G, we can at any time use::
# echo 1G > memory.soft_limit_in_bytes
-NOTE1:
+.. note::
Soft limits take effect over a long period of time, since they involve
reclaiming memory for balancing between memory cgroups
-NOTE2:
+
+.. note::
It is recommended to set the soft limit always below the hard limit,
otherwise the hard limit will take precedence.
+.. _cgroup-v1-memory-move-charges:
+
8. Move charges at task migration
=================================
@@ -735,23 +745,29 @@ If you want to enable it::
# echo (some positive value) > memory.move_charge_at_immigrate
-Note:
+.. note::
Each bits of move_charge_at_immigrate has its own meaning about what type
- of charges should be moved. See 8.2 for details.
-Note:
+ of charges should be moved. See :ref:`section 8.2
+ <cgroup-v1-memory-movable-charges>` for details.
+
+.. note::
Charges are moved only when you move mm->owner, in other words,
a leader of a thread group.
-Note:
+
+.. note::
If we cannot find enough space for the task in the destination cgroup, we
try to make space by reclaiming memory. Task migration may fail if we
cannot make enough space.
-Note:
+
+.. note::
It can take several seconds if you move charges much.
And if you want disable it again::
# echo 0 > memory.move_charge_at_immigrate
+.. _cgroup-v1-memory-movable-charges:
+
8.2 Type of charges which can be moved
--------------------------------------
@@ -801,6 +817,8 @@ threshold in any direction.
It's applicable for root and non-root cgroup.
+.. _cgroup-v1-memory-oom-control:
+
10. OOM Control
===============
@@ -956,15 +974,16 @@ commented and discussed quite extensively in the community.
References
==========
-1. Singh, Balbir. RFC: Memory Controller, http://lwn.net/Articles/206697/
-2. Singh, Balbir. Memory Controller (RSS Control),
+.. [1] Singh, Balbir. RFC: Memory Controller, http://lwn.net/Articles/206697/
+.. [2] Singh, Balbir. Memory Controller (RSS Control),
http://lwn.net/Articles/222762/
-3. Emelianov, Pavel. Resource controllers based on process cgroups
+.. [3] Emelianov, Pavel. Resource controllers based on process cgroups
https://lore.kernel.org/r/45ED7DEC.7010403@sw.ru
-4. Emelianov, Pavel. RSS controller based on process cgroups (v2)
+.. [4] Emelianov, Pavel. RSS controller based on process cgroups (v2)
https://lore.kernel.org/r/461A3010.90403@sw.ru
-5. Emelianov, Pavel. RSS controller based on process cgroups (v3)
+.. [5] Emelianov, Pavel. RSS controller based on process cgroups (v3)
https://lore.kernel.org/r/465D9739.8070209@openvz.org
+
6. Menage, Paul. Control Groups v10, http://lwn.net/Articles/236032/
7. Vaidyanathan, Srinivasan, Control Groups: Pagecache accounting and control
subsystem (v3), http://lwn.net/Articles/235534/
@@ -974,7 +993,8 @@ References
https://lore.kernel.org/r/464D267A.50107@linux.vnet.ibm.com
10. Singh, Balbir. Memory controller v6 test results,
https://lore.kernel.org/r/20070819094658.654.84837.sendpatchset@balbir-laptop
-11. Singh, Balbir. Memory controller introduction (v6),
- https://lore.kernel.org/r/20070817084228.26003.12568.sendpatchset@balbir-laptop
-12. Corbet, Jonathan, Controlling memory use in cgroups,
- http://lwn.net/Articles/243795/
+
+.. [11] Singh, Balbir. Memory controller introduction (v6),
+ https://lore.kernel.org/r/20070817084228.26003.12568.sendpatchset@balbir-laptop
+.. [12] Corbet, Jonathan, Controlling memory use in cgroups,
+ http://lwn.net/Articles/243795/
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index c8ae7c897f14..5db4c4dd5bb4 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -619,6 +619,8 @@ process migrations.
and is an example of this type.
+.. _cgroupv2-limits-distributor:
+
Limits
------
@@ -635,6 +637,7 @@ process migrations.
"io.max" limits the maximum BPS and/or IOPS that a cgroup can consume
on an IO device and is an example of this type.
+.. _cgroupv2-protections-distributor:
Protections
-----------
@@ -1245,13 +1248,17 @@ PAGE_SIZE multiple when read back.
This is a simple interface to trigger memory reclaim in the
target cgroup.
- This file accepts a string which contains the number of bytes to
- reclaim.
+ This file accepts a single key, the number of bytes to reclaim.
+ No nested keys are currently supported.
Example::
echo "1G" > memory.reclaim
+ The interface can be later extended with nested keys to
+ configure the reclaim behavior. For example, specify the
+ type of memory to reclaim from (anon, file, ..).
+
Please note that the kernel can over or under reclaim from
the target cgroup. If less bytes are reclaimed than the
specified amount, -EAGAIN is returned.
@@ -1263,13 +1270,6 @@ PAGE_SIZE multiple when read back.
This means that the networking layer will not adapt based on
reclaim induced by memory.reclaim.
- This file also allows the user to specify the nodes to reclaim from,
- via the 'nodes=' key, for example::
-
- echo "1G nodes=0,1" > memory.reclaim
-
- The above instructs the kernel to reclaim memory from nodes 0,1.
-
memory.peak
A read-only single value file which exists on non-root
cgroups.
diff --git a/Documentation/admin-guide/hw-vuln/cross-thread-rsb.rst b/Documentation/admin-guide/hw-vuln/cross-thread-rsb.rst
new file mode 100644
index 000000000000..875616d675fe
--- /dev/null
+++ b/Documentation/admin-guide/hw-vuln/cross-thread-rsb.rst
@@ -0,0 +1,91 @@
+
+.. SPDX-License-Identifier: GPL-2.0
+
+Cross-Thread Return Address Predictions
+=======================================
+
+Certain AMD and Hygon processors are subject to a cross-thread return address
+predictions vulnerability. When running in SMT mode and one sibling thread
+transitions out of C0 state, the other sibling thread could use return target
+predictions from the sibling thread that transitioned out of C0.
+
+The Spectre v2 mitigations protect the Linux kernel, as it fills the return
+address prediction entries with safe targets when context switching to the idle
+thread. However, KVM does allow a VMM to prevent exiting guest mode when
+transitioning out of C0. This could result in a guest-controlled return target
+being consumed by the sibling thread.
+
+Affected processors
+-------------------
+
+The following CPUs are vulnerable:
+
+ - AMD Family 17h processors
+ - Hygon Family 18h processors
+
+Related CVEs
+------------
+
+The following CVE entry is related to this issue:
+
+ ============== =======================================
+ CVE-2022-27672 Cross-Thread Return Address Predictions
+ ============== =======================================
+
+Problem
+-------
+
+Affected SMT-capable processors support 1T and 2T modes of execution when SMT
+is enabled. In 2T mode, both threads in a core are executing code. For the
+processor core to enter 1T mode, it is required that one of the threads
+requests to transition out of the C0 state. This can be communicated with the
+HLT instruction or with an MWAIT instruction that requests non-C0.
+When the thread re-enters the C0 state, the processor transitions back
+to 2T mode, assuming the other thread is also still in C0 state.
+
+In affected processors, the return address predictor (RAP) is partitioned
+depending on the SMT mode. For instance, in 2T mode each thread uses a private
+16-entry RAP, but in 1T mode, the active thread uses a 32-entry RAP. Upon
+transition between 1T/2T mode, the RAP contents are not modified but the RAP
+pointers (which control the next return target to use for predictions) may
+change. This behavior may result in return targets from one SMT thread being
+used by RET predictions in the sibling thread following a 1T/2T switch. In
+particular, a RET instruction executed immediately after a transition to 1T may
+use a return target from the thread that just became idle. In theory, this
+could lead to information disclosure if the return targets used do not come
+from trustworthy code.
+
+Attack scenarios
+----------------
+
+An attack can be mounted on affected processors by performing a series of CALL
+instructions with targeted return locations and then transitioning out of C0
+state.
+
+Mitigation mechanism
+--------------------
+
+Before entering idle state, the kernel context switches to the idle thread. The
+context switch fills the RAP entries (referred to as the RSB in Linux) with safe
+targets by performing a sequence of CALL instructions.
+
+Prevent a guest VM from directly putting the processor into an idle state by
+intercepting HLT and MWAIT instructions.
+
+Both mitigations are required to fully address this issue.
+
+Mitigation control on the kernel command line
+---------------------------------------------
+
+Use existing Spectre v2 mitigations that will fill the RSB on context switch.
+
+Mitigation control for KVM - module parameter
+---------------------------------------------
+
+By default, the KVM hypervisor mitigates this issue by intercepting guest
+attempts to transition out of C0. A VMM can use the KVM_CAP_X86_DISABLE_EXITS
+capability to override those interceptions, but since this is not common, the
+mitigation that covers this path is not enabled by default.
+
+The mitigation for the KVM_CAP_X86_DISABLE_EXITS capability can be turned on
+using the boolean module parameter mitigate_smt_rsb, e.g. ``kvm.mitigate_smt_rsb=1``.
diff --git a/Documentation/admin-guide/hw-vuln/index.rst b/Documentation/admin-guide/hw-vuln/index.rst
index 4df436e7c417..e0614760a99e 100644
--- a/Documentation/admin-guide/hw-vuln/index.rst
+++ b/Documentation/admin-guide/hw-vuln/index.rst
@@ -18,3 +18,4 @@ are configurable at compile, boot or run time.
core-scheduling.rst
l1d_flush.rst
processor_mmio_stale_data.rst
+ cross-thread-rsb.rst
diff --git a/Documentation/admin-guide/hw-vuln/spectre.rst b/Documentation/admin-guide/hw-vuln/spectre.rst
index c4dcdb3d0d45..3fe6511c5405 100644
--- a/Documentation/admin-guide/hw-vuln/spectre.rst
+++ b/Documentation/admin-guide/hw-vuln/spectre.rst
@@ -610,9 +610,9 @@ kernel command line.
retpoline,generic Retpolines
retpoline,lfence LFENCE; indirect branch
retpoline,amd alias for retpoline,lfence
- eibrs enhanced IBRS
- eibrs,retpoline enhanced IBRS + Retpolines
- eibrs,lfence enhanced IBRS + LFENCE
+ eibrs Enhanced/Auto IBRS
+ eibrs,retpoline Enhanced/Auto IBRS + Retpolines
+ eibrs,lfence Enhanced/Auto IBRS + LFENCE
ibrs use IBRS to protect kernel
Not specifying this option is equivalent to
diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst
index 5bfafcbb9562..0571938ecdc8 100644
--- a/Documentation/admin-guide/index.rst
+++ b/Documentation/admin-guide/index.rst
@@ -116,6 +116,7 @@ configure specific aspects of kernel behavior to your liking.
svga
syscall-user-dispatch
sysrq
+ thermal/index
thunderbolt
ufs
unicode
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 6cfa6e3996cf..993f3b6c10ff 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5113,6 +5113,17 @@
rcupdate.rcu_cpu_stall_timeout to be used (after
conversion from seconds to milliseconds).
+ rcupdate.rcu_cpu_stall_cputime= [KNL]
+ Provide statistics on the cputime and count of
+ interrupts and tasks during the sampling period. For
+ multiple continuous RCU stalls, all sampling periods
+ begin at half of the first RCU stall timeout.
+
+ rcupdate.rcu_exp_stall_task_details= [KNL]
+ Print stack dumps of any tasks blocking the
+ current expedited RCU grace period during an
+ expedited RCU CPU stall warning.
+
rcupdate.rcu_expedited= [KNL]
Use expedited grace-period primitives, for
example, synchronize_rcu_expedited() instead
@@ -5221,7 +5232,7 @@
rdt= [HW,X86,RDT]
Turn on/off individual RDT features. List is:
cmt, mbmtotal, mbmlocal, l3cat, l3cdp, l2cat, l2cdp,
- mba.
+ mba, smba, bmec.
E.g. to turn on cmt and turn off mba use:
rdt=cmt,!mba
@@ -5729,9 +5740,9 @@
retpoline,generic - Retpolines
retpoline,lfence - LFENCE; indirect branch
retpoline,amd - alias for retpoline,lfence
- eibrs - enhanced IBRS
- eibrs,retpoline - enhanced IBRS + Retpolines
- eibrs,lfence - enhanced IBRS + LFENCE
+ eibrs - Enhanced/Auto IBRS
+ eibrs,retpoline - Enhanced/Auto IBRS + Retpolines
+ eibrs,lfence - Enhanced/Auto IBRS + LFENCE
ibrs - use IBRS to protect kernel
Not specifying this option is equivalent to
@@ -6369,6 +6380,16 @@
in situations with strict latency requirements (where
interruptions from clocksource watchdog are not
acceptable).
+ [x86] recalibrate: force recalibration against a HW timer
+ (HPET or PM timer) on systems whose TSC frequency was
+ obtained from HW or FW using either an MSR or CPUID(0x15).
+ Warn if the difference is more than 500 ppm.
+ [x86] watchdog: Use TSC as the watchdog clocksource with
+ which to check other HW timers (HPET or PM timer), but
+ only on systems where TSC has been deemed trustworthy.
+ This will be suppressed by an earlier tsc=nowatchdog and
+ can be overridden by a later tsc=nowatchdog. A console
+ message will flag any such suppression or overriding.
tsc_early_khz= [X86] Skip early TSC calibration and use the given
value instead. Useful when the early TSC frequency discovery
@@ -7020,3 +7041,10 @@
management firmware translates the requests into actual
hardware states (core frequency, data fabric and memory
clocks etc.)
+ active
+ Use amd_pstate_epp driver instance as the scaling driver,
+ driver provides a hint to the hardware if software wants
+ to bias toward performance (0x0) or energy efficiency (0xff)
+ to the CPPC firmware. then CPPC power algorithm will
+ calculate the runtime workload and adjust the realtime cores
+ frequency.
diff --git a/Documentation/admin-guide/media/davinci-vpbe.rst b/Documentation/admin-guide/media/davinci-vpbe.rst
deleted file mode 100644
index 9e6360fd02db..000000000000
--- a/Documentation/admin-guide/media/davinci-vpbe.rst
+++ /dev/null
@@ -1,65 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-The VPBE V4L2 driver design
-===========================
-
-Functional partitioning
------------------------
-
-Consists of the following:
-
- 1. V4L2 display driver
-
- Implements creation of video2 and video3 device nodes and
- provides v4l2 device interface to manage VID0 and VID1 layers.
-
- 2. Display controller
-
- Loads up VENC, OSD and external encoders such as ths8200. It provides
- a set of API calls to V4L2 drivers to set the output/standards
- in the VENC or external sub devices. It also provides
- a device object to access the services from OSD subdevice
- using sub device ops. The connection of external encoders to VENC LCD
- controller port is done at init time based on default output and standard
- selection or at run time when application change the output through
- V4L2 IOCTLs.
-
- When connected to an external encoder, vpbe controller is also responsible
- for setting up the interface between VENC and external encoders based on
- board specific settings (specified in board-xxx-evm.c). This allows
- interfacing external encoders such as ths8200. The setup_if_config()
- is implemented for this as well as configure_venc() (part of the next patch)
- API to set timings in VENC for a specific display resolution. As of this
- patch series, the interconnection and enabling and setting of the external
- encoders is not present, and would be a part of the next patch series.
-
- 3. VENC subdevice module
-
- Responsible for setting outputs provided through internal DACs and also
- setting timings at LCD controller port when external encoders are connected
- at the port or LCD panel timings required. When external encoder/LCD panel
- is connected, the timings for a specific standard/preset is retrieved from
- the board specific table and the values are used to set the timings in
- venc using non-standard timing mode.
-
- Support LCD Panel displays using the VENC. For example to support a Logic
- PD display, it requires setting up the LCD controller port with a set of
- timings for the resolution supported and setting the dot clock. So we could
- add the available outputs as a board specific entry (i.e add the "LogicPD"
- output name to board-xxx-evm.c). A table of timings for various LCDs
- supported can be maintained in the board specific setup file to support
- various LCD displays.As of this patch a basic driver is present, and this
- support for external encoders and displays forms a part of the next
- patch series.
-
- 4. OSD module
-
- OSD module implements all OSD layer management and hardware specific
- features. The VPBE module interacts with the OSD for enabling and
- disabling appropriate features of the OSD.
-
-Current status
---------------
-
-A fully functional working version of the V4L2 driver is available. This
-driver has been tested with NTSC and PAL standards and buffer streaming.
diff --git a/Documentation/admin-guide/media/platform-cardlist.rst b/Documentation/admin-guide/media/platform-cardlist.rst
index ac73c4166d1e..8ef57cd13dec 100644
--- a/Documentation/admin-guide/media/platform-cardlist.rst
+++ b/Documentation/admin-guide/media/platform-cardlist.rst
@@ -73,7 +73,6 @@ via-camera VIAFB camera controller
video-mux Video Multiplexer
vpif_display TI DaVinci VPIF V4L2-Display
vpif_capture TI DaVinci VPIF video capture
-vpss TI DaVinci VPBE V4L2-Display
vsp1 Renesas VSP1 Video Processing Engine
xilinx-tpg Xilinx Video Test Pattern Generator
xilinx-video Xilinx Video IP (EXPERIMENTAL)
diff --git a/Documentation/admin-guide/media/v4l-drivers.rst b/Documentation/admin-guide/media/v4l-drivers.rst
index 90a026ee05c6..734e18c310bd 100644
--- a/Documentation/admin-guide/media/v4l-drivers.rst
+++ b/Documentation/admin-guide/media/v4l-drivers.rst
@@ -13,7 +13,6 @@ Video4Linux (V4L) driver-specific documentation
cafe_ccic
cpia2
cx88
- davinci-vpbe
fimc
imx
imx7
diff --git a/Documentation/admin-guide/mm/zswap.rst b/Documentation/admin-guide/mm/zswap.rst
index f67de481c7f6..6dd74a18268b 100644
--- a/Documentation/admin-guide/mm/zswap.rst
+++ b/Documentation/admin-guide/mm/zswap.rst
@@ -70,9 +70,7 @@ e.g. ``zswap.zpool=zbud``. It can also be changed at runtime using the sysfs
The zbud type zpool allocates exactly 1 page to store 2 compressed pages, which
means the compression ratio will always be 2:1 or worse (because of half-full
zbud pages). The zsmalloc type zpool has a more complex compressed page
-storage method, and it can achieve greater storage densities. However,
-zsmalloc does not implement compressed page eviction, so once zswap fills it
-cannot evict the oldest page, it can only reject new pages.
+storage method, and it can achieve greater storage densities.
When a swap page is passed from frontswap to zswap, zswap maintains a mapping
of the swap entry, a combination of the swap type and swap offset, to the zpool
diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/Documentation/admin-guide/pm/amd-pstate.rst
index 5376d53faaa8..d143e72cf93e 100644
--- a/Documentation/admin-guide/pm/amd-pstate.rst
+++ b/Documentation/admin-guide/pm/amd-pstate.rst
@@ -230,8 +230,8 @@ with :c:macro:`MSR_AMD_CPPC_ENABLE` or ``cppc_set_enable``, it will respond
to the request from AMD P-States.
-User Space Interface in ``sysfs``
-==================================
+User Space Interface in ``sysfs`` - Per-policy control
+======================================================
``amd-pstate`` exposes several global attributes (files) in ``sysfs`` to
control its functionality at the system level. They are located in the
@@ -262,6 +262,25 @@ lowest non-linear performance in `AMD CPPC Performance Capability
<perf_cap_>`_.)
This attribute is read-only.
+``energy_performance_available_preferences``
+
+A list of all the supported EPP preferences that could be used for
+``energy_performance_preference`` on this system.
+These profiles represent different hints that are provided
+to the low-level firmware about the user's desired energy vs efficiency
+tradeoff. ``default`` represents the epp value is set by platform
+firmware. This attribute is read-only.
+
+``energy_performance_preference``
+
+The current energy performance preference can be read from this attribute.
+and user can change current preference according to energy or performance needs
+Please get all support profiles list from
+``energy_performance_available_preferences`` attribute, all the profiles are
+integer values defined between 0 to 255 when EPP feature is enabled by platform
+firmware, if EPP feature is disabled, driver will ignore the written value
+This attribute is read-write.
+
Other performance and frequency values can be read back from
``/sys/devices/system/cpu/cpuX/acpi_cppc/``, see :ref:`cppc_sysfs`.
@@ -280,8 +299,30 @@ module which supports the new AMD P-States mechanism on most of the future AMD
platforms. The AMD P-States mechanism is the more performance and energy
efficiency frequency management method on AMD processors.
-Kernel Module Options for ``amd-pstate``
-=========================================
+
+AMD Pstate Driver Operation Modes
+=================================
+
+``amd_pstate`` CPPC has two operation modes: CPPC Autonomous(active) mode and
+CPPC non-autonomous(passive) mode.
+active mode and passive mode can be chosen by different kernel parameters.
+When in Autonomous mode, CPPC ignores requests done in the Desired Performance
+Target register and takes into account only the values set to the Minimum requested
+performance, Maximum requested performance, and Energy Performance Preference
+registers. When Autonomous is disabled, it only considers the Desired Performance Target.
+
+Active Mode
+------------
+
+``amd_pstate=active``
+
+This is the low-level firmware control mode which is implemented by ``amd_pstate_epp``
+driver with ``amd_pstate=active`` passed to the kernel in the command line.
+In this mode, ``amd_pstate_epp`` driver provides a hint to the hardware if software
+wants to bias toward performance (0x0) or energy efficiency (0xff) to the CPPC firmware.
+then CPPC power algorithm will calculate the runtime workload and adjust the realtime
+cores frequency according to the power supply and thermal, core voltage and some other
+hardware conditions.
Passive Mode
------------
@@ -298,6 +339,35 @@ processor must provide at least nominal performance requested and go higher if c
operating conditions allow.
+User Space Interface in ``sysfs`` - General
+===========================================
+
+Global Attributes
+-----------------
+
+``amd-pstate`` exposes several global attributes (files) in ``sysfs`` to
+control its functionality at the system level. They are located in the
+``/sys/devices/system/cpu/amd-pstate/`` directory and affect all CPUs.
+
+``status``
+ Operation mode of the driver: "active", "passive" or "disable".
+
+ "active"
+ The driver is functional and in the ``active mode``
+
+ "passive"
+ The driver is functional and in the ``passive mode``
+
+ "disable"
+ The driver is unregistered and not functional now.
+
+ This attribute can be written to in order to change the driver's
+ operation mode or to unregister it. The string written to it must be
+ one of the possible values of it and, if successful, writing one of
+ these values to the sysfs file will cause the driver to switch over
+ to the operation mode represented by that string - or to be
+ unregistered in the "disable" case.
+
``cpupower`` tool support for ``amd-pstate``
===============================================
diff --git a/Documentation/admin-guide/thermal/index.rst b/Documentation/admin-guide/thermal/index.rst
new file mode 100644
index 000000000000..193b7b01a87d
--- /dev/null
+++ b/Documentation/admin-guide/thermal/index.rst
@@ -0,0 +1,8 @@
+=================
+Thermal Subsystem
+=================
+
+.. toctree::
+ :maxdepth: 1
+
+ intel_powerclamp
diff --git a/Documentation/driver-api/thermal/intel_powerclamp.rst b/Documentation/admin-guide/thermal/intel_powerclamp.rst
index 3f6dfb0b3ea6..3ce96043af17 100644
--- a/Documentation/driver-api/thermal/intel_powerclamp.rst
+++ b/Documentation/admin-guide/thermal/intel_powerclamp.rst
@@ -26,6 +26,8 @@ By:
- Generic Thermal Layer (sysfs)
- Kernel APIs (TBD)
+ (*) Module Parameters
+
INTRODUCTION
============
@@ -153,13 +155,15 @@ b) determine the amount of compensation needed at each target ratio
Compensation to each target ratio consists of two parts:
a) steady state error compensation
- This is to offset the error occurring when the system can
- enter idle without extra wakeups (such as external interrupts).
+
+ This is to offset the error occurring when the system can
+ enter idle without extra wakeups (such as external interrupts).
b) dynamic error compensation
- When an excessive amount of wakeups occurs during idle, an
- additional idle ratio can be added to quiet interrupts, by
- slowing down CPU activities.
+
+ When an excessive amount of wakeups occurs during idle, an
+ additional idle ratio can be added to quiet interrupts, by
+ slowing down CPU activities.
A debugfs file is provided for the user to examine compensation
progress and results, such as on a Westmere system::
@@ -281,6 +285,7 @@ cur_state returns value -1 instead of 0 which is to avoid confusing
100% busy state with the disabled state.
Example usage:
+
- To inject 25% idle time::
$ sudo sh -c "echo 25 > /sys/class/thermal/cooling_device80/cur_state
@@ -318,3 +323,23 @@ device, a PID based userspace thermal controller can manage to
control CPU temperature effectively, when no other thermal influence
is added. For example, a UltraBook user can compile the kernel under
certain temperature (below most active trip points).
+
+Module Parameters
+=================
+
+``cpumask`` (RW)
+ A bit mask of CPUs to inject idle. The format of the bitmask is same as
+ used in other subsystems like in /proc/irq/\*/smp_affinity. The mask is
+ comma separated 32 bit groups. Each CPU is one bit. For example for a 256
+ CPU system the full mask is:
+ ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff
+
+ The rightmost mask is for CPU 0-32.
+
+``max_idle`` (RW)
+ Maximum injected idle time to the total CPU time ratio in percent range
+ from 1 to 100. Even if the cooling device max_state is always 100 (100%),
+ this parameter allows to add a max idle percent limit. The default is 50,
+ to match the current implementation of powerclamp driver. Also doesn't
+ allow value more than 75, if the cpumask includes every CPU present in
+ the system.
diff --git a/Documentation/arm/index.rst b/Documentation/arm/index.rst
index 8c636d4a061f..ae42fe886f0d 100644
--- a/Documentation/arm/index.rst
+++ b/Documentation/arm/index.rst
@@ -64,7 +64,6 @@ SoC-specific documents
sunxi
samsung/index
- samsung-s3c24xx/index
sunxi/clocks
diff --git a/Documentation/arm/samsung-s3c24xx/cpufreq.rst b/Documentation/arm/samsung-s3c24xx/cpufreq.rst
deleted file mode 100644
index cd22697cf606..000000000000
--- a/Documentation/arm/samsung-s3c24xx/cpufreq.rst
+++ /dev/null
@@ -1,77 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0-only
-
-=======================
-S3C24XX CPUfreq support
-=======================
-
-Introduction
-------------
-
- The S3C24XX series support a number of power saving systems, such as
- the ability to change the core, memory and peripheral operating
- frequencies. The core control is exported via the CPUFreq driver
- which has a number of different manual or automatic controls over the
- rate the core is running at.
-
- There are two forms of the driver depending on the specific CPU and
- how the clocks are arranged. The first implementation used as single
- PLL to feed the ARM, memory and peripherals via a series of dividers
- and muxes and this is the implementation that is documented here. A
- newer version where there is a separate PLL and clock divider for the
- ARM core is available as a separate driver.
-
-
-Layout
-------
-
- The code core manages the CPU specific drivers, any data that they
- need to register and the interface to the generic drivers/cpufreq
- system. Each CPU registers a driver to control the PLL, clock dividers
- and anything else associated with it. Any board that wants to use this
- framework needs to supply at least basic details of what is required.
-
- The core registers with drivers/cpufreq at init time if all the data
- necessary has been supplied.
-
-
-CPU support
------------
-
- The support for each CPU depends on the facilities provided by the
- SoC and the driver as each device has different PLL and clock chains
- associated with it.
-
-
-Slow Mode
----------
-
- The SLOW mode where the PLL is turned off altogether and the
- system is fed by the external crystal input is currently not
- supported.
-
-
-sysfs
------
-
- The core code exports extra information via sysfs in the directory
- devices/system/cpu/cpu0/arch-freq.
-
-
-Board Support
--------------
-
- Each board that wants to use the cpufreq code must register some basic
- information with the core driver to provide information about what the
- board requires and any restrictions being placed on it.
-
- The board needs to supply information about whether it needs the IO bank
- timings changing, any maximum frequency limits and information about the
- SDRAM refresh rate.
-
-
-
-
-Document Author
----------------
-
-Ben Dooks, Copyright 2009 Simtec Electronics
diff --git a/Documentation/arm/samsung-s3c24xx/eb2410itx.rst b/Documentation/arm/samsung-s3c24xx/eb2410itx.rst
deleted file mode 100644
index 7863c93652f8..000000000000
--- a/Documentation/arm/samsung-s3c24xx/eb2410itx.rst
+++ /dev/null
@@ -1,59 +0,0 @@
-===================================
-Simtec Electronics EB2410ITX (BAST)
-===================================
-
- http://www.simtec.co.uk/products/EB2410ITX/
-
-Introduction
-------------
-
- The EB2410ITX is a S3C2410 based development board with a variety of
- peripherals and expansion connectors. This board is also known by
- the shortened name of Bast.
-
-
-Configuration
--------------
-
- To set the default configuration, use `make bast_defconfig` which
- supports the commonly used features of this board.
-
-
-Support
--------
-
- Official support information can be found on the Simtec Electronics
- website, at the product page http://www.simtec.co.uk/products/EB2410ITX/
-
- Useful links:
-
- - Resources Page http://www.simtec.co.uk/products/EB2410ITX/resources.html
-
- - Board FAQ at http://www.simtec.co.uk/products/EB2410ITX/faq.html
-
- - Bootloader info http://www.simtec.co.uk/products/SWABLE/resources.html
- and FAQ http://www.simtec.co.uk/products/SWABLE/faq.html
-
-
-MTD
----
-
- The NAND and NOR support has been merged from the linux-mtd project.
- Any problems, see http://www.linux-mtd.infradead.org/ for more
- information or up-to-date versions of linux-mtd.
-
-
-IDE
----
-
- Both onboard IDE ports are supported, however there is no support for
- changing speed of devices, PIO Mode 4 capable drives should be used.
-
-
-Maintainers
------------
-
- This board is maintained by Simtec Electronics.
-
-
-Copyright 2004 Ben Dooks, Simtec Electronics
diff --git a/Documentation/arm/samsung-s3c24xx/gpio.rst b/Documentation/arm/samsung-s3c24xx/gpio.rst
deleted file mode 100644
index f4a8c800a457..000000000000
--- a/Documentation/arm/samsung-s3c24xx/gpio.rst
+++ /dev/null
@@ -1,172 +0,0 @@
-====================
-S3C24XX GPIO Control
-====================
-
-Introduction
-------------
-
- The s3c2410 kernel provides an interface to configure and
- manipulate the state of the GPIO pins, and find out other
- information about them.
-
- There are a number of conditions attached to the configuration
- of the s3c2410 GPIO system, please read the Samsung provided
- data-sheet/users manual to find out the complete list.
-
- See Documentation/arm/samsung/gpio.rst for the core implementation.
-
-
-GPIOLIB
--------
-
- With the event of the GPIOLIB in drivers/gpio, support for some
- of the GPIO functions such as reading and writing a pin will
- be removed in favour of this common access method.
-
- Once all the extant drivers have been converted, the functions
- listed below will be removed (they may be marked as __deprecated
- in the near future).
-
- The following functions now either have a `s3c_` specific variant
- or are merged into gpiolib. See the definitions in
- arch/arm/mach-s3c/gpio-cfg.h:
-
- - s3c2410_gpio_setpin() gpio_set_value() or gpio_direction_output()
- - s3c2410_gpio_getpin() gpio_get_value() or gpio_direction_input()
- - s3c2410_gpio_getirq() gpio_to_irq()
- - s3c2410_gpio_cfgpin() s3c_gpio_cfgpin()
- - s3c2410_gpio_getcfg() s3c_gpio_getcfg()
- - s3c2410_gpio_pullup() s3c_gpio_setpull()
-
-
-GPIOLIB conversion
-------------------
-
-If you need to convert your board or driver to use gpiolib from the phased
-out s3c2410 API, then here are some notes on the process.
-
-1) If your board is exclusively using an GPIO, say to control peripheral
- power, then it will require to claim the gpio with gpio_request() before
- it can use it.
-
- It is recommended to check the return value, with at least WARN_ON()
- during initialisation.
-
-2) The s3c2410_gpio_cfgpin() can be directly replaced with s3c_gpio_cfgpin()
- as they have the same arguments, and can either take the pin specific
- values, or the more generic special-function-number arguments.
-
-3) s3c2410_gpio_pullup() changes have the problem that while the
- s3c2410_gpio_pullup(x, 1) can be easily translated to the
- s3c_gpio_setpull(x, S3C_GPIO_PULL_NONE), the s3c2410_gpio_pullup(x, 0)
- are not so easy.
-
- The s3c2410_gpio_pullup(x, 0) case enables the pull-up (or in the case
- of some of the devices, a pull-down) and as such the new API distinguishes
- between the UP and DOWN case. There is currently no 'just turn on' setting
- which may be required if this becomes a problem.
-
-4) s3c2410_gpio_setpin() can be replaced by gpio_set_value(), the old call
- does not implicitly configure the relevant gpio to output. The gpio
- direction should be changed before using gpio_set_value().
-
-5) s3c2410_gpio_getpin() is replaceable by gpio_get_value() if the pin
- has been set to input. It is currently unknown what the behaviour is
- when using gpio_get_value() on an output pin (s3c2410_gpio_getpin
- would return the value the pin is supposed to be outputting).
-
-6) s3c2410_gpio_getirq() should be directly replaceable with the
- gpio_to_irq() call.
-
-The s3c2410_gpio and `gpio_` calls have always operated on the same gpio
-numberspace, so there is no problem with converting the gpio numbering
-between the calls.
-
-
-Headers
--------
-
- See arch/arm/mach-s3c/regs-gpio-s3c24xx.h for the list
- of GPIO pins, and the configuration values for them. This
- is included by using #include <mach/regs-gpio.h>
-
-
-PIN Numbers
------------
-
- Each pin has an unique number associated with it in regs-gpio.h,
- e.g. S3C2410_GPA(0) or S3C2410_GPF(1). These defines are used to tell
- the GPIO functions which pin is to be used.
-
- With the conversion to gpiolib, there is no longer a direct conversion
- from gpio pin number to register base address as in earlier kernels. This
- is due to the number space required for newer SoCs where the later
- GPIOs are not contiguous.
-
-
-Configuring a pin
------------------
-
- The following function allows the configuration of a given pin to
- be changed.
-
- void s3c_gpio_cfgpin(unsigned int pin, unsigned int function);
-
- e.g.:
-
- s3c_gpio_cfgpin(S3C2410_GPA(0), S3C_GPIO_SFN(1));
- s3c_gpio_cfgpin(S3C2410_GPE(8), S3C_GPIO_SFN(2));
-
- which would turn GPA(0) into the lowest Address line A0, and set
- GPE(8) to be connected to the SDIO/MMC controller's SDDAT1 line.
-
-
-Reading the current configuration
----------------------------------
-
- The current configuration of a pin can be read by using standard
- gpiolib function:
-
- s3c_gpio_getcfg(unsigned int pin);
-
- The return value will be from the same set of values which can be
- passed to s3c_gpio_cfgpin().
-
-
-Configuring a pull-up resistor
-------------------------------
-
- A large proportion of the GPIO pins on the S3C2410 can have weak
- pull-up resistors enabled. This can be configured by the following
- function:
-
- void s3c_gpio_setpull(unsigned int pin, unsigned int to);
-
- Where the to value is S3C_GPIO_PULL_NONE to set the pull-up off,
- and S3C_GPIO_PULL_UP to enable the specified pull-up. Any other
- values are currently undefined.
-
-
-Getting and setting the state of a PIN
---------------------------------------
-
- These calls are now implemented by the relevant gpiolib calls, convert
- your board or driver to use gpiolib.
-
-
-Getting the IRQ number associated with a PIN
---------------------------------------------
-
- A standard gpiolib function can map the given pin number to an IRQ
- number to pass to the IRQ system.
-
- int gpio_to_irq(unsigned int pin);
-
- Note, not all pins have an IRQ.
-
-
-Author
--------
-
-Ben Dooks, 03 October 2004
-Copyright 2004 Ben Dooks, Simtec Electronics
diff --git a/Documentation/arm/samsung-s3c24xx/h1940.rst b/Documentation/arm/samsung-s3c24xx/h1940.rst
deleted file mode 100644
index 62a562c178e3..000000000000
--- a/Documentation/arm/samsung-s3c24xx/h1940.rst
+++ /dev/null
@@ -1,41 +0,0 @@
-=============
-HP IPAQ H1940
-=============
-
-http://www.handhelds.org/projects/h1940.html
-
-Introduction
-------------
-
- The HP H1940 is a S3C2410 based handheld device, with
- bluetooth connectivity.
-
-
-Support
--------
-
- A variety of information is available
-
- handhelds.org project page:
-
- http://www.handhelds.org/projects/h1940.html
-
- handhelds.org wiki page:
-
- http://handhelds.org/moin/moin.cgi/HpIpaqH1940
-
- Herbert Pötzl pages:
-
- http://vserver.13thfloor.at/H1940/
-
-
-Maintainers
------------
-
- This project is being maintained and developed by a variety
- of people, including Ben Dooks, Arnaud Patard, and Herbert Pötzl.
-
- Thanks to the many others who have also provided support.
-
-
-(c) 2005 Ben Dooks
diff --git a/Documentation/arm/samsung-s3c24xx/index.rst b/Documentation/arm/samsung-s3c24xx/index.rst
deleted file mode 100644
index ccb951a0bedb..000000000000
--- a/Documentation/arm/samsung-s3c24xx/index.rst
+++ /dev/null
@@ -1,20 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-==========================
-Samsung S3C24XX SoC Family
-==========================
-
-.. toctree::
- :maxdepth: 1
-
- h1940
- gpio
- cpufreq
- suspend
- usb-host
- s3c2412
- eb2410itx
- nand
- smdk2440
- s3c2413
- overview
diff --git a/Documentation/arm/samsung-s3c24xx/nand.rst b/Documentation/arm/samsung-s3c24xx/nand.rst
deleted file mode 100644
index 938995694ee7..000000000000
--- a/Documentation/arm/samsung-s3c24xx/nand.rst
+++ /dev/null
@@ -1,30 +0,0 @@
-====================
-S3C24XX NAND Support
-====================
-
-Introduction
-------------
-
-Small Page NAND
----------------
-
-The driver uses a 512 byte (1 page) ECC code for this setup. The
-ECC code is not directly compatible with the default kernel ECC
-code, so the driver enforces its own OOB layout and ECC parameters
-
-Large Page NAND
----------------
-
-The driver is capable of handling NAND flash with a 2KiB page
-size, with support for hardware ECC generation and correction.
-
-Unlike the 512byte page mode, the driver generates ECC data for
-each 256 byte block in an 2KiB page. This means that more than
-one error in a page can be rectified. It also means that the
-OOB layout remains the default kernel layout for these flashes.
-
-
-Document Author
----------------
-
-Ben Dooks, Copyright 2007 Simtec Electronics
diff --git a/Documentation/arm/samsung-s3c24xx/overview.rst b/Documentation/arm/samsung-s3c24xx/overview.rst
deleted file mode 100644
index 14535e5cffb7..000000000000
--- a/Documentation/arm/samsung-s3c24xx/overview.rst
+++ /dev/null
@@ -1,311 +0,0 @@
-==========================
-S3C24XX ARM Linux Overview
-==========================
-
-
-
-Introduction
-------------
-
- The Samsung S3C24XX range of ARM9 System-on-Chip CPUs are supported
- by the 's3c2410' architecture of ARM Linux. Currently the S3C2410,
- S3C2412, S3C2413, S3C2416, S3C2440, S3C2442, S3C2443 and S3C2450 devices
- are supported.
-
- Support for the S3C2400 and S3C24A0 series was never completed and the
- corresponding code has been removed after a while. If someone wishes to
- revive this effort, partial support can be retrieved from earlier Linux
- versions.
-
- The S3C2416 and S3C2450 devices are very similar and S3C2450 support is
- included under the arch/arm/mach-s3c directory. Note, while core
- support for these SoCs is in, work on some of the extra peripherals
- and extra interrupts is still ongoing.
-
-
-Configuration
--------------
-
- A generic S3C2410 configuration is provided, and can be used as the
- default by `make s3c2410_defconfig`. This configuration has support
- for all the machines, and the commonly used features on them.
-
- Certain machines may have their own default configurations as well,
- please check the machine specific documentation.
-
-
-Layout
-------
-
- The core support files, register, kernel and paltform data are located in the
- platform code contained in arch/arm/mach-s3c with headers in
- arch/arm/mach-s3c/include
-
-arch/arm/mach-s3c:
-
- Files in here are either common to all the s3c24xx family,
- or are common to only some of them with names to indicate this
- status. The files that are not common to all are generally named
- with the initial cpu they support in the series to ensure a short
- name without any possibility of confusion with newer devices.
-
- As an example, initially s3c244x would cover s3c2440 and s3c2442, but
- with the s3c2443 which does not share many of the same drivers in
- this directory, the name becomes invalid. We stick to s3c2440-<x>
- to indicate a driver that is s3c2440 and s3c2442 compatible.
-
- This does mean that to find the status of any given SoC, a number
- of directories may need to be searched.
-
-
-Machines
---------
-
- The currently supported machines are as follows:
-
- Simtec Electronics EB2410ITX (BAST)
-
- A general purpose development board, see EB2410ITX.txt for further
- details
-
- Simtec Electronics IM2440D20 (Osiris)
-
- CPU Module from Simtec Electronics, with a S3C2440A CPU, nand flash
- and a PCMCIA controller.
-
- Samsung SMDK2410
-
- Samsung's own development board, geared for PDA work.
-
- Samsung/Aiji SMDK2412
-
- The S3C2412 version of the SMDK2440.
-
- Samsung/Aiji SMDK2413
-
- The S3C2412 version of the SMDK2440.
-
- Samsung/Meritech SMDK2440
-
- The S3C2440 compatible version of the SMDK2440, which has the
- option of an S3C2440 or S3C2442 CPU module.
-
- Thorcom VR1000
-
- Custom embedded board
-
- HP IPAQ 1940
-
- Handheld (IPAQ), available in several varieties
-
- HP iPAQ rx3715
-
- S3C2440 based IPAQ, with a number of variations depending on
- features shipped.
-
- Acer N30
-
- A S3C2410 based PDA from Acer. There is a Wiki page at
- http://handhelds.org/moin/moin.cgi/AcerN30Documentation .
-
- AML M5900
-
- American Microsystems' M5900
-
- Nex Vision Nexcoder
- Nex Vision Otom
-
- Two machines by Nex Vision
-
-
-Adding New Machines
--------------------
-
- The architecture has been designed to support as many machines as can
- be configured for it in one kernel build, and any future additions
- should keep this in mind before altering items outside of their own
- machine files.
-
- Machine definitions should be kept in arch/arm/mach-s3c,
- and there are a number of examples that can be looked at.
-
- Read the kernel patch submission policies as well as the
- Documentation/arm directory before submitting patches. The
- ARM kernel series is managed by Russell King, and has a patch system
- located at http://www.arm.linux.org.uk/developer/patches/
- as well as mailing lists that can be found from the same site.
-
- As a courtesy, please notify <ben-linux@fluff.org> of any new
- machines or other modifications.
-
- Any large scale modifications, or new drivers should be discussed
- on the ARM kernel mailing list (linux-arm-kernel) before being
- attempted. See http://www.arm.linux.org.uk/mailinglists/ for the
- mailing list information.
-
-
-I2C
----
-
- The hardware I2C core in the CPU is supported in single master
- mode, and can be configured via platform data.
-
-
-RTC
----
-
- Support for the onboard RTC unit, including alarm function.
-
- This has recently been upgraded to use the new RTC core,
- and the module has been renamed to rtc-s3c to fit in with
- the new rtc naming scheme.
-
-
-Watchdog
---------
-
- The onchip watchdog is available via the standard watchdog
- interface.
-
-
-NAND
-----
-
- The current kernels now have support for the s3c2410 NAND
- controller. If there are any problems the latest linux-mtd
- code can be found from http://www.linux-mtd.infradead.org/
-
- For more information see Documentation/arm/samsung-s3c24xx/nand.rst
-
-
-SD/MMC
-------
-
- The SD/MMC hardware pre S3C2443 is supported in the current
- kernel, the driver is drivers/mmc/host/s3cmci.c and supports
- 1 and 4 bit SD or MMC cards.
-
- The SDIO behaviour of this driver has not been fully tested. There is no
- current support for hardware SDIO interrupts.
-
-
-Serial
-------
-
- The s3c2410 serial driver provides support for the internal
- serial ports. These devices appear as /dev/ttySAC0 through 3.
-
- To create device nodes for these, use the following commands
-
- mknod ttySAC0 c 204 64
- mknod ttySAC1 c 204 65
- mknod ttySAC2 c 204 66
-
-
-GPIO
-----
-
- The core contains support for manipulating the GPIO, see the
- documentation in GPIO.txt in the same directory as this file.
-
- Newer kernels carry GPIOLIB, and support is being moved towards
- this with some of the older support in line to be removed.
-
- As of v2.6.34, the move towards using gpiolib support is almost
- complete, and very little of the old calls are left.
-
- See Documentation/arm/samsung-s3c24xx/gpio.rst for the S3C24XX specific
- support and Documentation/arm/samsung/gpio.rst for the core Samsung
- implementation.
-
-
-Clock Management
-----------------
-
- The core provides the interface defined in the header file
- include/asm-arm/hardware/clock.h, to allow control over the
- various clock units
-
-
-Suspend to RAM
---------------
-
- For boards that provide support for suspend to RAM, the
- system can be placed into low power suspend.
-
- See Suspend.txt for more information.
-
-
-SPI
----
-
- SPI drivers are available for both the in-built hardware
- (although there is no DMA support yet) and a generic
- GPIO based solution.
-
-
-LEDs
-----
-
- There is support for GPIO based LEDs via a platform driver
- in the LED subsystem.
-
-
-Platform Data
--------------
-
- Whenever a device has platform specific data that is specified
- on a per-machine basis, care should be taken to ensure the
- following:
-
- 1) that default data is not left in the device to confuse the
- driver if a machine does not set it at startup
-
- 2) the data should (if possible) be marked as __initdata,
- to ensure that the data is thrown away if the machine is
- not the one currently in use.
-
- The best way of doing this is to make a function that
- kmalloc()s an area of memory, and copies the __initdata
- and then sets the relevant device's platform data. Making
- the function `__init` takes care of ensuring it is discarded
- with the rest of the initialisation code::
-
- static __init void s3c24xx_xxx_set_platdata(struct xxx_data *pd)
- {
- struct s3c2410_xxx_mach_info *npd;
-
- npd = kmalloc(sizeof(struct s3c2410_xxx_mach_info), GFP_KERNEL);
- if (npd) {
- memcpy(npd, pd, sizeof(struct s3c2410_xxx_mach_info));
- s3c_device_xxx.dev.platform_data = npd;
- } else {
- printk(KERN_ERR "no memory for xxx platform data\n");
- }
- }
-
- Note, since the code is marked as __init, it should not be
- exported outside arch/arm/mach-s3c/, or exported to
- modules via EXPORT_SYMBOL() and related functions.
-
-
-Port Contributors
------------------
-
- Ben Dooks (BJD)
- Vincent Sanders
- Herbert Potzl
- Arnaud Patard (RTP)
- Roc Wu
- Klaus Fetscher
- Dimitry Andric
- Shannon Holland
- Guillaume Gourat (NexVision)
- Christer Weinigel (wingel) (Acer N30)
- Lucas Correia Villa Real (S3C2400 port)
-
-
-Document Author
----------------
-
-Ben Dooks, Copyright 2004-2006 Simtec Electronics
diff --git a/Documentation/arm/samsung-s3c24xx/s3c2412.rst b/Documentation/arm/samsung-s3c24xx/s3c2412.rst
deleted file mode 100644
index 68b985fc6bf4..000000000000
--- a/Documentation/arm/samsung-s3c24xx/s3c2412.rst
+++ /dev/null
@@ -1,121 +0,0 @@
-==========================
-S3C2412 ARM Linux Overview
-==========================
-
-Introduction
-------------
-
- The S3C2412 is part of the S3C24XX range of ARM9 System-on-Chip CPUs
- from Samsung. This part has an ARM926-EJS core, capable of running up
- to 266MHz (see data-sheet for more information)
-
-
-Clock
------
-
- The core clock code provides a set of clocks to the drivers, and allows
- for source selection and a number of other features.
-
-
-Power
------
-
- No support for suspend/resume to RAM in the current system.
-
-
-DMA
----
-
- No current support for DMA.
-
-
-GPIO
-----
-
- There is support for setting the GPIO to input/output/special function
- and reading or writing to them.
-
-
-UART
-----
-
- The UART hardware is similar to the S3C2440, and is supported by the
- s3c2410 driver in the drivers/serial directory.
-
-
-NAND
-----
-
- The NAND hardware is similar to the S3C2440, and is supported by the
- s3c2410 driver in the drivers/mtd/nand/raw directory.
-
-
-USB Host
---------
-
- The USB hardware is similar to the S3C2410, with extended clock source
- control. The OHCI portion is supported by the ohci-s3c2410 driver, and
- the clock control selection is supported by the core clock code.
-
-
-USB Device
-----------
-
- No current support in the kernel
-
-
-IRQs
-----
-
- All the standard, and external interrupt sources are supported. The
- extra sub-sources are not yet supported.
-
-
-RTC
----
-
- The RTC hardware is similar to the S3C2410, and is supported by the
- s3c2410-rtc driver.
-
-
-Watchdog
---------
-
- The watchdog hardware is the same as the S3C2410, and is supported by
- the s3c2410_wdt driver.
-
-
-MMC/SD/SDIO
------------
-
- No current support for the MMC/SD/SDIO block.
-
-IIC
----
-
- The IIC hardware is the same as the S3C2410, and is supported by the
- i2c-s3c24xx driver.
-
-
-IIS
----
-
- No current support for the IIS interface.
-
-
-SPI
----
-
- No current support for the SPI interfaces.
-
-
-ATA
----
-
- No current support for the on-board ATA block.
-
-
-Document Author
----------------
-
-Ben Dooks, Copyright 2006 Simtec Electronics
diff --git a/Documentation/arm/samsung-s3c24xx/s3c2413.rst b/Documentation/arm/samsung-s3c24xx/s3c2413.rst
deleted file mode 100644
index 1f51e207fc46..000000000000
--- a/Documentation/arm/samsung-s3c24xx/s3c2413.rst
+++ /dev/null
@@ -1,22 +0,0 @@
-==========================
-S3C2413 ARM Linux Overview
-==========================
-
-Introduction
-------------
-
- The S3C2413 is an extended version of the S3C2412, with an camera
- interface and mobile DDR memory support. See the S3C2412 support
- documentation for more information.
-
-
-Camera Interface
-----------------
-
- This block is currently not supported.
-
-
-Document Author
----------------
-
-Ben Dooks, Copyright 2006 Simtec Electronics
diff --git a/Documentation/arm/samsung-s3c24xx/smdk2440.rst b/Documentation/arm/samsung-s3c24xx/smdk2440.rst
deleted file mode 100644
index 524fd0b4afaf..000000000000
--- a/Documentation/arm/samsung-s3c24xx/smdk2440.rst
+++ /dev/null
@@ -1,57 +0,0 @@
-=========================
-Samsung/Meritech SMDK2440
-=========================
-
-Introduction
-------------
-
- The SMDK2440 is a two part evaluation board for the Samsung S3C2440
- processor. It includes support for LCD, SmartMedia, Audio, SD and
- 10MBit Ethernet, and expansion headers for various signals, including
- the camera and unused GPIO.
-
-
-Configuration
--------------
-
- To set the default configuration, use `make smdk2440_defconfig` which
- will configure the common features of this board, or use
- `make s3c2410_config` to include support for all s3c2410/s3c2440 machines
-
-
-Support
--------
-
- Ben Dooks' SMDK2440 site at http://www.fluff.org/ben/smdk2440/ which
- includes linux based USB download tools.
-
- Some of the h1940 patches that can be found from the H1940 project
- site at http://www.handhelds.org/projects/h1940.html can also be
- applied to this board.
-
-
-Peripherals
------------
-
- There is no current support for any of the extra peripherals on the
- base-board itself.
-
-
-MTD
----
-
- The NAND flash should be supported by the in kernel MTD NAND support,
- NOR flash will be added later.
-
-
-Maintainers
------------
-
- This board is being maintained by Ben Dooks, for more info, see
- http://www.fluff.org/ben/smdk2440/
-
- Many thanks to Dimitry Andric of TomTom for the loan of the SMDK2440,
- and to Simtec Electronics for allowing me time to work on this.
-
-
-(c) 2004 Ben Dooks
diff --git a/Documentation/arm/samsung-s3c24xx/suspend.rst b/Documentation/arm/samsung-s3c24xx/suspend.rst
deleted file mode 100644
index b4f3ae9fe76e..000000000000
--- a/Documentation/arm/samsung-s3c24xx/suspend.rst
+++ /dev/null
@@ -1,137 +0,0 @@
-=======================
-S3C24XX Suspend Support
-=======================
-
-
-Introduction
-------------
-
- The S3C24XX supports a low-power suspend mode, where the SDRAM is kept
- in Self-Refresh mode, and all but the essential peripheral blocks are
- powered down. For more information on how this works, please look
- at the relevant CPU datasheet from Samsung.
-
-
-Requirements
-------------
-
- 1) A bootloader that can support the necessary resume operation
-
- 2) Support for at least 1 source for resume
-
- 3) CONFIG_PM enabled in the kernel
-
- 4) Any peripherals that are going to be powered down at the same
- time require suspend/resume support.
-
-
-Resuming
---------
-
- The S3C2410 user manual defines the process of sending the CPU to
- sleep and how it resumes. The default behaviour of the Linux code
- is to set the GSTATUS3 register to the physical address of the
- code to resume Linux operation.
-
- GSTATUS4 is currently left alone by the sleep code, and is free to
- use for any other purposes (for example, the EB2410ITX uses this to
- save memory configuration in).
-
-
-Machine Support
----------------
-
- The machine specific functions must call the s3c_pm_init() function
- to say that its bootloader is capable of resuming. This can be as
- simple as adding the following to the machine's definition:
-
- INITMACHINE(s3c_pm_init)
-
- A board can do its own setup before calling s3c_pm_init, if it
- needs to setup anything else for power management support.
-
- There is currently no support for over-riding the default method of
- saving the resume address, if your board requires it, then contact
- the maintainer and discuss what is required.
-
- Note, the original method of adding an late_initcall() is wrong,
- and will end up initialising all compiled machines' pm init!
-
- The following is an example of code used for testing wakeup from
- an falling edge on IRQ_EINT0::
-
-
- static irqreturn_t button_irq(int irq, void *pw)
- {
- return IRQ_HANDLED;
- }
-
- statuc void __init machine_init(void)
- {
- ...
-
- request_irq(IRQ_EINT0, button_irq, IRQF_TRIGGER_FALLING,
- "button-irq-eint0", NULL);
-
- enable_irq_wake(IRQ_EINT0);
-
- s3c_pm_init();
- }
-
-
-Debugging
----------
-
- There are several important things to remember when using PM suspend:
-
- 1) The uart drivers will disable the clocks to the UART blocks when
- suspending, which means that use of printascii() or similar direct
- access to the UARTs will cause the debug to stop.
-
- 2) While the pm code itself will attempt to re-enable the UART clocks,
- care should be taken that any external clock sources that the UARTs
- rely on are still enabled at that point.
-
- 3) If any debugging is placed in the resume path, then it must have the
- relevant clocks and peripherals setup before use (ie, bootloader).
-
- For example, if you transmit a character from the UART, the baud
- rate and uart controls must be setup beforehand.
-
-
-Configuration
--------------
-
- The S3C2410 specific configuration in `System Type` defines various
- aspects of how the S3C2410 suspend and resume support is configured
-
- `S3C2410 PM Suspend debug`
-
- This option prints messages to the serial console before and after
- the actual suspend, giving detailed information on what is
- happening
-
-
- `S3C2410 PM Suspend Memory CRC`
-
- Allows the entire memory to be checksummed before and after the
- suspend to see if there has been any corruption of the contents.
-
- Note, the time to calculate the CRC is dependent on the CPU speed
- and the size of memory. For an 64Mbyte RAM area on an 200MHz
- S3C2410, this can take approximately 4 seconds to complete.
-
- This support requires the CRC32 function to be enabled.
-
-
- `S3C2410 PM Suspend CRC Chunksize (KiB)`
-
- Defines the size of memory each CRC chunk covers. A smaller value
- will mean that the CRC data block will take more memory, but will
- identify any faults with better precision
-
-
-Document Author
----------------
-
-Ben Dooks, Copyright 2004 Simtec Electronics
diff --git a/Documentation/arm/samsung-s3c24xx/usb-host.rst b/Documentation/arm/samsung-s3c24xx/usb-host.rst
deleted file mode 100644
index 7aaffac89e04..000000000000
--- a/Documentation/arm/samsung-s3c24xx/usb-host.rst
+++ /dev/null
@@ -1,91 +0,0 @@
-========================
-S3C24XX USB Host support
-========================
-
-
-
-Introduction
-------------
-
- This document details the S3C2410/S3C2440 in-built OHCI USB host support.
-
-Configuration
--------------
-
- Enable at least the following kernel options:
-
- menuconfig::
-
- Device Drivers --->
- USB support --->
- <*> Support for Host-side USB
- <*> OHCI HCD support
-
-
- .config:
-
- - CONFIG_USB
- - CONFIG_USB_OHCI_HCD
-
-
- Once these options are configured, the standard set of USB device
- drivers can be configured and used.
-
-
-Board Support
--------------
-
- The driver attaches to a platform device, which will need to be
- added by the board specific support file in arch/arm/mach-s3c,
- such as mach-bast.c or mach-smdk2410.c
-
- The platform device's platform_data field is only needed if the
- board implements extra power control or over-current monitoring.
-
- The OHCI driver does not ensure the state of the S3C2410's MISCCTRL
- register, so if both ports are to be used for the host, then it is
- the board support file's responsibility to ensure that the second
- port is configured to be connected to the OHCI core.
-
-
-Platform Data
--------------
-
- See include/linux/platform_data/usb-ohci-s3c2410.h for the
- descriptions of the platform device data. An implementation
- can be found in arch/arm/mach-s3c/simtec-usb.c .
-
- The `struct s3c2410_hcd_info` contains a pair of functions
- that get called to enable over-current detection, and to
- control the port power status.
-
- The ports are numbered 0 and 1.
-
- power_control:
- Called to enable or disable the power on the port.
-
- enable_oc:
- Called to enable or disable the over-current monitoring.
- This should claim or release the resources being used to
- check the power condition on the port, such as an IRQ.
-
- report_oc:
- The OHCI driver fills this field in for the over-current code
- to call when there is a change to the over-current state on
- an port. The ports argument is a bitmask of 1 bit per port,
- with bit X being 1 for an over-current on port X.
-
- The function s3c2410_usb_report_oc() has been provided to
- ensure this is called correctly.
-
- port[x]:
- This is struct describes each port, 0 or 1. The platform driver
- should set the flags field of each port to S3C_HCDFLG_USED if
- the port is enabled.
-
-
-
-Document Author
----------------
-
-Ben Dooks, Copyright 2005 Simtec Electronics
diff --git a/Documentation/arm/samsung/gpio.rst b/Documentation/arm/samsung/gpio.rst
index f6e27b07c993..27fae0d50361 100644
--- a/Documentation/arm/samsung/gpio.rst
+++ b/Documentation/arm/samsung/gpio.rst
@@ -9,14 +9,6 @@ This outlines the Samsung GPIO implementation and the architecture
specific calls provided alongside the drivers/gpio core.
-S3C24XX (Legacy)
-----------------
-
-See Documentation/arm/samsung-s3c24xx/gpio.rst for more information
-about these devices. Their implementation has been brought into line
-with the core samsung implementation described in this document.
-
-
GPIOLIB integration
-------------------
diff --git a/Documentation/arm/samsung/overview.rst b/Documentation/arm/samsung/overview.rst
index e74307897416..8b15a190169b 100644
--- a/Documentation/arm/samsung/overview.rst
+++ b/Documentation/arm/samsung/overview.rst
@@ -12,21 +12,10 @@ Introduction
The currently supported SoCs are:
- - S3C24XX: See Documentation/arm/samsung-s3c24xx/overview.rst for full list
- S3C64XX: S3C6400 and S3C6410
- S5PC110 / S5PV210
-S3C24XX Systems
----------------
-
- There is still documentation in Documnetation/arm/Samsung-S3C24XX/ which
- deals with the architecture and drivers specific to these devices.
-
- See Documentation/arm/samsung-s3c24xx/overview.rst for more information
- on the implementation details and specific support.
-
-
Configuration
-------------
@@ -51,8 +40,6 @@ Layout
specific information. It contains the base clock, GPIO and device definitions
to get the system running.
- plat-s3c24xx is for s3c24xx specific builds, see the S3C24XX docs.
-
plat-s5p is for s5p specific builds, and contains common support for the
S5P specific systems. Not all S5Ps use all the features in this directory
due to differences in the hardware.
diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst
index 808ade4cc008..ec5f889d7681 100644
--- a/Documentation/arm64/silicon-errata.rst
+++ b/Documentation/arm64/silicon-errata.rst
@@ -120,6 +120,8 @@ stable kernels.
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A710 | #2224489 | ARM64_ERRATUM_2224489 |
+----------------+-----------------+-----------------+-----------------------------+
+| ARM | Cortex-A715 | #2645198 | ARM64_ERRATUM_2645198 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-X2 | #2119858 | ARM64_ERRATUM_2119858 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-X2 | #2224489 | ARM64_ERRATUM_2224489 |
diff --git a/Documentation/block/capability.rst b/Documentation/block/capability.rst
deleted file mode 100644
index 2ae7f064736a..000000000000
--- a/Documentation/block/capability.rst
+++ /dev/null
@@ -1,10 +0,0 @@
-===============================
-Generic Block Device Capability
-===============================
-
-This file documents the sysfs file ``block/<disk>/capability``.
-
-``capability`` is a bitfield, printed in hexadecimal, indicating which
-capabilities a specific block device supports:
-
-.. kernel-doc:: include/linux/blkdev.h
diff --git a/Documentation/block/index.rst b/Documentation/block/index.rst
index c4c73db748a8..102953166429 100644
--- a/Documentation/block/index.rst
+++ b/Documentation/block/index.rst
@@ -10,7 +10,6 @@ Block
bfq-iosched
biovecs
blk-mq
- capability
cmdline-partition
data-integrity
deadline-iosched
diff --git a/Documentation/block/ublk.rst b/Documentation/block/ublk.rst
index ba45c46cc0da..1713b2890abb 100644
--- a/Documentation/block/ublk.rst
+++ b/Documentation/block/ublk.rst
@@ -144,6 +144,43 @@ managing and controlling ublk devices with help of several control commands:
For retrieving device info via ``ublksrv_ctrl_dev_info``. It is the server's
responsibility to save IO target specific info in userspace.
+- ``UBLK_CMD_GET_DEV_INFO2``
+ Same purpose with ``UBLK_CMD_GET_DEV_INFO``, but ublk server has to
+ provide path of the char device of ``/dev/ublkc*`` for kernel to run
+ permission check, and this command is added for supporting unprivileged
+ ublk device, and introduced with ``UBLK_F_UNPRIVILEGED_DEV`` together.
+ Only the user owning the requested device can retrieve the device info.
+
+ How to deal with userspace/kernel compatibility:
+
+ 1) if kernel is capable of handling ``UBLK_F_UNPRIVILEGED_DEV``
+
+ If ublk server supports ``UBLK_F_UNPRIVILEGED_DEV``:
+
+ ublk server should send ``UBLK_CMD_GET_DEV_INFO2``, given anytime
+ unprivileged application needs to query devices the current user owns,
+ when the application has no idea if ``UBLK_F_UNPRIVILEGED_DEV`` is set
+ given the capability info is stateless, and application should always
+ retrieve it via ``UBLK_CMD_GET_DEV_INFO2``
+
+ If ublk server doesn't support ``UBLK_F_UNPRIVILEGED_DEV``:
+
+ ``UBLK_CMD_GET_DEV_INFO`` is always sent to kernel, and the feature of
+ UBLK_F_UNPRIVILEGED_DEV isn't available for user
+
+ 2) if kernel isn't capable of handling ``UBLK_F_UNPRIVILEGED_DEV``
+
+ If ublk server supports ``UBLK_F_UNPRIVILEGED_DEV``:
+
+ ``UBLK_CMD_GET_DEV_INFO2`` is tried first, and will be failed, then
+ ``UBLK_CMD_GET_DEV_INFO`` needs to be retried given
+ ``UBLK_F_UNPRIVILEGED_DEV`` can't be set
+
+ If ublk server doesn't support ``UBLK_F_UNPRIVILEGED_DEV``:
+
+ ``UBLK_CMD_GET_DEV_INFO`` is always sent to kernel, and the feature of
+ ``UBLK_F_UNPRIVILEGED_DEV`` isn't available for user
+
- ``UBLK_CMD_START_USER_RECOVERY``
This command is valid if ``UBLK_F_USER_RECOVERY`` feature is enabled. This
@@ -180,6 +217,15 @@ managing and controlling ublk devices with help of several control commands:
double-write since the driver may issue the same I/O request twice. It
might be useful to a read-only FS or a VM backend.
+Unprivileged ublk device is supported by passing ``UBLK_F_UNPRIVILEGED_DEV``.
+Once the flag is set, all control commands can be sent by unprivileged
+user. Except for command of ``UBLK_CMD_ADD_DEV``, permission check on
+the specified char device(``/dev/ublkc*``) is done for all other control
+commands by ublk driver, for doing that, path of the char device has to
+be provided in these commands' payload from ublk server. With this way,
+ublk device becomes container-ware, and device created in one container
+can be controlled/accessed just inside this container.
+
Data plane
----------
@@ -254,15 +300,6 @@ with specified IO tag in the command data:
Future development
==================
-Container-aware ublk deivice
-----------------------------
-
-ublk driver doesn't handle any IO logic. Its function is well defined
-for now and very limited userspace interfaces are needed, which is also
-well defined too. It is possible to make ublk devices container-aware block
-devices in future as Stefan Hajnoczi suggested [#stefan]_, by removing
-ADMIN privilege.
-
Zero copy
---------
diff --git a/Documentation/conf.py b/Documentation/conf.py
index a5c45df0bd83..d927737e3c10 100644
--- a/Documentation/conf.py
+++ b/Documentation/conf.py
@@ -31,6 +31,12 @@ def have_command(cmd):
# Get Sphinx version
major, minor, patch = sphinx.version_info[:3]
+#
+# Warn about older versions that we don't want to support for much
+# longer.
+#
+if (major < 2) or (major == 2 and minor < 4):
+ print('WARNING: support for Sphinx < 2.4 will be removed soon.')
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
@@ -339,7 +345,11 @@ html_use_smartypants = False
# Custom sidebar templates, maps document names to template names.
# Note that the RTD theme ignores this
-html_sidebars = { '**': ["about.html", 'searchbox.html', 'localtoc.html', 'sourcelink.html']}
+html_sidebars = { '**': ['searchbox.html', 'localtoc.html', 'sourcelink.html']}
+
+# about.html is available for alabaster theme. Add it at the front.
+if html_theme == 'alabaster':
+ html_sidebars['**'].insert(0, 'about.html')
# Output file base name for HTML help builder.
htmlhelp_basename = 'TheLinuxKerneldoc'
diff --git a/Documentation/devicetree/bindings/.gitignore b/Documentation/devicetree/bindings/.gitignore
index a77719968a7e..51ddb26d93f0 100644
--- a/Documentation/devicetree/bindings/.gitignore
+++ b/Documentation/devicetree/bindings/.gitignore
@@ -2,3 +2,8 @@
*.example.dts
/processed-schema*.yaml
/processed-schema*.json
+
+#
+# We don't want to ignore the following even if they are dot-files
+#
+!.yamllint
diff --git a/Documentation/devicetree/bindings/arm/altera.yaml b/Documentation/devicetree/bindings/arm/altera.yaml
index 3eee03aa935c..8c7575455422 100644
--- a/Documentation/devicetree/bindings/arm/altera.yaml
+++ b/Documentation/devicetree/bindings/arm/altera.yaml
@@ -31,6 +31,7 @@ properties:
- description: Mercury+ AA1 boards
items:
- enum:
+ - enclustra,mercury-pe1
- google,chameleon-v3
- const: enclustra,mercury-aa1
- const: altr,socfpga-arria10
diff --git a/Documentation/devicetree/bindings/arm/amlogic.yaml b/Documentation/devicetree/bindings/arm/amlogic.yaml
index e16b5fa55847..b634d5b04e15 100644
--- a/Documentation/devicetree/bindings/arm/amlogic.yaml
+++ b/Documentation/devicetree/bindings/arm/amlogic.yaml
@@ -154,6 +154,7 @@ properties:
items:
- enum:
- khadas,vim3
+ - radxa,zero2
- const: amlogic,a311d
- const: amlogic,g12b
@@ -165,6 +166,7 @@ properties:
- azw,gtking-pro
- hardkernel,odroid-go-ultra
- hardkernel,odroid-n2
+ - hardkernel,odroid-n2l
- hardkernel,odroid-n2-plus
- khadas,vim3
- ugoos,am6
@@ -176,6 +178,7 @@ properties:
- enum:
- amediatech,x96-air
- amediatech,x96-air-gbit
+ - bananapi,bpi-m2-pro
- bananapi,bpi-m5
- cyx,a95xf3-air
- cyx,a95xf3-air-gbit
diff --git a/Documentation/devicetree/bindings/arm/aspeed/aspeed.yaml b/Documentation/devicetree/bindings/arm/aspeed/aspeed.yaml
index 73f272664e83..e0eff4c05879 100644
--- a/Documentation/devicetree/bindings/arm/aspeed/aspeed.yaml
+++ b/Documentation/devicetree/bindings/arm/aspeed/aspeed.yaml
@@ -78,6 +78,7 @@ properties:
- facebook,cloudripper-bmc
- facebook,elbert-bmc
- facebook,fuji-bmc
+ - facebook,greatlakes-bmc
- ibm,everest-bmc
- ibm,rainier-bmc
- ibm,tacoma-bmc
@@ -85,6 +86,7 @@ properties:
- jabil,rbp-bmc
- qcom,dc-scm-v1-bmc
- quanta,s6q-bmc
+ - ufispace,ncplite-bmc
- const: aspeed,ast2600
additionalProperties: true
diff --git a/Documentation/devicetree/bindings/arm/atmel-at91.yaml b/Documentation/devicetree/bindings/arm/atmel-at91.yaml
index 2224b18801a1..dfb8fd089197 100644
--- a/Documentation/devicetree/bindings/arm/atmel-at91.yaml
+++ b/Documentation/devicetree/bindings/arm/atmel-at91.yaml
@@ -91,9 +91,11 @@ properties:
- const: atmel,sama5d2
- const: atmel,sama5
- - description: SAM9X60-EK board
+ - description: Microchip SAM9X60 Evaluation Boards
items:
- - const: microchip,sam9x60ek
+ - enum:
+ - microchip,sam9x60ek
+ - microchip,sam9x60-curiosity
- const: microchip,sam9x60
- const: atmel,at91sam9
diff --git a/Documentation/devicetree/bindings/arm/fsl.yaml b/Documentation/devicetree/bindings/arm/fsl.yaml
index 05b5276a0e14..442ce8f4d675 100644
--- a/Documentation/devicetree/bindings/arm/fsl.yaml
+++ b/Documentation/devicetree/bindings/arm/fsl.yaml
@@ -88,12 +88,56 @@ properties:
items:
- enum:
- armadeus,imx28-apf28 # APF28 SoM
- - armadeus,imx28-apf28dev # APF28 SoM on APF28Dev board
+ - bluegiga,apx4devkit # Bluegiga APx4 SoM on dev board
+ - crystalfontz,cfa10036 # Crystalfontz CFA-10036 SoM
+ - eukrea,mbmx28lc
- fsl,imx28-evk
- i2se,duckbill
- i2se,duckbill-2
+ - karo,tx28 # Ka-Ro electronics TX28 module
+ - lwn,imx28-xea
+ - msr,m28cu3 # M28 SoM with custom base board
+ - schulercontrol,imx28-sps1
- technologic,imx28-ts4600
- const: fsl,imx28
+
+ - description: i.MX28 Aries M28 SoM Board
+ items:
+ - const: aries,m28
+ - const: denx,m28
+ - const: fsl,imx28
+
+ - description: i.MX28 Aries M28EVK Board
+ items:
+ - const: aries,m28evk
+ - const: denx,m28evk
+ - const: fsl,imx28
+
+ - description: i.MX28 Armadeus Systems APF28Dev Board
+ items:
+ - const: armadeus,imx28-apf28dev
+ - const: armadeus,imx28-apf28
+ - const: fsl,imx28
+
+ - description: i.MX28 Crystalfontz CFA-10036 based Boards
+ items:
+ - enum:
+ - crystalfontz,cfa10037
+ - crystalfontz,cfa10049
+ - crystalfontz,cfa10057
+ - crystalfontz,cfa10058
+ - const: crystalfontz,cfa10036
+ - const: fsl,imx28
+
+ - description: i.MX28 Crystalfontz CFA-10037 based Boards
+ items:
+ - enum:
+ - crystalfontz,cfa10055
+ - crystalfontz,cfa10056
+ - const: crystalfontz,cfa10037
+ - const: crystalfontz,cfa10036
+ - const: fsl,imx28
+
- description: i.MX28 Duckbill 2 based Boards
items:
- enum:
@@ -103,6 +147,19 @@ properties:
- const: i2se,duckbill-2
- const: fsl,imx28
+ - description: i.MX28 Eukrea Electromatique MBMX283LC Board
+ items:
+ - const: eukrea,mbmx283lc
+ - const: eukrea,mbmx28lc
+ - const: fsl,imx28
+
+ - description: i.MX28 Eukrea Electromatique MBMX287LC Board
+ items:
+ - const: eukrea,mbmx287lc
+ - const: eukrea,mbmx283lc
+ - const: eukrea,mbmx28lc
+ - const: fsl,imx28
+
- description: i.MX31 based Boards
items:
- enum:
@@ -173,6 +230,7 @@ properties:
- kiebackpeter,imx53-ddc # K+P imx53 DDC
- kiebackpeter,imx53-hsc # K+P imx53 HSC
- menlo,m53menlo # i.MX53 Menlo board
+ - starterkit,sk-imx53
- voipac,imx53-dmm-668 # Voipac i.MX53 X53-DMM-668
- const: fsl,imx53
@@ -644,6 +702,16 @@ properties:
- const: armadeus,imx6ull-opos6ul # OPOS6UL (i.MX6ULL) SoM
- const: fsl,imx6ull
+ - description: i.MX6ULL DHCOM SoM based Boards
+ items:
+ - enum:
+ - dh,imx6ull-dhcom-drc02
+ - dh,imx6ull-dhcom-pdk2
+ - dh,imx6ull-dhcom-picoitx
+ - const: dh,imx6ull-dhcom-som # The DHCOR is soldered on the DHCOM
+ - const: dh,imx6ull-dhcor-som
+ - const: fsl,imx6ull
+
- description: i.MX6ULL PHYTEC phyBOARD-Segin
items:
- enum:
@@ -815,7 +883,6 @@ properties:
- enum:
- beacon,imx8mm-beacon-kit # i.MX8MM Beacon Development Kit
- boundary,imx8mm-nitrogen8mm # i.MX8MM Nitrogen Board
- - cloos,imx8mm-phg # i.MX8MM Cloos PHG Board
- dmo,imx8mm-data-modul-edm-sbc # i.MX8MM eDM SBC
- emtrion,emcon-mx8mm-avari # emCON-MX8MM SoM on Avari Base
- fsl,imx8mm-ddr4-evk # i.MX8MM DDR4 EVK Board
@@ -830,7 +897,6 @@ properties:
- innocomm,wb15-evk # i.MX8MM Innocomm EVK board with WB15 SoM
- kontron,imx8mm-sl # i.MX8MM Kontron SL (N801X) SOM
- kontron,imx8mm-osm-s # i.MX8MM Kontron OSM-S (N802X) SOM
- - menlo,mx8menlo # i.MX8MM Menlo board with Verdin SoM
- toradex,verdin-imx8mm # Verdin iMX8M Mini Modules
- toradex,verdin-imx8mm-nonwifi # Verdin iMX8M Mini Modules without Wi-Fi / BT
- toradex,verdin-imx8mm-wifi # Verdin iMX8M Mini Wi-Fi / BT Modules
@@ -861,8 +927,10 @@ properties:
- description: Toradex Boards with Verdin iMX8M Mini Modules
items:
- enum:
+ - menlo,mx8menlo # Verdin iMX8M Mini Module on i.MX8MM Menlo board
- toradex,verdin-imx8mm-nonwifi-dahlia # Verdin iMX8M Mini Module on Dahlia
- toradex,verdin-imx8mm-nonwifi-dev # Verdin iMX8M Mini Module on Verdin Development Board
+ - toradex,verdin-imx8mm-nonwifi-yavia # Verdin iMX8M Mini Module on Yavia
- const: toradex,verdin-imx8mm-nonwifi # Verdin iMX8M Mini Module without Wi-Fi / BT
- const: toradex,verdin-imx8mm # Verdin iMX8M Mini Module
- const: fsl,imx8mm
@@ -872,6 +940,7 @@ properties:
- enum:
- toradex,verdin-imx8mm-wifi-dahlia # Verdin iMX8M Mini Wi-Fi / BT Module on Dahlia
- toradex,verdin-imx8mm-wifi-dev # Verdin iMX8M Mini Wi-Fi / BT M. on Verdin Development B.
+ - toradex,verdin-imx8mm-wifi-yavia # Verdin iMX8M Mini Wi-Fi / BT Module on Yavia
- const: toradex,verdin-imx8mm-wifi # Verdin iMX8M Mini Wi-Fi / BT Module
- const: toradex,verdin-imx8mm # Verdin iMX8M Mini Module
- const: fsl,imx8mm
@@ -895,6 +964,7 @@ properties:
one compatible is needed.
items:
- enum:
+ - cloos,imx8mm-phg # i.MX8MM Cloos PHG Board
- tq,imx8mm-tqma8mqml-mba8mx # TQ-Systems GmbH i.MX8MM TQMa8MQML SOM on MBa8Mx
- const: tq,imx8mm-tqma8mqml # TQ-Systems GmbH i.MX8MM TQMa8MQML SOM
- const: fsl,imx8mm
@@ -931,10 +1001,11 @@ properties:
- description: i.MX8MP based Boards
items:
- enum:
- - dh,imx8mp-dhcom-som # i.MX8MP DHCOM SoM
- - dh,imx8mp-dhcom-pdk2 # i.MX8MP DHCOM SoM on PDK2 board
+ - beacon,imx8mp-beacon-kit # i.MX8MP Beacon Development Kit
- fsl,imx8mp-evk # i.MX8MP EVK Board
- gateworks,imx8mp-gw74xx # i.MX8MP Gateworks Board
+ - polyhex,imx8mp-debix # Polyhex Debix boards
+ - polyhex,imx8mp-debix-model-a # Polyhex Debix Model A Board
- toradex,verdin-imx8mp # Verdin iMX8M Plus Modules
- toradex,verdin-imx8mp-nonwifi # Verdin iMX8M Plus Modules without Wi-Fi / BT
- toradex,verdin-imx8mp-wifi # Verdin iMX8M Plus Wi-Fi / BT Modules
@@ -947,6 +1018,12 @@ properties:
- const: avnet,sm2s-imx8mp # SM2S-IMX8PLUS SoM
- const: fsl,imx8mp
+ - description: i.MX8MP DHCOM based Boards
+ items:
+ - const: dh,imx8mp-dhcom-pdk2 # i.MX8MP DHCOM SoM on PDK2 board
+ - const: dh,imx8mp-dhcom-som # i.MX8MP DHCOM SoM
+ - const: fsl,imx8mp
+
- description: Engicam i.Core MX8M Plus SoM based boards
items:
- enum:
@@ -965,6 +1042,7 @@ properties:
- enum:
- toradex,verdin-imx8mp-nonwifi-dahlia # Verdin iMX8M Plus Module on Dahlia
- toradex,verdin-imx8mp-nonwifi-dev # Verdin iMX8M Plus Module on Verdin Development Board
+ - toradex,verdin-imx8mp-nonwifi-yavia # Verdin iMX8M Plus Module on Yavia
- const: toradex,verdin-imx8mp-nonwifi # Verdin iMX8M Plus Module without Wi-Fi / BT
- const: toradex,verdin-imx8mp # Verdin iMX8M Plus Module
- const: fsl,imx8mp
@@ -974,6 +1052,7 @@ properties:
- enum:
- toradex,verdin-imx8mp-wifi-dahlia # Verdin iMX8M Plus Wi-Fi / BT Module on Dahlia
- toradex,verdin-imx8mp-wifi-dev # Verdin iMX8M Plus Wi-Fi / BT M. on Verdin Development B.
+ - toradex,verdin-imx8mp-wifi-yavia # Verdin iMX8M Plus Wi-Fi / BT Module on Yavia
- const: toradex,verdin-imx8mp-wifi # Verdin iMX8M Plus Wi-Fi / BT Module
- const: toradex,verdin-imx8mp # Verdin iMX8M Plus Module
- const: fsl,imx8mp
@@ -999,12 +1078,17 @@ properties:
- fsl,imx8mq-evk # i.MX8MQ EVK Board
- google,imx8mq-phanbell # Google Coral Edge TPU
- kontron,pitx-imx8m # Kontron pITX-imx8m Board
- - mntre,reform2 # MNT Reform2 Laptop
- purism,librem5-devkit # Purism Librem5 devkit
- solidrun,hummingboard-pulse # SolidRun Hummingboard Pulse
- technexion,pico-pi-imx8m # TechNexion PICO-PI-8M evk
- const: fsl,imx8mq
+ - description: i.MX8MQ NITROGEN SoM based Boards
+ items:
+ - const: mntre,reform2 # MNT Reform2 Laptop
+ - const: boundary,imx8mq-nitrogen8m-som # i.MX8MQ NITROGEN SoM
+ - const: fsl,imx8mq
+
- description: Purism Librem5 phones
items:
- enum:
diff --git a/Documentation/devicetree/bindings/arm/mediatek.yaml b/Documentation/devicetree/bindings/arm/mediatek.yaml
index 2275e5d93721..ae12b1cab9fb 100644
--- a/Documentation/devicetree/bindings/arm/mediatek.yaml
+++ b/Documentation/devicetree/bindings/arm/mediatek.yaml
@@ -246,6 +246,10 @@ properties:
- const: mediatek,mt8183
- items:
- enum:
+ - mediatek,mt8365-evk
+ - const: mediatek,mt8365
+ - items:
+ - enum:
- mediatek,mt8516-pumpkin
- const: mediatek,mt8516
diff --git a/Documentation/devicetree/bindings/arm/qcom.yaml b/Documentation/devicetree/bindings/arm/qcom.yaml
index 27063a045bd0..1bb24d46e4ee 100644
--- a/Documentation/devicetree/bindings/arm/qcom.yaml
+++ b/Documentation/devicetree/bindings/arm/qcom.yaml
@@ -35,6 +35,8 @@ description: |
mdm9615
msm8226
msm8916
+ msm8939
+ msm8953
msm8956
msm8974
msm8976
@@ -47,11 +49,13 @@ description: |
qru1000
sa8155p
sa8540p
+ sa8775p
sc7180
sc7280
sc8180x
sc8280xp
sda660
+ sdm450
sdm630
sdm632
sdm636
@@ -62,6 +66,7 @@ description: |
sdx65
sm4250
sm6115
+ sm6115p
sm6125
sm6350
sm6375
@@ -70,6 +75,7 @@ description: |
sm8250
sm8350
sm8450
+ sm8550
The 'board' element must be one of the following strings:
@@ -84,6 +90,7 @@ description: |
liquid
mtp
qrd
+ ride
sbc
x100
@@ -162,6 +169,12 @@ properties:
- items:
- enum:
+ - sony,kanuti-tulip
+ - square,apq8039-t2
+ - const: qcom,msm8939
+
+ - items:
+ - enum:
- sony,kugo-row
- sony,suzu-row
- const: qcom,msm8956
@@ -194,8 +207,10 @@ properties:
- items:
- enum:
+ - acer,a1-724
- alcatel,idol347
- asus,z00l
+ - gplus,fl8005a
- huawei,g7
- longcheer,l8910
- samsung,a3u-eur
@@ -203,8 +218,13 @@ properties:
- samsung,e5
- samsung,e7
- samsung,grandmax
+ - samsung,gt510
+ - samsung,gt58
- samsung,j5
+ - samsung,j5x
- samsung,serranove
+ - thwc,uf896
+ - thwc,ufi001c
- wingtech,wt88047
- const: qcom,msm8916
@@ -215,6 +235,15 @@ properties:
- items:
- enum:
+ - motorola,potter
+ - xiaomi,daisy
+ - xiaomi,mido
+ - xiaomi,tissot
+ - xiaomi,vince
+ - const: qcom,msm8953
+
+ - items:
+ - enum:
- lg,bullhead
- microsoft,talkman
- xiaomi,libra
@@ -627,6 +656,12 @@ properties:
- const: google,hoglin
- const: qcom,sc7280
+ - description: Qualcomm Technologies, Inc. sc7280 CRD Pro platform (newest rev)
+ items:
+ - const: google,zoglin-sku1536
+ - const: google,hoglin-sku1536
+ - const: qcom,sc7280
+
- description: Qualcomm Technologies, Inc. sc7280 IDP SKU1 platform
items:
- const: qcom,sc7280-idp
@@ -679,6 +714,18 @@ properties:
- const: google,zombie-sku512
- const: qcom,sc7280
+ - description: Google Zombie with NVMe (newest rev)
+ items:
+ - const: google,zombie-sku2
+ - const: google,zombie-sku3
+ - const: google,zombie-sku515
+ - const: qcom,sc7280
+
+ - description: Google Zombie with LTE and NVMe (newest rev)
+ items:
+ - const: google,zombie-sku514
+ - const: qcom,sc7280
+
- items:
- enum:
- lenovo,flex-5g
@@ -695,6 +742,11 @@ properties:
- items:
- enum:
+ - motorola,ali
+ - const: qcom,sdm450
+
+ - items:
+ - enum:
- sony,discovery-row
- sony,kirin-row
- sony,pioneer-row
@@ -709,6 +761,7 @@ properties:
- items:
- enum:
- fairphone,fp3
+ - motorola,ocean
- const: qcom,sdm632
- items:
@@ -764,6 +817,11 @@ properties:
- items:
- enum:
+ - qcom,sa8775p-ride
+ - const: qcom,sa8775p
+
+ - items:
+ - enum:
- google,cheza
- google,cheza-rev1
- google,cheza-rev2
@@ -792,6 +850,12 @@ properties:
- items:
- enum:
+ - lenovo,j606f
+ - const: qcom,sm6115p
+ - const: qcom,sm6115
+
+ - items:
+ - enum:
- sony,pdx201
- const: qcom,sm6125
@@ -826,6 +890,7 @@ properties:
- qcom,sm8250-mtp
- sony,pdx203-generic
- sony,pdx206-generic
+ - xiaomi,elish
- const: qcom,sm8250
- items:
@@ -845,6 +910,11 @@ properties:
- sony,pdx224
- const: qcom,sm8450
+ - items:
+ - enum:
+ - qcom,sm8550-mtp
+ - const: qcom,sm8550
+
# Board compatibles go above
qcom,msm-id:
@@ -922,15 +992,22 @@ allOf:
- qcom,apq8026
- qcom,apq8094
- qcom,apq8096
+ - qcom,msm8939
+ - qcom,msm8953
+ - qcom,msm8956
- qcom,msm8992
- qcom,msm8994
- qcom,msm8996
- qcom,msm8998
+ - qcom,sdm450
- qcom,sdm630
- qcom,sdm632
+ - qcom,sdm636
- qcom,sdm845
- qcom,sdx55
- qcom,sdx65
+ - qcom,sm4250
+ - qcom,sm6115
- qcom,sm6125
- qcom,sm6350
- qcom,sm7225
@@ -954,6 +1031,8 @@ allOf:
- oneplus,dumpling
- oneplus,enchilada
- oneplus,fajita
+ - oneplus,oneplus3
+ - oneplus,oneplus3t
then:
properties:
qcom,board-id:
diff --git a/Documentation/devicetree/bindings/arm/rockchip.yaml b/Documentation/devicetree/bindings/arm/rockchip.yaml
index 88ff4422a8c1..35f74eda30ae 100644
--- a/Documentation/devicetree/bindings/arm/rockchip.yaml
+++ b/Documentation/devicetree/bindings/arm/rockchip.yaml
@@ -90,11 +90,33 @@ properties:
- const: chipspark,rayeager-px2
- const: rockchip,rk3066a
+ - description: Edgeble Neural Compute Module 2(Neu2) SoM based boards
+ items:
+ - const: edgeble,neural-compute-module-2-io # Edgeble Neural Compute Module 2 IO Board
+ - const: edgeble,neural-compute-module-2 # Edgeble Neural Compute Module 2 SoM
+ - const: rockchip,rv1126
+
+ - description: Edgeble Neural Compute Module 6(Neu6) Model A SoM based boards
+ items:
+ - const: edgeble,neural-compute-module-6a-io # Edgeble Neural Compute Module 6A IO Board
+ - const: edgeble,neural-compute-module-6a # Edgeble Neural Compute Module 6A SoM
+ - const: rockchip,rk3588
+
- description: Elgin RV1108 R1
items:
- const: elgin,rv1108-r1
- const: rockchip,rv1108
+ - description: EmbedFire LubanCat 1
+ items:
+ - const: embedfire,lubancat-1
+ - const: rockchip,rk3566
+
+ - description: EmbedFire LubanCat 2
+ items:
+ - const: embedfire,lubancat-2
+ - const: rockchip,rk3568
+
- description: Engicam PX30.Core C.TOUCH 2.0
items:
- const: engicam,px30-core-ctouch2
@@ -599,6 +621,20 @@ properties:
- const: pine64,soquartz
- const: rockchip,rk3566
+ - description: Radxa Compute Module 3(CM3)
+ items:
+ - enum:
+ - radxa,cm3-io
+ - const: radxa,cm3
+ - const: rockchip,rk3566
+
+ - description: Radxa CM3 Industrial
+ items:
+ - enum:
+ - radxa,e25
+ - const: radxa,cm3i
+ - const: rockchip,rk3568
+
- description: Radxa Rock
items:
- const: radxa,rock
@@ -652,6 +688,16 @@ properties:
- const: radxa,rock3a
- const: rockchip,rk3568
+ - description: Radxa ROCK 5 Model A
+ items:
+ - const: radxa,rock-5a
+ - const: rockchip,rk3588s
+
+ - description: Radxa ROCK 5 Model B
+ items:
+ - const: radxa,rock-5b
+ - const: rockchip,rk3588
+
- description: Rikomagic MK808 v1
items:
- const: rikomagic,mk808
@@ -689,6 +735,11 @@ properties:
- const: rockchip,rk3036-evb
- const: rockchip,rk3036
+ - description: Rockchip RK3128 Evaluation board
+ items:
+ - const: rockchip,rk3128-evb
+ - const: rockchip,rk3128
+
- description: Rockchip RK3228 Evaluation board
items:
- const: rockchip,rk3228-evb
@@ -736,6 +787,11 @@ properties:
- const: rockchip,rk3399-sapphire-excavator
- const: rockchip,rk3399
+ - description: Rockchip RK3588 Evaluation board
+ items:
+ - const: rockchip,rk3588-evb1-v10
+ - const: rockchip,rk3588
+
- description: Rockchip RV1108 Evaluation board
items:
- const: rockchip,rv1108-evb
@@ -761,6 +817,11 @@ properties:
- const: tronsmart,orion-r68-meta
- const: rockchip,rk3368
+ - description: Xunlong Orange Pi R1 Plus
+ items:
+ - const: xunlong,orangepi-r1-plus
+ - const: rockchip,rk3328
+
- description: Zkmagic A95X Z2
items:
- const: zkmagic,a95x-z2
diff --git a/Documentation/devicetree/bindings/arm/rockchip/pmu.yaml b/Documentation/devicetree/bindings/arm/rockchip/pmu.yaml
index 8c73bc7f4009..b79c81cd9f0e 100644
--- a/Documentation/devicetree/bindings/arm/rockchip/pmu.yaml
+++ b/Documentation/devicetree/bindings/arm/rockchip/pmu.yaml
@@ -27,6 +27,7 @@ select:
- rockchip,rk3399-pmu
- rockchip,rk3568-pmu
- rockchip,rk3588-pmu
+ - rockchip,rv1126-pmu
required:
- compatible
@@ -43,6 +44,7 @@ properties:
- rockchip,rk3399-pmu
- rockchip,rk3568-pmu
- rockchip,rk3588-pmu
+ - rockchip,rv1126-pmu
- const: syscon
- const: simple-mfd
diff --git a/Documentation/devicetree/bindings/arm/samsung/samsung-boards.yaml b/Documentation/devicetree/bindings/arm/samsung/samsung-boards.yaml
index faea33e4f731..deb2cf971871 100644
--- a/Documentation/devicetree/bindings/arm/samsung/samsung-boards.yaml
+++ b/Documentation/devicetree/bindings/arm/samsung/samsung-boards.yaml
@@ -171,6 +171,7 @@ properties:
- hardkernel,odroid-xu3-lite # Hardkernel Odroid XU3 Lite
- hardkernel,odroid-xu4 # Hardkernel Odroid XU4
- hardkernel,odroid-hc1 # Hardkernel Odroid HC1
+ - samsung,k3g # Samsung Galaxy S5 (SM-G900H)
- const: samsung,exynos5800
- const: samsung,exynos5
diff --git a/Documentation/devicetree/bindings/arm/ti/k3.yaml b/Documentation/devicetree/bindings/arm/ti/k3.yaml
index 203faab80142..a60a4065caa8 100644
--- a/Documentation/devicetree/bindings/arm/ti/k3.yaml
+++ b/Documentation/devicetree/bindings/arm/ti/k3.yaml
@@ -38,10 +38,17 @@ properties:
- ti,am642-sk
- const: ti,am642
+ - description: K3 AM642 SoC PHYTEC phyBOARD-Electra
+ items:
+ - const: phytec,am642-phyboard-electra-rdk
+ - const: phytec,am64-phycore-som
+ - const: ti,am642
+
- description: K3 AM654 SoC
items:
- enum:
- siemens,iot2050-advanced
+ - siemens,iot2050-advanced-m2
- siemens,iot2050-advanced-pg2
- siemens,iot2050-basic
- siemens,iot2050-basic-pg2
@@ -69,9 +76,17 @@ properties:
- description: K3 J721s2 SoC
items:
- enum:
+ - ti,am68-sk
- ti,j721s2-evm
- const: ti,j721s2
+ - description: K3 J784s4 SoC
+ items:
+ - enum:
+ - ti,am69-sk
+ - ti,j784s4-evm
+ - const: ti,j784s4
+
additionalProperties: true
...
diff --git a/Documentation/devicetree/bindings/bus/nvidia,tegra210-aconnect.yaml b/Documentation/devicetree/bindings/bus/nvidia,tegra210-aconnect.yaml
index d3ed048c9521..4157e885c6e7 100644
--- a/Documentation/devicetree/bindings/bus/nvidia,tegra210-aconnect.yaml
+++ b/Documentation/devicetree/bindings/bus/nvidia,tegra210-aconnect.yaml
@@ -40,10 +40,10 @@ properties:
maxItems: 1
"#address-cells":
- const: 1
+ enum: [ 1, 2 ]
"#size-cells":
- const: 1
+ enum: [ 1, 2 ]
ranges: true
diff --git a/Documentation/devicetree/bindings/clock/qcom,qdu1000-gcc.yaml b/Documentation/devicetree/bindings/clock/qcom,qdu1000-gcc.yaml
new file mode 100644
index 000000000000..767a9d03aa32
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/qcom,qdu1000-gcc.yaml
@@ -0,0 +1,51 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/clock/qcom,qdu1000-gcc.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Qualcomm Global Clock & Reset Controller for QDU1000 and QRU1000
+
+maintainers:
+ - Melody Olvera <quic_molvera@quicinc.com>
+
+description: |
+ Qualcomm global clock control module which supports the clocks, resets and
+ power domains on QDU1000 and QRU1000
+
+ See also:: include/dt-bindings/clock/qcom,qdu1000-gcc.h
+
+properties:
+ compatible:
+ const: qcom,qdu1000-gcc
+
+ clocks:
+ items:
+ - description: Board XO source
+ - description: Sleep clock source
+ - description: PCIE 0 Pipe clock source
+ - description: PCIE 0 Phy Auxiliary clock source
+ - description: USB3 Phy wrapper pipe clock source
+
+required:
+ - compatible
+ - clocks
+
+allOf:
+ - $ref: qcom,gcc.yaml#
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/clock/qcom,rpmh.h>
+ clock-controller@100000 {
+ compatible = "qcom,qdu1000-gcc";
+ reg = <0x00100000 0x001f4200>;
+ clocks = <&rpmhcc RPMH_CXO_CLK>, <&sleep_clk>,
+ <&pcie_0_pipe_clk>, <&pcie_0_phy_aux_clk>,
+ <&usb3_phy_wrapper_pipe_clk>;
+ #clock-cells = <1>;
+ #reset-cells = <1>;
+ #power-domain-cells = <1>;
+ };
diff --git a/Documentation/devicetree/bindings/clock/qcom,sm6350-camcc.yaml b/Documentation/devicetree/bindings/clock/qcom,sm6350-camcc.yaml
new file mode 100644
index 000000000000..fd6658cb793d
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/qcom,sm6350-camcc.yaml
@@ -0,0 +1,49 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/clock/qcom,sm6350-camcc.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Qualcomm Camera Clock & Reset Controller on SM6350
+
+maintainers:
+ - Konrad Dybcio <konrad.dybcio@linaro.org>
+
+description: |
+ Qualcomm camera clock control module provides the clocks, resets and power
+ domains on SM6350.
+
+ See also:: include/dt-bindings/clock/qcom,sm6350-camcc.h
+
+properties:
+ compatible:
+ const: qcom,sm6350-camcc
+
+ clocks:
+ items:
+ - description: Board XO source
+
+ reg:
+ maxItems: 1
+
+required:
+ - compatible
+ - clocks
+
+allOf:
+ - $ref: qcom,gcc.yaml#
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/clock/qcom,rpmh.h>
+ clock-controller@ad00000 {
+ compatible = "qcom,sm6350-camcc";
+ reg = <0x0ad00000 0x16000>;
+ clocks = <&rpmhcc RPMH_CXO_CLK>;
+ #clock-cells = <1>;
+ #reset-cells = <1>;
+ #power-domain-cells = <1>;
+ };
+...
diff --git a/Documentation/devicetree/bindings/clock/qcom,sm8550-dispcc.yaml b/Documentation/devicetree/bindings/clock/qcom,sm8550-dispcc.yaml
new file mode 100644
index 000000000000..ab25f7cbaa2e
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/qcom,sm8550-dispcc.yaml
@@ -0,0 +1,105 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/clock/qcom,sm8550-dispcc.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Qualcomm Display Clock & Reset Controller for SM8550
+
+maintainers:
+ - Bjorn Andersson <andersson@kernel.org>
+ - Neil Armstrong <neil.armstrong@linaro.org>
+
+description: |
+ Qualcomm display clock control module provides the clocks, resets and power
+ domains on SM8550.
+
+ See also:: include/dt-bindings/clock/qcom,sm8550-dispcc.h
+
+properties:
+ compatible:
+ enum:
+ - qcom,sm8550-dispcc
+
+ clocks:
+ items:
+ - description: Board XO source
+ - description: Board Always On XO source
+ - description: Display's AHB clock
+ - description: sleep clock
+ - description: Byte clock from DSI PHY0
+ - description: Pixel clock from DSI PHY0
+ - description: Byte clock from DSI PHY1
+ - description: Pixel clock from DSI PHY1
+ - description: Link clock from DP PHY0
+ - description: VCO DIV clock from DP PHY0
+ - description: Link clock from DP PHY1
+ - description: VCO DIV clock from DP PHY1
+ - description: Link clock from DP PHY2
+ - description: VCO DIV clock from DP PHY2
+ - description: Link clock from DP PHY3
+ - description: VCO DIV clock from DP PHY3
+
+ '#clock-cells':
+ const: 1
+
+ '#reset-cells':
+ const: 1
+
+ '#power-domain-cells':
+ const: 1
+
+ reg:
+ maxItems: 1
+
+ power-domains:
+ description:
+ A phandle and PM domain specifier for the MMCX power domain.
+ maxItems: 1
+
+ required-opps:
+ description:
+ A phandle to an OPP node describing required MMCX performance point.
+ maxItems: 1
+
+required:
+ - compatible
+ - reg
+ - clocks
+ - '#clock-cells'
+ - '#reset-cells'
+ - '#power-domain-cells'
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/clock/qcom,sm8550-gcc.h>
+ #include <dt-bindings/clock/qcom,rpmh.h>
+ #include <dt-bindings/power/qcom-rpmpd.h>
+ clock-controller@af00000 {
+ compatible = "qcom,sm8550-dispcc";
+ reg = <0x0af00000 0x10000>;
+ clocks = <&rpmhcc RPMH_CXO_CLK>,
+ <&rpmhcc RPMH_CXO_CLK_A>,
+ <&gcc GCC_DISP_AHB_CLK>,
+ <&sleep_clk>,
+ <&dsi0_phy 0>,
+ <&dsi0_phy 1>,
+ <&dsi1_phy 0>,
+ <&dsi1_phy 1>,
+ <&dp0_phy 0>,
+ <&dp0_phy 1>,
+ <&dp1_phy 0>,
+ <&dp1_phy 1>,
+ <&dp2_phy 0>,
+ <&dp2_phy 1>,
+ <&dp3_phy 0>,
+ <&dp3_phy 1>;
+ #clock-cells = <1>;
+ #reset-cells = <1>;
+ #power-domain-cells = <1>;
+ power-domains = <&rpmhpd SM8550_MMCX>;
+ required-opps = <&rpmhpd_opp_low_svs>;
+ };
+...
diff --git a/Documentation/devicetree/bindings/clock/qcom,sm8550-tcsr.yaml b/Documentation/devicetree/bindings/clock/qcom,sm8550-tcsr.yaml
new file mode 100644
index 000000000000..1bf1a41fd89c
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/qcom,sm8550-tcsr.yaml
@@ -0,0 +1,55 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/clock/qcom,sm8550-tcsr.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Qualcomm TCSR Clock Controller on SM8550
+
+maintainers:
+ - Bjorn Andersson <andersson@kernel.org>
+
+description: |
+ Qualcomm TCSR clock control module provides the clocks, resets and
+ power domains on SM8550
+
+ See also:: include/dt-bindings/clock/qcom,sm8550-tcsr.h
+
+properties:
+ compatible:
+ items:
+ - const: qcom,sm8550-tcsr
+ - const: syscon
+
+ clocks:
+ items:
+ - description: TCXO pad clock
+
+ reg:
+ maxItems: 1
+
+ '#clock-cells':
+ const: 1
+
+ '#reset-cells':
+ const: 1
+
+required:
+ - compatible
+ - clocks
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/clock/qcom,rpmh.h>
+
+ clock-controller@1fc0000 {
+ compatible = "qcom,sm8550-tcsr", "syscon";
+ reg = <0x1fc0000 0x30000>;
+ clocks = <&rpmhcc RPMH_CXO_CLK>;
+ #clock-cells = <1>;
+ #reset-cells = <1>;
+ };
+
+...
diff --git a/Documentation/devicetree/bindings/clock/samsung,s3c2410-clock.txt b/Documentation/devicetree/bindings/clock/samsung,s3c2410-clock.txt
deleted file mode 100644
index 2632d3f13004..000000000000
--- a/Documentation/devicetree/bindings/clock/samsung,s3c2410-clock.txt
+++ /dev/null
@@ -1,49 +0,0 @@
-* Samsung S3C2410 Clock Controller
-
-The S3C2410 clock controller generates and supplies clock to various controllers
-within the SoC. The clock binding described here is applicable to the s3c2410,
-s3c2440 and s3c2442 SoCs in the s3c24x family.
-
-Required Properties:
-
-- compatible: should be one of the following.
- - "samsung,s3c2410-clock" - controller compatible with S3C2410 SoC.
- - "samsung,s3c2440-clock" - controller compatible with S3C2440 SoC.
- - "samsung,s3c2442-clock" - controller compatible with S3C2442 SoC.
-- reg: physical base address of the controller and length of memory mapped
- region.
-- #clock-cells: should be 1.
-
-Each clock is assigned an identifier and client nodes can use this identifier
-to specify the clock which they consume. Some of the clocks are available only
-on a particular SoC.
-
-All available clocks are defined as preprocessor macros in
-dt-bindings/clock/s3c2410.h header and can be used in device
-tree sources.
-
-External clocks:
-
-The xti clock used as input for the plls is generated outside the SoC. It is
-expected that is are defined using standard clock bindings with a
-clock-output-names value of "xti".
-
-Example: Clock controller node:
-
- clocks: clock-controller@4c000000 {
- compatible = "samsung,s3c2410-clock";
- reg = <0x4c000000 0x20>;
- #clock-cells = <1>;
- };
-
-Example: UART controller node that consumes the clock generated by the clock
- controller (refer to the standard clock bindings for information about
- "clocks" and "clock-names" properties):
-
- serial@50004000 {
- compatible = "samsung,s3c2440-uart";
- reg = <0x50004000 0x4000>;
- interrupts = <1 23 3 4>, <1 23 4 4>;
- clock-names = "uart", "clk_uart_baud2";
- clocks = <&clocks PCLK_UART0>, <&clocks PCLK_UART0>;
- };
diff --git a/Documentation/devicetree/bindings/clock/samsung,s3c2412-clock.txt b/Documentation/devicetree/bindings/clock/samsung,s3c2412-clock.txt
deleted file mode 100644
index 21a8c23e658f..000000000000
--- a/Documentation/devicetree/bindings/clock/samsung,s3c2412-clock.txt
+++ /dev/null
@@ -1,49 +0,0 @@
-* Samsung S3C2412 Clock Controller
-
-The S3C2412 clock controller generates and supplies clock to various controllers
-within the SoC. The clock binding described here is applicable to the s3c2412
-and s3c2413 SoCs in the s3c24x family.
-
-Required Properties:
-
-- compatible: should be "samsung,s3c2412-clock"
-- reg: physical base address of the controller and length of memory mapped
- region.
-- #clock-cells: should be 1.
-
-Each clock is assigned an identifier and client nodes can use this identifier
-to specify the clock which they consume. Some of the clocks are available only
-on a particular SoC.
-
-All available clocks are defined as preprocessor macros in
-dt-bindings/clock/s3c2412.h header and can be used in device
-tree sources.
-
-External clocks:
-
-There are several clocks that are generated outside the SoC. It is expected
-that they are defined using standard clock bindings with following
-clock-output-names:
- - "xti" - crystal input - required,
- - "ext" - external clock source - optional,
-
-Example: Clock controller node:
-
- clocks: clock-controller@4c000000 {
- compatible = "samsung,s3c2412-clock";
- reg = <0x4c000000 0x20>;
- #clock-cells = <1>;
- };
-
-Example: UART controller node that consumes the clock generated by the clock
- controller (refer to the standard clock bindings for information about
- "clocks" and "clock-names" properties):
-
- serial@50004000 {
- compatible = "samsung,s3c2412-uart";
- reg = <0x50004000 0x4000>;
- interrupts = <1 23 3 4>, <1 23 4 4>;
- clock-names = "uart", "clk_uart_baud2", "clk_uart_baud3";
- clocks = <&clocks PCLK_UART0>, <&clocks PCLK_UART0>,
- <&clocks SCLK_UART>;
- };
diff --git a/Documentation/devicetree/bindings/clock/samsung,s3c2443-clock.txt b/Documentation/devicetree/bindings/clock/samsung,s3c2443-clock.txt
deleted file mode 100644
index 985c0f574e9a..000000000000
--- a/Documentation/devicetree/bindings/clock/samsung,s3c2443-clock.txt
+++ /dev/null
@@ -1,55 +0,0 @@
-* Samsung S3C2443 Clock Controller
-
-The S3C2443 clock controller generates and supplies clock to various controllers
-within the SoC. The clock binding described here is applicable to all SoCs in
-the s3c24x family starting with the s3c2443.
-
-Required Properties:
-
-- compatible: should be one of the following.
- - "samsung,s3c2416-clock" - controller compatible with S3C2416 SoC.
- - "samsung,s3c2443-clock" - controller compatible with S3C2443 SoC.
- - "samsung,s3c2450-clock" - controller compatible with S3C2450 SoC.
-- reg: physical base address of the controller and length of memory mapped
- region.
-- #clock-cells: should be 1.
-
-Each clock is assigned an identifier and client nodes can use this identifier
-to specify the clock which they consume. Some of the clocks are available only
-on a particular SoC.
-
-All available clocks are defined as preprocessor macros in
-dt-bindings/clock/s3c2443.h header and can be used in device
-tree sources.
-
-External clocks:
-
-There are several clocks that are generated outside the SoC. It is expected
-that they are defined using standard clock bindings with following
-clock-output-names:
- - "xti" - crystal input - required,
- - "ext" - external clock source - optional,
- - "ext_i2s" - external I2S clock - optional,
- - "ext_uart" - external uart clock - optional,
-
-Example: Clock controller node:
-
- clocks: clock-controller@4c000000 {
- compatible = "samsung,s3c2416-clock";
- reg = <0x4c000000 0x40>;
- #clock-cells = <1>;
- };
-
-Example: UART controller node that consumes the clock generated by the clock
- controller (refer to the standard clock bindings for information about
- "clocks" and "clock-names" properties):
-
- serial@50004000 {
- compatible = "samsung,s3c2440-uart";
- reg = <0x50004000 0x4000>;
- interrupts = <1 23 3 4>, <1 23 4 4>;
- clock-names = "uart", "clk_uart_baud2",
- "clk_uart_baud3";
- clocks = <&clocks PCLK_UART0>, <&clocks PCLK_UART0>,
- <&clocks SCLK_UART>;
- };
diff --git a/Documentation/devicetree/bindings/cpufreq/cpufreq-qcom-hw.yaml b/Documentation/devicetree/bindings/cpufreq/cpufreq-qcom-hw.yaml
index 903b31129f01..e4aa8c67d532 100644
--- a/Documentation/devicetree/bindings/cpufreq/cpufreq-qcom-hw.yaml
+++ b/Documentation/devicetree/bindings/cpufreq/cpufreq-qcom-hw.yaml
@@ -26,8 +26,13 @@ properties:
items:
- enum:
- qcom,qdu1000-cpufreq-epss
+ - qcom,sc7280-cpufreq-epss
+ - qcom,sc8280xp-cpufreq-epss
- qcom,sm6375-cpufreq-epss
- qcom,sm8250-cpufreq-epss
+ - qcom,sm8350-cpufreq-epss
+ - qcom,sm8450-cpufreq-epss
+ - qcom,sm8550-cpufreq-epss
- const: qcom,cpufreq-epss
reg:
@@ -54,6 +59,17 @@ properties:
- const: xo
- const: alternate
+ interrupts:
+ minItems: 1
+ maxItems: 3
+
+ interrupt-names:
+ minItems: 1
+ items:
+ - const: dcvsh-irq-0
+ - const: dcvsh-irq-1
+ - const: dcvsh-irq-2
+
'#freq-domain-cells':
const: 1
diff --git a/Documentation/devicetree/bindings/cpufreq/qcom-cpufreq-nvmem.yaml b/Documentation/devicetree/bindings/cpufreq/qcom-cpufreq-nvmem.yaml
index 9c086eac6ca7..6f5e7904181f 100644
--- a/Documentation/devicetree/bindings/cpufreq/qcom-cpufreq-nvmem.yaml
+++ b/Documentation/devicetree/bindings/cpufreq/qcom-cpufreq-nvmem.yaml
@@ -17,6 +17,9 @@ description: |
on the CPU OPP in use. The CPUFreq driver sets the CPR power domain level
according to the required OPPs defined in the CPU OPP tables.
+ For old implementation efuses are parsed to select the correct opp table and
+ voltage and CPR is not supported/used.
+
select:
properties:
compatible:
@@ -33,37 +36,65 @@ select:
required:
- compatible
-properties:
- cpus:
- type: object
-
- patternProperties:
- '^cpu@[0-9a-f]+$':
- type: object
-
- properties:
- power-domains:
- maxItems: 1
-
- power-domain-names:
- items:
- - const: cpr
-
- required:
- - power-domains
- - power-domain-names
-
patternProperties:
'^opp-table(-[a-z0-9]+)?$':
- if:
+ allOf:
+ - if:
+ properties:
+ compatible:
+ const: operating-points-v2-kryo-cpu
+ then:
+ $ref: /schemas/opp/opp-v2-kryo-cpu.yaml#
+
+ - if:
+ properties:
+ compatible:
+ const: operating-points-v2-qcom-level
+ then:
+ $ref: /schemas/opp/opp-v2-qcom-level.yaml#
+
+ unevaluatedProperties: false
+
+allOf:
+ - if:
properties:
compatible:
- const: operating-points-v2-kryo-cpu
+ contains:
+ enum:
+ - qcom,qcs404
+
then:
+ properties:
+ cpus:
+ type: object
+
+ patternProperties:
+ '^cpu@[0-9a-f]+$':
+ type: object
+
+ properties:
+ power-domains:
+ maxItems: 1
+
+ power-domain-names:
+ items:
+ - const: cpr
+
+ required:
+ - power-domains
+ - power-domain-names
+
patternProperties:
- '^opp-?[0-9]+$':
- required:
- - required-opps
+ '^opp-table(-[a-z0-9]+)?$':
+ if:
+ properties:
+ compatible:
+ const: operating-points-v2-kryo-cpu
+ then:
+ patternProperties:
+ '^opp-?[0-9]+$':
+ required:
+ - required-opps
additionalProperties: true
diff --git a/Documentation/devicetree/bindings/crypto/atmel,at91sam9g46-aes.yaml b/Documentation/devicetree/bindings/crypto/atmel,at91sam9g46-aes.yaml
index 0ccaab16dc61..0b7383b3106b 100644
--- a/Documentation/devicetree/bindings/crypto/atmel,at91sam9g46-aes.yaml
+++ b/Documentation/devicetree/bindings/crypto/atmel,at91sam9g46-aes.yaml
@@ -8,7 +8,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml#
title: Atmel Advanced Encryption Standard (AES) HW cryptographic accelerator
maintainers:
- - Tudor Ambarus <tudor.ambarus@microchip.com>
+ - Tudor Ambarus <tudor.ambarus@linaro.org>
properties:
compatible:
diff --git a/Documentation/devicetree/bindings/crypto/atmel,at91sam9g46-sha.yaml b/Documentation/devicetree/bindings/crypto/atmel,at91sam9g46-sha.yaml
index 5163c51b4547..ee2ffb034325 100644
--- a/Documentation/devicetree/bindings/crypto/atmel,at91sam9g46-sha.yaml
+++ b/Documentation/devicetree/bindings/crypto/atmel,at91sam9g46-sha.yaml
@@ -8,7 +8,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml#
title: Atmel Secure Hash Algorithm (SHA) HW cryptographic accelerator
maintainers:
- - Tudor Ambarus <tudor.ambarus@microchip.com>
+ - Tudor Ambarus <tudor.ambarus@linaro.org>
properties:
compatible:
diff --git a/Documentation/devicetree/bindings/crypto/atmel,at91sam9g46-tdes.yaml b/Documentation/devicetree/bindings/crypto/atmel,at91sam9g46-tdes.yaml
index fcc5adf03cad..3d6ed24b1b00 100644
--- a/Documentation/devicetree/bindings/crypto/atmel,at91sam9g46-tdes.yaml
+++ b/Documentation/devicetree/bindings/crypto/atmel,at91sam9g46-tdes.yaml
@@ -8,7 +8,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml#
title: Atmel Triple Data Encryption Standard (TDES) HW cryptographic accelerator
maintainers:
- - Tudor Ambarus <tudor.ambarus@microchip.com>
+ - Tudor Ambarus <tudor.ambarus@linaro.org>
properties:
compatible:
diff --git a/Documentation/devicetree/bindings/display/msm/dsi-controller-main.yaml b/Documentation/devicetree/bindings/display/msm/dsi-controller-main.yaml
index f2c143730a55..6e2fd6e9fa7f 100644
--- a/Documentation/devicetree/bindings/display/msm/dsi-controller-main.yaml
+++ b/Documentation/devicetree/bindings/display/msm/dsi-controller-main.yaml
@@ -32,7 +32,7 @@ properties:
- description: Display byte clock
- description: Display byte interface clock
- description: Display pixel clock
- - description: Display escape clock
+ - description: Display core clock
- description: Display AHB clock
- description: Display AXI clock
@@ -137,8 +137,6 @@ required:
- phys
- assigned-clocks
- assigned-clock-parents
- - power-domains
- - operating-points-v2
- ports
additionalProperties: false
diff --git a/Documentation/devicetree/bindings/display/msm/dsi-phy-10nm.yaml b/Documentation/devicetree/bindings/display/msm/dsi-phy-10nm.yaml
index d9ad8b659f58..3ec466c3ab38 100644
--- a/Documentation/devicetree/bindings/display/msm/dsi-phy-10nm.yaml
+++ b/Documentation/devicetree/bindings/display/msm/dsi-phy-10nm.yaml
@@ -69,7 +69,6 @@ required:
- compatible
- reg
- reg-names
- - vdds-supply
unevaluatedProperties: false
diff --git a/Documentation/devicetree/bindings/display/msm/dsi-phy-14nm.yaml b/Documentation/devicetree/bindings/display/msm/dsi-phy-14nm.yaml
index 819de5ce0bc9..a43e11d3b00d 100644
--- a/Documentation/devicetree/bindings/display/msm/dsi-phy-14nm.yaml
+++ b/Documentation/devicetree/bindings/display/msm/dsi-phy-14nm.yaml
@@ -39,7 +39,6 @@ required:
- compatible
- reg
- reg-names
- - vcca-supply
unevaluatedProperties: false
diff --git a/Documentation/devicetree/bindings/display/msm/dsi-phy-28nm.yaml b/Documentation/devicetree/bindings/display/msm/dsi-phy-28nm.yaml
index 3d8540a06fe2..2f1fd140c87d 100644
--- a/Documentation/devicetree/bindings/display/msm/dsi-phy-28nm.yaml
+++ b/Documentation/devicetree/bindings/display/msm/dsi-phy-28nm.yaml
@@ -34,6 +34,10 @@ properties:
vddio-supply:
description: Phandle to vdd-io regulator device node.
+ qcom,dsi-phy-regulator-ldo-mode:
+ type: boolean
+ description: Indicates if the LDO mode PHY regulator is wanted.
+
required:
- compatible
- reg
diff --git a/Documentation/devicetree/bindings/display/msm/qcom,qcm2290-mdss.yaml b/Documentation/devicetree/bindings/display/msm/qcom,qcm2290-mdss.yaml
index d6f043a4b08d..4795e13c7b59 100644
--- a/Documentation/devicetree/bindings/display/msm/qcom,qcm2290-mdss.yaml
+++ b/Documentation/devicetree/bindings/display/msm/qcom,qcm2290-mdss.yaml
@@ -72,7 +72,7 @@ examples:
#include <dt-bindings/interconnect/qcom,qcm2290.h>
#include <dt-bindings/power/qcom-rpmpd.h>
- mdss@5e00000 {
+ display-subsystem@5e00000 {
#address-cells = <1>;
#size-cells = <1>;
compatible = "qcom,qcm2290-mdss";
diff --git a/Documentation/devicetree/bindings/display/msm/qcom,sm6115-mdss.yaml b/Documentation/devicetree/bindings/display/msm/qcom,sm6115-mdss.yaml
index a86d7f53fa84..886858ef6700 100644
--- a/Documentation/devicetree/bindings/display/msm/qcom,sm6115-mdss.yaml
+++ b/Documentation/devicetree/bindings/display/msm/qcom,sm6115-mdss.yaml
@@ -62,7 +62,7 @@ examples:
#include <dt-bindings/interrupt-controller/arm-gic.h>
#include <dt-bindings/power/qcom-rpmpd.h>
- mdss@5e00000 {
+ display-subsystem@5e00000 {
#address-cells = <1>;
#size-cells = <1>;
compatible = "qcom,sm6115-mdss";
diff --git a/Documentation/devicetree/bindings/display/tegra/nvidia,tegra186-display.yaml b/Documentation/devicetree/bindings/display/tegra/nvidia,tegra186-display.yaml
index ce5c673f940c..117c371ce24b 100644
--- a/Documentation/devicetree/bindings/display/tegra/nvidia,tegra186-display.yaml
+++ b/Documentation/devicetree/bindings/display/tegra/nvidia,tegra186-display.yaml
@@ -20,10 +20,10 @@ properties:
- nvidia,tegra194-display
'#address-cells':
- const: 1
+ enum: [ 1, 2 ]
'#size-cells':
- const: 1
+ enum: [ 1, 2 ]
reg:
maxItems: 1
diff --git a/Documentation/devicetree/bindings/i2c/renesas,rzv2m.yaml b/Documentation/devicetree/bindings/i2c/renesas,rzv2m.yaml
index c46378efc123..92e899905ef8 100644
--- a/Documentation/devicetree/bindings/i2c/renesas,rzv2m.yaml
+++ b/Documentation/devicetree/bindings/i2c/renesas,rzv2m.yaml
@@ -16,7 +16,7 @@ properties:
compatible:
items:
- enum:
- - renesas,i2c-r9a09g011 # RZ/V2M
+ - renesas,r9a09g011-i2c # RZ/V2M
- const: renesas,rzv2m-i2c
reg:
@@ -66,7 +66,7 @@ examples:
#include <dt-bindings/interrupt-controller/arm-gic.h>
i2c0: i2c@a4030000 {
- compatible = "renesas,i2c-r9a09g011", "renesas,rzv2m-i2c";
+ compatible = "renesas,r9a09g011-i2c", "renesas,rzv2m-i2c";
reg = <0xa4030000 0x80>;
interrupts = <GIC_SPI 232 IRQ_TYPE_EDGE_RISING>,
<GIC_SPI 236 IRQ_TYPE_EDGE_RISING>;
diff --git a/Documentation/devicetree/bindings/interconnect/qcom,qdu1000-rpmh.yaml b/Documentation/devicetree/bindings/interconnect/qcom,qdu1000-rpmh.yaml
new file mode 100644
index 000000000000..0070b0396e31
--- /dev/null
+++ b/Documentation/devicetree/bindings/interconnect/qcom,qdu1000-rpmh.yaml
@@ -0,0 +1,70 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/interconnect/qcom,qdu1000-rpmh.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Qualcomm RPMh Network-On-Chip Interconnect on QDU1000
+
+maintainers:
+ - Georgi Djakov <djakov@kernel.org>
+ - Odelu Kukatla <quic_okukatla@quicinc.com>
+
+description: |
+ RPMh interconnect providers support system bandwidth requirements through
+ RPMh hardware accelerators known as Bus Clock Manager (BCM). The provider is
+ able to communicate with the BCM through the Resource State Coordinator (RSC)
+ associated with each execution environment. Provider nodes must point to at
+ least one RPMh device child node pertaining to their RSC and each provider
+ can map to multiple RPMh resources.
+
+properties:
+ compatible:
+ enum:
+ - qcom,qdu1000-clk-virt
+ - qcom,qdu1000-gem-noc
+ - qcom,qdu1000-mc-virt
+ - qcom,qdu1000-system-noc
+
+ '#interconnect-cells': true
+
+ reg:
+ maxItems: 1
+
+allOf:
+ - $ref: qcom,rpmh-common.yaml#
+ - if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - qcom,qdu1000-clk-virt
+ - qcom,qdu1000-mc-virt
+ then:
+ properties:
+ reg: false
+ else:
+ required:
+ - reg
+
+required:
+ - compatible
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/interconnect/qcom,qdu1000-rpmh.h>
+
+ system_noc: interconnect@1640000 {
+ compatible = "qcom,qdu1000-system-noc";
+ reg = <0x1640000 0x45080>;
+ #interconnect-cells = <2>;
+ qcom,bcm-voters = <&apps_bcm_voter>;
+ };
+
+ clk_virt: interconnect-0 {
+ compatible = "qcom,qdu1000-clk-virt";
+ #interconnect-cells = <2>;
+ qcom,bcm-voters = <&apps_bcm_voter>;
+ };
diff --git a/Documentation/devicetree/bindings/interconnect/qcom,rpm.yaml b/Documentation/devicetree/bindings/interconnect/qcom,rpm.yaml
index 4b37aa88a375..5e6be4e79201 100644
--- a/Documentation/devicetree/bindings/interconnect/qcom,rpm.yaml
+++ b/Documentation/devicetree/bindings/interconnect/qcom,rpm.yaml
@@ -84,7 +84,6 @@ allOf:
- qcom,msm8939-pcnoc
- qcom,msm8939-snoc
- qcom,msm8996-a1noc
- - qcom,msm8996-a2noc
- qcom,msm8996-bimc
- qcom,msm8996-cnoc
- qcom,msm8996-pnoc
@@ -191,6 +190,29 @@ allOf:
compatible:
contains:
enum:
+ - qcom,msm8996-a2noc
+
+ then:
+ properties:
+ clock-names:
+ items:
+ - const: bus
+ - const: bus_a
+ - const: aggre2_ufs_axi
+ - const: ufs_axi
+
+ clocks:
+ items:
+ - description: Bus Clock
+ - description: Bus A Clock
+ - description: Aggregate2 NoC UFS AXI Clock
+ - description: UFS AXI Clock
+
+ - if:
+ properties:
+ compatible:
+ contains:
+ enum:
- qcom,sdm660-a2noc
then:
diff --git a/Documentation/devicetree/bindings/interconnect/qcom,sm8550-rpmh.yaml b/Documentation/devicetree/bindings/interconnect/qcom,sm8550-rpmh.yaml
new file mode 100644
index 000000000000..716bd21f6041
--- /dev/null
+++ b/Documentation/devicetree/bindings/interconnect/qcom,sm8550-rpmh.yaml
@@ -0,0 +1,139 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/interconnect/qcom,sm8550-rpmh.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Qualcomm RPMh Network-On-Chip Interconnect on SM8550
+
+maintainers:
+ - Abel Vesa <abel.vesa@linaro.org>
+ - Neil Armstrong <neil.armstrong@linaro.org>
+
+description: |
+ RPMh interconnect providers support system bandwidth requirements through
+ RPMh hardware accelerators known as Bus Clock Manager (BCM). The provider is
+ able to communicate with the BCM through the Resource State Coordinator (RSC)
+ associated with each execution environment. Provider nodes must point to at
+ least one RPMh device child node pertaining to their RSC and each provider
+ can map to multiple RPMh resources.
+
+ See also:: include/dt-bindings/interconnect/qcom,sm8550-rpmh.h
+
+properties:
+ compatible:
+ enum:
+ - qcom,sm8550-aggre1-noc
+ - qcom,sm8550-aggre2-noc
+ - qcom,sm8550-clk-virt
+ - qcom,sm8550-cnoc-main
+ - qcom,sm8550-config-noc
+ - qcom,sm8550-gem-noc
+ - qcom,sm8550-lpass-ag-noc
+ - qcom,sm8550-lpass-lpiaon-noc
+ - qcom,sm8550-lpass-lpicx-noc
+ - qcom,sm8550-mc-virt
+ - qcom,sm8550-mmss-noc
+ - qcom,sm8550-nsp-noc
+ - qcom,sm8550-pcie-anoc
+ - qcom,sm8550-system-noc
+
+ reg:
+ maxItems: 1
+
+ clocks:
+ minItems: 1
+ maxItems: 2
+
+allOf:
+ - $ref: qcom,rpmh-common.yaml#
+ - if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - qcom,sm8550-clk-virt
+ - qcom,sm8550-mc-virt
+ then:
+ properties:
+ reg: false
+ else:
+ required:
+ - reg
+
+ - if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - qcom,sm8550-pcie-anoc
+ then:
+ properties:
+ clocks:
+ items:
+ - description: aggre-NOC PCIe AXI clock
+ - description: cfg-NOC PCIe a-NOC AHB clock
+
+ - if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - qcom,sm8550-aggre1-noc
+ then:
+ properties:
+ clocks:
+ items:
+ - description: aggre UFS PHY AXI clock
+ - description: aggre USB3 PRIM AXI clock
+
+ - if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - qcom,sm8550-aggre2-noc
+ then:
+ properties:
+ clocks:
+ items:
+ - description: RPMH CC IPA clock
+
+ - if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - qcom,sm8550-aggre1-noc
+ - qcom,sm8550-aggre2-noc
+ - qcom,sm8550-pcie-anoc
+ then:
+ required:
+ - clocks
+ else:
+ properties:
+ clocks: false
+
+required:
+ - compatible
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/clock/qcom,sm8550-gcc.h>
+
+ clk_virt: interconnect-0 {
+ compatible = "qcom,sm8550-clk-virt";
+ #interconnect-cells = <2>;
+ qcom,bcm-voters = <&apps_bcm_voter>;
+ };
+
+ aggre1_noc: interconnect@16e0000 {
+ compatible = "qcom,sm8550-aggre1-noc";
+ reg = <0x016e0000 0x14400>;
+ #interconnect-cells = <2>;
+ clocks = <&gcc GCC_AGGRE_UFS_PHY_AXI_CLK>,
+ <&gcc GCC_AGGRE_USB3_PRIM_AXI_CLK>;
+ qcom,bcm-voters = <&apps_bcm_voter>;
+ };
diff --git a/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.yaml b/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.yaml
index 9f7d3e11aacb..8449e14af9f3 100644
--- a/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.yaml
+++ b/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.yaml
@@ -108,7 +108,7 @@ properties:
msi-controller:
description:
- Only present if the Message Based Interrupt functionnality is
+ Only present if the Message Based Interrupt functionality is
being exposed by the HW, and the mbi-ranges property present.
mbi-ranges:
diff --git a/Documentation/devicetree/bindings/interrupt-controller/samsung,s3c24xx-irq.txt b/Documentation/devicetree/bindings/interrupt-controller/samsung,s3c24xx-irq.txt
deleted file mode 100644
index c54c5a9a2a90..000000000000
--- a/Documentation/devicetree/bindings/interrupt-controller/samsung,s3c24xx-irq.txt
+++ /dev/null
@@ -1,53 +0,0 @@
-Samsung S3C24XX Interrupt Controllers
-
-The S3C24XX SoCs contain a custom set of interrupt controllers providing a
-varying number of interrupt sources. The set consists of a main- and sub-
-controller and on newer SoCs even a second main controller.
-
-Required properties:
-- compatible: Compatible property value should be "samsung,s3c2410-irq"
- for machines before s3c2416 and "samsung,s3c2416-irq" for s3c2416 and later.
-
-- reg: Physical base address of the controller and length of memory mapped
- region.
-
-- interrupt-controller : Identifies the node as an interrupt controller
-
-- #interrupt-cells : Specifies the number of cells needed to encode an
- interrupt source. The value shall be 4 and interrupt descriptor shall
- have the following format:
- <ctrl_num parent_irq ctrl_irq type>
-
- ctrl_num contains the controller to use:
- - 0 ... main controller
- - 1 ... sub controller
- - 2 ... second main controller on s3c2416 and s3c2450
- parent_irq contains the parent bit in the main controller and will be
- ignored in main controllers
- ctrl_irq contains the interrupt bit of the controller
- type contains the trigger type to use
-
-Example:
-
- interrupt-controller@4a000000 {
- compatible = "samsung,s3c2410-irq";
- reg = <0x4a000000 0x100>;
- interrupt-controller;
- #interrupt-cells=<4>;
- };
-
- [...]
-
- serial@50000000 {
- compatible = "samsung,s3c2410-uart";
- reg = <0x50000000 0x4000>;
- interrupt-parent = <&subintc>;
- interrupts = <1 28 0 4>, <1 28 1 4>;
- };
-
- rtc@57000000 {
- compatible = "samsung,s3c2410-rtc";
- reg = <0x57000000 0x100>;
- interrupt-parent = <&intc>;
- interrupts = <0 30 0 3>, <0 8 0 3>;
- };
diff --git a/Documentation/devicetree/bindings/memory-controllers/xlnx,zynqmp-ocmc-1.0.yaml b/Documentation/devicetree/bindings/memory-controllers/xlnx,zynqmp-ocmc-1.0.yaml
new file mode 100644
index 000000000000..ca9fc747bf4f
--- /dev/null
+++ b/Documentation/devicetree/bindings/memory-controllers/xlnx,zynqmp-ocmc-1.0.yaml
@@ -0,0 +1,45 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/memory-controllers/xlnx,zynqmp-ocmc-1.0.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Xilinx Zynqmp OCM(On-Chip Memory) Controller
+
+maintainers:
+ - Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>
+ - Sai Krishna Potthuri <sai.krishna.potthuri@amd.com>
+
+description: |
+ The OCM supports 64-bit wide ECC functionality to detect multi-bit errors
+ and recover from a single-bit memory fault.On a write, if all bytes are
+ being written, the ECC is generated and written into the ECC RAM along with
+ the write-data that is written into the data RAM. If one or more bytes are
+ not written, then the read operation results in an correctable error or
+ uncorrectable error.
+
+properties:
+ compatible:
+ const: xlnx,zynqmp-ocmc-1.0
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ maxItems: 1
+
+required:
+ - compatible
+ - reg
+ - interrupts
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/interrupt-controller/arm-gic.h>
+ memory-controller@ff960000 {
+ compatible = "xlnx,zynqmp-ocmc-1.0";
+ reg = <0xff960000 0x1000>;
+ interrupts = <GIC_SPI 10 IRQ_TYPE_LEVEL_HIGH>;
+ };
diff --git a/Documentation/devicetree/bindings/mfd/syscon.yaml b/Documentation/devicetree/bindings/mfd/syscon.yaml
index 1b01bd010431..b73ba1ea08f7 100644
--- a/Documentation/devicetree/bindings/mfd/syscon.yaml
+++ b/Documentation/devicetree/bindings/mfd/syscon.yaml
@@ -64,12 +64,6 @@ properties:
- rockchip,rk3568-qos
- rockchip,rk3588-qos
- rockchip,rv1126-qos
- - samsung,exynos3-sysreg
- - samsung,exynos4-sysreg
- - samsung,exynos5-sysreg
- - samsung,exynos5433-sysreg
- - samsung,exynos850-sysreg
- - samsung,exynosautov9-sysreg
- const: syscon
diff --git a/Documentation/devicetree/bindings/mmc/samsung,s3cmci.txt b/Documentation/devicetree/bindings/mmc/samsung,s3cmci.txt
deleted file mode 100644
index 5f68feb9f9d6..000000000000
--- a/Documentation/devicetree/bindings/mmc/samsung,s3cmci.txt
+++ /dev/null
@@ -1,42 +0,0 @@
-* Samsung's S3C24XX MMC/SD/SDIO controller device tree bindings
-
-Samsung's S3C24XX MMC/SD/SDIO controller is used as a connectivity interface
-with external MMC, SD and SDIO storage mediums.
-
-This file documents differences between the core mmc properties described by
-mmc.txt and the properties used by the Samsung S3C24XX MMC/SD/SDIO controller
-implementation.
-
-Required SoC Specific Properties:
-- compatible: should be one of the following
- - "samsung,s3c2410-sdi": for controllers compatible with s3c2410
- - "samsung,s3c2412-sdi": for controllers compatible with s3c2412
- - "samsung,s3c2440-sdi": for controllers compatible with s3c2440
-- reg: register location and length
-- interrupts: mmc controller interrupt
-- clocks: Should reference the controller clock
-- clock-names: Should contain "sdi"
-
-Required Board Specific Properties:
-- pinctrl-0: Should specify pin control groups used for this controller.
-- pinctrl-names: Should contain only one value - "default".
-
-Optional Properties:
-- bus-width: number of data lines (see mmc.txt)
-- cd-gpios: gpio for card detection (see mmc.txt)
-- wp-gpios: gpio for write protection (see mmc.txt)
-
-Example:
-
- mmc0: mmc@5a000000 {
- compatible = "samsung,s3c2440-sdi";
- pinctrl-names = "default";
- pinctrl-0 = <&sdi_pins>;
- reg = <0x5a000000 0x100000>;
- interrupts = <0 0 21 3>;
- clocks = <&clocks PCLK_SDI>;
- clock-names = "sdi";
- bus-width = <4>;
- cd-gpios = <&gpg 8 GPIO_ACTIVE_LOW>;
- wp-gpios = <&gph 8 GPIO_ACTIVE_LOW>;
- };
diff --git a/Documentation/devicetree/bindings/opp/opp-v2-kryo-cpu.yaml b/Documentation/devicetree/bindings/opp/opp-v2-kryo-cpu.yaml
index 60cf3cbde4c5..bbbad31ae4ca 100644
--- a/Documentation/devicetree/bindings/opp/opp-v2-kryo-cpu.yaml
+++ b/Documentation/devicetree/bindings/opp/opp-v2-kryo-cpu.yaml
@@ -50,12 +50,22 @@ patternProperties:
opp-supported-hw:
description: |
A single 32 bit bitmap value, representing compatible HW.
- Bitmap:
+ Bitmap for MSM8996 format:
0: MSM8996, speedbin 0
1: MSM8996, speedbin 1
2: MSM8996, speedbin 2
- 3-31: unused
- maximum: 0x7
+ 3: MSM8996, speedbin 3
+ 4-31: unused
+
+ Bitmap for MSM8996SG format (speedbin shifted of 4 left):
+ 0-3: unused
+ 4: MSM8996SG, speedbin 0
+ 5: MSM8996SG, speedbin 1
+ 6: MSM8996SG, speedbin 2
+ 7-31: unused
+ enum: [0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7,
+ 0x9, 0xd, 0xe, 0xf,
+ 0x10, 0x20, 0x30, 0x70]
clock-latency-ns: true
@@ -106,6 +116,7 @@ examples:
L2_0: l2-cache {
compatible = "cache";
cache-level = <2>;
+ cache-unified;
};
};
@@ -140,6 +151,7 @@ examples:
L2_1: l2-cache {
compatible = "cache";
cache-level = <2>;
+ cache-unified;
};
};
diff --git a/Documentation/devicetree/bindings/opp/opp-v2-qcom-level.yaml b/Documentation/devicetree/bindings/opp/opp-v2-qcom-level.yaml
index b9ce2e099ce9..a30ef93213c0 100644
--- a/Documentation/devicetree/bindings/opp/opp-v2-qcom-level.yaml
+++ b/Documentation/devicetree/bindings/opp/opp-v2-qcom-level.yaml
@@ -30,7 +30,9 @@ patternProperties:
this OPP node. Sometimes several corners/levels shares a certain fuse
corner/level. A fuse corner/level contains e.g. ref uV, min uV,
and max uV.
- $ref: /schemas/types.yaml#/definitions/uint32
+ $ref: /schemas/types.yaml#/definitions/uint32-array
+ minItems: 1
+ maxItems: 2
required:
- opp-level
diff --git a/Documentation/devicetree/bindings/phy/amlogic,meson-g12a-usb2-phy.yaml b/Documentation/devicetree/bindings/phy/amlogic,g12a-usb2-phy.yaml
index f3a5fbabbbb5..bb01c6b34dab 100644
--- a/Documentation/devicetree/bindings/phy/amlogic,meson-g12a-usb2-phy.yaml
+++ b/Documentation/devicetree/bindings/phy/amlogic,g12a-usb2-phy.yaml
@@ -2,7 +2,7 @@
# Copyright 2019 BayLibre, SAS
%YAML 1.2
---
-$id: "http://devicetree.org/schemas/phy/amlogic,meson-g12a-usb2-phy.yaml#"
+$id: "http://devicetree.org/schemas/phy/amlogic,g12a-usb2-phy.yaml#"
$schema: "http://devicetree.org/meta-schemas/core.yaml#"
title: Amlogic G12A USB2 PHY
@@ -13,8 +13,8 @@ maintainers:
properties:
compatible:
enum:
- - amlogic,meson-g12a-usb2-phy
- - amlogic,meson-a1-usb2-phy
+ - amlogic,g12a-usb2-phy
+ - amlogic,a1-usb2-phy
reg:
maxItems: 1
@@ -68,7 +68,7 @@ additionalProperties: false
examples:
- |
phy@36000 {
- compatible = "amlogic,meson-g12a-usb2-phy";
+ compatible = "amlogic,g12a-usb2-phy";
reg = <0x36000 0x2000>;
clocks = <&xtal>;
clock-names = "xtal";
diff --git a/Documentation/devicetree/bindings/phy/amlogic,meson-g12a-usb3-pcie-phy.yaml b/Documentation/devicetree/bindings/phy/amlogic,g12a-usb3-pcie-phy.yaml
index 868b4e6fde71..129d26e99776 100644
--- a/Documentation/devicetree/bindings/phy/amlogic,meson-g12a-usb3-pcie-phy.yaml
+++ b/Documentation/devicetree/bindings/phy/amlogic,g12a-usb3-pcie-phy.yaml
@@ -2,7 +2,7 @@
# Copyright 2019 BayLibre, SAS
%YAML 1.2
---
-$id: "http://devicetree.org/schemas/phy/amlogic,meson-g12a-usb3-pcie-phy.yaml#"
+$id: "http://devicetree.org/schemas/phy/amlogic,g12a-usb3-pcie-phy.yaml#"
$schema: "http://devicetree.org/meta-schemas/core.yaml#"
title: Amlogic G12A USB3 + PCIE Combo PHY
@@ -13,7 +13,7 @@ maintainers:
properties:
compatible:
enum:
- - amlogic,meson-g12a-usb3-pcie-phy
+ - amlogic,g12a-usb3-pcie-phy
reg:
maxItems: 1
@@ -49,7 +49,7 @@ additionalProperties: false
examples:
- |
phy@46000 {
- compatible = "amlogic,meson-g12a-usb3-pcie-phy";
+ compatible = "amlogic,g12a-usb3-pcie-phy";
reg = <0x46000 0x2000>;
clocks = <&ref_clk>;
clock-names = "ref_clk";
diff --git a/Documentation/devicetree/bindings/phy/qcom,usb-hs-28nm.yaml b/Documentation/devicetree/bindings/phy/qcom,usb-hs-28nm.yaml
index abcc4373f39e..ca6a0836b53c 100644
--- a/Documentation/devicetree/bindings/phy/qcom,usb-hs-28nm.yaml
+++ b/Documentation/devicetree/bindings/phy/qcom,usb-hs-28nm.yaml
@@ -16,7 +16,6 @@ properties:
compatible:
enum:
- qcom,usb-hs-28nm-femtophy
- - qcom,usb-hs-28nm-mdm9607
reg:
maxItems: 1
diff --git a/Documentation/devicetree/bindings/phy/phy-rockchip-inno-usb2.yaml b/Documentation/devicetree/bindings/phy/rockchip,inno-usb2phy.yaml
index f71920082fa3..0d6b8c28be07 100644
--- a/Documentation/devicetree/bindings/phy/phy-rockchip-inno-usb2.yaml
+++ b/Documentation/devicetree/bindings/phy/rockchip,inno-usb2phy.yaml
@@ -1,7 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
%YAML 1.2
---
-$id: http://devicetree.org/schemas/phy/phy-rockchip-inno-usb2.yaml#
+$id: http://devicetree.org/schemas/phy/rockchip,inno-usb2phy.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: Rockchip USB2.0 phy with inno IP block
diff --git a/Documentation/devicetree/bindings/phy/rockchip,rk3288-dp-phy.yaml b/Documentation/devicetree/bindings/phy/rockchip,rk3288-dp-phy.yaml
new file mode 100644
index 000000000000..2538235c5ac6
--- /dev/null
+++ b/Documentation/devicetree/bindings/phy/rockchip,rk3288-dp-phy.yaml
@@ -0,0 +1,41 @@
+# SPDX-License-Identifier: GPL-2.0
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/phy/rockchip,rk3288-dp-phy.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Rockchip specific extensions to the Analogix Display Port PHY
+
+maintainers:
+ - Heiko Stuebner <heiko@sntech.de>
+
+properties:
+ compatible:
+ const: rockchip,rk3288-dp-phy
+
+ clocks:
+ maxItems: 1
+
+ clock-names:
+ const: 24m
+
+ "#phy-cells":
+ const: 0
+
+required:
+ - compatible
+ - clocks
+ - clock-names
+ - "#phy-cells"
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/clock/rk3288-cru.h>
+ edp-phy {
+ compatible = "rockchip,rk3288-dp-phy";
+ clocks = <&cru SCLK_EDP_24M>;
+ clock-names = "24m";
+ #phy-cells = <0>;
+ };
diff --git a/Documentation/devicetree/bindings/phy/rockchip-dp-phy.txt b/Documentation/devicetree/bindings/phy/rockchip-dp-phy.txt
deleted file mode 100644
index e3b4809fbe82..000000000000
--- a/Documentation/devicetree/bindings/phy/rockchip-dp-phy.txt
+++ /dev/null
@@ -1,26 +0,0 @@
-Rockchip specific extensions to the Analogix Display Port PHY
-------------------------------------
-
-Required properties:
-- compatible : should be one of the following supported values:
- - "rockchip.rk3288-dp-phy"
-- clocks: from common clock binding: handle to dp clock.
- of memory mapped region.
-- clock-names: from common clock binding:
- Required elements: "24m"
-- #phy-cells : from the generic PHY bindings, must be 0;
-
-Example:
-
-grf: syscon@ff770000 {
- compatible = "rockchip,rk3288-grf", "syscon", "simple-mfd";
-
-...
-
- edp_phy: edp-phy {
- compatible = "rockchip,rk3288-dp-phy";
- clocks = <&cru SCLK_EDP_24M>;
- clock-names = "24m";
- #phy-cells = <0>;
- };
-};
diff --git a/Documentation/devicetree/bindings/power/fsl,imx-gpc.yaml b/Documentation/devicetree/bindings/power/fsl,imx-gpc.yaml
index 777e1d852ddd..c21a66422d4f 100644
--- a/Documentation/devicetree/bindings/power/fsl,imx-gpc.yaml
+++ b/Documentation/devicetree/bindings/power/fsl,imx-gpc.yaml
@@ -23,11 +23,16 @@ description: |
properties:
compatible:
- enum:
- - fsl,imx6q-gpc
- - fsl,imx6qp-gpc
- - fsl,imx6sl-gpc
- - fsl,imx6sx-gpc
+ oneOf:
+ - enum:
+ - fsl,imx6q-gpc
+ - items:
+ - enum:
+ - fsl,imx6qp-gpc
+ - fsl,imx6sl-gpc
+ - fsl,imx6sx-gpc
+ - fsl,imx6ul-gpc
+ - const: fsl,imx6q-gpc
reg:
maxItems: 1
@@ -35,6 +40,10 @@ properties:
interrupts:
maxItems: 1
+ interrupt-controller: true
+ '#interrupt-cells':
+ const: 3
+
clocks:
maxItems: 1
diff --git a/Documentation/devicetree/bindings/regulator/samsung,s2mps14.yaml b/Documentation/devicetree/bindings/regulator/samsung,s2mps14.yaml
index 01f9d4e236e9..a7feb497eb89 100644
--- a/Documentation/devicetree/bindings/regulator/samsung,s2mps14.yaml
+++ b/Documentation/devicetree/bindings/regulator/samsung,s2mps14.yaml
@@ -19,8 +19,8 @@ description: |
additional information and example.
patternProperties:
- # 25 LDOs
- "^LDO([1-9]|[1][0-9]|2[0-5])$":
+ # 25 LDOs, without LDO10-12
+ "^LDO([1-9]|1[3-9]|2[0-5])$":
type: object
$ref: regulator.yaml#
unevaluatedProperties: false
@@ -30,6 +30,23 @@ patternProperties:
required:
- regulator-name
+ "^LDO(1[0-2])$":
+ type: object
+ $ref: regulator.yaml#
+ unevaluatedProperties: false
+ description:
+ Properties for single LDO regulator.
+
+ properties:
+ samsung,ext-control-gpios:
+ maxItems: 1
+ description:
+ LDO10, LDO11 and LDO12 can be configured to external control over
+ GPIO.
+
+ required:
+ - regulator-name
+
# 5 bucks
"^BUCK[1-5]$":
type: object
diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml b/Documentation/devicetree/bindings/riscv/cpus.yaml
index c6720764e765..a2884e3113da 100644
--- a/Documentation/devicetree/bindings/riscv/cpus.yaml
+++ b/Documentation/devicetree/bindings/riscv/cpus.yaml
@@ -83,7 +83,7 @@ properties:
insensitive, letters in the riscv,isa string must be all
lowercase to simplify parsing.
$ref: "/schemas/types.yaml#/definitions/string"
- pattern: ^rv(?:64|32)imaf?d?q?c?b?v?k?h?(?:_[hsxz](?:[a-z])+)*$
+ pattern: ^rv(?:64|32)imaf?d?q?c?b?k?j?p?v?h?(?:[hsxz](?:[a-z])+)?(?:_[hsxz](?:[a-z])+)*$
# RISC-V requires 'timebase-frequency' in /cpus, so disallow it here
timebase-frequency: false
diff --git a/Documentation/devicetree/bindings/riscv/microchip.yaml b/Documentation/devicetree/bindings/riscv/microchip.yaml
index 714d0fcab399..4a29c890619a 100644
--- a/Documentation/devicetree/bindings/riscv/microchip.yaml
+++ b/Documentation/devicetree/bindings/riscv/microchip.yaml
@@ -27,6 +27,7 @@ properties:
- items:
- enum:
+ - aldec,tysom-m-mpfs250t-rev2
- aries,m100pfsevp
- microchip,mpfs-sev-kit
- sundance,polarberry
diff --git a/Documentation/devicetree/bindings/riscv/starfive.yaml b/Documentation/devicetree/bindings/riscv/starfive.yaml
index 5d3fcee52d59..cc4d92f0a1bf 100644
--- a/Documentation/devicetree/bindings/riscv/starfive.yaml
+++ b/Documentation/devicetree/bindings/riscv/starfive.yaml
@@ -24,6 +24,12 @@ properties:
- starfive,visionfive-v1
- const: starfive,jh7100
+ - items:
+ - enum:
+ - starfive,visionfive-2-v1.2a
+ - starfive,visionfive-2-v1.3b
+ - const: starfive,jh7110
+
additionalProperties: true
...
diff --git a/Documentation/devicetree/bindings/riscv/sunxi.yaml b/Documentation/devicetree/bindings/riscv/sunxi.yaml
new file mode 100644
index 000000000000..9edb5e5992b1
--- /dev/null
+++ b/Documentation/devicetree/bindings/riscv/sunxi.yaml
@@ -0,0 +1,69 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/riscv/sunxi.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Allwinner RISC-V SoC-based boards
+
+maintainers:
+ - Chen-Yu Tsai <wens@csie.org>
+ - Jernej Skrabec <jernej.skrabec@gmail.com>
+ - Samuel Holland <samuel@sholland.org>
+
+description:
+ Allwinner RISC-V SoC-based boards
+
+properties:
+ $nodename:
+ const: '/'
+ compatible:
+ oneOf:
+ - description: Dongshan Nezha STU SoM
+ items:
+ - const: 100ask,dongshan-nezha-stu
+ - const: allwinner,sun20i-d1
+
+ - description: D1 Nezha board
+ items:
+ - const: allwinner,d1-nezha
+ - const: allwinner,sun20i-d1
+
+ - description: ClockworkPi R-01 SoM and v3.14 board
+ items:
+ - const: clockwork,r-01-clockworkpi-v3.14
+ - const: allwinner,sun20i-d1
+
+ - description: ClockworkPi R-01 SoM, v3.14 board, and DevTerm expansion
+ items:
+ - const: clockwork,r-01-devterm-v3.14
+ - const: clockwork,r-01-clockworkpi-v3.14
+ - const: allwinner,sun20i-d1
+
+ - description: Lichee RV SoM
+ items:
+ - const: sipeed,lichee-rv
+ - const: allwinner,sun20i-d1
+
+ - description: Carrier boards for the Lichee RV SoM
+ items:
+ - enum:
+ - sipeed,lichee-rv-86-panel-480p
+ - sipeed,lichee-rv-86-panel-720p
+ - sipeed,lichee-rv-dock
+ - const: sipeed,lichee-rv
+ - const: allwinner,sun20i-d1
+
+ - description: MangoPi MQ board
+ items:
+ - const: widora,mangopi-mq
+ - const: allwinner,sun20i-d1s
+
+ - description: MangoPi MQ Pro board
+ items:
+ - const: widora,mangopi-mq-pro
+ - const: allwinner,sun20i-d1
+
+additionalProperties: true
+
+...
diff --git a/Documentation/devicetree/bindings/rtc/qcom-pm8xxx-rtc.yaml b/Documentation/devicetree/bindings/rtc/qcom-pm8xxx-rtc.yaml
index 0a7aa29563c1..21c8ea08ff0a 100644
--- a/Documentation/devicetree/bindings/rtc/qcom-pm8xxx-rtc.yaml
+++ b/Documentation/devicetree/bindings/rtc/qcom-pm8xxx-rtc.yaml
@@ -40,6 +40,8 @@ properties:
description:
Indicates that the setting of RTC time is allowed by the host CPU.
+ wakeup-source: true
+
required:
- compatible
- reg
diff --git a/Documentation/devicetree/bindings/soc/imx/fsl,imx-iomuxc-gpr.yaml b/Documentation/devicetree/bindings/soc/imx/fsl,imx-iomuxc-gpr.yaml
new file mode 100644
index 000000000000..1da1b758b4ae
--- /dev/null
+++ b/Documentation/devicetree/bindings/soc/imx/fsl,imx-iomuxc-gpr.yaml
@@ -0,0 +1,57 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/soc/imx/fsl,imx-iomuxc-gpr.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Freescale IOMUX Controller General Purpose Registers
+
+maintainers:
+ - Peng Fan <peng.fan@nxp.com>
+
+description:
+ i.MX Processors have an IOMUXC General Purpose Register group for
+ various System Settings
+
+properties:
+ compatible:
+ oneOf:
+ - items:
+ - const: fsl,imx8mq-iomuxc-gpr
+ - const: syscon
+ - const: simple-mfd
+ - items:
+ - enum:
+ - fsl,imx8mm-iomuxc-gpr
+ - fsl,imx8mn-iomuxc-gpr
+ - fsl,imx8mp-iomuxc-gpr
+ - const: syscon
+
+ reg:
+ maxItems: 1
+
+ mux-controller:
+ type: object
+ $ref: /schemas/mux/reg-mux.yaml
+
+additionalProperties: false
+
+required:
+ - compatible
+ - reg
+
+examples:
+ # Pinmux controller node
+ - |
+ iomuxc_gpr: syscon@30340000 {
+ compatible = "fsl,imx8mq-iomuxc-gpr", "syscon", "simple-mfd";
+ reg = <0x30340000 0x10000>;
+
+ mux: mux-controller {
+ compatible = "mmio-mux";
+ #mux-control-cells = <1>;
+ mux-reg-masks = <0x34 0x00000004>; /* MIPI_MUX_SEL */
+ };
+ };
+
+...
diff --git a/Documentation/devicetree/bindings/soc/imx/fsl,imx8mp-hsio-blk-ctrl.yaml b/Documentation/devicetree/bindings/soc/imx/fsl,imx8mp-hsio-blk-ctrl.yaml
index c29181a9745b..1fe68b53b1d8 100644
--- a/Documentation/devicetree/bindings/soc/imx/fsl,imx8mp-hsio-blk-ctrl.yaml
+++ b/Documentation/devicetree/bindings/soc/imx/fsl,imx8mp-hsio-blk-ctrl.yaml
@@ -39,6 +39,9 @@ properties:
- const: pcie
- const: pcie-phy
+ '#clock-cells':
+ const: 0
+
clocks:
minItems: 2
maxItems: 2
@@ -85,4 +88,5 @@ examples:
power-domain-names = "bus", "usb", "usb-phy1",
"usb-phy2", "pcie", "pcie-phy";
#power-domain-cells = <1>;
+ #clock-cells = <0>;
};
diff --git a/Documentation/devicetree/bindings/soc/qcom/qcom,apr-services.yaml b/Documentation/devicetree/bindings/soc/qcom/qcom,apr-services.yaml
index 290555426c39..bdf482db32aa 100644
--- a/Documentation/devicetree/bindings/soc/qcom/qcom,apr-services.yaml
+++ b/Documentation/devicetree/bindings/soc/qcom/qcom,apr-services.yaml
@@ -39,8 +39,8 @@ properties:
qcom,protection-domain:
$ref: /schemas/types.yaml#/definitions/string-array
description: |
- Protection domain service name and path for APR service
- possible values are::
+ Protection domain service name and path for APR service (if supported).
+ Possible values are::
"avs/audio", "msm/adsp/audio_pd".
"kernel/elf_loader", "msm/modem/wlan_pd".
"tms/servreg", "msm/adsp/audio_pd".
@@ -49,6 +49,5 @@ properties:
required:
- reg
- - qcom,protection-domain
additionalProperties: true
diff --git a/Documentation/devicetree/bindings/soc/renesas/renesas,rzv2m-pwc.yaml b/Documentation/devicetree/bindings/soc/renesas/renesas,rzv2m-pwc.yaml
new file mode 100644
index 000000000000..12df33f58484
--- /dev/null
+++ b/Documentation/devicetree/bindings/soc/renesas/renesas,rzv2m-pwc.yaml
@@ -0,0 +1,56 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/soc/renesas/renesas,rzv2m-pwc.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Renesas RZ/V2M External Power Sequence Controller (PWC)
+
+description: |+
+ The PWC IP found in the RZ/V2M family of chips comes with the below
+ capabilities
+ - external power supply on/off sequence generation
+ - on/off signal generation for the LPDDR4 core power supply (LPVDD)
+ - key input signals processing
+ - general-purpose output pins
+
+maintainers:
+ - Fabrizio Castro <fabrizio.castro.jz@renesas.com>
+
+properties:
+ compatible:
+ items:
+ - enum:
+ - renesas,r9a09g011-pwc # RZ/V2M
+ - renesas,r9a09g055-pwc # RZ/V2MA
+ - const: renesas,rzv2m-pwc
+
+ reg:
+ maxItems: 1
+
+ gpio-controller: true
+
+ '#gpio-cells':
+ const: 2
+
+ renesas,rzv2m-pwc-power:
+ description: The PWC is used to control the system power supplies.
+ type: boolean
+
+required:
+ - compatible
+ - reg
+ - gpio-controller
+ - '#gpio-cells'
+
+additionalProperties: false
+
+examples:
+ - |
+ pwc: pwc@a3700000 {
+ compatible = "renesas,r9a09g011-pwc", "renesas,rzv2m-pwc";
+ reg = <0xa3700000 0x800>;
+ gpio-controller;
+ #gpio-cells = <2>;
+ renesas,rzv2m-pwc-power;
+ };
diff --git a/Documentation/devicetree/bindings/soc/rockchip/grf.yaml b/Documentation/devicetree/bindings/soc/rockchip/grf.yaml
index 2ed8cca79b59..e697c928900d 100644
--- a/Documentation/devicetree/bindings/soc/rockchip/grf.yaml
+++ b/Documentation/devicetree/bindings/soc/rockchip/grf.yaml
@@ -20,6 +20,11 @@ properties:
- rockchip,rk3568-pipe-grf
- rockchip,rk3568-pipe-phy-grf
- rockchip,rk3568-usb2phy-grf
+ - rockchip,rk3588-bigcore0-grf
+ - rockchip,rk3588-bigcore1-grf
+ - rockchip,rk3588-ioc
+ - rockchip,rk3588-php-grf
+ - rockchip,rk3588-sys-grf
- rockchip,rk3588-pcie3-phy-grf
- rockchip,rk3588-pcie3-pipe-grf
- rockchip,rv1108-usbgrf
@@ -92,8 +97,9 @@ allOf:
then:
properties:
edp-phy:
- description:
- Documentation/devicetree/bindings/phy/rockchip-dp-phy.txt
+ type: object
+ $ref: /schemas/phy/rockchip,rk3288-dp-phy.yaml#
+ unevaluatedProperties: false
- if:
properties:
@@ -200,7 +206,7 @@ allOf:
"usb2phy@[0-9a-f]+$":
type: object
- $ref: "/schemas/phy/phy-rockchip-inno-usb2.yaml#"
+ $ref: /schemas/phy/rockchip,inno-usb2phy.yaml#
unevaluatedProperties: false
diff --git a/Documentation/devicetree/bindings/soc/samsung/exynos-pmu.yaml b/Documentation/devicetree/bindings/soc/samsung/exynos-pmu.yaml
index 13bb8dfcefe6..f7c141dd11ec 100644
--- a/Documentation/devicetree/bindings/soc/samsung/exynos-pmu.yaml
+++ b/Documentation/devicetree/bindings/soc/samsung/exynos-pmu.yaml
@@ -31,20 +31,28 @@ select:
properties:
compatible:
- items:
- - enum:
- - samsung,exynos3250-pmu
- - samsung,exynos4210-pmu
- - samsung,exynos4412-pmu
- - samsung,exynos5250-pmu
- - samsung,exynos5260-pmu
- - samsung,exynos5410-pmu
- - samsung,exynos5420-pmu
- - samsung,exynos5433-pmu
- - samsung,exynos7-pmu
- - samsung,exynos850-pmu
- - samsung-s5pv210-pmu
- - const: syscon
+ oneOf:
+ - items:
+ - enum:
+ - samsung,exynos3250-pmu
+ - samsung,exynos4210-pmu
+ - samsung,exynos4412-pmu
+ - samsung,exynos5250-pmu
+ - samsung,exynos5260-pmu
+ - samsung,exynos5410-pmu
+ - samsung,exynos5420-pmu
+ - samsung,exynos5433-pmu
+ - samsung,exynos7-pmu
+ - samsung,exynos850-pmu
+ - samsung-s5pv210-pmu
+ - const: syscon
+ - items:
+ - enum:
+ - samsung,exynos5250-pmu
+ - samsung,exynos5420-pmu
+ - samsung,exynos5433-pmu
+ - const: simple-mfd
+ - const: syscon
reg:
maxItems: 1
@@ -64,6 +72,10 @@ properties:
minItems: 1
maxItems: 32
+ dp-phy:
+ $ref: /schemas/phy/samsung,dp-video-phy.yaml
+ unevaluatedProperties: false
+
interrupt-controller:
description:
Some PMUs are capable of behaving as an interrupt controller (mostly
@@ -74,6 +86,10 @@ properties:
Must be identical to the that of the parent interrupt controller.
const: 3
+ mipi-phy:
+ $ref: /schemas/phy/samsung,mipi-video-phy.yaml
+ unevaluatedProperties: false
+
reboot-mode:
$ref: /schemas/power/reset/syscon-reboot-mode.yaml
type: object
@@ -117,6 +133,23 @@ allOf:
- clock-names
- clocks
+ - if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - samsung,exynos5250-pmu
+ - samsung,exynos5420-pmu
+ - samsung,exynos5433-pmu
+ then:
+ properties:
+ dp-phy: true
+ mipi-phy: true
+ else:
+ properties:
+ dp-phy: false
+ mipi-phy: false
+
examples:
- |
#include <dt-bindings/clock/exynos5250.h>
@@ -130,4 +163,14 @@ examples:
#clock-cells = <1>;
clock-names = "clkout16";
clocks = <&clock CLK_FIN_PLL>;
+
+ dp-phy {
+ compatible = "samsung,exynos5250-dp-video-phy";
+ #phy-cells = <0>;
+ };
+
+ mipi-phy {
+ compatible = "samsung,s5pv210-mipi-video-phy";
+ #phy-cells = <1>;
+ };
};
diff --git a/Documentation/devicetree/bindings/soc/samsung/samsung,exynos-sysreg.yaml b/Documentation/devicetree/bindings/soc/samsung/samsung,exynos-sysreg.yaml
new file mode 100644
index 000000000000..163e912e9cad
--- /dev/null
+++ b/Documentation/devicetree/bindings/soc/samsung/samsung,exynos-sysreg.yaml
@@ -0,0 +1,87 @@
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/soc/samsung/samsung,exynos-sysreg.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Samsung Exynos SoC series System Registers (SYSREG)
+
+maintainers:
+ - Krzysztof Kozlowski <krzk@kernel.org>
+
+properties:
+ compatible:
+ oneOf:
+ - items:
+ - enum:
+ - samsung,exynos3-sysreg
+ - samsung,exynos4-sysreg
+ - samsung,exynos5-sysreg
+ - tesla,fsd-cam-sysreg
+ - tesla,fsd-fsys0-sysreg
+ - tesla,fsd-fsys1-sysreg
+ - tesla,fsd-peric-sysreg
+ - const: syscon
+ - items:
+ - enum:
+ - samsung,exynos5433-cam0-sysreg
+ - samsung,exynos5433-cam1-sysreg
+ - samsung,exynos5433-disp-sysreg
+ - samsung,exynos5433-fsys-sysreg
+ - const: samsung,exynos5433-sysreg
+ - const: syscon
+ - items:
+ - enum:
+ - samsung,exynos5433-sysreg
+ - samsung,exynos850-sysreg
+ - samsung,exynosautov9-sysreg
+ - const: syscon
+ deprecated: true
+ - items:
+ - enum:
+ - samsung,exynos850-cmgp-sysreg
+ - samsung,exynos850-peri-sysreg
+ - const: samsung,exynos850-sysreg
+ - const: syscon
+ - items:
+ - enum:
+ - samsung,exynosautov9-fsys2-sysreg
+ - samsung,exynosautov9-peric0-sysreg
+ - samsung,exynosautov9-peric1-sysreg
+ - const: samsung,exynosautov9-sysreg
+ - const: syscon
+
+ reg:
+ maxItems: 1
+
+ clocks:
+ maxItems: 1
+
+required:
+ - compatible
+ - reg
+
+allOf:
+ - if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - samsung,exynos850-cmgp-sysreg
+ - samsung,exynos850-peri-sysreg
+ - samsung,exynos850-sysreg
+ then:
+ required:
+ - clocks
+ else:
+ properties:
+ clocks: false
+
+additionalProperties: false
+
+examples:
+ - |
+ system-controller@10010000 {
+ compatible = "samsung,exynos4-sysreg", "syscon";
+ reg = <0x10010000 0x400>;
+ };
diff --git a/Documentation/devicetree/bindings/sound/everest,es8326.yaml b/Documentation/devicetree/bindings/sound/everest,es8326.yaml
index 07781408e788..07781408e788 100755..100644
--- a/Documentation/devicetree/bindings/sound/everest,es8326.yaml
+++ b/Documentation/devicetree/bindings/sound/everest,es8326.yaml
diff --git a/Documentation/devicetree/bindings/sound/mt8186-mt6366-rt1019-rt5682s.yaml b/Documentation/devicetree/bindings/sound/mt8186-mt6366-rt1019-rt5682s.yaml
index 9d3139990237..aa23b0024c46 100644
--- a/Documentation/devicetree/bindings/sound/mt8186-mt6366-rt1019-rt5682s.yaml
+++ b/Documentation/devicetree/bindings/sound/mt8186-mt6366-rt1019-rt5682s.yaml
@@ -16,6 +16,7 @@ properties:
compatible:
enum:
- mediatek,mt8186-mt6366-rt1019-rt5682s-sound
+ - mediatek,mt8186-mt6366-rt5682s-max98360-sound
mediatek,platform:
$ref: "/schemas/types.yaml#/definitions/phandle"
diff --git a/Documentation/devicetree/bindings/sound/nvidia,tegra210-ahub.yaml b/Documentation/devicetree/bindings/sound/nvidia,tegra210-ahub.yaml
index 89f7805de274..c4abac81f207 100644
--- a/Documentation/devicetree/bindings/sound/nvidia,tegra210-ahub.yaml
+++ b/Documentation/devicetree/bindings/sound/nvidia,tegra210-ahub.yaml
@@ -50,10 +50,10 @@ properties:
maxItems: 1
"#address-cells":
- const: 1
+ enum: [ 1, 2 ]
"#size-cells":
- const: 1
+ enum: [ 1, 2 ]
ranges: true
diff --git a/Documentation/devicetree/bindings/sound/nvidia,tegra210-ope.yaml b/Documentation/devicetree/bindings/sound/nvidia,tegra210-ope.yaml
index 5fc03b8771b1..9017fb6d575d 100644
--- a/Documentation/devicetree/bindings/sound/nvidia,tegra210-ope.yaml
+++ b/Documentation/devicetree/bindings/sound/nvidia,tegra210-ope.yaml
@@ -34,10 +34,10 @@ properties:
maxItems: 1
"#address-cells":
- const: 1
+ enum: [ 1, 2 ]
"#size-cells":
- const: 1
+ enum: [ 1, 2 ]
ranges: true
diff --git a/Documentation/devicetree/bindings/sound/qcom,lpass-tx-macro.yaml b/Documentation/devicetree/bindings/sound/qcom,lpass-tx-macro.yaml
index 66431aade3b7..da5f70910da5 100644
--- a/Documentation/devicetree/bindings/sound/qcom,lpass-tx-macro.yaml
+++ b/Documentation/devicetree/bindings/sound/qcom,lpass-tx-macro.yaml
@@ -30,7 +30,9 @@ properties:
const: 0
clocks:
- maxItems: 5
+ oneOf:
+ - maxItems: 3
+ - maxItems: 5
clock-names:
oneOf:
diff --git a/Documentation/devicetree/bindings/sound/qcom,lpass-wsa-macro.yaml b/Documentation/devicetree/bindings/sound/qcom,lpass-wsa-macro.yaml
index 2bf8d082f8f1..66cbb1f5e31a 100644
--- a/Documentation/devicetree/bindings/sound/qcom,lpass-wsa-macro.yaml
+++ b/Documentation/devicetree/bindings/sound/qcom,lpass-wsa-macro.yaml
@@ -9,9 +9,6 @@ title: LPASS(Low Power Audio Subsystem) VA Macro audio codec
maintainers:
- Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
-allOf:
- - $ref: dai-common.yaml#
-
properties:
compatible:
enum:
@@ -30,15 +27,12 @@ properties:
const: 0
clocks:
- maxItems: 5
+ minItems: 5
+ maxItems: 6
clock-names:
- items:
- - const: mclk
- - const: npl
- - const: macro
- - const: dcodec
- - const: fsgen
+ minItems: 5
+ maxItems: 6
clock-output-names:
maxItems: 1
@@ -55,10 +49,51 @@ required:
- reg
- "#sound-dai-cells"
+allOf:
+ - $ref: dai-common.yaml#
+
+ - if:
+ properties:
+ compatible:
+ enum:
+ - qcom,sc7280-lpass-wsa-macro
+ - qcom,sm8450-lpass-wsa-macro
+ - qcom,sc8280xp-lpass-wsa-macro
+ then:
+ properties:
+ clocks:
+ maxItems: 5
+ clock-names:
+ items:
+ - const: mclk
+ - const: npl
+ - const: macro
+ - const: dcodec
+ - const: fsgen
+
+ - if:
+ properties:
+ compatible:
+ enum:
+ - qcom,sm8250-lpass-wsa-macro
+ then:
+ properties:
+ clocks:
+ minItems: 6
+ clock-names:
+ items:
+ - const: mclk
+ - const: npl
+ - const: macro
+ - const: dcodec
+ - const: va
+ - const: fsgen
+
unevaluatedProperties: false
examples:
- |
+ #include <dt-bindings/clock/qcom,sm8250-lpass-aoncc.h>
#include <dt-bindings/sound/qcom,q6afe.h>
codec@3240000 {
compatible = "qcom,sm8250-lpass-wsa-macro";
@@ -69,7 +104,8 @@ examples:
<&audiocc 0>,
<&q6afecc LPASS_HW_MACRO_VOTE LPASS_CLK_ATTRIBUTE_COUPLE_NO>,
<&q6afecc LPASS_HW_DCODEC_VOTE LPASS_CLK_ATTRIBUTE_COUPLE_NO>,
+ <&aoncc LPASS_CDC_VA_MCLK>,
<&vamacro>;
- clock-names = "mclk", "npl", "macro", "dcodec", "fsgen";
+ clock-names = "mclk", "npl", "macro", "dcodec", "va", "fsgen";
clock-output-names = "mclk";
};
diff --git a/Documentation/devicetree/bindings/spi/atmel,at91rm9200-spi.yaml b/Documentation/devicetree/bindings/spi/atmel,at91rm9200-spi.yaml
index 4dd973e341e6..6c57dd6c3a36 100644
--- a/Documentation/devicetree/bindings/spi/atmel,at91rm9200-spi.yaml
+++ b/Documentation/devicetree/bindings/spi/atmel,at91rm9200-spi.yaml
@@ -8,7 +8,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml#
title: Atmel SPI device
maintainers:
- - Tudor Ambarus <tudor.ambarus@microchip.com>
+ - Tudor Ambarus <tudor.ambarus@linaro.org>
allOf:
- $ref: spi-controller.yaml#
diff --git a/Documentation/devicetree/bindings/spi/atmel,quadspi.yaml b/Documentation/devicetree/bindings/spi/atmel,quadspi.yaml
index 1d493add4053..b0d99bc10535 100644
--- a/Documentation/devicetree/bindings/spi/atmel,quadspi.yaml
+++ b/Documentation/devicetree/bindings/spi/atmel,quadspi.yaml
@@ -7,7 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml#
title: Atmel Quad Serial Peripheral Interface (QSPI)
maintainers:
- - Tudor Ambarus <tudor.ambarus@microchip.com>
+ - Tudor Ambarus <tudor.ambarus@linaro.org>
allOf:
- $ref: spi-controller.yaml#
diff --git a/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml b/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
index ead2cccf658f..9a60c0664bbe 100644
--- a/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
+++ b/Documentation/devicetree/bindings/spi/spi-peripheral-props.yaml
@@ -44,9 +44,9 @@ properties:
description:
Maximum SPI clocking speed of the device in Hz.
- spi-cs-setup-ns:
+ spi-cs-setup-delay-ns:
description:
- Delay in nanosecods to be introduced by the controller after CS is
+ Delay in nanoseconds to be introduced by the controller after CS is
asserted.
spi-rx-bus-width:
diff --git a/Documentation/devicetree/bindings/thermal/mediatek,lvts-thermal.yaml b/Documentation/devicetree/bindings/thermal/mediatek,lvts-thermal.yaml
new file mode 100644
index 000000000000..fe9ae4c425c0
--- /dev/null
+++ b/Documentation/devicetree/bindings/thermal/mediatek,lvts-thermal.yaml
@@ -0,0 +1,142 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/thermal/mediatek,lvts-thermal.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: MediaTek SoC Low Voltage Thermal Sensor (LVTS)
+
+maintainers:
+ - Balsam CHIHI <bchihi@baylibre.com>
+
+description: |
+ LVTS is a thermal management architecture composed of three subsystems,
+ a Sensing device - Thermal Sensing Micro Circuit Unit (TSMCU),
+ a Converter - Low Voltage Thermal Sensor converter (LVTS), and
+ a Digital controller (LVTS_CTRL).
+
+properties:
+ compatible:
+ enum:
+ - mediatek,mt8192-lvts-ap
+ - mediatek,mt8192-lvts-mcu
+ - mediatek,mt8195-lvts-ap
+ - mediatek,mt8195-lvts-mcu
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ maxItems: 1
+
+ clocks:
+ maxItems: 1
+
+ resets:
+ maxItems: 1
+ description: LVTS reset for clearing temporary data on AP/MCU.
+
+ nvmem-cells:
+ minItems: 1
+ items:
+ - description: Calibration eFuse data 1 for LVTS
+ - description: Calibration eFuse data 2 for LVTS
+
+ nvmem-cell-names:
+ minItems: 1
+ items:
+ - const: lvts-calib-data-1
+ - const: lvts-calib-data-2
+
+ "#thermal-sensor-cells":
+ const: 1
+
+allOf:
+ - $ref: thermal-sensor.yaml#
+
+ - if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - mediatek,mt8192-lvts-ap
+ - mediatek,mt8192-lvts-mcu
+ then:
+ properties:
+ nvmem-cells:
+ maxItems: 1
+
+ nvmem-cell-names:
+ maxItems: 1
+
+ - if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - mediatek,mt8195-lvts-ap
+ - mediatek,mt8195-lvts-mcu
+ then:
+ properties:
+ nvmem-cells:
+ minItems: 2
+
+ nvmem-cell-names:
+ minItems: 2
+
+required:
+ - compatible
+ - reg
+ - interrupts
+ - clocks
+ - resets
+ - nvmem-cells
+ - nvmem-cell-names
+ - "#thermal-sensor-cells"
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/interrupt-controller/arm-gic.h>
+ #include <dt-bindings/clock/mt8195-clk.h>
+ #include <dt-bindings/reset/mt8195-resets.h>
+ #include <dt-bindings/thermal/mediatek,lvts-thermal.h>
+
+ soc {
+ #address-cells = <2>;
+ #size-cells = <2>;
+
+ lvts_mcu: thermal-sensor@11278000 {
+ compatible = "mediatek,mt8195-lvts-mcu";
+ reg = <0 0x11278000 0 0x1000>;
+ interrupts = <GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH 0>;
+ clocks = <&infracfg_ao CLK_INFRA_AO_THERM>;
+ resets = <&infracfg_ao MT8195_INFRA_RST4_THERM_CTRL_MCU_SWRST>;
+ nvmem-cells = <&lvts_efuse_data1 &lvts_efuse_data2>;
+ nvmem-cell-names = "lvts-calib-data-1", "lvts-calib-data-2";
+ #thermal-sensor-cells = <1>;
+ };
+ };
+
+ thermal_zones: thermal-zones {
+ cpu0-thermal {
+ polling-delay = <1000>;
+ polling-delay-passive = <250>;
+ thermal-sensors = <&lvts_mcu MT8195_MCU_LITTLE_CPU0>;
+
+ trips {
+ cpu0_alert: trip-alert {
+ temperature = <85000>;
+ hysteresis = <2000>;
+ type = "passive";
+ };
+
+ cpu0_crit: trip-crit {
+ temperature = <100000>;
+ hysteresis = <2000>;
+ type = "critical";
+ };
+ };
+ };
+ };
diff --git a/Documentation/devicetree/bindings/thermal/qcom-spmi-adc-tm5.yaml b/Documentation/devicetree/bindings/thermal/qcom-spmi-adc-tm5.yaml
index d20569b9b763..52ec18cf1eda 100644
--- a/Documentation/devicetree/bindings/thermal/qcom-spmi-adc-tm5.yaml
+++ b/Documentation/devicetree/bindings/thermal/qcom-spmi-adc-tm5.yaml
@@ -13,6 +13,7 @@ properties:
enum:
- qcom,spmi-adc-tm5
- qcom,spmi-adc-tm5-gen2
+ - qcom,adc-tm7 # Incomplete / subject to change
reg:
maxItems: 1
diff --git a/Documentation/devicetree/bindings/thermal/qcom-tsens.yaml b/Documentation/devicetree/bindings/thermal/qcom-tsens.yaml
index 0231f187b097..926e9c51c93c 100644
--- a/Documentation/devicetree/bindings/thermal/qcom-tsens.yaml
+++ b/Documentation/devicetree/bindings/thermal/qcom-tsens.yaml
@@ -37,6 +37,7 @@ properties:
- description: v1 of TSENS
items:
- enum:
+ - qcom,msm8956-tsens
- qcom,msm8976-tsens
- qcom,qcs404-tsens
- const: qcom,tsens-v1
@@ -80,18 +81,120 @@ properties:
maxItems: 2
nvmem-cells:
- minItems: 1
- maxItems: 2
- description:
- Reference to an nvmem node for the calibration data
+ oneOf:
+ - minItems: 1
+ maxItems: 2
+ description:
+ Reference to an nvmem node for the calibration data
+ - minItems: 5
+ maxItems: 35
+ description: |
+ Reference to nvmem cells for the calibration mode, two calibration
+ bases and two cells per each sensor
+ # special case for msm8974 / apq8084
+ - maxItems: 51
+ description: |
+ Reference to nvmem cells for the calibration mode, two calibration
+ bases and two cells per each sensor, main and backup copies, plus use_backup cell
nvmem-cell-names:
- minItems: 1
- items:
- - const: calib
- - enum:
- - calib_backup
- - calib_sel
+ oneOf:
+ - minItems: 1
+ items:
+ - const: calib
+ - enum:
+ - calib_backup
+ - calib_sel
+ - minItems: 5
+ items:
+ - const: mode
+ - const: base1
+ - const: base2
+ - pattern: '^s[0-9]+_p1$'
+ - pattern: '^s[0-9]+_p2$'
+ - pattern: '^s[0-9]+_p1$'
+ - pattern: '^s[0-9]+_p2$'
+ - pattern: '^s[0-9]+_p1$'
+ - pattern: '^s[0-9]+_p2$'
+ - pattern: '^s[0-9]+_p1$'
+ - pattern: '^s[0-9]+_p2$'
+ - pattern: '^s[0-9]+_p1$'
+ - pattern: '^s[0-9]+_p2$'
+ - pattern: '^s[0-9]+_p1$'
+ - pattern: '^s[0-9]+_p2$'
+ - pattern: '^s[0-9]+_p1$'
+ - pattern: '^s[0-9]+_p2$'
+ - pattern: '^s[0-9]+_p1$'
+ - pattern: '^s[0-9]+_p2$'
+ - pattern: '^s[0-9]+_p1$'
+ - pattern: '^s[0-9]+_p2$'
+ - pattern: '^s[0-9]+_p1$'
+ - pattern: '^s[0-9]+_p2$'
+ - pattern: '^s[0-9]+_p1$'
+ - pattern: '^s[0-9]+_p2$'
+ - pattern: '^s[0-9]+_p1$'
+ - pattern: '^s[0-9]+_p2$'
+ - pattern: '^s[0-9]+_p1$'
+ - pattern: '^s[0-9]+_p2$'
+ - pattern: '^s[0-9]+_p1$'
+ - pattern: '^s[0-9]+_p2$'
+ - pattern: '^s[0-9]+_p1$'
+ - pattern: '^s[0-9]+_p2$'
+ - pattern: '^s[0-9]+_p1$'
+ - pattern: '^s[0-9]+_p2$'
+ # special case for msm8974 / apq8084
+ - items:
+ - const: mode
+ - const: base1
+ - const: base2
+ - const: use_backup
+ - const: mode_backup
+ - const: base1_backup
+ - const: base2_backup
+ - const: s0_p1
+ - const: s0_p2
+ - const: s1_p1
+ - const: s1_p2
+ - const: s2_p1
+ - const: s2_p2
+ - const: s3_p1
+ - const: s3_p2
+ - const: s4_p1
+ - const: s4_p2
+ - const: s5_p1
+ - const: s5_p2
+ - const: s6_p1
+ - const: s6_p2
+ - const: s7_p1
+ - const: s7_p2
+ - const: s8_p1
+ - const: s8_p2
+ - const: s9_p1
+ - const: s9_p2
+ - const: s10_p1
+ - const: s10_p2
+ - const: s0_p1_backup
+ - const: s0_p2_backup
+ - const: s1_p1_backup
+ - const: s1_p2_backup
+ - const: s2_p1_backup
+ - const: s2_p2_backup
+ - const: s3_p1_backup
+ - const: s3_p2_backup
+ - const: s4_p1_backup
+ - const: s4_p2_backup
+ - const: s5_p1_backup
+ - const: s5_p2_backup
+ - const: s6_p1_backup
+ - const: s6_p2_backup
+ - const: s7_p1_backup
+ - const: s7_p2_backup
+ - const: s8_p1_backup
+ - const: s8_p2_backup
+ - const: s9_p1_backup
+ - const: s9_p2_backup
+ - const: s10_p1_backup
+ - const: s10_p2_backup
"#qcom,sensors":
description:
@@ -222,6 +325,36 @@ examples:
- |
#include <dt-bindings/interrupt-controller/arm-gic.h>
+ // Example 1 (new calbiration data: for pre v1 IP):
+ thermal-sensor@900000 {
+ compatible = "qcom,msm8916-tsens", "qcom,tsens-v0_1";
+ reg = <0x4a9000 0x1000>, /* TM */
+ <0x4a8000 0x1000>; /* SROT */
+
+ nvmem-cells = <&tsens_mode>,
+ <&tsens_base1>, <&tsens_base2>,
+ <&tsens_s0_p1>, <&tsens_s0_p2>,
+ <&tsens_s1_p1>, <&tsens_s1_p2>,
+ <&tsens_s2_p1>, <&tsens_s2_p2>,
+ <&tsens_s4_p1>, <&tsens_s4_p2>,
+ <&tsens_s5_p1>, <&tsens_s5_p2>;
+ nvmem-cell-names = "mode",
+ "base1", "base2",
+ "s0_p1", "s0_p2",
+ "s1_p1", "s1_p2",
+ "s2_p1", "s2_p2",
+ "s4_p1", "s4_p2",
+ "s5_p1", "s5_p2";
+
+ interrupts = <GIC_SPI 184 IRQ_TYPE_LEVEL_HIGH>;
+ interrupt-names = "uplow";
+
+ #qcom,sensors = <5>;
+ #thermal-sensor-cells = <1>;
+ };
+
+ - |
+ #include <dt-bindings/interrupt-controller/arm-gic.h>
// Example 1 (legacy: for pre v1 IP):
tsens1: thermal-sensor@900000 {
compatible = "qcom,msm8916-tsens", "qcom,tsens-v0_1";
diff --git a/Documentation/devicetree/bindings/thermal/rcar-gen3-thermal.yaml b/Documentation/devicetree/bindings/thermal/rcar-gen3-thermal.yaml
index 0f05f5c886c5..ecf276fd155c 100644
--- a/Documentation/devicetree/bindings/thermal/rcar-gen3-thermal.yaml
+++ b/Documentation/devicetree/bindings/thermal/rcar-gen3-thermal.yaml
@@ -28,6 +28,7 @@ properties:
- renesas,r8a77980-thermal # R-Car V3H
- renesas,r8a779a0-thermal # R-Car V3U
- renesas,r8a779f0-thermal # R-Car S4-8
+ - renesas,r8a779g0-thermal # R-Car V4H
reg: true
@@ -80,6 +81,7 @@ else:
- description: TSC1 registers
- description: TSC2 registers
- description: TSC3 registers
+ - description: TSC4 registers
if:
not:
properties:
@@ -87,6 +89,7 @@ else:
contains:
enum:
- renesas,r8a779f0-thermal
+ - renesas,r8a779g0-thermal
then:
required:
- interrupts
diff --git a/Documentation/devicetree/bindings/timer/mediatek,mtk-timer.txt b/Documentation/devicetree/bindings/timer/mediatek,mtk-timer.txt
index 8bbb6e94508b..b3e797e8aa31 100644
--- a/Documentation/devicetree/bindings/timer/mediatek,mtk-timer.txt
+++ b/Documentation/devicetree/bindings/timer/mediatek,mtk-timer.txt
@@ -33,6 +33,7 @@ Required properties:
For those SoCs that use CPUX
* "mediatek,mt6795-systimer" for MT6795 compatible timers (CPUX)
+ * "mediatek,mt8365-systimer" for MT8365 compatible timers (CPUX)
- reg: Should contain location and length for timer register.
- clocks: Should contain system clock.
diff --git a/Documentation/devicetree/bindings/timer/riscv,timer.yaml b/Documentation/devicetree/bindings/timer/riscv,timer.yaml
new file mode 100644
index 000000000000..38d67e1a5a79
--- /dev/null
+++ b/Documentation/devicetree/bindings/timer/riscv,timer.yaml
@@ -0,0 +1,52 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/timer/riscv,timer.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: RISC-V timer
+
+maintainers:
+ - Anup Patel <anup@brainfault.org>
+
+description: |+
+ RISC-V platforms always have a RISC-V timer device for the supervisor-mode
+ based on the time CSR defined by the RISC-V privileged specification. The
+ timer interrupts of this device are configured using the RISC-V SBI Time
+ extension or the RISC-V Sstc extension.
+
+ The clock frequency of RISC-V timer device is specified via the
+ "timebase-frequency" DT property of "/cpus" DT node which is described
+ in Documentation/devicetree/bindings/riscv/cpus.yaml
+
+properties:
+ compatible:
+ enum:
+ - riscv,timer
+
+ interrupts-extended:
+ minItems: 1
+ maxItems: 4096 # Should be enough?
+
+ riscv,timer-cannot-wake-cpu:
+ type: boolean
+ description:
+ If present, the timer interrupt cannot wake up the CPU from one or
+ more suspend/idle states.
+
+additionalProperties: false
+
+required:
+ - compatible
+ - interrupts-extended
+
+examples:
+ - |
+ timer {
+ compatible = "riscv,timer";
+ interrupts-extended = <&cpu1intc 5>,
+ <&cpu2intc 5>,
+ <&cpu3intc 5>,
+ <&cpu4intc 5>;
+ };
+...
diff --git a/Documentation/devicetree/bindings/timer/rockchip,rk-timer.yaml b/Documentation/devicetree/bindings/timer/rockchip,rk-timer.yaml
index b61ed1a431bb..65e59836a660 100644
--- a/Documentation/devicetree/bindings/timer/rockchip,rk-timer.yaml
+++ b/Documentation/devicetree/bindings/timer/rockchip,rk-timer.yaml
@@ -17,6 +17,7 @@ properties:
- items:
- enum:
- rockchip,rv1108-timer
+ - rockchip,rv1126-timer
- rockchip,rk3036-timer
- rockchip,rk3128-timer
- rockchip,rk3188-timer
diff --git a/Documentation/devicetree/bindings/timer/sifive,clint.yaml b/Documentation/devicetree/bindings/timer/sifive,clint.yaml
index bbad24165837..aada6957216c 100644
--- a/Documentation/devicetree/bindings/timer/sifive,clint.yaml
+++ b/Documentation/devicetree/bindings/timer/sifive,clint.yaml
@@ -20,6 +20,10 @@ description:
property of "/cpus" DT node. The "timebase-frequency" DT property is
described in Documentation/devicetree/bindings/riscv/cpus.yaml
+ T-Head C906/C910 CPU cores include an implementation of CLINT too, however
+ their implementation lacks a memory-mapped MTIME register, thus not
+ compatible with SiFive ones.
+
properties:
compatible:
oneOf:
@@ -30,6 +34,10 @@ properties:
- canaan,k210-clint
- const: sifive,clint0
- items:
+ - enum:
+ - allwinner,sun20i-d1-clint
+ - const: thead,c900-clint
+ - items:
- const: sifive,clint0
- const: riscv,clint0
deprecated: true
diff --git a/Documentation/devicetree/bindings/vendor-prefixes.yaml b/Documentation/devicetree/bindings/vendor-prefixes.yaml
index 70ffb3780621..852910c0ce30 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.yaml
+++ b/Documentation/devicetree/bindings/vendor-prefixes.yaml
@@ -69,6 +69,8 @@ patternProperties:
description: Annapurna Labs
"^alcatel,.*":
description: Alcatel
+ "^aldec,.*":
+ description: Aldec, Inc.
"^alfa-network,.*":
description: ALFA Network Inc.
"^allegro,.*":
@@ -264,6 +266,8 @@ patternProperties:
description: Cirrus Logic, Inc.
"^cisco,.*":
description: Cisco Systems, Inc.
+ "^clockwork,.*":
+ description: Clockwork Tech LLC
"^cloos,.*":
description: Carl Cloos Schweisstechnik GmbH.
"^cloudengines,.*":
@@ -374,6 +378,8 @@ patternProperties:
description: EBV Elektronik
"^eckelmann,.*":
description: Eckelmann AG
+ "^edgeble,.*":
+ description: Edgeble AI Technologies Pvt. Ltd.
"^edimax,.*":
description: EDIMAX Technology Co., Ltd
"^edt,.*":
@@ -396,6 +402,8 @@ patternProperties:
description: Elimo Engineering Ltd.
"^elpida,.*":
description: Elpida Memory, Inc.
+ "^embedfire,.*":
+ description: Dongguan EmbedFire Electronic Technology Co., Ltd.
"^embest,.*":
description: Shenzhen Embest Technology Co., Ltd.
"^emlid,.*":
@@ -512,6 +520,8 @@ patternProperties:
description: Shenzhen Huiding Technology Co., Ltd.
"^google,.*":
description: Google, Inc.
+ "^gplus,.*":
+ description: GPLUS
"^grinn,.*":
description: Grinn
"^grmn,.*":
@@ -1025,6 +1035,8 @@ patternProperties:
description: PocketBook International SA
"^polaroid,.*":
description: Polaroid Corporation
+ "^polyhex,.*":
+ description: Polyhex Technology Co. Ltd.
"^portwell,.*":
description: Portwell Inc.
"^poslab,.*":
@@ -1246,6 +1258,8 @@ patternProperties:
description: Starry Electronic Technology (ShenZhen) Co., LTD
"^startek,.*":
description: Startek
+ "^starterkit,.*":
+ description: Starterkit
"^ste,.*":
description: ST-Ericsson
deprecated: true
@@ -1319,6 +1333,8 @@ patternProperties:
description: thingy.jp
"^thundercomm,.*":
description: Thundercomm Technology Co., Ltd.
+ "^thwc,.*":
+ description: Shenzhen Tong Heng Wei Chuang Technology Co., Ltd.
"^ti,.*":
description: Texas Instruments
"^tianma,.*":
@@ -1372,6 +1388,8 @@ patternProperties:
description: uCRobotics
"^udoo,.*":
description: Udoo
+ "^ufispace,.*":
+ description: Ufi Space Co., Ltd.
"^ugoos,.*":
description: Ugoos Industrial Co., Ltd.
"^uniwest,.*":
@@ -1444,6 +1462,8 @@ patternProperties:
description: Shenzhen whwave Electronics, Inc.
"^wi2wi,.*":
description: Wi2Wi, Inc.
+ "^widora,.*":
+ description: Beijing Widora Technology Co., Ltd.
"^wiligear,.*":
description: Wiligear, Ltd.
"^willsemi,.*":
diff --git a/Documentation/driver-api/media/drivers/davinci-vpbe-devel.rst b/Documentation/driver-api/media/drivers/davinci-vpbe-devel.rst
deleted file mode 100644
index 4e87bdbc7ae4..000000000000
--- a/Documentation/driver-api/media/drivers/davinci-vpbe-devel.rst
+++ /dev/null
@@ -1,39 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-The VPBE V4L2 driver design
-===========================
-
-File partitioning
------------------
-
- V4L2 display device driver
- drivers/media/platform/ti/davinci/vpbe_display.c
- drivers/media/platform/ti/davinci/vpbe_display.h
-
- VPBE display controller
- drivers/media/platform/ti/davinci/vpbe.c
- drivers/media/platform/ti/davinci/vpbe.h
-
- VPBE venc sub device driver
- drivers/media/platform/ti/davinci/vpbe_venc.c
- drivers/media/platform/ti/davinci/vpbe_venc.h
- drivers/media/platform/ti/davinci/vpbe_venc_regs.h
-
- VPBE osd driver
- drivers/media/platform/ti/davinci/vpbe_osd.c
- drivers/media/platform/ti/davinci/vpbe_osd.h
- drivers/media/platform/ti/davinci/vpbe_osd_regs.h
-
-To be done
-----------
-
-vpbe display controller
- - Add support for external encoders.
- - add support for selecting external encoder as default at probe time.
-
-vpbe venc sub device
- - add timings for supporting ths8200
- - add support for LogicPD LCD.
-
-FB drivers
- - Add support for fbdev drivers.- Ready and part of subsequent patches.
diff --git a/Documentation/driver-api/media/drivers/index.rst b/Documentation/driver-api/media/drivers/index.rst
index 32406490557c..3c17d48f83c0 100644
--- a/Documentation/driver-api/media/drivers/index.rst
+++ b/Documentation/driver-api/media/drivers/index.rst
@@ -16,7 +16,6 @@ Video4Linux (V4L) drivers
cpia2_devel
cx2341x-devel
cx88-devel
- davinci-vpbe-devel
fimc-devel
pvrusb2
pxa_camera
diff --git a/Documentation/driver-api/thermal/index.rst b/Documentation/driver-api/thermal/index.rst
index 030306ffa408..a886028014ab 100644
--- a/Documentation/driver-api/thermal/index.rst
+++ b/Documentation/driver-api/thermal/index.rst
@@ -14,7 +14,6 @@ Thermal
exynos_thermal
exynos_thermal_emulation
- intel_powerclamp
nouveau_thermal
x86_pkg_temperature_thermal
intel_dptf
diff --git a/Documentation/driver-api/thermal/intel_dptf.rst b/Documentation/driver-api/thermal/intel_dptf.rst
index 372bdb4d04c6..f5c193cccbda 100644
--- a/Documentation/driver-api/thermal/intel_dptf.rst
+++ b/Documentation/driver-api/thermal/intel_dptf.rst
@@ -84,6 +84,9 @@ DPTF ACPI Drivers interface
https:/github.com/intel/thermal_daemon for decoding
thermal table.
+``production_mode`` (RO)
+ When different from zero, manufacturer locked thermal configuration
+ from further changes.
ACPI Thermal Relationship table interface
------------------------------------------
diff --git a/Documentation/features/seccomp/seccomp-filter/arch-support.txt b/Documentation/features/seccomp/seccomp-filter/arch-support.txt
index dc71bf7b1a7e..3a7237b989cd 100644
--- a/Documentation/features/seccomp/seccomp-filter/arch-support.txt
+++ b/Documentation/features/seccomp/seccomp-filter/arch-support.txt
@@ -14,7 +14,7 @@
| hexagon: | TODO |
| ia64: | TODO |
| loongarch: | ok |
- | m68k: | TODO |
+ | m68k: | ok |
| microblaze: | TODO |
| mips: | ok |
| nios2: | TODO |
diff --git a/Documentation/filesystems/erofs.rst b/Documentation/filesystems/erofs.rst
index 067fd1670b1f..a43aacf1494e 100644
--- a/Documentation/filesystems/erofs.rst
+++ b/Documentation/filesystems/erofs.rst
@@ -120,6 +120,8 @@ dax={always,never} Use direct access (no page cache). See
dax A legacy option which is an alias for ``dax=always``.
device=%s Specify a path to an extra device to be used together.
fsid=%s Specify a filesystem image ID for Fscache back-end.
+domain_id=%s Specify a domain ID in fscache mode so that different images
+ with the same blobs under a given domain ID can share storage.
=================== =========================================================
Sysfs Entries
diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
index ef183387da20..eccd327e6df5 100644
--- a/Documentation/filesystems/fscrypt.rst
+++ b/Documentation/filesystems/fscrypt.rst
@@ -1277,8 +1277,8 @@ the file contents themselves, as described below:
For the read path (->read_folio()) of regular files, filesystems can
read the ciphertext into the page cache and decrypt it in-place. The
-page lock must be held until decryption has finished, to prevent the
-page from becoming visible to userspace prematurely.
+folio lock must be held until decryption has finished, to prevent the
+folio from becoming visible to userspace prematurely.
For the write path (->writepage()) of regular files, filesystems
cannot encrypt data in-place in the page cache, since the cached
diff --git a/Documentation/filesystems/fsverity.rst b/Documentation/filesystems/fsverity.rst
index cb8e7573882a..ede672dedf11 100644
--- a/Documentation/filesystems/fsverity.rst
+++ b/Documentation/filesystems/fsverity.rst
@@ -118,10 +118,11 @@ as follows:
- ``hash_algorithm`` must be the identifier for the hash algorithm to
use for the Merkle tree, such as FS_VERITY_HASH_ALG_SHA256. See
``include/uapi/linux/fsverity.h`` for the list of possible values.
-- ``block_size`` must be the Merkle tree block size. Currently, this
- must be equal to the system page size, which is usually 4096 bytes.
- Other sizes may be supported in the future. This value is not
- necessarily the same as the filesystem block size.
+- ``block_size`` is the Merkle tree block size, in bytes. In Linux
+ v6.3 and later, this can be any power of 2 between (inclusively)
+ 1024 and the minimum of the system page size and the filesystem
+ block size. In earlier versions, the page size was the only allowed
+ value.
- ``salt_size`` is the size of the salt in bytes, or 0 if no salt is
provided. The salt is a value that is prepended to every hashed
block; it can be used to personalize the hashing for a particular
@@ -161,6 +162,7 @@ FS_IOC_ENABLE_VERITY can fail with the following errors:
- ``EBUSY``: this ioctl is already running on the file
- ``EEXIST``: the file already has verity enabled
- ``EFAULT``: the caller provided inaccessible memory
+- ``EFBIG``: the file is too large to enable verity on
- ``EINTR``: the operation was interrupted by a fatal signal
- ``EINVAL``: unsupported version, hash algorithm, or block size; or
reserved bits are set; or the file descriptor refers to neither a
@@ -495,9 +497,11 @@ To create verity files on an ext4 filesystem, the filesystem must have
been formatted with ``-O verity`` or had ``tune2fs -O verity`` run on
it. "verity" is an RO_COMPAT filesystem feature, so once set, old
kernels will only be able to mount the filesystem readonly, and old
-versions of e2fsck will be unable to check the filesystem. Moreover,
-currently ext4 only supports mounting a filesystem with the "verity"
-feature when its block size is equal to PAGE_SIZE (often 4096 bytes).
+versions of e2fsck will be unable to check the filesystem.
+
+Originally, an ext4 filesystem with the "verity" feature could only be
+mounted when its block size was equal to the system page size
+(typically 4096 bytes). In Linux v6.3, this limitation was removed.
ext4 sets the EXT4_VERITY_FL on-disk inode flag on verity files. It
can only be set by `FS_IOC_ENABLE_VERITY`_, and it cannot be cleared.
@@ -518,9 +522,7 @@ support paging multi-gigabyte xattrs into memory, and to support
encrypting xattrs. Note that the verity metadata *must* be encrypted
when the file is, since it contains hashes of the plaintext data.
-Currently, ext4 verity only supports the case where the Merkle tree
-block size, filesystem block size, and page size are all the same. It
-also only supports extent-based files.
+ext4 only allows verity on extent-based files.
f2fs
----
@@ -538,11 +540,10 @@ Like ext4, f2fs stores the verity metadata (Merkle tree and
fsverity_descriptor) past the end of the file, starting at the first
64K boundary beyond i_size. See explanation for ext4 above.
Moreover, f2fs supports at most 4096 bytes of xattr entries per inode
-which wouldn't be enough for even a single Merkle tree block.
+which usually wouldn't be enough for even a single Merkle tree block.
-Currently, f2fs verity only supports a Merkle tree block size of 4096.
-Also, f2fs doesn't support enabling verity on files that currently
-have atomic or volatile writes pending.
+f2fs doesn't support enabling verity on files that currently have
+atomic or volatile writes pending.
btrfs
-----
@@ -567,51 +568,48 @@ Pagecache
~~~~~~~~~
For filesystems using Linux's pagecache, the ``->read_folio()`` and
-``->readahead()`` methods must be modified to verify pages before they
-are marked Uptodate. Merely hooking ``->read_iter()`` would be
+``->readahead()`` methods must be modified to verify folios before
+they are marked Uptodate. Merely hooking ``->read_iter()`` would be
insufficient, since ``->read_iter()`` is not used for memory maps.
-Therefore, fs/verity/ provides a function fsverity_verify_page() which
-verifies a page that has been read into the pagecache of a verity
-inode, but is still locked and not Uptodate, so it's not yet readable
-by userspace. As needed to do the verification,
-fsverity_verify_page() will call back into the filesystem to read
-Merkle tree pages via fsverity_operations::read_merkle_tree_page().
+Therefore, fs/verity/ provides the function fsverity_verify_blocks()
+which verifies data that has been read into the pagecache of a verity
+inode. The containing folio must still be locked and not Uptodate, so
+it's not yet readable by userspace. As needed to do the verification,
+fsverity_verify_blocks() will call back into the filesystem to read
+hash blocks via fsverity_operations::read_merkle_tree_page().
-fsverity_verify_page() returns false if verification failed; in this
-case, the filesystem must not set the page Uptodate. Following this,
+fsverity_verify_blocks() returns false if verification failed; in this
+case, the filesystem must not set the folio Uptodate. Following this,
as per the usual Linux pagecache behavior, attempts by userspace to
-read() from the part of the file containing the page will fail with
-EIO, and accesses to the page within a memory map will raise SIGBUS.
-
-fsverity_verify_page() currently only supports the case where the
-Merkle tree block size is equal to PAGE_SIZE (often 4096 bytes).
+read() from the part of the file containing the folio will fail with
+EIO, and accesses to the folio within a memory map will raise SIGBUS.
-In principle, fsverity_verify_page() verifies the entire path in the
-Merkle tree from the data page to the root hash. However, for
-efficiency the filesystem may cache the hash pages. Therefore,
-fsverity_verify_page() only ascends the tree reading hash pages until
-an already-verified hash page is seen, as indicated by the PageChecked
-bit being set. It then verifies the path to that page.
+In principle, verifying a data block requires verifying the entire
+path in the Merkle tree from the data block to the root hash.
+However, for efficiency the filesystem may cache the hash blocks.
+Therefore, fsverity_verify_blocks() only ascends the tree reading hash
+blocks until an already-verified hash block is seen. It then verifies
+the path to that block.
This optimization, which is also used by dm-verity, results in
excellent sequential read performance. This is because usually (e.g.
-127 in 128 times for 4K blocks and SHA-256) the hash page from the
+127 in 128 times for 4K blocks and SHA-256) the hash block from the
bottom level of the tree will already be cached and checked from
-reading a previous data page. However, random reads perform worse.
+reading a previous data block. However, random reads perform worse.
Block device based filesystems
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Block device based filesystems (e.g. ext4 and f2fs) in Linux also use
the pagecache, so the above subsection applies too. However, they
-also usually read many pages from a file at once, grouped into a
+also usually read many data blocks from a file at once, grouped into a
structure called a "bio". To make it easier for these types of
filesystems to support fs-verity, fs/verity/ also provides a function
-fsverity_verify_bio() which verifies all pages in a bio.
+fsverity_verify_bio() which verifies all data blocks in a bio.
ext4 and f2fs also support encryption. If a verity file is also
-encrypted, the pages must be decrypted before being verified. To
+encrypted, the data must be decrypted before being verified. To
support this, these filesystems allocate a "post-read context" for
each bio and store it in ``->bi_private``::
@@ -626,14 +624,14 @@ each bio and store it in ``->bi_private``::
verity, or both is enabled. After the bio completes, for each needed
postprocessing step the filesystem enqueues the bio_post_read_ctx on a
workqueue, and then the workqueue work does the decryption or
-verification. Finally, pages where no decryption or verity error
-occurred are marked Uptodate, and the pages are unlocked.
+verification. Finally, folios where no decryption or verity error
+occurred are marked Uptodate, and the folios are unlocked.
On many filesystems, files can contain holes. Normally,
-``->readahead()`` simply zeroes holes and sets the corresponding pages
-Uptodate; no bios are issued. To prevent this case from bypassing
-fs-verity, these filesystems use fsverity_verify_page() to verify hole
-pages.
+``->readahead()`` simply zeroes hole blocks and considers the
+corresponding data to be up-to-date; no bios are issued. To prevent
+this case from bypassing fs-verity, filesystems use
+fsverity_verify_blocks() to verify hole blocks.
Filesystems also disable direct I/O on verity files, since otherwise
direct I/O would bypass fs-verity.
@@ -644,7 +642,7 @@ Userspace utility
This document focuses on the kernel, but a userspace utility for
fs-verity can be found at:
- https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/fsverity-utils.git
+ https://git.kernel.org/pub/scm/fs/fsverity/fsverity-utils.git
See the README.md file in the fsverity-utils source tree for details,
including examples of setting up fs-verity protected files.
@@ -793,9 +791,9 @@ weren't already directly answered in other parts of this document.
:A: There are many reasons why this is not possible or would be very
difficult, including the following:
- - To prevent bypassing verification, pages must not be marked
+ - To prevent bypassing verification, folios must not be marked
Uptodate until they've been verified. Currently, each
- filesystem is responsible for marking pages Uptodate via
+ filesystem is responsible for marking folios Uptodate via
``->readahead()``. Therefore, currently it's not possible for
the VFS to do the verification on its own. Changing this would
require significant changes to the VFS and all filesystems.
diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst
index 36fa2a83d714..7de7a7272a5e 100644
--- a/Documentation/filesystems/locking.rst
+++ b/Documentation/filesystems/locking.rst
@@ -56,35 +56,35 @@ inode_operations
prototypes::
- int (*create) (struct inode *,struct dentry *,umode_t, bool);
+ int (*create) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t, bool);
struct dentry * (*lookup) (struct inode *,struct dentry *, unsigned int);
int (*link) (struct dentry *,struct inode *,struct dentry *);
int (*unlink) (struct inode *,struct dentry *);
- int (*symlink) (struct inode *,struct dentry *,const char *);
- int (*mkdir) (struct inode *,struct dentry *,umode_t);
+ int (*symlink) (struct mnt_idmap *, struct inode *,struct dentry *,const char *);
+ int (*mkdir) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t);
int (*rmdir) (struct inode *,struct dentry *);
- int (*mknod) (struct inode *,struct dentry *,umode_t,dev_t);
- int (*rename) (struct inode *, struct dentry *,
+ int (*mknod) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t,dev_t);
+ int (*rename) (struct mnt_idmap *, struct inode *, struct dentry *,
struct inode *, struct dentry *, unsigned int);
int (*readlink) (struct dentry *, char __user *,int);
const char *(*get_link) (struct dentry *, struct inode *, struct delayed_call *);
void (*truncate) (struct inode *);
- int (*permission) (struct inode *, int, unsigned int);
+ int (*permission) (struct mnt_idmap *, struct inode *, int, unsigned int);
struct posix_acl * (*get_inode_acl)(struct inode *, int, bool);
- int (*setattr) (struct dentry *, struct iattr *);
- int (*getattr) (const struct path *, struct kstat *, u32, unsigned int);
+ int (*setattr) (struct mnt_idmap *, struct dentry *, struct iattr *);
+ int (*getattr) (struct mnt_idmap *, const struct path *, struct kstat *, u32, unsigned int);
ssize_t (*listxattr) (struct dentry *, char *, size_t);
int (*fiemap)(struct inode *, struct fiemap_extent_info *, u64 start, u64 len);
void (*update_time)(struct inode *, struct timespec *, int);
int (*atomic_open)(struct inode *, struct dentry *,
struct file *, unsigned open_flag,
umode_t create_mode);
- int (*tmpfile) (struct user_namespace *, struct inode *,
+ int (*tmpfile) (struct mnt_idmap *, struct inode *,
struct file *, umode_t);
- int (*fileattr_set)(struct user_namespace *mnt_userns,
+ int (*fileattr_set)(struct mnt_idmap *idmap,
struct dentry *dentry, struct fileattr *fa);
int (*fileattr_get)(struct dentry *dentry, struct fileattr *fa);
- struct posix_acl * (*get_acl)(struct user_namespace *, struct dentry *, int);
+ struct posix_acl * (*get_acl)(struct mnt_idmap *, struct dentry *, int);
locking rules:
all may block
@@ -135,7 +135,7 @@ prototypes::
struct inode *inode, const char *name, void *buffer,
size_t size);
int (*set)(const struct xattr_handler *handler,
- struct user_namespace *mnt_userns,
+ struct mnt_idmap *idmap,
struct dentry *dentry, struct inode *inode, const char *name,
const void *buffer, size_t size, int flags);
diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst
index 2c15e7053113..c53f30251a66 100644
--- a/Documentation/filesystems/vfs.rst
+++ b/Documentation/filesystems/vfs.rst
@@ -421,31 +421,31 @@ As of kernel 2.6.22, the following members are defined:
.. code-block:: c
struct inode_operations {
- int (*create) (struct user_namespace *, struct inode *,struct dentry *, umode_t, bool);
+ int (*create) (struct mnt_idmap *, struct inode *,struct dentry *, umode_t, bool);
struct dentry * (*lookup) (struct inode *,struct dentry *, unsigned int);
int (*link) (struct dentry *,struct inode *,struct dentry *);
int (*unlink) (struct inode *,struct dentry *);
- int (*symlink) (struct user_namespace *, struct inode *,struct dentry *,const char *);
- int (*mkdir) (struct user_namespace *, struct inode *,struct dentry *,umode_t);
+ int (*symlink) (struct mnt_idmap *, struct inode *,struct dentry *,const char *);
+ int (*mkdir) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t);
int (*rmdir) (struct inode *,struct dentry *);
- int (*mknod) (struct user_namespace *, struct inode *,struct dentry *,umode_t,dev_t);
- int (*rename) (struct user_namespace *, struct inode *, struct dentry *,
+ int (*mknod) (struct mnt_idmap *, struct inode *,struct dentry *,umode_t,dev_t);
+ int (*rename) (struct mnt_idmap *, struct inode *, struct dentry *,
struct inode *, struct dentry *, unsigned int);
int (*readlink) (struct dentry *, char __user *,int);
const char *(*get_link) (struct dentry *, struct inode *,
struct delayed_call *);
- int (*permission) (struct user_namespace *, struct inode *, int);
+ int (*permission) (struct mnt_idmap *, struct inode *, int);
struct posix_acl * (*get_inode_acl)(struct inode *, int, bool);
- int (*setattr) (struct user_namespace *, struct dentry *, struct iattr *);
- int (*getattr) (struct user_namespace *, const struct path *, struct kstat *, u32, unsigned int);
+ int (*setattr) (struct mnt_idmap *, struct dentry *, struct iattr *);
+ int (*getattr) (struct mnt_idmap *, const struct path *, struct kstat *, u32, unsigned int);
ssize_t (*listxattr) (struct dentry *, char *, size_t);
void (*update_time)(struct inode *, struct timespec *, int);
int (*atomic_open)(struct inode *, struct dentry *, struct file *,
unsigned open_flag, umode_t create_mode);
- int (*tmpfile) (struct user_namespace *, struct inode *, struct file *, umode_t);
- struct posix_acl * (*get_acl)(struct user_namespace *, struct dentry *, int);
- int (*set_acl)(struct user_namespace *, struct dentry *, struct posix_acl *, int);
- int (*fileattr_set)(struct user_namespace *mnt_userns,
+ int (*tmpfile) (struct mnt_idmap *, struct inode *, struct file *, umode_t);
+ struct posix_acl * (*get_acl)(struct mnt_idmap *, struct dentry *, int);
+ int (*set_acl)(struct mnt_idmap *, struct dentry *, struct posix_acl *, int);
+ int (*fileattr_set)(struct mnt_idmap *idmap,
struct dentry *dentry, struct fileattr *fa);
int (*fileattr_get)(struct dentry *dentry, struct fileattr *fa);
};
diff --git a/Documentation/firmware-guide/acpi/acpi-lid.rst b/Documentation/firmware-guide/acpi/acpi-lid.rst
index 71b9af13a048..03cbad6c6730 100644
--- a/Documentation/firmware-guide/acpi/acpi-lid.rst
+++ b/Documentation/firmware-guide/acpi/acpi-lid.rst
@@ -34,7 +34,7 @@ state upon the last _LID evaluation. There won't be difference when the
_LID control method is evaluated during the runtime, the problem is its
initial returning value. When the AML tables implement this control method
with cached value, the initial returning value is likely not reliable.
-There are platforms always retun "closed" as initial lid state.
+There are platforms always return "closed" as initial lid state.
Restrictions of the lid state change notifications
==================================================
diff --git a/Documentation/firmware-guide/acpi/gpio-properties.rst b/Documentation/firmware-guide/acpi/gpio-properties.rst
index eaec732cc77c..db0c0b1f3700 100644
--- a/Documentation/firmware-guide/acpi/gpio-properties.rst
+++ b/Documentation/firmware-guide/acpi/gpio-properties.rst
@@ -67,17 +67,30 @@ state of the output pin which driver should use during its initialization.
Linux tries to use common sense here and derives the state from the bias
and polarity settings. The table below shows the expectations:
-========= ============= ==============
-Pull Bias Polarity Requested...
-========= ============= ==============
-Implicit x AS IS (assumed firmware configured for us)
-Explicit x (no _DSD) as Pull Bias (Up == High, Down == Low),
- assuming non-active (Polarity = !Pull Bias)
-Down Low as low, assuming active
-Down High as low, assuming non-active
-Up Low as high, assuming non-active
-Up High as high, assuming active
-========= ============= ==============
++-------------+-------------+-----------------------------------------------+
+| Pull Bias | Polarity | Requested... |
++=============+=============+===============================================+
+| Implicit |
++-------------+-------------+-----------------------------------------------+
+| **Default** | x | AS IS (assumed firmware configured it for us) |
++-------------+-------------+-----------------------------------------------+
+| Explicit |
++-------------+-------------+-----------------------------------------------+
+| **None** | x | AS IS (assumed firmware configured it for us) |
+| | | with no Pull Bias |
++-------------+-------------+-----------------------------------------------+
+| **Up** | x (no _DSD) | |
+| +-------------+ as high, assuming non-active |
+| | Low | |
+| +-------------+-----------------------------------------------+
+| | High | as high, assuming active |
++-------------+-------------+-----------------------------------------------+
+| **Down** | x (no _DSD) | |
+| +-------------+ as low, assuming non-active |
+| | High | |
+| +-------------+-----------------------------------------------+
+| | Low | as low, assuming active |
++-------------+-------------+-----------------------------------------------+
That said, for our above example the both GPIOs, since the bias setting
is explicit and _DSD is present, will be treated as active with a high
diff --git a/Documentation/firmware-guide/acpi/namespace.rst b/Documentation/firmware-guide/acpi/namespace.rst
index 6193582a2204..4ef963679a3d 100644
--- a/Documentation/firmware-guide/acpi/namespace.rst
+++ b/Documentation/firmware-guide/acpi/namespace.rst
@@ -31,7 +31,7 @@ Description Table). The XSDT always points to the FADT (Fixed ACPI
Description Table) using its first entry, the data within the FADT
includes various fixed-length entries that describe fixed ACPI features
of the hardware. The FADT contains a pointer to the DSDT
-(Differentiated System Descripition Table). The XSDT also contains
+(Differentiated System Description Table). The XSDT also contains
entries pointing to possibly multiple SSDTs (Secondary System
Description Table).
diff --git a/Documentation/kbuild/makefiles.rst b/Documentation/kbuild/makefiles.rst
index 6b7368d1f516..38bc74eaa547 100644
--- a/Documentation/kbuild/makefiles.rst
+++ b/Documentation/kbuild/makefiles.rst
@@ -1042,7 +1042,7 @@ $(clean-files).
When executing "make clean", the file "crc32table.h" will be deleted.
Kbuild will assume files to be in the same relative directory as the
-Makefile, except if prefixed with $(objtree).
+Makefile.
To exclude certain files or directories from make clean, use the
$(no-clean-files) variable.
diff --git a/Documentation/kernel-hacking/locking.rst b/Documentation/kernel-hacking/locking.rst
index c756786e17ae..dff0646a717b 100644
--- a/Documentation/kernel-hacking/locking.rst
+++ b/Documentation/kernel-hacking/locking.rst
@@ -1277,11 +1277,11 @@ Manfred Spraul points out that you can still do this, even if the data
is very occasionally accessed in user context or softirqs/tasklets. The
irq handler doesn't use a lock, and all other accesses are done as so::
- spin_lock(&lock);
+ mutex_lock(&lock);
disable_irq(irq);
...
enable_irq(irq);
- spin_unlock(&lock);
+ mutex_unlock(&lock);
The disable_irq() prevents the irq handler from running
(and waits for it to finish if it's currently running on other CPUs).
diff --git a/Documentation/networking/bridge.rst b/Documentation/networking/bridge.rst
index 4aef9cddde2f..c859f3c1636e 100644
--- a/Documentation/networking/bridge.rst
+++ b/Documentation/networking/bridge.rst
@@ -8,7 +8,7 @@ In order to use the Ethernet bridging functionality, you'll need the
userspace tools.
Documentation for Linux bridging is on:
- http://www.linuxfoundation.org/collaborate/workgroups/networking/bridge
+ https://wiki.linuxfoundation.org/networking/bridge
The bridge-utilities are maintained at:
git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/bridge-utils.git
diff --git a/Documentation/networking/device_drivers/ethernet/intel/ice.rst b/Documentation/networking/device_drivers/ethernet/intel/ice.rst
index dc2e60ced927..b481b81f3be5 100644
--- a/Documentation/networking/device_drivers/ethernet/intel/ice.rst
+++ b/Documentation/networking/device_drivers/ethernet/intel/ice.rst
@@ -819,7 +819,7 @@ NAPI
----
This driver supports NAPI (Rx polling mode).
For more information on NAPI, see
-https://www.linuxfoundation.org/collaborate/workgroups/networking/napi
+https://wiki.linuxfoundation.org/networking/napi
MACVLAN
diff --git a/Documentation/networking/device_drivers/ethernet/wangxun/txgbe.rst b/Documentation/networking/device_drivers/ethernet/wangxun/txgbe.rst
index eaa87dbe8848..d052ef40fe36 100644
--- a/Documentation/networking/device_drivers/ethernet/wangxun/txgbe.rst
+++ b/Documentation/networking/device_drivers/ethernet/wangxun/txgbe.rst
@@ -16,5 +16,5 @@ Contents
Support
=======
-If you got any problem, contact Wangxun support team via support@trustnetic.com
+If you got any problem, contact Wangxun support team via nic-support@net-swift.com
and Cc: netdev.
diff --git a/Documentation/networking/nf_conntrack-sysctl.rst b/Documentation/networking/nf_conntrack-sysctl.rst
index 49db1d11d7c4..8b1045c3b59e 100644
--- a/Documentation/networking/nf_conntrack-sysctl.rst
+++ b/Documentation/networking/nf_conntrack-sysctl.rst
@@ -173,7 +173,9 @@ nf_conntrack_sctp_timeout_cookie_echoed - INTEGER (seconds)
default 3
nf_conntrack_sctp_timeout_established - INTEGER (seconds)
- default 432000 (5 days)
+ default 210
+
+ Default is set to (hb_interval * path_max_retrans + rto_max)
nf_conntrack_sctp_timeout_shutdown_sent - INTEGER (seconds)
default 0.3
@@ -190,12 +192,6 @@ nf_conntrack_sctp_timeout_heartbeat_sent - INTEGER (seconds)
This timeout is used to setup conntrack entry on secondary paths.
Default is set to hb_interval.
-nf_conntrack_sctp_timeout_heartbeat_acked - INTEGER (seconds)
- default 210
-
- This timeout is used to setup conntrack entry on secondary paths.
- Default is set to (hb_interval * path_max_retrans + rto_max)
-
nf_conntrack_udp_timeout - INTEGER (seconds)
default 30
diff --git a/Documentation/networking/rxrpc.rst b/Documentation/networking/rxrpc.rst
index 39494a6ea739..e1af54424192 100644
--- a/Documentation/networking/rxrpc.rst
+++ b/Documentation/networking/rxrpc.rst
@@ -880,8 +880,8 @@ The kernel interface functions are as follows:
notify_end_rx can be NULL or it can be used to specify a function to be
called when the call changes state to end the Tx phase. This function is
- called with the call-state spinlock held to prevent any reply or final ACK
- from being delivered first.
+ called with a spinlock held to prevent the last DATA packet from being
+ transmitted until the function returns.
(#) Receive data from a call::
diff --git a/Documentation/power/power_supply_class.rst b/Documentation/power/power_supply_class.rst
index c04fabee0a58..da8e275a14ff 100644
--- a/Documentation/power/power_supply_class.rst
+++ b/Documentation/power/power_supply_class.rst
@@ -40,8 +40,8 @@ kind of power supply, and can process/present them to a user in consistent
manner. Results for different power supplies and machines are also directly
comparable.
-See drivers/power/supply/ds2760_battery.c and drivers/power/supply/pda_power.c
-for the example how to declare and handle attributes.
+See drivers/power/supply/ds2760_battery.c for the example how to declare
+and handle attributes.
Units
diff --git a/Documentation/power/suspend-and-interrupts.rst b/Documentation/power/suspend-and-interrupts.rst
index 4cda6617709a..dfbace2f4600 100644
--- a/Documentation/power/suspend-and-interrupts.rst
+++ b/Documentation/power/suspend-and-interrupts.rst
@@ -67,7 +67,7 @@ That may involve turning on a special signal handling logic within the platform
during system sleep so as to trigger a system wakeup when needed. For example,
the platform may include a dedicated interrupt controller used specifically for
handling system wakeup events. Then, if a given interrupt line is supposed to
-wake up the system from sleep sates, the corresponding input of that interrupt
+wake up the system from sleep states, the corresponding input of that interrupt
controller needs to be enabled to receive signals from the line in question.
After wakeup, it generally is better to disable that input to prevent the
dedicated controller from triggering interrupts unnecessarily.
diff --git a/Documentation/s390/pci.rst b/Documentation/s390/pci.rst
index 8157f0cddbc2..a1a72a47dc96 100644
--- a/Documentation/s390/pci.rst
+++ b/Documentation/s390/pci.rst
@@ -51,7 +51,7 @@ Entries specific to zPCI functions and entries that hold zPCI information.
The slot entries are set up using the function identifier (FID) of the
PCI function. The format depicted as XXXXXXXX above is 8 hexadecimal digits
- with 0 padding and lower case hexadecimal digitis.
+ with 0 padding and lower case hexadecimal digits.
- /sys/bus/pci/slots/XXXXXXXX/power
@@ -66,7 +66,7 @@ Entries specific to zPCI functions and entries that hold zPCI information.
- function_handle
Low-level identifier used for a configured PCI function.
- It might be useful for debuging.
+ It might be useful for debugging.
- pchid
Model-dependent location of the I/O adapter.
diff --git a/Documentation/s390/vfio-ccw.rst b/Documentation/s390/vfio-ccw.rst
index ea928a3806f4..37026fa18179 100644
--- a/Documentation/s390/vfio-ccw.rst
+++ b/Documentation/s390/vfio-ccw.rst
@@ -176,7 +176,7 @@ The process of how these work together.
Use the 'mdev_create' sysfs file, we need to manually create one (and
only one for our case) mediated device.
3. vfio_mdev.ko drives the mediated ccw device.
- vfio_mdev is also the vfio device drvier. It will probe the mdev and
+ vfio_mdev is also the vfio device driver. It will probe the mdev and
add it to an iommu_group and a vfio_group. Then we could pass through
the mdev to a guest.
@@ -219,8 +219,8 @@ values may occur:
The operation was successful.
``-EOPNOTSUPP``
- The orb specified transport mode or an unidentified IDAW format, or the
- scsw specified a function other than the start function.
+ The ORB specified transport mode or the
+ SCSW specified a function other than the start function.
``-EIO``
A request was issued while the device was not in a state ready to accept
diff --git a/Documentation/scheduler/index.rst b/Documentation/scheduler/index.rst
index b430d856056a..f12d0d06de3a 100644
--- a/Documentation/scheduler/index.rst
+++ b/Documentation/scheduler/index.rst
@@ -15,6 +15,7 @@ Linux Scheduler
sched-capacity
sched-energy
schedutil
+ sched-util-clamp
sched-nice-design
sched-rt-group
sched-stats
diff --git a/Documentation/scheduler/sched-util-clamp.rst b/Documentation/scheduler/sched-util-clamp.rst
new file mode 100644
index 000000000000..74d5b7c6431d
--- /dev/null
+++ b/Documentation/scheduler/sched-util-clamp.rst
@@ -0,0 +1,741 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====================
+Utilization Clamping
+====================
+
+1. Introduction
+===============
+
+Utilization clamping, also known as util clamp or uclamp, is a scheduler
+feature that allows user space to help in managing the performance requirement
+of tasks. It was introduced in v5.3 release. The CGroup support was merged in
+v5.4.
+
+Uclamp is a hinting mechanism that allows the scheduler to understand the
+performance requirements and restrictions of the tasks, thus it helps the
+scheduler to make a better decision. And when schedutil cpufreq governor is
+used, util clamp will influence the CPU frequency selection as well.
+
+Since the scheduler and schedutil are both driven by PELT (util_avg) signals,
+util clamp acts on that to achieve its goal by clamping the signal to a certain
+point; hence the name. That is, by clamping utilization we are making the
+system run at a certain performance point.
+
+The right way to view util clamp is as a mechanism to make request or hint on
+performance constraints. It consists of two tunables:
+
+ * UCLAMP_MIN, which sets the lower bound.
+ * UCLAMP_MAX, which sets the upper bound.
+
+These two bounds will ensure a task will operate within this performance range
+of the system. UCLAMP_MIN implies boosting a task, while UCLAMP_MAX implies
+capping a task.
+
+One can tell the system (scheduler) that some tasks require a minimum
+performance point to operate at to deliver the desired user experience. Or one
+can tell the system that some tasks should be restricted from consuming too
+much resources and should not go above a specific performance point. Viewing
+the uclamp values as performance points rather than utilization is a better
+abstraction from user space point of view.
+
+As an example, a game can use util clamp to form a feedback loop with its
+perceived Frames Per Second (FPS). It can dynamically increase the minimum
+performance point required by its display pipeline to ensure no frame is
+dropped. It can also dynamically 'prime' up these tasks if it knows in the
+coming few hundred milliseconds a computationally intensive scene is about to
+happen.
+
+On mobile hardware where the capability of the devices varies a lot, this
+dynamic feedback loop offers a great flexibility to ensure best user experience
+given the capabilities of any system.
+
+Of course a static configuration is possible too. The exact usage will depend
+on the system, application and the desired outcome.
+
+Another example is in Android where tasks are classified as background,
+foreground, top-app, etc. Util clamp can be used to constrain how much
+resources background tasks are consuming by capping the performance point they
+can run at. This constraint helps reserve resources for important tasks, like
+the ones belonging to the currently active app (top-app group). Beside this
+helps in limiting how much power they consume. This can be more obvious in
+heterogeneous systems (e.g. Arm big.LITTLE); the constraint will help bias the
+background tasks to stay on the little cores which will ensure that:
+
+ 1. The big cores are free to run top-app tasks immediately. top-app
+ tasks are the tasks the user is currently interacting with, hence
+ the most important tasks in the system.
+ 2. They don't run on a power hungry core and drain battery even if they
+ are CPU intensive tasks.
+
+.. note::
+ **little cores**:
+ CPUs with capacity < 1024
+
+ **big cores**:
+ CPUs with capacity = 1024
+
+By making these uclamp performance requests, or rather hints, user space can
+ensure system resources are used optimally to deliver the best possible user
+experience.
+
+Another use case is to help with **overcoming the ramp up latency inherit in
+how scheduler utilization signal is calculated**.
+
+On the other hand, a busy task for instance that requires to run at maximum
+performance point will suffer a delay of ~200ms (PELT HALFIFE = 32ms) for the
+scheduler to realize that. This is known to affect workloads like gaming on
+mobile devices where frames will drop due to slow response time to select the
+higher frequency required for the tasks to finish their work in time. Setting
+UCLAMP_MIN=1024 will ensure such tasks will always see the highest performance
+level when they start running.
+
+The overall visible effect goes beyond better perceived user
+experience/performance and stretches to help achieve a better overall
+performance/watt if used effectively.
+
+User space can form a feedback loop with the thermal subsystem too to ensure
+the device doesn't heat up to the point where it will throttle.
+
+Both SCHED_NORMAL/OTHER and SCHED_FIFO/RR honour uclamp requests/hints.
+
+In the SCHED_FIFO/RR case, uclamp gives the option to run RT tasks at any
+performance point rather than being tied to MAX frequency all the time. Which
+can be useful on general purpose systems that run on battery powered devices.
+
+Note that by design RT tasks don't have per-task PELT signal and must always
+run at a constant frequency to combat undeterministic DVFS rampup delays.
+
+Note that using schedutil always implies a single delay to modify the frequency
+when an RT task wakes up. This cost is unchanged by using uclamp. Uclamp only
+helps picking what frequency to request instead of schedutil always requesting
+MAX for all RT tasks.
+
+See :ref:`section 3.4 <uclamp-default-values>` for default values and
+:ref:`3.4.1 <sched-util-clamp-min-rt-default>` on how to change RT tasks
+default value.
+
+2. Design
+=========
+
+Util clamp is a property of every task in the system. It sets the boundaries of
+its utilization signal; acting as a bias mechanism that influences certain
+decisions within the scheduler.
+
+The actual utilization signal of a task is never clamped in reality. If you
+inspect PELT signals at any point of time you should continue to see them as
+they are intact. Clamping happens only when needed, e.g: when a task wakes up
+and the scheduler needs to select a suitable CPU for it to run on.
+
+Since the goal of util clamp is to allow requesting a minimum and maximum
+performance point for a task to run on, it must be able to influence the
+frequency selection as well as task placement to be most effective. Both of
+which have implications on the utilization value at CPU runqueue (rq for short)
+level, which brings us to the main design challenge.
+
+When a task wakes up on an rq, the utilization signal of the rq will be
+affected by the uclamp settings of all the tasks enqueued on it. For example if
+a task requests to run at UTIL_MIN = 512, then the util signal of the rq needs
+to respect to this request as well as all other requests from all of the
+enqueued tasks.
+
+To be able to aggregate the util clamp value of all the tasks attached to the
+rq, uclamp must do some housekeeping at every enqueue/dequeue, which is the
+scheduler hot path. Hence care must be taken since any slow down will have
+significant impact on a lot of use cases and could hinder its usability in
+practice.
+
+The way this is handled is by dividing the utilization range into buckets
+(struct uclamp_bucket) which allows us to reduce the search space from every
+task on the rq to only a subset of tasks on the top-most bucket.
+
+When a task is enqueued, the counter in the matching bucket is incremented,
+and on dequeue it is decremented. This makes keeping track of the effective
+uclamp value at rq level a lot easier.
+
+As tasks are enqueued and dequeued, we keep track of the current effective
+uclamp value of the rq. See :ref:`section 2.1 <uclamp-buckets>` for details on
+how this works.
+
+Later at any path that wants to identify the effective uclamp value of the rq,
+it will simply need to read this effective uclamp value of the rq at that exact
+moment of time it needs to take a decision.
+
+For task placement case, only Energy Aware and Capacity Aware Scheduling
+(EAS/CAS) make use of uclamp for now, which implies that it is applied on
+heterogeneous systems only.
+When a task wakes up, the scheduler will look at the current effective uclamp
+value of every rq and compare it with the potential new value if the task were
+to be enqueued there. Favoring the rq that will end up with the most energy
+efficient combination.
+
+Similarly in schedutil, when it needs to make a frequency update it will look
+at the current effective uclamp value of the rq which is influenced by the set
+of tasks currently enqueued there and select the appropriate frequency that
+will satisfy constraints from requests.
+
+Other paths like setting overutilization state (which effectively disables EAS)
+make use of uclamp as well. Such cases are considered necessary housekeeping to
+allow the 2 main use cases above and will not be covered in detail here as they
+could change with implementation details.
+
+.. _uclamp-buckets:
+
+2.1. Buckets
+------------
+
+::
+
+ [struct rq]
+
+ (bottom) (top)
+
+ 0 1024
+ | |
+ +-----------+-----------+-----------+---- ----+-----------+
+ | Bucket 0 | Bucket 1 | Bucket 2 | ... | Bucket N |
+ +-----------+-----------+-----------+---- ----+-----------+
+ : : :
+ +- p0 +- p3 +- p4
+ : :
+ +- p1 +- p5
+ :
+ +- p2
+
+
+.. note::
+ The diagram above is an illustration rather than a true depiction of the
+ internal data structure.
+
+To reduce the search space when trying to decide the effective uclamp value of
+an rq as tasks are enqueued/dequeued, the whole utilization range is divided
+into N buckets where N is configured at compile time by setting
+CONFIG_UCLAMP_BUCKETS_COUNT. By default it is set to 5.
+
+The rq has a bucket for each uclamp_id tunables: [UCLAMP_MIN, UCLAMP_MAX].
+
+The range of each bucket is 1024/N. For example, for the default value of
+5 there will be 5 buckets, each of which will cover the following range:
+
+::
+
+ DELTA = round_closest(1024/5) = 204.8 = 205
+
+ Bucket 0: [0:204]
+ Bucket 1: [205:409]
+ Bucket 2: [410:614]
+ Bucket 3: [615:819]
+ Bucket 4: [820:1024]
+
+When a task p with following tunable parameters
+
+::
+
+ p->uclamp[UCLAMP_MIN] = 300
+ p->uclamp[UCLAMP_MAX] = 1024
+
+is enqueued into the rq, bucket 1 will be incremented for UCLAMP_MIN and bucket
+4 will be incremented for UCLAMP_MAX to reflect the fact the rq has a task in
+this range.
+
+The rq then keeps track of its current effective uclamp value for each
+uclamp_id.
+
+When a task p is enqueued, the rq value changes to:
+
+::
+
+ // update bucket logic goes here
+ rq->uclamp[UCLAMP_MIN] = max(rq->uclamp[UCLAMP_MIN], p->uclamp[UCLAMP_MIN])
+ // repeat for UCLAMP_MAX
+
+Similarly, when p is dequeued the rq value changes to:
+
+::
+
+ // update bucket logic goes here
+ rq->uclamp[UCLAMP_MIN] = search_top_bucket_for_highest_value()
+ // repeat for UCLAMP_MAX
+
+When all buckets are empty, the rq uclamp values are reset to system defaults.
+See :ref:`section 3.4 <uclamp-default-values>` for details on default values.
+
+
+2.2. Max aggregation
+--------------------
+
+Util clamp is tuned to honour the request for the task that requires the
+highest performance point.
+
+When multiple tasks are attached to the same rq, then util clamp must make sure
+the task that needs the highest performance point gets it even if there's
+another task that doesn't need it or is disallowed from reaching this point.
+
+For example, if there are multiple tasks attached to an rq with the following
+values:
+
+::
+
+ p0->uclamp[UCLAMP_MIN] = 300
+ p0->uclamp[UCLAMP_MAX] = 900
+
+ p1->uclamp[UCLAMP_MIN] = 500
+ p1->uclamp[UCLAMP_MAX] = 500
+
+then assuming both p0 and p1 are enqueued to the same rq, both UCLAMP_MIN
+and UCLAMP_MAX become:
+
+::
+
+ rq->uclamp[UCLAMP_MIN] = max(300, 500) = 500
+ rq->uclamp[UCLAMP_MAX] = max(900, 500) = 900
+
+As we shall see in :ref:`section 5.1 <uclamp-capping-fail>`, this max
+aggregation is the cause of one of limitations when using util clamp, in
+particular for UCLAMP_MAX hint when user space would like to save power.
+
+2.3. Hierarchical aggregation
+-----------------------------
+
+As stated earlier, util clamp is a property of every task in the system. But
+the actual applied (effective) value can be influenced by more than just the
+request made by the task or another actor on its behalf (middleware library).
+
+The effective util clamp value of any task is restricted as follows:
+
+ 1. By the uclamp settings defined by the cgroup CPU controller it is attached
+ to, if any.
+ 2. The restricted value in (1) is then further restricted by the system wide
+ uclamp settings.
+
+:ref:`Section 3 <uclamp-interfaces>` discusses the interfaces and will expand
+further on that.
+
+For now suffice to say that if a task makes a request, its actual effective
+value will have to adhere to some restrictions imposed by cgroup and system
+wide settings.
+
+The system will still accept the request even if effectively will be beyond the
+constraints, but as soon as the task moves to a different cgroup or a sysadmin
+modifies the system settings, the request will be satisfied only if it is
+within new constraints.
+
+In other words, this aggregation will not cause an error when a task changes
+its uclamp values, but rather the system may not be able to satisfy requests
+based on those factors.
+
+2.4. Range
+----------
+
+Uclamp performance request has the range of 0 to 1024 inclusive.
+
+For cgroup interface percentage is used (that is 0 to 100 inclusive).
+Just like other cgroup interfaces, you can use 'max' instead of 100.
+
+.. _uclamp-interfaces:
+
+3. Interfaces
+=============
+
+3.1. Per task interface
+-----------------------
+
+sched_setattr() syscall was extended to accept two new fields:
+
+* sched_util_min: requests the minimum performance point the system should run
+ at when this task is running. Or lower performance bound.
+* sched_util_max: requests the maximum performance point the system should run
+ at when this task is running. Or upper performance bound.
+
+For example, the following scenario have 40% to 80% utilization constraints:
+
+::
+
+ attr->sched_util_min = 40% * 1024;
+ attr->sched_util_max = 80% * 1024;
+
+When task @p is running, **the scheduler should try its best to ensure it
+starts at 40% performance level**. If the task runs for a long enough time so
+that its actual utilization goes above 80%, the utilization, or performance
+level, will be capped.
+
+The special value -1 is used to reset the uclamp settings to the system
+default.
+
+Note that resetting the uclamp value to system default using -1 is not the same
+as manually setting uclamp value to system default. This distinction is
+important because as we shall see in system interfaces, the default value for
+RT could be changed. SCHED_NORMAL/OTHER might gain similar knobs too in the
+future.
+
+3.2. cgroup interface
+---------------------
+
+There are two uclamp related values in the CPU cgroup controller:
+
+* cpu.uclamp.min
+* cpu.uclamp.max
+
+When a task is attached to a CPU controller, its uclamp values will be impacted
+as follows:
+
+* cpu.uclamp.min is a protection as described in :ref:`section 3-3 of cgroup
+ v2 documentation <cgroupv2-protections-distributor>`.
+
+ If a task uclamp_min value is lower than cpu.uclamp.min, then the task will
+ inherit the cgroup cpu.uclamp.min value.
+
+ In a cgroup hierarchy, effective cpu.uclamp.min is the max of (child,
+ parent).
+
+* cpu.uclamp.max is a limit as described in :ref:`section 3-2 of cgroup v2
+ documentation <cgroupv2-limits-distributor>`.
+
+ If a task uclamp_max value is higher than cpu.uclamp.max, then the task will
+ inherit the cgroup cpu.uclamp.max value.
+
+ In a cgroup hierarchy, effective cpu.uclamp.max is the min of (child,
+ parent).
+
+For example, given following parameters:
+
+::
+
+ p0->uclamp[UCLAMP_MIN] = // system default;
+ p0->uclamp[UCLAMP_MAX] = // system default;
+
+ p1->uclamp[UCLAMP_MIN] = 40% * 1024;
+ p1->uclamp[UCLAMP_MAX] = 50% * 1024;
+
+ cgroup0->cpu.uclamp.min = 20% * 1024;
+ cgroup0->cpu.uclamp.max = 60% * 1024;
+
+ cgroup1->cpu.uclamp.min = 60% * 1024;
+ cgroup1->cpu.uclamp.max = 100% * 1024;
+
+when p0 and p1 are attached to cgroup0, the values become:
+
+::
+
+ p0->uclamp[UCLAMP_MIN] = cgroup0->cpu.uclamp.min = 20% * 1024;
+ p0->uclamp[UCLAMP_MAX] = cgroup0->cpu.uclamp.max = 60% * 1024;
+
+ p1->uclamp[UCLAMP_MIN] = 40% * 1024; // intact
+ p1->uclamp[UCLAMP_MAX] = 50% * 1024; // intact
+
+when p0 and p1 are attached to cgroup1, these instead become:
+
+::
+
+ p0->uclamp[UCLAMP_MIN] = cgroup1->cpu.uclamp.min = 60% * 1024;
+ p0->uclamp[UCLAMP_MAX] = cgroup1->cpu.uclamp.max = 100% * 1024;
+
+ p1->uclamp[UCLAMP_MIN] = cgroup1->cpu.uclamp.min = 60% * 1024;
+ p1->uclamp[UCLAMP_MAX] = 50% * 1024; // intact
+
+Note that cgroup interfaces allows cpu.uclamp.max value to be lower than
+cpu.uclamp.min. Other interfaces don't allow that.
+
+3.3. System interface
+---------------------
+
+3.3.1 sched_util_clamp_min
+--------------------------
+
+System wide limit of allowed UCLAMP_MIN range. By default it is set to 1024,
+which means that permitted effective UCLAMP_MIN range for tasks is [0:1024].
+By changing it to 512 for example the range reduces to [0:512]. This is useful
+to restrict how much boosting tasks are allowed to acquire.
+
+Requests from tasks to go above this knob value will still succeed, but
+they won't be satisfied until it is more than p->uclamp[UCLAMP_MIN].
+
+The value must be smaller than or equal to sched_util_clamp_max.
+
+3.3.2 sched_util_clamp_max
+--------------------------
+
+System wide limit of allowed UCLAMP_MAX range. By default it is set to 1024,
+which means that permitted effective UCLAMP_MAX range for tasks is [0:1024].
+
+By changing it to 512 for example the effective allowed range reduces to
+[0:512]. This means is that no task can run above 512, which implies that all
+rqs are restricted too. IOW, the whole system is capped to half its performance
+capacity.
+
+This is useful to restrict the overall maximum performance point of the system.
+For example, it can be handy to limit performance when running low on battery
+or when the system wants to limit access to more energy hungry performance
+levels when it's in idle state or screen is off.
+
+Requests from tasks to go above this knob value will still succeed, but they
+won't be satisfied until it is more than p->uclamp[UCLAMP_MAX].
+
+The value must be greater than or equal to sched_util_clamp_min.
+
+.. _uclamp-default-values:
+
+3.4. Default values
+-------------------
+
+By default all SCHED_NORMAL/SCHED_OTHER tasks are initialized to:
+
+::
+
+ p_fair->uclamp[UCLAMP_MIN] = 0
+ p_fair->uclamp[UCLAMP_MAX] = 1024
+
+That is, by default they're boosted to run at the maximum performance point of
+changed at boot or runtime. No argument was made yet as to why we should
+provide this, but can be added in the future.
+
+For SCHED_FIFO/SCHED_RR tasks:
+
+::
+
+ p_rt->uclamp[UCLAMP_MIN] = 1024
+ p_rt->uclamp[UCLAMP_MAX] = 1024
+
+That is by default they're boosted to run at the maximum performance point of
+the system which retains the historical behavior of the RT tasks.
+
+RT tasks default uclamp_min value can be modified at boot or runtime via
+sysctl. See below section.
+
+.. _sched-util-clamp-min-rt-default:
+
+3.4.1 sched_util_clamp_min_rt_default
+-------------------------------------
+
+Running RT tasks at maximum performance point is expensive on battery powered
+devices and not necessary. To allow system developer to offer good performance
+guarantees for these tasks without pushing it all the way to maximum
+performance point, this sysctl knob allows tuning the best boost value to
+address the system requirement without burning power running at maximum
+performance point all the time.
+
+Application developer are encouraged to use the per task util clamp interface
+to ensure they are performance and power aware. Ideally this knob should be set
+to 0 by system designers and leave the task of managing performance
+requirements to the apps.
+
+4. How to use util clamp
+========================
+
+Util clamp promotes the concept of user space assisted power and performance
+management. At the scheduler level there is no info required to make the best
+decision. However, with util clamp user space can hint to the scheduler to make
+better decision about task placement and frequency selection.
+
+Best results are achieved by not making any assumptions about the system the
+application is running on and to use it in conjunction with a feedback loop to
+dynamically monitor and adjust. Ultimately this will allow for a better user
+experience at a better perf/watt.
+
+For some systems and use cases, static setup will help to achieve good results.
+Portability will be a problem in this case. How much work one can do at 100,
+200 or 1024 is different for each system. Unless there's a specific target
+system, static setup should be avoided.
+
+There are enough possibilities to create a whole framework based on util clamp
+or self contained app that makes use of it directly.
+
+4.1. Boost important and DVFS-latency-sensitive tasks
+-----------------------------------------------------
+
+A GUI task might not be busy to warrant driving the frequency high when it
+wakes up. However, it requires to finish its work within a specific time window
+to deliver the desired user experience. The right frequency it requires at
+wakeup will be system dependent. On some underpowered systems it will be high,
+on other overpowered ones it will be low or 0.
+
+This task can increase its UCLAMP_MIN value every time it misses the deadline
+to ensure on next wake up it runs at a higher performance point. It should try
+to approach the lowest UCLAMP_MIN value that allows to meet its deadline on any
+particular system to achieve the best possible perf/watt for that system.
+
+On heterogeneous systems, it might be important for this task to run on
+a faster CPU.
+
+**Generally it is advised to perceive the input as performance level or point
+which will imply both task placement and frequency selection**.
+
+4.2. Cap background tasks
+-------------------------
+
+Like explained for Android case in the introduction. Any app can lower
+UCLAMP_MAX for some background tasks that don't care about performance but
+could end up being busy and consume unnecessary system resources on the system.
+
+4.3. Powersave mode
+-------------------
+
+sched_util_clamp_max system wide interface can be used to limit all tasks from
+operating at the higher performance points which are usually energy
+inefficient.
+
+This is not unique to uclamp as one can achieve the same by reducing max
+frequency of the cpufreq governor. It can be considered a more convenient
+alternative interface.
+
+4.4. Per-app performance restriction
+------------------------------------
+
+Middleware/Utility can provide the user an option to set UCLAMP_MIN/MAX for an
+app every time it is executed to guarantee a minimum performance point and/or
+limit it from draining system power at the cost of reduced performance for
+these apps.
+
+If you want to prevent your laptop from heating up while on the go from
+compiling the kernel and happy to sacrifice performance to save power, but
+still would like to keep your browser performance intact, uclamp makes it
+possible.
+
+5. Limitations
+==============
+
+.. _uclamp-capping-fail:
+
+5.1. Capping frequency with uclamp_max fails under certain conditions
+---------------------------------------------------------------------
+
+If task p0 is capped to run at 512:
+
+::
+
+ p0->uclamp[UCLAMP_MAX] = 512
+
+and it shares the rq with p1 which is free to run at any performance point:
+
+::
+
+ p1->uclamp[UCLAMP_MAX] = 1024
+
+then due to max aggregation the rq will be allowed to reach max performance
+point:
+
+::
+
+ rq->uclamp[UCLAMP_MAX] = max(512, 1024) = 1024
+
+Assuming both p0 and p1 have UCLAMP_MIN = 0, then the frequency selection for
+the rq will depend on the actual utilization value of the tasks.
+
+If p1 is a small task but p0 is a CPU intensive task, then due to the fact that
+both are running at the same rq, p1 will cause the frequency capping to be left
+from the rq although p1, which is allowed to run at any performance point,
+doesn't actually need to run at that frequency.
+
+5.2. UCLAMP_MAX can break PELT (util_avg) signal
+------------------------------------------------
+
+PELT assumes that frequency will always increase as the signals grow to ensure
+there's always some idle time on the CPU. But with UCLAMP_MAX, this frequency
+increase will be prevented which can lead to no idle time in some
+circumstances. When there's no idle time, a task will stuck in a busy loop,
+which would result in util_avg being 1024.
+
+Combing with issue described below, this can lead to unwanted frequency spikes
+when severely capped tasks share the rq with a small non capped task.
+
+As an example if task p, which have:
+
+::
+
+ p0->util_avg = 300
+ p0->uclamp[UCLAMP_MAX] = 0
+
+wakes up on an idle CPU, then it will run at min frequency (Fmin) this
+CPU is capable of. The max CPU frequency (Fmax) matters here as well,
+since it designates the shortest computational time to finish the task's
+work on this CPU.
+
+::
+
+ rq->uclamp[UCLAMP_MAX] = 0
+
+If the ratio of Fmax/Fmin is 3, then maximum value will be:
+
+::
+
+ 300 * (Fmax/Fmin) = 900
+
+which indicates the CPU will still see idle time since 900 is < 1024. The
+_actual_ util_avg will not be 900 though, but somewhere between 300 and 900. As
+long as there's idle time, p->util_avg updates will be off by a some margin,
+but not proportional to Fmax/Fmin.
+
+::
+
+ p0->util_avg = 300 + small_error
+
+Now if the ratio of Fmax/Fmin is 4, the maximum value becomes:
+
+::
+
+ 300 * (Fmax/Fmin) = 1200
+
+which is higher than 1024 and indicates that the CPU has no idle time. When
+this happens, then the _actual_ util_avg will become:
+
+::
+
+ p0->util_avg = 1024
+
+If task p1 wakes up on this CPU, which have:
+
+::
+
+ p1->util_avg = 200
+ p1->uclamp[UCLAMP_MAX] = 1024
+
+then the effective UCLAMP_MAX for the CPU will be 1024 according to max
+aggregation rule. But since the capped p0 task was running and throttled
+severely, then the rq->util_avg will be:
+
+::
+
+ p0->util_avg = 1024
+ p1->util_avg = 200
+
+ rq->util_avg = 1024
+ rq->uclamp[UCLAMP_MAX] = 1024
+
+Hence lead to a frequency spike since if p0 wasn't throttled we should get:
+
+::
+
+ p0->util_avg = 300
+ p1->util_avg = 200
+
+ rq->util_avg = 500
+
+and run somewhere near mid performance point of that CPU, not the Fmax we get.
+
+5.3. Schedutil response time issues
+-----------------------------------
+
+schedutil has three limitations:
+
+ 1. Hardware takes non-zero time to respond to any frequency change
+ request. On some platforms can be in the order of few ms.
+ 2. Non fast-switch systems require a worker deadline thread to wake up
+ and perform the frequency change, which adds measurable overhead.
+ 3. schedutil rate_limit_us drops any requests during this rate_limit_us
+ window.
+
+If a relatively small task is doing critical job and requires a certain
+performance point when it wakes up and starts running, then all these
+limitations will prevent it from getting what it wants in the time scale it
+expects.
+
+This limitation is not only impactful when using uclamp, but will be more
+prevalent as we no longer gradually ramp up or down. We could easily be
+jumping between frequencies depending on the order tasks wake up, and their
+respective uclamp values.
+
+We regard that as a limitation of the capabilities of the underlying system
+itself.
+
+There is room to improve the behavior of schedutil rate_limit_us, but not much
+to be done for 1 or 2. They are considered hard limitations of the system.
diff --git a/Documentation/sphinx/load_config.py b/Documentation/sphinx/load_config.py
index eeb394b39e2c..8b416bfd75ac 100644
--- a/Documentation/sphinx/load_config.py
+++ b/Documentation/sphinx/load_config.py
@@ -3,7 +3,7 @@
import os
import sys
-from sphinx.util.pycompat import execfile_
+from sphinx.util.osutil import fs_encoding
# ------------------------------------------------------------------------------
def loadConfig(namespace):
@@ -48,7 +48,9 @@ def loadConfig(namespace):
sys.stdout.write("load additional sphinx-config: %s\n" % config_file)
config = namespace.copy()
config['__file__'] = config_file
- execfile_(config_file, config)
+ with open(config_file, 'rb') as f:
+ code = compile(f.read(), fs_encoding, 'exec')
+ exec(code, config)
del config['__file__']
namespace.update(config)
else:
diff --git a/Documentation/translations/it_IT/kernel-hacking/locking.rst b/Documentation/translations/it_IT/kernel-hacking/locking.rst
index b8ecf41273c5..05d362b16bf0 100644
--- a/Documentation/translations/it_IT/kernel-hacking/locking.rst
+++ b/Documentation/translations/it_IT/kernel-hacking/locking.rst
@@ -1307,11 +1307,11 @@ se i dati vengono occasionalmente utilizzati da un contesto utente o
da un'interruzione software. Il gestore d'interruzione non utilizza alcun
*lock*, e tutti gli altri accessi verranno fatti così::
- spin_lock(&lock);
+ mutex_lock(&lock);
disable_irq(irq);
...
enable_irq(irq);
- spin_unlock(&lock);
+ mutex_unlock(&lock);
La funzione disable_irq() impedisce al gestore d'interruzioni
d'essere eseguito (e aspetta che finisca nel caso fosse in esecuzione su
diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
index eb045fc495a4..362bee0be8ab 100644
--- a/Documentation/userspace-api/ioctl/ioctl-number.rst
+++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
@@ -201,7 +201,6 @@ Code Seq# Include File Comments
'V' all linux/videodev2.h conflict!
'V' C0 linux/ivtvfb.h conflict!
'V' C0 linux/ivtv.h conflict!
-'V' C0 media/davinci/vpfe_capture.h conflict!
'V' C0 media/si4713.h conflict!
'W' 00-1F linux/watchdog.h conflict!
'W' 00-1F linux/wanrouter.h conflict! (pre 3.9)
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index deb494f759ed..0a67cb738013 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -1354,6 +1354,14 @@ the memory region are automatically reflected into the guest. For example, an
mmap() that affects the region will be made visible immediately. Another
example is madvise(MADV_DROP).
+Note: On arm64, a write generated by the page-table walker (to update
+the Access and Dirty flags, for example) never results in a
+KVM_EXIT_MMIO exit when the slot has the KVM_MEM_READONLY flag. This
+is because KVM cannot provide the data that would be written by the
+page-table walker, making it impossible to emulate the access.
+Instead, an abort (data abort if the cause of the page-table update
+was a load or a store, instruction abort if it was an instruction
+fetch) is injected in the guest.
4.36 KVM_SET_TSS_ADDR
---------------------
@@ -8062,9 +8070,13 @@ considering the state as complete. VMM needs to ensure that the dirty
state is final and avoid missing dirty pages from another ioctl ordered
after the bitmap collection.
-NOTE: One example of using the backup bitmap is saving arm64 vgic/its
-tables through KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_SAVE_TABLES} command on
-KVM device "kvm-arm-vgic-its" when dirty ring is enabled.
+NOTE: Multiple examples of using the backup bitmap: (1) save vgic/its
+tables through command KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_SAVE_TABLES} on
+KVM device "kvm-arm-vgic-its". (2) restore vgic/its tables through
+command KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_RESTORE_TABLES} on KVM device
+"kvm-arm-vgic-its". VGICv3 LPI pending status is restored. (3) save
+vgic3 pending table through KVM_DEV_ARM_VGIC_{GRP_CTRL, SAVE_PENDING_TABLES}
+command on KVM device "kvm-arm-vgic-v3".
8.30 KVM_CAP_XEN_HVM
--------------------
@@ -8310,6 +8322,20 @@ CPU[EAX=1]:ECX[24] (TSC_DEADLINE) is not reported by ``KVM_GET_SUPPORTED_CPUID``
It can be enabled if ``KVM_CAP_TSC_DEADLINE_TIMER`` is present and the kernel
has enabled in-kernel emulation of the local APIC.
+CPU topology
+~~~~~~~~~~~~
+
+Several CPUID values include topology information for the host CPU:
+0x0b and 0x1f for Intel systems, 0x8000001e for AMD systems. Different
+versions of KVM return different values for this information and userspace
+should not rely on it. Currently they return all zeroes.
+
+If userspace wishes to set up a guest topology, it should be careful that
+the values of these three leaves differ for each CPU. In particular,
+the APIC ID is found in EDX for all subleaves of 0x0b and 0x1f, and in EAX
+for 0x8000001e; the latter also encodes the core id and node id in bits
+7:0 of EBX and ECX respectively.
+
Obsolete ioctls and capabilities
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
index a3ca76f9be75..a0146793d197 100644
--- a/Documentation/virt/kvm/locking.rst
+++ b/Documentation/virt/kvm/locking.rst
@@ -24,21 +24,22 @@ The acquisition orders for mutexes are as follows:
For SRCU:
-- ``synchronize_srcu(&kvm->srcu)`` is called _inside_
- the kvm->slots_lock critical section, therefore kvm->slots_lock
- cannot be taken inside a kvm->srcu read-side critical section.
- Instead, kvm->slots_arch_lock is released before the call
- to ``synchronize_srcu()`` and _can_ be taken inside a
- kvm->srcu read-side critical section.
-
-- kvm->lock is taken inside kvm->srcu, therefore
- ``synchronize_srcu(&kvm->srcu)`` cannot be called inside
- a kvm->lock critical section. If you cannot delay the
- call until after kvm->lock is released, use ``call_srcu``.
+- ``synchronize_srcu(&kvm->srcu)`` is called inside critical sections
+ for kvm->lock, vcpu->mutex and kvm->slots_lock. These locks _cannot_
+ be taken inside a kvm->srcu read-side critical section; that is, the
+ following is broken::
+
+ srcu_read_lock(&kvm->srcu);
+ mutex_lock(&kvm->slots_lock);
+
+- kvm->slots_arch_lock instead is released before the call to
+ ``synchronize_srcu()``. It _can_ therefore be taken inside a
+ kvm->srcu read-side critical section, for example while processing
+ a vmexit.
On x86:
-- vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock
+- vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock and kvm->arch.xen.xen_lock
- kvm->arch.mmu_lock is an rwlock. kvm->arch.tdp_mmu_pages_lock and
kvm->arch.mmu_unsync_pages_lock are taken inside kvm->arch.mmu_lock, and
diff --git a/Documentation/x86/amd-memory-encryption.rst b/Documentation/x86/amd-memory-encryption.rst
index a1940ebe7be5..934310ce7258 100644
--- a/Documentation/x86/amd-memory-encryption.rst
+++ b/Documentation/x86/amd-memory-encryption.rst
@@ -95,3 +95,39 @@ by supplying mem_encrypt=on on the kernel command line. However, if BIOS does
not enable SME, then Linux will not be able to activate memory encryption, even
if configured to do so by default or the mem_encrypt=on command line parameter
is specified.
+
+Secure Nested Paging (SNP)
+==========================
+
+SEV-SNP introduces new features (SEV_FEATURES[1:63]) which can be enabled
+by the hypervisor for security enhancements. Some of these features need
+guest side implementation to function correctly. The below table lists the
+expected guest behavior with various possible scenarios of guest/hypervisor
+SNP feature support.
+
++-----------------+---------------+---------------+------------------+
+| Feature Enabled | Guest needs | Guest has | Guest boot |
+| by the HV | implementation| implementation| behaviour |
++=================+===============+===============+==================+
+| No | No | No | Boot |
+| | | | |
++-----------------+---------------+---------------+------------------+
+| No | Yes | No | Boot |
+| | | | |
++-----------------+---------------+---------------+------------------+
+| No | Yes | Yes | Boot |
+| | | | |
++-----------------+---------------+---------------+------------------+
+| Yes | No | No | Boot with |
+| | | | feature enabled |
++-----------------+---------------+---------------+------------------+
+| Yes | Yes | No | Graceful boot |
+| | | | failure |
++-----------------+---------------+---------------+------------------+
+| Yes | Yes | Yes | Boot with |
+| | | | feature enabled |
++-----------------+---------------+---------------+------------------+
+
+More details in AMD64 APM[1] Vol 2: 15.34.10 SEV_STATUS MSR
+
+[1] https://www.amd.com/system/files/TechDocs/40332.pdf
diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
index 71a531061e4e..058257dc56c8 100644
--- a/Documentation/x86/resctrl.rst
+++ b/Documentation/x86/resctrl.rst
@@ -17,14 +17,21 @@ AMD refers to this feature as AMD Platform Quality of Service(AMD QoS).
This feature is enabled by the CONFIG_X86_CPU_RESCTRL and the x86 /proc/cpuinfo
flag bits:
-============================================= ================================
+=============================================== ================================
RDT (Resource Director Technology) Allocation "rdt_a"
CAT (Cache Allocation Technology) "cat_l3", "cat_l2"
CDP (Code and Data Prioritization) "cdp_l3", "cdp_l2"
CQM (Cache QoS Monitoring) "cqm_llc", "cqm_occup_llc"
MBM (Memory Bandwidth Monitoring) "cqm_mbm_total", "cqm_mbm_local"
MBA (Memory Bandwidth Allocation) "mba"
-============================================= ================================
+SMBA (Slow Memory Bandwidth Allocation) ""
+BMEC (Bandwidth Monitoring Event Configuration) ""
+=============================================== ================================
+
+Historically, new features were made visible by default in /proc/cpuinfo. This
+resulted in the feature flags becoming hard to parse by humans. Adding a new
+flag to /proc/cpuinfo should be avoided if user space can obtain information
+about the feature from resctrl's info directory.
To use the feature mount the file system::
@@ -161,6 +168,83 @@ with the following files:
"mon_features":
Lists the monitoring events if
monitoring is enabled for the resource.
+ Example::
+
+ # cat /sys/fs/resctrl/info/L3_MON/mon_features
+ llc_occupancy
+ mbm_total_bytes
+ mbm_local_bytes
+
+ If the system supports Bandwidth Monitoring Event
+ Configuration (BMEC), then the bandwidth events will
+ be configurable. The output will be::
+
+ # cat /sys/fs/resctrl/info/L3_MON/mon_features
+ llc_occupancy
+ mbm_total_bytes
+ mbm_total_bytes_config
+ mbm_local_bytes
+ mbm_local_bytes_config
+
+"mbm_total_bytes_config", "mbm_local_bytes_config":
+ Read/write files containing the configuration for the mbm_total_bytes
+ and mbm_local_bytes events, respectively, when the Bandwidth
+ Monitoring Event Configuration (BMEC) feature is supported.
+ The event configuration settings are domain specific and affect
+ all the CPUs in the domain. When either event configuration is
+ changed, the bandwidth counters for all RMIDs of both events
+ (mbm_total_bytes as well as mbm_local_bytes) are cleared for that
+ domain. The next read for every RMID will report "Unavailable"
+ and subsequent reads will report the valid value.
+
+ Following are the types of events supported:
+
+ ==== ========================================================
+ Bits Description
+ ==== ========================================================
+ 6 Dirty Victims from the QOS domain to all types of memory
+ 5 Reads to slow memory in the non-local NUMA domain
+ 4 Reads to slow memory in the local NUMA domain
+ 3 Non-temporal writes to non-local NUMA domain
+ 2 Non-temporal writes to local NUMA domain
+ 1 Reads to memory in the non-local NUMA domain
+ 0 Reads to memory in the local NUMA domain
+ ==== ========================================================
+
+ By default, the mbm_total_bytes configuration is set to 0x7f to count
+ all the event types and the mbm_local_bytes configuration is set to
+ 0x15 to count all the local memory events.
+
+ Examples:
+
+ * To view the current configuration::
+ ::
+
+ # cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
+ 0=0x7f;1=0x7f;2=0x7f;3=0x7f
+
+ # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
+ 0=0x15;1=0x15;3=0x15;4=0x15
+
+ * To change the mbm_total_bytes to count only reads on domain 0,
+ the bits 0, 1, 4 and 5 needs to be set, which is 110011b in binary
+ (in hexadecimal 0x33):
+ ::
+
+ # echo "0=0x33" > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
+
+ # cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
+ 0=0x33;1=0x7f;2=0x7f;3=0x7f
+
+ * To change the mbm_local_bytes to count all the slow memory reads on
+ domain 0 and 1, the bits 4 and 5 needs to be set, which is 110000b
+ in binary (in hexadecimal 0x30):
+ ::
+
+ # echo "0=0x30;1=0x30" > /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
+
+ # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
+ 0=0x30;1=0x30;3=0x15;4=0x15
"max_threshold_occupancy":
Read/write file provides the largest value (in
@@ -464,6 +548,25 @@ Memory bandwidth domain is L3 cache.
MB:<cache_id0>=bw_MBps0;<cache_id1>=bw_MBps1;...
+Slow Memory Bandwidth Allocation (SMBA)
+---------------------------------------
+AMD hardware supports Slow Memory Bandwidth Allocation (SMBA).
+CXL.memory is the only supported "slow" memory device. With the
+support of SMBA, the hardware enables bandwidth allocation on
+the slow memory devices. If there are multiple such devices in
+the system, the throttling logic groups all the slow sources
+together and applies the limit on them as a whole.
+
+The presence of SMBA (with CXL.memory) is independent of slow memory
+devices presence. If there are no such devices on the system, then
+configuring SMBA will have no impact on the performance of the system.
+
+The bandwidth domain for slow memory is L3 cache. Its schemata file
+is formatted as:
+::
+
+ SMBA:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;...
+
Reading/writing the schemata file
---------------------------------
Reading the schemata file will show the state of all resources
@@ -479,6 +582,46 @@ which you wish to change. E.g.
L3DATA:0=fffff;1=fffff;2=3c0;3=fffff
L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
+Reading/writing the schemata file (on AMD systems)
+--------------------------------------------------
+Reading the schemata file will show the current bandwidth limit on all
+domains. The allocated resources are in multiples of one eighth GB/s.
+When writing to the file, you need to specify what cache id you wish to
+configure the bandwidth limit.
+
+For example, to allocate 2GB/s limit on the first cache id:
+
+::
+
+ # cat schemata
+ MB:0=2048;1=2048;2=2048;3=2048
+ L3:0=ffff;1=ffff;2=ffff;3=ffff
+
+ # echo "MB:1=16" > schemata
+ # cat schemata
+ MB:0=2048;1= 16;2=2048;3=2048
+ L3:0=ffff;1=ffff;2=ffff;3=ffff
+
+Reading/writing the schemata file (on AMD systems) with SMBA feature
+--------------------------------------------------------------------
+Reading and writing the schemata file is the same as without SMBA in
+above section.
+
+For example, to allocate 8GB/s limit on the first cache id:
+
+::
+
+ # cat schemata
+ SMBA:0=2048;1=2048;2=2048;3=2048
+ MB:0=2048;1=2048;2=2048;3=2048
+ L3:0=ffff;1=ffff;2=ffff;3=ffff
+
+ # echo "SMBA:1=64" > schemata
+ # cat schemata
+ SMBA:0=2048;1= 64;2=2048;3=2048
+ MB:0=2048;1=2048;2=2048;3=2048
+ L3:0=ffff;1=ffff;2=ffff;3=ffff
+
Cache Pseudo-Locking
====================
CAT enables a user to specify the amount of cache space that an