diff options
Diffstat (limited to 'Documentation/x86/resctrl.rst')
-rw-r--r-- | Documentation/x86/resctrl.rst | 147 |
1 files changed, 145 insertions, 2 deletions
diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst index 71a531061e4e..058257dc56c8 100644 --- a/Documentation/x86/resctrl.rst +++ b/Documentation/x86/resctrl.rst @@ -17,14 +17,21 @@ AMD refers to this feature as AMD Platform Quality of Service(AMD QoS). This feature is enabled by the CONFIG_X86_CPU_RESCTRL and the x86 /proc/cpuinfo flag bits: -============================================= ================================ +=============================================== ================================ RDT (Resource Director Technology) Allocation "rdt_a" CAT (Cache Allocation Technology) "cat_l3", "cat_l2" CDP (Code and Data Prioritization) "cdp_l3", "cdp_l2" CQM (Cache QoS Monitoring) "cqm_llc", "cqm_occup_llc" MBM (Memory Bandwidth Monitoring) "cqm_mbm_total", "cqm_mbm_local" MBA (Memory Bandwidth Allocation) "mba" -============================================= ================================ +SMBA (Slow Memory Bandwidth Allocation) "" +BMEC (Bandwidth Monitoring Event Configuration) "" +=============================================== ================================ + +Historically, new features were made visible by default in /proc/cpuinfo. This +resulted in the feature flags becoming hard to parse by humans. Adding a new +flag to /proc/cpuinfo should be avoided if user space can obtain information +about the feature from resctrl's info directory. To use the feature mount the file system:: @@ -161,6 +168,83 @@ with the following files: "mon_features": Lists the monitoring events if monitoring is enabled for the resource. + Example:: + + # cat /sys/fs/resctrl/info/L3_MON/mon_features + llc_occupancy + mbm_total_bytes + mbm_local_bytes + + If the system supports Bandwidth Monitoring Event + Configuration (BMEC), then the bandwidth events will + be configurable. The output will be:: + + # cat /sys/fs/resctrl/info/L3_MON/mon_features + llc_occupancy + mbm_total_bytes + mbm_total_bytes_config + mbm_local_bytes + mbm_local_bytes_config + +"mbm_total_bytes_config", "mbm_local_bytes_config": + Read/write files containing the configuration for the mbm_total_bytes + and mbm_local_bytes events, respectively, when the Bandwidth + Monitoring Event Configuration (BMEC) feature is supported. + The event configuration settings are domain specific and affect + all the CPUs in the domain. When either event configuration is + changed, the bandwidth counters for all RMIDs of both events + (mbm_total_bytes as well as mbm_local_bytes) are cleared for that + domain. The next read for every RMID will report "Unavailable" + and subsequent reads will report the valid value. + + Following are the types of events supported: + + ==== ======================================================== + Bits Description + ==== ======================================================== + 6 Dirty Victims from the QOS domain to all types of memory + 5 Reads to slow memory in the non-local NUMA domain + 4 Reads to slow memory in the local NUMA domain + 3 Non-temporal writes to non-local NUMA domain + 2 Non-temporal writes to local NUMA domain + 1 Reads to memory in the non-local NUMA domain + 0 Reads to memory in the local NUMA domain + ==== ======================================================== + + By default, the mbm_total_bytes configuration is set to 0x7f to count + all the event types and the mbm_local_bytes configuration is set to + 0x15 to count all the local memory events. + + Examples: + + * To view the current configuration:: + :: + + # cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config + 0=0x7f;1=0x7f;2=0x7f;3=0x7f + + # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config + 0=0x15;1=0x15;3=0x15;4=0x15 + + * To change the mbm_total_bytes to count only reads on domain 0, + the bits 0, 1, 4 and 5 needs to be set, which is 110011b in binary + (in hexadecimal 0x33): + :: + + # echo "0=0x33" > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config + + # cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config + 0=0x33;1=0x7f;2=0x7f;3=0x7f + + * To change the mbm_local_bytes to count all the slow memory reads on + domain 0 and 1, the bits 4 and 5 needs to be set, which is 110000b + in binary (in hexadecimal 0x30): + :: + + # echo "0=0x30;1=0x30" > /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config + + # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config + 0=0x30;1=0x30;3=0x15;4=0x15 "max_threshold_occupancy": Read/write file provides the largest value (in @@ -464,6 +548,25 @@ Memory bandwidth domain is L3 cache. MB:<cache_id0>=bw_MBps0;<cache_id1>=bw_MBps1;... +Slow Memory Bandwidth Allocation (SMBA) +--------------------------------------- +AMD hardware supports Slow Memory Bandwidth Allocation (SMBA). +CXL.memory is the only supported "slow" memory device. With the +support of SMBA, the hardware enables bandwidth allocation on +the slow memory devices. If there are multiple such devices in +the system, the throttling logic groups all the slow sources +together and applies the limit on them as a whole. + +The presence of SMBA (with CXL.memory) is independent of slow memory +devices presence. If there are no such devices on the system, then +configuring SMBA will have no impact on the performance of the system. + +The bandwidth domain for slow memory is L3 cache. Its schemata file +is formatted as: +:: + + SMBA:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;... + Reading/writing the schemata file --------------------------------- Reading the schemata file will show the state of all resources @@ -479,6 +582,46 @@ which you wish to change. E.g. L3DATA:0=fffff;1=fffff;2=3c0;3=fffff L3CODE:0=fffff;1=fffff;2=fffff;3=fffff +Reading/writing the schemata file (on AMD systems) +-------------------------------------------------- +Reading the schemata file will show the current bandwidth limit on all +domains. The allocated resources are in multiples of one eighth GB/s. +When writing to the file, you need to specify what cache id you wish to +configure the bandwidth limit. + +For example, to allocate 2GB/s limit on the first cache id: + +:: + + # cat schemata + MB:0=2048;1=2048;2=2048;3=2048 + L3:0=ffff;1=ffff;2=ffff;3=ffff + + # echo "MB:1=16" > schemata + # cat schemata + MB:0=2048;1= 16;2=2048;3=2048 + L3:0=ffff;1=ffff;2=ffff;3=ffff + +Reading/writing the schemata file (on AMD systems) with SMBA feature +-------------------------------------------------------------------- +Reading and writing the schemata file is the same as without SMBA in +above section. + +For example, to allocate 8GB/s limit on the first cache id: + +:: + + # cat schemata + SMBA:0=2048;1=2048;2=2048;3=2048 + MB:0=2048;1=2048;2=2048;3=2048 + L3:0=ffff;1=ffff;2=ffff;3=ffff + + # echo "SMBA:1=64" > schemata + # cat schemata + SMBA:0=2048;1= 64;2=2048;3=2048 + MB:0=2048;1=2048;2=2048;3=2048 + L3:0=ffff;1=ffff;2=ffff;3=ffff + Cache Pseudo-Locking ==================== CAT enables a user to specify the amount of cache space that an |