summaryrefslogtreecommitdiff
path: root/fs/resctrl/internal.h
AgeCommit message (Collapse)AuthorFilesLines
2025-09-15fs/resctrl: Introduce the interface to switch between monitor modesBabu Moger1-0/+6
Resctrl subsystem can support two monitoring modes, "mbm_event" or "default". In mbm_event mode, monitoring event can only accumulate data while it is backed by a hardware counter. In "default" mode, resctrl assumes there is a hardware counter for each event within every CTRL_MON and MON group. Introduce mbm_assign_mode resctrl file to switch between mbm_event and default modes. Example: To list the MBM monitor modes supported: $ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode [mbm_event] default To enable the "mbm_event" counter assignment mode: $ echo "mbm_event" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode To enable the "default" monitoring mode: $ echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode Reset MBM event counters automatically as part of changing the mode. Clear both architectural and non-architectural event states to prevent overflow conditions during the next event read. Clear assignable counter configuration on all the domains. Also, enable auto assignment when switching to "mbm_event" mode. Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
2025-09-15fs/resctrl: Introduce the interface to modify assignments in a groupBabu Moger1-0/+3
Enable the mbm_l3_assignments resctrl file to be used to modify counter assignments of CTRL_MON and MON groups when the "mbm_event" counter assignment mode is enabled. Process the assignment modifications in the following format: <Event>:<Domain id>=<Assignment state>;<Domain id>=<Assignment state> Event: A valid MBM event in the /sys/fs/resctrl/info/L3_MON/event_configs directory. Domain ID: A valid domain ID. When writing, '*' applies the changes to all domains. Assignment states: _ : Unassign a counter. e : Assign a counter exclusively. Examples: $ cd /sys/fs/resctrl $ cat /sys/fs/resctrl/mbm_L3_assignments mbm_total_bytes:0=e;1=e mbm_local_bytes:0=e;1=e To unassign the counter associated with the mbm_total_bytes event on domain 0: $ echo "mbm_total_bytes:0=_" > mbm_L3_assignments $ cat /sys/fs/resctrl/mbm_L3_assignments mbm_total_bytes:0=_;1=e mbm_local_bytes:0=e;1=e To unassign the counter associated with the mbm_total_bytes event on all the domains: $ echo "mbm_total_bytes:*=_" > mbm_L3_assignments $ cat /sys/fs/resctrl/mbm_L3_assignments mbm_total_bytes:0=_;1=_ mbm_local_bytes:0=e;1=e Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
2025-09-15fs/resctrl: Introduce mbm_L3_assignments to list assignments in a groupBabu Moger1-0/+2
Introduce the mbm_L3_assignments resctrl file associated with CTRL_MON and MON resource groups to display the counter assignment states of the resource group when "mbm_event" counter assignment mode is enabled. Display the list in the following format: <Event>:<Domain id>=<Assignment state>;<Domain id>=<Assignment state> Event: A valid MBM event listed in /sys/fs/resctrl/info/L3_MON/event_configs directory. Domain ID: A valid domain ID. The assignment state can be one of the following: _ : No counter assigned. e : Counter assigned exclusively. Example: To list the assignment states for the default group $ cd /sys/fs/resctrl $ cat /sys/fs/resctrl/mbm_L3_assignments mbm_total_bytes:0=e;1=e mbm_local_bytes:0=e;1=e Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
2025-09-15fs/resctrl: Introduce mbm_assign_on_mkdir to enable assignments on mkdirBabu Moger1-0/+6
The "mbm_event" counter assignment mode allows users to assign a hardware counter to an RMID, event pair and monitor the bandwidth as long as it is assigned. Introduce a user-configurable option that determines if a counter will automatically be assigned to an RMID, event pair when its associated monitor group is created via mkdir. Accessible when "mbm_event" counter assignment mode is enabled. Suggested-by: Peter Newman <peternewman@google.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
2025-09-15fs/resctrl: Provide interface to update the event configurationsBabu Moger1-0/+3
When "mbm_event" counter assignment mode is enabled, users can modify the event configuration by writing to the 'event_filter' resctrl file. The event configurations for mbm_event mode are located in /sys/fs/resctrl/info/L3_MON/event_configs/. Update the assignments of all CTRL_MON and MON resource groups when the event configuration is modified. Example: $ mount -t resctrl resctrl /sys/fs/resctrl $ cd /sys/fs/resctrl/ $ cat info/L3_MON/event_configs/mbm_local_bytes/event_filter local_reads,local_non_temporal_writes,local_reads_slow_memory $ echo "local_reads,local_non_temporal_writes" > info/L3_MON/event_configs/mbm_total_bytes/event_filter $ cat info/L3_MON/event_configs/mbm_total_bytes/event_filter local_reads,local_non_temporal_writes Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
2025-09-15fs/resctrl: Add event configuration directory under info/L3_MON/Babu Moger1-0/+4
The "mbm_event" counter assignment mode allows the user to assign a hardware counter to an RMID, event pair and monitor the bandwidth as long as it is assigned. The user can specify the memory transaction(s) for the counter to track. When this mode is supported, the /sys/fs/resctrl/info/L3_MON/event_configs directory contains a sub-directory for each MBM event that can be assigned to a counter. The MBM event sub-directory contains a file named "event_filter" that is used to view and modify which memory transactions the MBM event is configured with. Create /sys/fs/resctrl/info/L3_MON/event_configs directory on resctrl mount and pre-populate it with directories for the two existing MBM events: mbm_total_bytes and mbm_local_bytes. Create the "event_filter" file within each MBM event directory with the needed *show() that displays the memory transactions with which the MBM event is configured. Example: $ mount -t resctrl resctrl /sys/fs/resctrl $ cd /sys/fs/resctrl/ $ cat info/L3_MON/event_configs/mbm_total_bytes/event_filter local_reads,remote_reads,local_non_temporal_writes, remote_non_temporal_writes,local_reads_slow_memory, remote_reads_slow_memory,dirty_victim_writes_all $ cat info/L3_MON/event_configs/mbm_local_bytes/event_filter local_reads,local_non_temporal_writes,local_reads_slow_memory Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
2025-09-15fs/resctrl: Support counter read/reset with mbm_event assignment modeBabu Moger1-0/+3
When "mbm_event" counter assignment mode is enabled, the architecture requires a counter ID to read the event data. Introduce an is_mbm_cntr field in struct rmid_read to indicate whether counter assignment mode is in use. Update the logic to call resctrl_arch_cntr_read() and resctrl_arch_reset_cntr() when the assignment mode is active. Report 'Unassigned' in case the user attempts to read an event without assigning a hardware counter. Suggested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
2025-09-15fs/resctrl: Add the functionality to unassign MBM eventsBabu Moger1-0/+2
The "mbm_event" counter assignment mode offers "num_mbm_cntrs" number of counters that can be assigned to RMID, event pairs and monitor bandwidth usage as long as it is assigned. If all the counters are in use, the kernel logs the error message "Failed to allocate counter for <event> in domain <id>" in /sys/fs/resctrl/info/last_cmd_status when a new assignment is requested. To make space for a new assignment, users must unassign an already assigned counter and retry the assignment again. Add the functionality to unassign and free the counters in the domain. Also, add the helper rdtgroup_unassign_cntrs() to unassign counters in the group. Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
2025-09-15fs/resctrl: Add the functionality to assign MBM eventsBabu Moger1-0/+2
When supported, "mbm_event" counter assignment mode offers "num_mbm_cntrs" number of counters that can be assigned to RMID, event pairs and monitor bandwidth usage as long as it is assigned. Add the functionality to allocate and assign a counter to an RMID, event pair in the domain. Also, add the helper rdtgroup_assign_cntrs() to assign counters in the group. Log the error message "Failed to allocate counter for <event> in domain <id>" in /sys/fs/resctrl/info/last_cmd_status if all the counters are in use. Exit on the first failure when assigning counters across all the domains. Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
2025-09-15fs/resctrl: Introduce event configuration field in struct mon_evtBabu Moger1-0/+5
When supported, mbm_event counter assignment mode allows the user to configure events to track specific types of memory transactions. Introduce an evt_cfg field in struct mon_evt to define the type of memory transactions tracked by a monitoring event. Also add a helper function to get the evt_cfg value. Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
2025-09-15fs/resctrl: Introduce interface to display number of free MBM countersBabu Moger1-0/+3
Introduce the "available_mbm_cntrs" resctrl file to display the number of counters available for assignment in each domain when "mbm_event" mode is enabled. Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
2025-09-15fs/resctrl: Add resctrl file to display number of assignable countersBabu Moger1-0/+2
The "mbm_event" counter assignment mode allows users to assign a hardware counter to an RMID, event pair and monitor bandwidth usage as long as it is assigned. The hardware continues to track the assigned counter until it is explicitly unassigned by the user. Create 'num_mbm_cntrs' resctrl file that displays the number of counters supported in each domain. 'num_mbm_cntrs' is only visible to user space when the system supports "mbm_event" mode. Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
2025-09-15fs/resctrl: Introduce the interface to display monitoring modesBabu Moger1-0/+4
Introduce the resctrl file "mbm_assign_mode" to list the supported counter assignment modes. The "mbm_event" counter assignment mode allows users to assign a hardware counter to an RMID, event pair and monitor bandwidth usage as long as it is assigned. The hardware continues to track the assigned counter until it is explicitly unassigned by the user. Each event within a resctrl group can be assigned independently in this mode. On AMD systems "mbm_event" mode is backed by the ABMC (Assignable Bandwidth Monitoring Counters) hardware feature and is enabled by default. The "default" mode is the existing mode that works without the explicit counter assignment, instead relying on dynamic counter assignment by hardware that may result in hardware not dedicating a counter resulting in monitoring data reads returning "Unavailable". Provide an interface to display the monitor modes on the system. $ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode [mbm_event] default Add IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED) check to support Arm64. On x86, CONFIG_RESCTRL_ASSIGN_FIXED is not defined. On Arm64, it will be defined when the "mbm_event" mode is supported. Add IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED) check early to ensure the user interface remains compatible with upcoming Arm64 support. IS_ENABLED() safely evaluates to 0 when the configuration is not defined. As a result, for MPAM, the display would be either: [default] or [mbm_event] Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
2025-09-15x86,fs/resctrl: Consolidate monitor event descriptionsTony Luck1-3/+10
There are currently only three monitor events, all associated with the RDT_RESOURCE_L3 resource. Growing support for additional events will be easier with some restructuring to have a single point in file system code where all attributes of all events are defined. Place all event descriptions into an array mon_event_all[]. Doing this has the beneficial side effect of removing the need for rdt_resource::evt_list. Add resctrl_event_id::QOS_FIRST_EVENT for a lower bound on range checks for event ids and as the starting index to scan mon_event_all[]. Drop the code that builds evt_list and change the two places where the list is scanned to scan mon_event_all[] instead using a new helper macro for_each_mon_event(). Architecture code now informs file system code which events are available with resctrl_enable_mon_event(). Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
2025-09-09fs/resctrl: Eliminate false positive lockdep warning when reading SNC countersReinette Chatre1-2/+2
Running resctrl_tests on an SNC-2 system with lockdep debugging enabled triggers several warnings with following trace: WARNING: CPU: 0 PID: 1914 at kernel/cpu.c:528 lockdep_assert_cpus_held ... Call Trace: __mon_event_count ? __lock_acquire ? __pfx___mon_event_count mon_event_count ? __pfx_smp_mon_event_count smp_mon_event_count smp_call_on_cpu_callback get_cpu_cacheinfo_level() called from __mon_event_count() requires CPU hotplug lock to be held. The hotplug lock is indeed held during this time, as confirmed by the lockdep_assert_cpus_held() within mon_event_read() that calls mon_event_count() via IPI, but the lockdep tracking is not able to follow the IPI. Fresh CPU cache information via get_cpu_cacheinfo_level() from __mon_event_count() was added to support the fix for the issue where resctrl inappropriately maintained links to L3 cache information that will be stale in the case when the associated CPU goes offline. Keep the cacheinfo ID in struct rdt_mon_domain to ensure that resctrl does not maintain stale cache information while CPUs can go offline. Return to using a pointer to the L3 cache information (struct cacheinfo) in struct rmid_read, rmid_read::ci. Initialize rmid_read::ci before the IPI where it is used. CPU hotplug lock is held across rmid_read::ci initialization and use to ensure that it points to accurate cache information. Fixes: 594902c986e2 ("x86,fs/resctrl: Remove inappropriate references to cacheinfo in the resctrl subsystem") Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2025-06-16x86,fs/resctrl: Remove inappropriate references to cacheinfo in the resctrl ↵Qinyun Tan1-2/+2
subsystem In the resctrl subsystem's Sub-NUMA Cluster (SNC) mode, the rdt_mon_domain structure representing a NUMA node relies on the cacheinfo interface (rdt_mon_domain::ci) to store L3 cache information (e.g., shared_cpu_map) for monitoring. The L3 cache information of a SNC NUMA node determines which domains are summed for the "top level" L3-scoped events. rdt_mon_domain::ci is initialized using the first online CPU of a NUMA node. When this CPU goes offline, its shared_cpu_map is cleared to contain only the offline CPU itself. Subsequently, attempting to read counters via smp_call_on_cpu(offline_cpu) fails (and error ignored), returning zero values for "top-level events" without any error indication. Replace the cacheinfo references in struct rdt_mon_domain and struct rmid_read with the cacheinfo ID (a unique identifier for the L3 cache). rdt_domain_hdr::cpu_mask contains the online CPUs associated with that domain. When reading "top-level events", select a CPU from rdt_domain_hdr::cpu_mask and utilize its L3 shared_cpu_map to determine valid CPUs for reading RMID counter via the MSR interface. Considering all CPUs associated with the L3 cache improves the chances of picking a housekeeping CPU on which the counter reading work can be queued, avoiding an unnecessary IPI. Fixes: 328ea68874642 ("x86/resctrl: Prepare for new Sub-NUMA Cluster (SNC) monitor files") Signed-off-by: Qinyun Tan <qinyuntan@linux.alibaba.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/20250530182053.37502-2-qinyuntan@linux.alibaba.com
2025-05-16x86,fs/resctrl: Move the resctrl filesystem code to live in /fs/resctrlJames Morse1-0/+426
Resctrl is a filesystem interface to hardware that provides cache allocation policy and bandwidth control for groups of tasks or CPUs. To support more than one architecture, resctrl needs to live in /fs/. Move the code that is concerned with the filesystem interface to /fs/resctrl. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/20250515165855.31452-25-james.morse@arm.com
2025-05-16fs/resctrl: Add boiler plate for external resctrl codeJames Morse1-0/+0
Add Makefile and Kconfig for fs/resctrl. Add ARCH_HAS_CPU_RESCTRL for the common parts of the resctrl interface and make X86_CPU_RESCTRL select this. Adding an include of asm/resctrl.h to linux/resctrl.h allows the /fs/resctrl files to switch over to using this header instead. Co-developed-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Fenghua Yu <fenghuay@nvidia.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/20250515165855.31452-16-james.morse@arm.com