summaryrefslogtreecommitdiff
path: root/arch/x86/include/asm/resctrl.h
AgeCommit message (Collapse)AuthorFilesLines
2023-03-11x86/resctl: fix scheduler confusion with 'current'Linus Torvalds1-6/+6
commit 7fef099702527c3b2c5234a2ea6a24411485a13a upstream. The implementation of 'current' on x86 is very intentionally special: it is a very common thing to look up, and it uses 'this_cpu_read_stable()' to get the current thread pointer efficiently from per-cpu storage. And the keyword in there is 'stable': the current thread pointer never changes as far as a single thread is concerned. Even if when a thread is preempted, or moved to another CPU, or even across an explicit call 'schedule()' that thread will still have the same value for 'current'. It is, after all, the kernel base pointer to thread-local storage. That's why it's stable to begin with, but it's also why it's important enough that we have that special 'this_cpu_read_stable()' access for it. So this is all done very intentionally to allow the compiler to treat 'current' as a value that never visibly changes, so that the compiler can do CSE and combine multiple different 'current' accesses into one. However, there is obviously one very special situation when the currently running thread does actually change: inside the scheduler itself. So the scheduler code paths are special, and do not have a 'current' thread at all. Instead there are _two_ threads: the previous and the next thread - typically called 'prev' and 'next' (or prev_p/next_p) internally. So this is all actually quite straightforward and simple, and not all that complicated. Except for when you then have special code that is run in scheduler context, that code then has to be aware that 'current' isn't really a valid thing. Did you mean 'prev'? Did you mean 'next'? In fact, even if then look at the code, and you use 'current' after the new value has been assigned to the percpu variable, we have explicitly told the compiler that 'current' is magical and always stable. So the compiler is quite free to use an older (or newer) value of 'current', and the actual assignment to the percpu storage is not relevant even if it might look that way. Which is exactly what happened in the resctl code, that blithely used 'current' in '__resctrl_sched_in()' when it really wanted the new process state (as implied by the name: we're scheduling 'into' that new resctl state). And clang would end up just using the old thread pointer value at least in some configurations. This could have happened with gcc too, and purely depends on random compiler details. Clang just seems to have been more aggressive about moving the read of the per-cpu current_task pointer around. The fix is trivial: just make the resctl code adhere to the scheduler rules of using the prev/next thread pointer explicitly, instead of using 'current' in a situation where it just wasn't valid. That same code is then also used outside of the scheduler context (when a thread resctl state is explicitly changed), and then we will just pass in 'current' as that pointer, of course. There is no ambiguity in that case. The fix may be trivial, but noticing and figuring out what went wrong was not. The credit for that goes to Stephane Eranian. Reported-by: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/lkml/20230303231133.1486085-1-eranian@google.com/ Link: https://lore.kernel.org/lkml/alpine.LFD.2.01.0908011214330.3304@localhost.localdomain/ Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Tony Luck <tony.luck@intel.com> Tested-by: Stephane Eranian <eranian@google.com> Tested-by: Babu Moger <babu.moger@amd.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-28x86/resctrl: Move MSR defines into msr-index.hBorislav Petkov1-5/+3
msr-index.h should contain all MSRs for easier grepping for MSR numbers when dealing with unchecked MSR access warnings, for example. Move the resctrl ones. Prefix IA32_PQR_ASSOC with "MSR_" while at it. No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221106212923.20699-1-bp@alien8.de
2022-09-23x86/resctrl: Rename and change the units of resctrl_cqm_thresholdJames Morse1-0/+9
resctrl_cqm_threshold is stored in a hardware specific chunk size, but exposed to user-space as bytes. This means the filesystem parts of resctrl need to know how the hardware counts, to convert the user provided byte value to chunks. The interface between the architecture's resctrl code and the filesystem ought to treat everything as bytes. Change the unit of resctrl_cqm_threshold to bytes. resctrl_arch_rmid_read() still returns its value in chunks, so this needs converting to bytes. As all the users have been touched, rename the variable to resctrl_rmid_realloc_threshold, which describes what the value is for. Neither r->num_rmid nor hw_res->mon_scale are guaranteed to be a power of 2, so the existing code introduces a rounding error from resctrl's theoretical fraction of the cache usage. This behaviour is kept as it ensures the user visible value matches the value read from hardware when the rmid will be reallocated. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <quic_jiles@quicinc.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Xin Hao <xhao@linux.alibaba.com> Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Tested-by: Cristian Marussi <cristian.marussi@arm.com> Link: https://lore.kernel.org/r/20220902154829.30399-20-james.morse@arm.com
2021-01-11x86/resctrl: Apply READ_ONCE/WRITE_ONCE to task_struct.{rmid,closid}Valentin Schneider1-4/+7
A CPU's current task can have its {closid, rmid} fields read locally while they are being concurrently written to from another CPU. This can happen anytime __resctrl_sched_in() races with either __rdtgroup_move_task() or rdt_move_group_tasks(). Prevent load / store tearing for those accesses by giving them the READ_ONCE() / WRITE_ONCE() treatment. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/9921fda88ad81afb9885b517fbe864a2bc7c35a9.1608243147.git.reinette.chatre@intel.com
2020-05-06x86/cpu: Move resctrl CPUID code to resctrl/Reinette Chatre1-0/+3
The function determining a platform's support and properties of cache occupancy and memory bandwidth monitoring (properties of X86_FEATURE_CQM_LLC) can be found among the common CPU code. After the feature's properties is populated in the per-CPU data the resctrl subsystem is the only consumer (via boot_cpu_data). Move the function that obtains the CPU information used by resctrl to the resctrl subsystem and rename it from init_cqm() to resctrl_cpu_detect(). The function continues to be called from the common CPU code. This move is done in preparation of the addition of some vendor specific code. No functional change. Suggested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/38433b99f9d16c8f4ee796f8cc42b871531fa203.1588715690.git.reinette.chatre@intel.com
2020-05-06x86/resctrl: Rename asm/resctrl_sched.h to asm/resctrl.hReinette Chatre1-0/+93
asm/resctrl_sched.h is dedicated to the code used for configuration of the CPU resource control state when a task is scheduled. Rename resctrl_sched.h to resctrl.h in preparation of additions that will no longer make this file dedicated to work done during scheduling. No functional change. Suggested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/6914e0ef880b539a82a6d889f9423496d471ad1d.1588715690.git.reinette.chatre@intel.com