summaryrefslogtreecommitdiff
path: root/arch/x86/kernel/cpu
AgeCommit message (Collapse)AuthorFilesLines
2023-11-06random: Make it work on rtThomas Gleixner1-1/+2
Delegate the random insertion to the forced threaded interrupt handler. Store the return IP of the hard interrupt handler in the irq descriptor and feed it into the random generator as a source of entropy. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2021-10-06x86/entry: Clear X86_FEATURE_SMAP when CONFIG_X86_SMAP=nVegard Nossum1-0/+1
Commit 3c73b81a9164 ("x86/entry, selftests: Further improve user entry sanity checks") added a warning if AC is set when in the kernel. Commit 662a0221893a3d ("x86/entry: Fix AC assertion") changed the warning to only fire if the CPU supports SMAP. However, the warning can still trigger on a machine that supports SMAP but where it's disabled in the kernel config and when running the syscall_nt selftest, for example: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 49 at irqentry_enter_from_user_mode CPU: 0 PID: 49 Comm: init Tainted: G T 5.15.0-rc4+ #98 e6202628ee053b4f310759978284bd8bb0ce6905 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 RIP: 0010:irqentry_enter_from_user_mode ... Call Trace: ? irqentry_enter ? exc_general_protection ? asm_exc_general_protection ? asm_exc_general_protectio IS_ENABLED(CONFIG_X86_SMAP) could be added to the warning condition, but even this would not be enough in case SMAP is disabled at boot time with the "nosmap" parameter. To be consistent with "nosmap" behaviour, clear X86_FEATURE_SMAP when !CONFIG_X86_SMAP. Found using entry-fuzz + satrandconfig. [ bp: Massage commit message. ] Fixes: 3c73b81a9164 ("x86/entry, selftests: Further improve user entry sanity checks") Fixes: 662a0221893a ("x86/entry: Fix AC assertion") Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20211003223423.8666-1-vegard.nossum@oracle.com
2021-10-06x86/resctrl: Fix kfree() of the wrong type in domain_add_cpu()James Morse1-2/+2
Commit in Fixes separated the architecture specific and filesystem parts of the resctrl domain structures. This left the error paths in domain_add_cpu() kfree()ing the memory with the wrong type. This will cause a problem if someone adds a new member to struct rdt_hw_domain meaning d_resctrl is no longer the first member. Fixes: 792e0f6f789b ("x86/resctrl: Split struct rdt_domain") Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lkml.kernel.org/r/20210917165924.28254-1-james.morse@arm.com
2021-10-06x86/resctrl: Free the ctrlval arrays when domain_setup_mon_state() failsJames Morse1-0/+2
domain_add_cpu() is called whenever a CPU is brought online. The earlier call to domain_setup_ctrlval() allocates the control value arrays. If domain_setup_mon_state() fails, the control value arrays are not freed. Add the missing kfree() calls. Fixes: 1bd2a63b4f0de ("x86/intel_rdt/mba_sc: Add initialization support") Fixes: edf6fa1c4a951 ("x86/intel_rdt/cqm: Add RMID (Resource monitoring ID) management") Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Reinette Chatre <reinette.chatre@intel.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20210917165958.28313-1-james.morse@arm.com
2021-09-19Merge tag 'x86_urgent_for_v5.15_rc2' of ↵Linus Torvalds1-11/+32
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Prevent a infinite loop in the MCE recovery on return to user space, which was caused by a second MCE queueing work for the same page and thereby creating a circular work list. - Make kern_addr_valid() handle existing PMD entries, which are marked not present in the higher level page table, correctly instead of blindly dereferencing them. - Pass a valid address to sanitize_phys(). This was caused by the mixture of inclusive and exclusive ranges. memtype_reserve() expect 'end' being exclusive, but sanitize_phys() wants it inclusive. This worked so far, but with end being the end of the physical address space the fail is exposed. - Increase the maximum supported GPIO numbers for 64bit. Newer SoCs exceed the previous maximum. * tag 'x86_urgent_for_v5.15_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Avoid infinite loop for copy from user recovery x86/mm: Fix kern_addr_valid() to cope with existing but not present entries x86/platform: Increase maximum GPIO number for X86_64 x86/pat: Pass valid address to sanitize_phys()
2021-09-14x86/mce: Avoid infinite loop for copy from user recoveryTony Luck1-11/+32
There are two cases for machine check recovery: 1) The machine check was triggered by ring3 (application) code. This is the simpler case. The machine check handler simply queues work to be executed on return to user. That code unmaps the page from all users and arranges to send a SIGBUS to the task that triggered the poison. 2) The machine check was triggered in kernel code that is covered by an exception table entry. In this case the machine check handler still queues a work entry to unmap the page, etc. but this will not be called right away because the #MC handler returns to the fix up code address in the exception table entry. Problems occur if the kernel triggers another machine check before the return to user processes the first queued work item. Specifically, the work is queued using the ->mce_kill_me callback structure in the task struct for the current thread. Attempting to queue a second work item using this same callback results in a loop in the linked list of work functions to call. So when the kernel does return to user, it enters an infinite loop processing the same entry for ever. There are some legitimate scenarios where the kernel may take a second machine check before returning to the user. 1) Some code (e.g. futex) first tries a get_user() with page faults disabled. If this fails, the code retries with page faults enabled expecting that this will resolve the page fault. 2) Copy from user code retries a copy in byte-at-time mode to check whether any additional bytes can be copied. On the other side of the fence are some bad drivers that do not check the return value from individual get_user() calls and may access multiple user addresses without noticing that some/all calls have failed. Fix by adding a counter (current->mce_count) to keep track of repeated machine checks before task_work() is called. First machine check saves the address information and calls task_work_add(). Subsequent machine checks before that task_work call back is executed check that the address is in the same page as the first machine check (since the callback will offline exactly one page). Expected worst case is four machine checks before moving on (e.g. one user access with page faults disabled, then a repeat to the same address with page faults enabled ... repeat in copy tail bytes). Just in case there is some code that loops forever enforce a limit of 10. [ bp: Massage commit message, drop noinstr, fix typo, extend panic messages. ] Fixes: 5567d11c21a1 ("x86/mce: Send #MC singal from task work") Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/YT/IJ9ziLqmtqEPu@agluck-desk2.amr.corp.intel.com
2021-09-11Merge branch 'linus' into smp/urgentThomas Gleixner1-22/+16
Ensure that all usage sites of get/put_online_cpus() except for the struggler in drivers/thermal are gone. So the last user and the deprecated inlines can be removed.
2021-09-02Merge tag 'hyperv-next-signed-20210831' of ↵Linus Torvalds1-22/+16
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - make Hyper-V code arch-agnostic (Michael Kelley) - fix sched_clock behaviour on Hyper-V (Ani Sinha) - fix a fault when Linux runs as the root partition on MSHV (Praveen Kumar) - fix VSS driver (Vitaly Kuznetsov) - cleanup (Sonia Sharma) * tag 'hyperv-next-signed-20210831' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: hv_utils: Set the maximum packet size for VSS driver to the length of the receive buffer Drivers: hv: Enable Hyper-V code to be built on ARM64 arm64: efi: Export screen_info arm64: hyperv: Initialize hypervisor on boot arm64: hyperv: Add panic handler arm64: hyperv: Add Hyper-V hypercall and register access utilities x86/hyperv: fix root partition faults when writing to VP assist page MSR hv: hyperv.h: Remove unused inline functions drivers: hv: Decouple Hyper-V clock/timer code from VMbus drivers x86/hyperv: add comment describing TSC_INVARIANT_CONTROL MSR setting bit 0 Drivers: hv: Move Hyper-V misc functionality to arch-neutral code Drivers: hv: Add arch independent default functions for some Hyper-V handlers Drivers: hv: Make portions of Hyper-V init code be arch neutral x86/hyperv: fix for unwanted manipulation of sched_clock when TSC marked unstable asm-generic/hyperv: Add missing #include of nmi.h
2021-09-01drivers: base: cacheinfo: Get rid of DEFINE_SMP_CALL_CACHE_FUNCTION()Thomas Gleixner1-5/+2
DEFINE_SMP_CALL_CACHE_FUNCTION() was usefel before the CPU hotplug rework to ensure that the cache related functions are called on the upcoming CPU because the notifier itself could run on any online CPU. The hotplug state machine guarantees that the callbacks are invoked on the upcoming CPU. So there is no need to have this SMP function call obfuscation. That indirection was missed when the hotplug notifiers were converted. This also solves the problem of ARM64 init_cache_level() invoking ACPI functions which take a semaphore in that context. That's invalid as SMP function calls run with interrupts disabled. Running it just from the callback in context of the CPU hotplug thread solves this. Fixes: 8571890e1513 ("arm64: Add support for ACPI based firmware tables") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/871r69ersb.ffs@tglx
2021-08-31Merge tag 'x86-cpu-2021-08-30' of ↵Linus Torvalds1-0/+70
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cache flush updates from Thomas Gleixner: "A reworked version of the opt-in L1D flush mechanism. This is a stop gap for potential future speculation related hardware vulnerabilities and a mechanism for truly security paranoid applications. It allows a task to request that the L1D cache is flushed when the kernel switches to a different mm. This can be requested via prctl(). Changes vs the previous versions: - Get rid of the software flush fallback - Make the handling consistent with other mitigations - Kill the task when it ends up on a SMT enabled core which defeats the purpose of L1D flushing obviously" * tag 'x86-cpu-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Documentation: Add L1D flushing Documentation x86, prctl: Hook L1D flushing in via prctl x86/mm: Prepare for opt-in based L1D flush in switch_mm() x86/process: Make room for TIF_SPEC_L1D_FLUSH sched: Add task_work callback for paranoid L1D flush x86/mm: Refactor cond_ibpb() to support other use cases x86/smp: Add a per-cpu view of SMT state
2021-08-30Merge tag 'perf-core-2021-08-30' of ↵Linus Torvalds2-1/+7
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 perf event updates from Ingo Molnar: - Add support for Intel Sapphire Rapids server CPU uncore events - Allow the AMD uncore driver to be built as a module - Misc cleanups and fixes * tag 'perf-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) perf/x86/amd/ibs: Add bitfield definitions in new <asm/amd-ibs.h> header perf/amd/uncore: Allow the driver to be built as a module x86/cpu: Add get_llc_id() helper function perf/amd/uncore: Clean up header use, use <linux/ include paths instead of <asm/ perf/amd/uncore: Simplify code, use free_percpu()'s built-in check for NULL perf/hw_breakpoint: Replace deprecated CPU-hotplug functions perf/x86/intel: Replace deprecated CPU-hotplug functions perf/x86: Remove unused assignment to pointer 'e' perf/x86/intel/uncore: Fix IIO cleanup mapping procedure for SNR/ICX perf/x86/intel/uncore: Support IMC free-running counters on Sapphire Rapids server perf/x86/intel/uncore: Support IIO free-running counters on Sapphire Rapids server perf/x86/intel/uncore: Factor out snr_uncore_mmio_map() perf/x86/intel/uncore: Add alias PMU name perf/x86/intel/uncore: Add Sapphire Rapids server MDF support perf/x86/intel/uncore: Add Sapphire Rapids server M3UPI support perf/x86/intel/uncore: Add Sapphire Rapids server UPI support perf/x86/intel/uncore: Add Sapphire Rapids server M2M support perf/x86/intel/uncore: Add Sapphire Rapids server IMC support perf/x86/intel/uncore: Add Sapphire Rapids server PCU support perf/x86/intel/uncore: Add Sapphire Rapids server M2PCIe support ...
2021-08-30Merge tag 'x86_cleanups_for_v5.15' of ↵Linus Torvalds3-17/+17
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Borislav Petkov: "The usual round of minor cleanups and fixes" * tag 'x86_cleanups_for_v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/kaslr: Have process_mem_region() return a boolean x86/power: Fix kernel-doc warnings in cpu.c x86/mce/inject: Replace deprecated CPU-hotplug functions. x86/microcode: Replace deprecated CPU-hotplug functions. x86/mtrr: Replace deprecated CPU-hotplug functions. x86/mmiotrace: Replace deprecated CPU-hotplug functions.
2021-08-30Merge tag 'x86_cache_for_v5.15' of ↵Linus Torvalds6-595/+592
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 resource control updates from Borislav Petkov: "A first round of changes towards splitting the arch-specific bits from the filesystem bits of resctrl, the ultimate goal being to support ARM's equivalent technology MPAM, with the same fs interface (James Morse)" * tag 'x86_cache_for_v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits) x86/resctrl: Make resctrl_arch_get_config() return its value x86/resctrl: Merge the CDP resources x86/resctrl: Expand resctrl_arch_update_domains()'s msr_param range x86/resctrl: Remove rdt_cdp_peer_get() x86/resctrl: Merge the ctrl_val arrays x86/resctrl: Calculate the index from the configuration type x86/resctrl: Apply offset correction when config is staged x86/resctrl: Make ctrlval arrays the same size x86/resctrl: Pass configuration type to resctrl_arch_get_config() x86/resctrl: Add a helper to read a closid's configuration x86/resctrl: Rename update_domains() to resctrl_arch_update_domains() x86/resctrl: Allow different CODE/DATA configurations to be staged x86/resctrl: Group staged configuration into a separate struct x86/resctrl: Move the schemata names into struct resctrl_schema x86/resctrl: Add a helper to read/set the CDP configuration x86/resctrl: Swizzle rdt_resource and resctrl_schema in pseudo_lock_region x86/resctrl: Pass the schema to resctrl filesystem functions x86/resctrl: Add resctrl_arch_get_num_closid() x86/resctrl: Store the effective num_closid in the schema x86/resctrl: Walk the resctrl schema list instead of an arch list ...
2021-08-30Merge tag 'ras_core_for_v5.15' of ↵Linus Torvalds1-3/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS update from Borislav Petkov: "A single RAS change for 5.15: - Do not start processing MCEs logged early because the decoding chain is not up yet - delay that processing until everything is ready" * tag 'ras_core_for_v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Defer processing of early errors
2021-08-26x86/cpu: Add get_llc_id() helper functionKim Phillips2-1/+7
Factor out a helper function rather than export cpu_llc_id, which is needed in order to be able to build the AMD uncore driver as a module. Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210817221048.88063-7-kim.phillips@amd.com
2021-08-24x86/mce: Defer processing of early errorsBorislav Petkov1-3/+8
When a fatal machine check results in a system reset, Linux does not clear the error(s) from machine check bank(s) - hardware preserves the machine check banks across a warm reset. During initialization of the kernel after the reboot, Linux reads, logs, and clears all machine check banks. But there is a problem. In: 5de97c9f6d85 ("x86/mce: Factor out and deprecate the /dev/mcelog driver") the call to mce_register_decode_chain() moved later in the boot sequence. This means that /dev/mcelog doesn't see those early error logs. This was partially fixed by: cd9c57cad3fe ("x86/MCE: Dump MCE to dmesg if no consumers") which made sure that the logs were not lost completely by printing to the console. But parsing console logs is error prone. Users of /dev/mcelog should expect to find any early errors logged to standard places. Add a new flag MCP_QUEUE_LOG to machine_check_poll() to be used in early machine check initialization to indicate that any errors found should just be queued to genpool. When mcheck_late_init() is called it will call mce_schedule_work() to actually log and flush any errors queued in the genpool. [ Based on an original patch, commit message by and completely productized by Tony Luck. ] Fixes: 5de97c9f6d85 ("x86/mce: Factor out and deprecate the /dev/mcelog driver") Reported-by: Sumanth Kamatala <skamatala@juniper.net> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210824003129.GA1642753@agluck-desk2.amr.corp.intel.com
2021-08-22x86/resctrl: Fix a maybe-uninitialized build warning treated as errorBabu Moger1-0/+6
The recent commit 064855a69003 ("x86/resctrl: Fix default monitoring groups reporting") caused a RHEL build failure with an uninitialized variable warning treated as an error because it removed the default case snippet. The RHEL Makefile uses '-Werror=maybe-uninitialized' to force possibly uninitialized variable warnings to be treated as errors. This is also reported by smatch via the 0day robot. The error from the RHEL build is: arch/x86/kernel/cpu/resctrl/monitor.c: In function ‘__mon_event_count’: arch/x86/kernel/cpu/resctrl/monitor.c:261:12: error: ‘m’ may be used uninitialized in this function [-Werror=maybe-uninitialized] m->chunks += chunks; ^~ The upstream Makefile does not build using '-Werror=maybe-uninitialized'. So, the problem is not seen there. Fix the problem by putting back the default case snippet. [ bp: note that there's nothing wrong with the code and other compilers do not trigger this warning - this is being done just so the RHEL compiler is happy. ] Fixes: 064855a69003 ("x86/resctrl: Fix default monitoring groups reporting") Reported-by: Terry Bowman <Terry.Bowman@amd.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/162949631908.23903.17090272726012848523.stgit@bmoger-ubuntu
2021-08-12x86/resctrl: Fix default monitoring groups reportingBabu Moger1-14/+13
Creating a new sub monitoring group in the root /sys/fs/resctrl leads to getting the "Unavailable" value for mbm_total_bytes and mbm_local_bytes on the entire filesystem. Steps to reproduce: 1. mount -t resctrl resctrl /sys/fs/resctrl/ 2. cd /sys/fs/resctrl/ 3. cat mon_data/mon_L3_00/mbm_total_bytes 23189832 4. Create sub monitor group: mkdir mon_groups/test1 5. cat mon_data/mon_L3_00/mbm_total_bytes Unavailable When a new monitoring group is created, a new RMID is assigned to the new group. But the RMID is not active yet. When the events are read on the new RMID, it is expected to report the status as "Unavailable". When the user reads the events on the default monitoring group with multiple subgroups, the events on all subgroups are consolidated together. Currently, if any of the RMID reads report as "Unavailable", then everything will be reported as "Unavailable". Fix the issue by discarding the "Unavailable" reads and reporting all the successful RMID reads. This is not a problem on Intel systems as Intel reports 0 on Inactive RMIDs. Fixes: d89b7379015f ("x86/intel_rdt/cqm: Add mon_data") Reported-by: Paweł Szulik <pawel.szulik@intel.com> Signed-off-by: Babu Moger <Babu.Moger@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Reinette Chatre <reinette.chatre@intel.com> Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=213311 Link: https://lkml.kernel.org/r/162793309296.9224.15871659871696482080.stgit@bmoger-ubuntu
2021-08-11x86/resctrl: Make resctrl_arch_get_config() return its valueJames Morse3-16/+19
resctrl_arch_get_config() has no return, but does pass a single value back via one of its arguments. Return the value instead. Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210811163831.14917-1-james.morse@arm.com
2021-08-11x86/resctrl: Merge the CDP resourcesJames Morse3-239/+85
resctrl uses struct rdt_resource to describe the available hardware resources. The domains of the CDP aliases share a single ctrl_val[] array. The only differences between the struct rdt_hw_resource aliases is the name and conf_type. The name from struct rdt_hw_resource is visible to user-space. To support another architecture, as many user-visible details should be handled in the filesystem parts of the code that is common to all architectures. The name and conf_type go together. Remove conf_type and the CDP aliases. When CDP is supported and enabled, schemata_list_create() can create two schemata using the single resource, generating the CODE/DATA suffix to the schema name itself. This allows the alloc_ctrlval_array() and complications around free()ing the ctrl_val arrays to be removed. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-25-james.morse@arm.com
2021-08-11x86/resctrl: Expand resctrl_arch_update_domains()'s msr_param rangeJames Morse1-3/+9
resctrl_arch_update_domains() specifies the one closid that has been modified and needs copying to the hardware. resctrl_arch_update_domains() takes a struct rdt_resource and a closid as arguments, but copies all the staged configurations for that closid into the ctrl_val[] array. resctrl_arch_update_domains() is called once per schema, but once the resources and domains are merged, the second call of a L2CODE/L2DATA pair will find no staged configurations, as they were previously applied. The msr_param of the first call only has one index, so would only have update the hardware for the last staged configuration. To avoid a second round of IPIs when changing L2CODE and L2DATA in one go, expand the range of the msr_param if multiple staged configurations are found. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-24-james.morse@arm.com
2021-08-11x86/resctrl: Remove rdt_cdp_peer_get()James Morse1-85/+14
When CDP is enabled, rdt_cdp_peer_get() finds the alternative CODE/DATA resource and returns the alternative domain. This is used to determine if bitmaps overlap when there are aliased entries in the two struct rdt_hw_resources. Now that the ctrl_val[] used by the CODE/DATA resources is the same, the search for an alternate resource/domain is not needed. Replace rdt_cdp_peer_get() with resctrl_peer_type(), which returns the alternative type. This can be passed to resctrl_arch_get_config() with the same resource and domain. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-23-james.morse@arm.com
2021-08-11x86/resctrl: Merge the ctrl_val arraysJames Morse1-4/+61
Each struct rdt_hw_resource has its own ctrl_val[] array. When CDP is enabled, two resources are in use, each with its own ctrl_val[] array that holds half of the configuration used by hardware. One uses the odd slots, the other the even. rdt_cdp_peer_get() is the helper to find the alternate resource, its domain, and corresponding entry in the other ctrl_val[] array. Once the CDP resources are merged there will be one struct rdt_hw_resource and one ctrl_val[] array for each hardware resource. This will include changes to rdt_cdp_peer_get(), making it hard to bisect any issue. Merge the ctrl_val[] arrays for three CODE/DATA/NONE resources first. Doing this before merging the resources temporarily complicates allocating and freeing the ctrl_val arrays. Add a helper to allocate the ctrl_val array, that returns the value on the L2 or L3 resource if it already exists. This gets removed once the resources are merged, and there really is only one ctrl_val[] array. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-22-james.morse@arm.com
2021-08-11x86/resctrl: Calculate the index from the configuration typeJames Morse2-22/+15
resctrl uses cbm_idx() to map a closid to an index in the configuration array. This is based on a multiplier and offset that are held in the resource. To merge the resources, the resctrl arch code needs to calculate the index from something else, as there will only be one resource. Decide based on the staged configuration type. This makes the static mult and offset parameters redundant. [ bp: Remove superfluous brackets in get_config_index() ] Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-21-james.morse@arm.com
2021-08-11x86/resctrl: Apply offset correction when config is stagedJames Morse4-31/+27
When resctrl comes to copy the CAT MSR values from the ctrl_val[] array into hardware, it applies an offset adjustment based on the type of the resource. CODE and DATA resources have their closid mapped into an odd/even range. This mapping is based on a property of the resource. This happens once the new control value has been written to the ctrl_val[] array. Once the CDP resources are merged, there will only be a single property that needs to cover both odd/even mappings to the single ctrl_val[] array. The offset adjustment must be applied before the new value is written to the array. Move the logic from cat_wrmsr() to resctrl_arch_update_domains(). The value provided to apply_config() is now an index in the array, not the closid. The parameters provided via struct msr_param are now indexes too. As resctrl's use of closid is a u32, struct msr_param's type is changed to match. With this, the CODE and DATA resources only use the odd or even indexes in the array. This allows the temporary num_closid/2 fixes in domain_setup_ctrlval() and reset_all_ctrls() to be removed. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-20-james.morse@arm.com
2021-08-11x86/resctrl: Make ctrlval arrays the same sizeJames Morse2-1/+18
The CODE and DATA resources report a num_closid that is half the actual size supported by the hardware. This behaviour is visible to user-space when CDP is enabled. The CODE and DATA resources have their own ctrlval arrays which are half the size of the underlying hardware because num_closid was already adjusted. One holds the odd configurations values, the other even. Before the CDP resources can be merged, the 'half the closids' behaviour needs to be implemented by schemata_list_create(), but this causes the ctrl_val[] array to be full sized. Remove the logic from the architecture specific rdt_get_cdp_config() setup, and add it to schemata_list_create(). Functions that walk all the configurations, such as domain_setup_ctrlval() and reset_all_ctrls(), take num_closid directly from struct rdt_hw_resource also have to halve num_closid as only the lower half of each array is in use. domain_setup_ctrlval() and reset_all_ctrls() both copy struct rdt_hw_resource's num_closid to a struct msr_param. Correct the value here. This is temporary as a subsequent patch will merge all three ctrl_val[] arrays such that when CDP is in use, the CODA/DATA layout in the array matches the hardware. reset_all_ctrls()'s loop over the whole of ctrl_val[] is not touched as this is harmless, and will be required as it is once the resources are merged. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-19-james.morse@arm.com
2021-08-11x86/resctrl: Pass configuration type to resctrl_arch_get_config()James Morse3-15/+27
The ctrl_val[] array for a struct rdt_hw_resource only holds configurations of one type. The type is implicit. Once the CDP resources are merged, the ctrl_val[] array will hold all the configurations for the hardware resource. When a particular type of configuration is needed, it must be specified explicitly. Pass the expected type from the schema into resctrl_arch_get_config(). Nothing uses this yet, but once a single ctrl_val[] array is used for the three struct rdt_hw_resources that share hardware, the type will be used to return the correct configuration value from the shared array. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-18-james.morse@arm.com
2021-08-11x86/resctrl: Add a helper to read a closid's configurationJames Morse3-30/+35
Functions like show_doms() reach into the architecture's private structure to retrieve the configuration from the struct rdt_hw_resource. The hardware configuration may look completely different to the values resctrl gets from user-space. The staged configuration and resctrl_arch_update_domains() allow the architecture to convert or translate these values. Resctrl shouldn't read or write the ctrl_val[] values directly. Add a helper to read the current configuration. This will allow another architecture to scale the bitmaps if necessary, and possibly use controls that don't take the user-space control format at all. Of the remaining functions that access ctrl_val[] directly, apply_config() is part of the architecture-specific code, and is called via resctrl_arch_update_domains(). reset_all_ctrls() will be an architecture specific helper. update_mba_bw() manipulates both ctrl_val[], mbps_val[] and the hardware. The mbps_val[] that matches the mba_sc state of the resource is changed, but the other is left unchanged. Abstracting this is the subject of later patches that affect set_mba_sc() too. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-17-james.morse@arm.com
2021-08-11x86/resctrl: Rename update_domains() to resctrl_arch_update_domains()James Morse3-4/+3
update_domains() merges the staged configuration changes into the arch codes configuration array. Rename to make it clear it is part of the arch code interface to resctrl. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-16-james.morse@arm.com
2021-08-11x86/resctrl: Allow different CODE/DATA configurations to be stagedJames Morse2-8/+17
Before the CDP resources can be merged, struct rdt_domain will need an array of struct resctrl_staged_config, one per type of configuration. Use the type as an index to the array to ensure that a schema configuration string can't specify the same domain twice. This will allow two schemata to apply configuration changes to one resource. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-15-james.morse@arm.com
2021-08-11x86/resctrl: Group staged configuration into a separate structJames Morse2-23/+42
When configuration changes are made, the new value is written to struct rdt_domain's new_ctrl field and the have_new_ctrl flag is set. Later new_ctrl is copied to hardware by a call to update_domains(). Once the CDP resources are merged, there will be one new_ctrl field in use by two struct resctrl_schema requiring a per-schema IPI to copy the value to hardware. Move new_ctrl and have_new_ctrl into a new struct resctrl_staged_config. Before the CDP resources can be merged, struct rdt_domain will need an array of these, one per type of configuration. Using the type as an index to the array will ensure that a schema configuration string can't specify the same domain twice. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-14-james.morse@arm.com
2021-08-11x86/resctrl: Move the schemata names into struct resctrl_schemaJames Morse3-16/+28
resctrl 'info' directories and schema parsing use the schema name. This lives in the struct rdt_resource, and is specified by the architecture code. Once the CDP resources are merged, there will only be one resource (and one name) in use by two schemata. To allow the CDP CODE/DATA property to be the type of configuration the schema uses, the name should also be per-schema. Add a name field to struct resctrl_schema, and use this wherever the schema name is exposed (or read from) user-space. Calculating max_name_width for padding the schemata file also moves as this is visible to user-space. As the names in struct rdt_resource already include the CDP information, schemata_list_create() copies them. schemata_list_create() includes the length of the CDP suffix when calculating max_name_width in preparation for CDP resources being merged. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-13-james.morse@arm.com
2021-08-11x86/resctrl: Add a helper to read/set the CDP configurationJames Morse4-34/+60
Whether CDP is enabled for a hardware resource like the L3 cache can be found by inspecting the alloc_enabled flags of the L3CODE/L3DATA struct rdt_hw_resources, even if they aren't in use. Once these resources are merged, the flags can't be compared. Whether CDP is enabled needs tracking explicitly. If another architecture is emulating CDP the behaviour may not be per-resource. 'cdp_capable' needs to be visible to resctrl, even if its not in use, as this affects the padding of the schemata table visible to user-space. Add cdp_enabled to struct rdt_hw_resource and cdp_capable to struct rdt_resource. Add resctrl_arch_set_cdp_enabled() to let resctrl enable or disable CDP on a resource. resctrl_arch_get_cdp_enabled() lets it read the current state. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-12-james.morse@arm.com
2021-08-11x86/resctrl: Swizzle rdt_resource and resctrl_schema in pseudo_lock_regionJames Morse4-11/+11
struct pseudo_lock_region points to the rdt_resource. Once the resources are merged, this won't be unique. The resource name is moving into the schema, so that the filesystem portions of resctrl can generate it. Swap pseudo_lock_region's rdt_resource pointer for a schema pointer. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-11-james.morse@arm.com
2021-08-11x86/resctrl: Pass the schema to resctrl filesystem functionsJames Morse3-21/+27
Once the CDP resources are merged, there will be two struct resctrl_schema for one struct rdt_resource. CDP becomes a type of configuration that belongs to the schema. Helpers like rdtgroup_cbm_overlaps() need access to the schema to query the configuration (or configurations) based on schema properties. Change these functions to take a struct schema instead of the struct rdt_resource. All the modified functions are part of the filesystem code that will move to /fs/resctrl once it is possible to support a second architecture. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-10-james.morse@arm.com
2021-08-11x86/resctrl: Add resctrl_arch_get_num_closid()James Morse3-4/+13
To initialise struct resctrl_schema's num_closid, schemata_list_create() reaches into the architectures private structure to retrieve num_closid from the struct rdt_hw_resource. The 'half the closids' behaviour should be part of the filesystem parts of resctrl that are the same on any architecture. struct resctrl_schema's num_closid should include any correction for CDP. Having two properties called num_closid is likely to be confusing when they have different values. Add a helper to read the resource's num_closid from the arch code. This should return the number of closid that the resource supports, regardless of whether CDP is in use. Once the CDP resources are merged, schemata_list_create() can apply the correction itself. Using a type with an obvious size for the arch helper means changing the type of num_closid to u32, which matches the type already used by struct rdtgroup. reset_all_ctrls() does not use resctrl_arch_get_num_closid(), even though it sets up a structure for modifying the hardware. This function will be part of the architecture code, the maximum closid should be the maximum value the hardware has, regardless of the way resctrl is using it. All the uses of num_closid in core.c are naturally part of the architecture specific code. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-9-james.morse@arm.com
2021-08-11x86/resctrl: Store the effective num_closid in the schemaJames Morse2-15/+7
Struct resctrl_schema holds properties that vary with the style of configuration that resctrl applies to a resource. There are already two values for the hardware's num_closid, depending on whether the architecture presents the L3 or L3CODE/L3DATA resources. As the way CDP changes the number of control groups that resctrl can create is part of the user-space interface, it should be managed by the filesystem parts of resctrl. This allows the architecture code to only describe the value the hardware supports. Add num_closid to resctrl_schema. This is the value seen by the filesystem, which may be different to the maximum value described by the arch code when CDP is enabled. These functions operate on the num_closid value that is exposed to user-space: * rdtgroup_parse_resource() * rdtgroup_schemata_show() * rdt_num_closids_show() * closid_init() Change them to use the schema value instead. schemata_list_create() sets this value, and reaches into the architecture-specific structure to get the value. This will eventually be replaced with a helper. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-8-james.morse@arm.com
2021-08-11x86/resctrl: Walk the resctrl schema list instead of an arch listJames Morse2-14/+27
When parsing a schema configuration value from user-space, resctrl walks the architectures rdt_resources_all[] array to find a matching struct rdt_resource. Once the CDP resources are merged there will be one resource in use by two schemata. Anything walking rdt_resources_all[] on behalf of a user-space request should walk the list of struct resctrl_schema instead. Change the users of for_each_alloc_enabled_rdt_resource() to walk the schema instead. Schemata were only created for alloc_enabled resources so these two lists are currently equivalent. schemata_list_create() and rdt_kill_sb() are ignored. The first creates the schema list, and will eventually loop over the resource indexes using an arch helper to retrieve the resource. rdt_kill_sb() will eventually make use of an arch 'reset everything' helper. After the filesystem code is moved, rdtgroup_pseudo_locked_in_hierarchy() remains part of the x86 specific hooks to support pseudo lock. This code walks each domain, and still does this after the separate resources are merged. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-7-james.morse@arm.com
2021-08-11x86/resctrl: Label the resources with their configuration typeJames Morse3-0/+10
The names of resources are used for the schema name presented to user-space. The name used is rooted in a structure provided by the architecture code because the names are different when CDP is enabled. x86 implements this by swapping between two sets of resource structures based on their alloc_enabled flag. The type of configuration in-use is encoded in the name (and cbm_idx_offset). Once the CDP behaviour is moved into the parts of resctrl that will move to /fs/, there will be two struct resctrl_schema for one struct rdt_resource. The schema describes the type of configuration being applied to the resource. The name of the schema should be generated by resctrl, base on the type of configuration. To do this struct resctrl_schema needs to store the type of configuration in use for a schema. Create an enum resctrl_conf_type describing the options, and add it to struct resctrl_schema. The underlying resources are still separate, as cbm_idx_offset is still in use. Temporarily label all the entries in rdt_resources_all[] and copy that value to struct resctrl_schema. Copying the value ensures there is no mismatch while the filesystem parts of resctrl are modified to use the schema. Once the resources are merged, the filesystem code can assign this value based on the schema being created. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-6-james.morse@arm.com
2021-08-11x86/resctrl: Pass the schema in info dir's private pointerJames Morse1-13/+25
Many of resctrl's per-schema files return a value from struct rdt_resource, which they take as their 'priv' pointer. Moving properties that resctrl exposes to user-space into the core 'fs' code, (e.g. the name of the schema), means some of the functions that back the filesystem need the schema struct (to where the properties are moved), but currently take struct rdt_resource. For example, once the CDP resources are merged, struct rdt_resource no longer reflects all the properties of the schema. For the info dirs that represent a control, the information needed will be accessed via struct resctrl_schema, as this is how the resource is being used. For the monitors, its still struct rdt_resource as the monitors aren't described as schema. This difference means the type of the private pointers varies between control and monitor info dirs. Change the 'priv' pointer to point to struct resctrl_schema for the per-schema files that represent a control. The type can be determined from the fflags field. If the flags are RF_MON_INFO, its a struct rdt_resource. If the flags are RF_CTRL_INFO, its a struct resctrl_schema. No entry in res_common_files[] has both flags. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-5-james.morse@arm.com
2021-08-11x86/resctrl: Add a separate schema list for resctrlJames Morse2-1/+43
Resctrl exposes schemata to user-space, which allow the control values to be specified for a group of tasks. User-visible properties of the interface, (such as the schemata names and how the values are parsed) are rooted in a struct provided by the architecture code. (struct rdt_hw_resource). Once a second architecture uses resctrl, this would allow user-visible properties to diverge between architectures. These properties should come from the resctrl code that will be common to all architectures. Resctrl has no per-schema structure, only struct rdt_{hw_,}resource. Create a struct resctrl_schema to hold the rdt_resource. Before a second architecture can be supported, this structure will also need to hold the schema name visible to user-space and the type of configuration values for resctrl. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-4-james.morse@arm.com
2021-08-11x86/resctrl: Split struct rdt_domainJames Morse5-59/+63
resctrl is the defacto Linux ABI for SoC resource partitioning features. To support it on another architecture, it needs to be abstracted from the features provided by Intel RDT and AMD PQoS, and moved to /fs/. struct rdt_domain contains a mix of architecture private details and properties of the filesystem interface user-space uses. Continue by splitting struct rdt_domain, into an architecture private 'hw' struct, which contains the common resctrl structure that would be used by any architecture. The hardware values in ctrl_val and mbps_val need to be accessed via helpers to allow another architecture to convert these into a different format if necessary. After this split, filesystem code paths touching a 'hw' struct indicates where an abstraction is needed. Splitting this structure only moves types around, and should not lead to any change in behaviour. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-3-james.morse@arm.com
2021-08-11x86/resctrl: Split struct rdt_resourceJames Morse6-272/+252
resctrl is the defacto Linux ABI for SoC resource partitioning features. To support it on another architecture, it needs to be abstracted from the features provided by Intel RDT and AMD PQoS, and moved to /fs/. struct rdt_resource contains a mix of architecture private details and properties of the filesystem interface user-space uses. Start by splitting struct rdt_resource, into an architecture private 'hw' struct, which contains the common resctrl structure that would be used by any architecture. The foreach helpers are most commonly used by the filesystem code, and should return the common resctrl structure. for_each_rdt_resource() is changed to walk the common structure in its parent arch private structure. Move as much of the structure as possible into the common structure in the core code's header file. The x86 hardware accessors remain part of the architecture private code, as do num_closid, mon_scale and mbm_width. mon_scale and mbm_width are used to detect overflow of the hardware counters, and convert them from their native size to bytes. Any cross-architecture abstraction should be in terms of bytes, making these properties private. The hardware's num_closid is kept in the private structure to force the filesystem code to use a helper to access it. MPAM would return a single value for the system, regardless of the resource. Using the helper prevents this field from being confused with the version of num_closid that is being exposed to user-space (added in a later patch). After this split, filesystem code touching a 'hw' struct indicates where an abstraction is needed. Splitting this structure only moves types around, and should not lead to any change in behaviour. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lkml.kernel.org/r/20210728170637.25610-2-james.morse@arm.com
2021-08-10x86/mce/inject: Replace deprecated CPU-hotplug functions.Sebastian Andrzej Siewior1-4/+4
The functions get_online_cpus() and put_online_cpus() have been deprecated during the CPU hotplug rework. They map directly to cpus_read_lock() and cpus_read_unlock(). Replace deprecated CPU-hotplug functions with the official version. The behavior remains unchanged. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210803141621.780504-10-bigeasy@linutronix.de
2021-08-10x86/microcode: Replace deprecated CPU-hotplug functions.Sebastian Andrzej Siewior1-9/+9
The functions get_online_cpus() and put_online_cpus() have been deprecated during the CPU hotplug rework. They map directly to cpus_read_lock() and cpus_read_unlock(). Replace deprecated CPU-hotplug functions with the official version. The behavior remains unchanged. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210803141621.780504-9-bigeasy@linutronix.de
2021-08-10x86/mtrr: Replace deprecated CPU-hotplug functions.Sebastian Andrzej Siewior1-4/+4
The functions get_online_cpus() and put_online_cpus() have been deprecated during the CPU hotplug rework. They map directly to cpus_read_lock() and cpus_read_unlock(). Replace deprecated CPU-hotplug functions with the official version. The behavior remains unchanged. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210803141621.780504-8-bigeasy@linutronix.de
2021-07-28x86, prctl: Hook L1D flushing in via prctlBalbir Singh1-0/+33
Use the existing PR_GET/SET_SPECULATION_CTRL API to expose the L1D flush capability. For L1D flushing PR_SPEC_FORCE_DISABLE and PR_SPEC_DISABLE_NOEXEC are not supported. Enabling L1D flush does not check if the task is running on an SMT enabled core, rather a check is done at runtime (at the time of flush), if the task runs on a SMT sibling then the task is sent a SIGBUS which is executed before the task returns to user space or to a guest. This is better than the other alternatives of: a. Ensuring strict affinity of the task (hard to enforce without further changes in the scheduler) b. Silently skipping flush for tasks that move to SMT enabled cores. Hook up the core prctl and implement the x86 specific parts which in turn makes it functional. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Balbir Singh <sblbir@amazon.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210108121056.21940-5-sblbir@amazon.com
2021-07-28x86/mm: Prepare for opt-in based L1D flush in switch_mm()Balbir Singh1-0/+37
The goal of this is to allow tasks that want to protect sensitive information, against e.g. the recently found snoop assisted data sampling vulnerabilites, to flush their L1D on being switched out. This protects their data from being snooped or leaked via side channels after the task has context switched out. This could also be used to wipe L1D when an untrusted task is switched in, but that's not a really well defined scenario while the opt-in variant is clearly defined. The mechanism is default disabled and can be enabled on the kernel command line. Prepare for the actual prctl based opt-in: 1) Provide the necessary setup functionality similar to the other mitigations and enable the static branch when the command line option is set and the CPU provides support for hardware assisted L1D flushing. Software based L1D flush is not supported because it's CPU model specific and not really well defined. This does not come with a sysfs file like the other mitigations because it is not bound to any specific vulnerability. Support has to be queried via the prctl(2) interface. 2) Add TIF_SPEC_L1D_FLUSH next to L1D_SPEC_IB so the two bits can be mangled into the mm pointer in one go which allows to reuse the existing mechanism in switch_mm() for the conditional IBPB speculation barrier efficiently. 3) Add the L1D flush specific functionality which flushes L1D when the outgoing task opted in. Also check whether the incoming task has requested L1D flush and if so validate that it is not accidentaly running on an SMT sibling as this makes the whole excercise moot because SMT siblings share L1D which opens tons of other attack vectors. If that happens schedule task work which signals the incoming task on return to user/guest with SIGBUS as this is part of the paranoid L1D flush contract. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Balbir Singh <sblbir@amazon.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210108121056.21940-1-sblbir@amazon.com
2021-07-21Revert "x86/hyperv: fix logical processor creation"Wei Liu1-1/+1
This reverts commit 450605c28d571eddca39a65fdbc1338add44c6d9. Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-07-16x86/hyperv: add comment describing TSC_INVARIANT_CONTROL MSR setting bit 0Ani Sinha1-0/+9
Commit dce7cd62754b5 ("x86/hyperv: Allow guests to enable InvariantTSC") added the support for HV_X64_MSR_TSC_INVARIANT_CONTROL. Setting bit 0 of this synthetic MSR will allow hyper-v guests to report invariant TSC CPU feature through CPUID. This comment adds this explanation to the code and mentions where the Intel's generic platform init code reads this feature bit from CPUID. The comment will help developers understand how the two parts of the initialization (hyperV specific and non-hyperV specific generic hw init) are related. Signed-off-by: Ani Sinha <ani@anisinha.ca> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20210716133245.3272672-1-ani@anisinha.ca Signed-off-by: Wei Liu <wei.liu@kernel.org>